# FreeBSD Manual Pages

Text::NSP::Measures::3UserIContributed PerlText::NSP::Measures::3D::MI::tmi(3)

NAME
Text::NSP::Measures::3D::MI::tmi	- Perl implementation for True Mutual
Information for trigrams.

SYNOPSIS
Basic Usage

use Text::NSP::Measures::3D::MI::tmi;

\$tmi_value = calculateStatistic( n111=>10,
n1pp=>40,
np1p=>45,
npp1=>42,
n11p=>20,
n1p1=>23,
np11=>21,
nppp=>100);

if( (\$errorCode = getErrorCode()))
{
print STDERR	\$erroCode." - ".getErrorMessage()."\n";
}
else
{
print getStatisticName."value for bigram is ".\$tmi_value."\n";
}

DESCRIPTION
True Mutual Information (tmi) is	defined	as the weighted	average	of the
pointwise mutual	informations for all the observed and expected value
pairs.

tmi = [n111/nppp * log(n111/m111) + n112/nppp *	log(n112/m112) +
n121/nppp * log(n121/m121) + n122/nppp *	log(n122/m122) +
n211/nppp * log(n211/m211) + n212/nppp *	log(n212/m212) +
n221/nppp * log(n221/m221) + n222/nppp *	log(n222/m222)]

PMI =	log (n111/m111)

Here n111 represents the	observed value for the cell (1,1,1) and	m111
represents the expected value for that cell. The	expected values	for
the internal cells are calculated by taking the product of their
associated marginals and	dividing by the	sample size, for example:

n1pp	* np1p * npp1
m111=	  --------------------
nppp

Methods
calculateStatistic(\$count_values) - This	method calculates the tmi
value
INPUT PARAMS	 : \$count_values   .. Reference	of an hash containing
the count	values computed	by the
count.pl program.

RETURN VALUES : \$tmi		   .. TMI value	for this trigram.

getStatisticName() - Returns the	name of	this statistic
INPUT PARAMS	 : none

RETURN VALUES : \$name      .. Name of the measure.

AUTHOR
Ted Pedersen,		    University of Minnesota Duluth
<tpederse@d.umn.edu>

Satanjeev Banerjee,	    Carnegie Mellon University
<satanjeev@cmu.edu>

Amruta Purandare,	    University of Pittsburgh
<amruta@cs.pitt.edu>

Bridget Thomson-McInnes,	    University of Minnesota Twin Cities
<bthompson@d.umn.edu>

Saiyam Kohli,		    University of Minnesota Duluth
<kohli003@d.umn.edu>

HISTORY
Last updated: \$Id: tmi.pm,v 1.10	2006/06/21 11:10:53 saiyam_kohli Exp \$

BUGS
@inproceedings{moore:2004:EMNLP,
author	 = {Moore, Robert C.},
title	 = {On Log-Likelihood-Ratios and the Significance of Rare
Events },
booktitle = {Proceedings	of EMNLP 2004},
editor =	{Dekang	Lin and	Dekai Wu},
year	 = 2004,
month	 = {July},
publisher = {Association	for Computational Linguistics},
pages	 = {333--340}
url = L<http://acl.ldc.upenn.edu/acl2004/emnlp/pdf/Moore.pdf>}

<http://groups.yahoo.com/group/ngram/>

<http://www.d.umn.edu/~tpederse/nsp.html>

Copyright (C) 2000-2006,	Ted Pedersen, Satanjeev	Banerjee, Amruta
Purandare, Bridget Thomson-McInnes and Saiyam Kohli

This program is free software; you can redistribute it and/or modify it
Free Software Foundation; either	version	2 of the License, or (at your
option) any later version.

This program is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A	PARTICULAR PURPOSE.  See the GNU
General Public License for more details.

You should have received	a copy of the GNU General Public License along
with this program; if not, write	to

The Free Software Foundation, Inc.,
59 Temple Place - Suite 330,
Boston, MA  02111-1307, USA.

Note: a copy of the GNU General Public License is available on the web
at <http://www.gnu.org/licenses/gpl.txt>	and is included	in this
distribution as GPL.txt.

perl v5.32.0			  2006-06-2Text::NSP::Measures::3D::MI::tmi(3)