Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
Text::NSP::Measures::2UseriContributed PerlText::NSP::Measures::2D::Fisher2(3)

NAME
       Text::NSP::Measures::2D::Fisher2	- Perl module that provides methods
					  to compute the Fishers exact tests.

SYNOPSIS
       Basic Usage

	 use Text::NSP::Measures::2D::Fisher2::left;

	 my $npp = 60; my $n1p = 20; my	$np1 = 20;  my $n11 = 10;

	 $left_value = calculateStatistic( n11=>$n11,
					     n1p=>$n1p,
					     np1=>$np1,
					     npp=>$npp);

	 if( ($errorCode = getErrorCode()))
	 {
	   print STDERR	$errorCode." - ".getErrorMessage();
	 }
	 else
	 {
	   print getStatisticName."value for bigram is ".$left_value;
	 }

DESCRIPTION
       This module provides a framework	for the	naive implementation of	the
       fishers exact tests. That is the	implementation does not	have any
       optimizations for performance. This will	compute	the factorials for the
       hypergeometric probabilities using direct multiplications.

       This measure should be used if you need exact values without any
       rounding	errors,	and you	are not	worried	about the performance of the
       measure,	otherwise use the implementations under	the
       Text::NSP::Measures::2D::Fisher module.

       To use this implementation, you will have to specify the	entire module
       name. Usage:

       statistic.pl Text::NSP::Measures::Fisher2::left dest.txt	source.cnt

       Assume that the frequency count data associated with a bigram
       <word1><word2> is stored	in a 2x2 contingency table:

		 word2	 ~word2
	 word1	  n11	   n12 | n1p
	~word1	  n21	   n22 | n2p
		  --------------
		  np1	   np2	 npp

       where n11 is the	number of times	<word1><word2> occur together, and n12
       is the number of	times <word1> occurs with some word other than word2,
       and n1p is the number of	times in total that word1 occurs as the	first
       word in a bigram.

       The fishers exact tests are calculated by fixing	the marginal totals
       and computing the hypergeometric	probabilities for all the possible
       contingency tables,

       A left sided test is calculated by adding the probabilities of all the
       possible	two by two contingency tables formed by	fixing the marginal
       totals and changing the value of	n11 to less than the given value. A
       left sided Fisher's Exact Test tells us how likely it is	to randomly
       sample a	table where n11	is less	than observed. In other	words, it
       tells us	how likely it is to sample an observation where	the two	words
       are less	dependent than currently observed.

       A right sided test is calculated	by adding the probabilities of all the
       possible	two by two contingency tables formed by	fixing the marginal
       totals and changing the value of	n11 to greater than or equal to	the
       given value. A right sided Fisher's Exact Test tells us how likely it
       is to randomly sample a table where n11 is greater than observed. In
       other words, it tells us	how likely it is to sample an observation
       where the two words are more dependent than currently observed.

       A two-tailed fishers test is calculated by adding the probabilities of
       all the contingency tables with probabilities less than the probability
       of the observed table. The two-tailed fishers test tells	us how likely
       it would	be to observe an contingency table which is less probable than
       the current table.

   Methods
       getValues() - This method calls the computeObservedValues() and the
       computeExpectedValues() methods to compute the observed and marginal
       total values. It	checks these values for	any errors that	might cause
       the Fishers Exact test measures to fail.
	   INPUT PARAMS	 : $count_values       .. Reference of an array
	   containing
						  the count values computed by
	   the
						  count.pl program.

	   RETURN VALUES : 1/undef	     ..returns '1' to indicate success
					       and an undefined(NULL) value to
	   indicate
					       failure.

       computeDistribution() - This method calculates the probabilities	for
       all the possible	tables
	   INPUT PARAMS	 : $n11_start	       .. the value for	the cell 1,1
	   in the first	contingency
						  table
			   $final_limit	       .. the value of cell 1,1	in the
	   last	contingency table
						  for which we have to compute
	   the probability.

	   RETURN VALUES : $probability	       .. Reference to a hash
	   containing hypergeometric
						  probabilities	for all	the
	   possible contingency
						  tables

AUTHOR
       Ted Pedersen,		    University of Minnesota Duluth
				    <tpederse@d.umn.edu>

       Satanjeev Banerjee,	    Carnegie Mellon University
				    <satanjeev@cmu.edu>

       Amruta Purandare,	    University of Pittsburgh
				    <amruta@cs.pitt.edu>

       Bridget Thomson-McInnes,	    University of Minnesota Twin Cities
				    <bthompson@d.umn.edu>

       Saiyam Kohli,		    University of Minnesota Duluth
				    <kohli003@d.umn.edu>

HISTORY
       Last updated: $Id: Fisher2.pm,v 1.11 2008/03/26 17:18:26	tpederse Exp $

BUGS
SEE ALSO
       <http://groups.yahoo.com/group/ngram/>

       <http://www.d.umn.edu/~tpederse/nsp.html>

COPYRIGHT
       Copyright (C) 2000-2006,	Ted Pedersen, Satanjeev	Banerjee, Amruta
       Purandare, Bridget Thomson-McInnes and Saiyam Kohli

       This program is free software; you can redistribute it and/or modify it
       under the terms of the GNU General Public License as published by the
       Free Software Foundation; either	version	2 of the License, or (at your
       option) any later version.

       This program is distributed in the hope that it will be useful, but
       WITHOUT ANY WARRANTY; without even the implied warranty of
       MERCHANTABILITY or FITNESS FOR A	PARTICULAR PURPOSE.  See the GNU
       General Public License for more details.

       You should have received	a copy of the GNU General Public License along
       with this program; if not, write	to

	   The Free Software Foundation, Inc.,
	   59 Temple Place - Suite 330,
	   Boston, MA  02111-1307, USA.

       Note: a copy of the GNU General Public License is available on the web
       at <http://www.gnu.org/licenses/gpl.txt>	and is included	in this
       distribution as GPL.txt.

perl v5.24.1			  2008-03-2Text::NSP::Measures::2D::Fisher2(3)

NAME | SYNOPSIS | DESCRIPTION | AUTHOR | HISTORY | BUGS | SEE ALSO | COPYRIGHT

Want to link to this manual page? Use this URL:
<https://www.freebsd.org/cgi/man.cgi?query=Text::NSP::Measures::2D::Fisher2&sektion=3&manpath=FreeBSD+12.0-RELEASE+and+Ports>

home | help