Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
Lingua::EN::Fathom(3) User Contributed Perl DocumentationLingua::EN::Fathom(3)

NAME
       Lingua::EN::Fathom - Measure readability	of English text

SYNOPSIS
	   use Lingua::EN::Fathom;

	   my $text = Lingua::EN::Fathom->new();

	   # Analyse contents of a text	file
	   $text->analyse_file("sample.txt"); #	Analyse	contents of a text file

	   $accumulate = 1;
	   # Analyse contents of a text	string
	   $text->analyse_block($text_string,$accumulate);

	   # TO	Do, remove repetition
	   $num_chars		  = $text->num_chars;
	   $num_words		  = $text->num_words;
	   $percent_complex_words = $text->percent_complex_words;
	   $num_sentences	  = $text->num_sentences;
	   $num_text_lines	  = $text->num_text_lines;
	   $num_blank_lines	  = $text->num_blank_lines;
	   $num_paragraphs	  = $text->num_paragraphs;
	   $syllables_per_word	  = $text->syllables_per_word;
	   $words_per_sentence	  = $text->words_per_sentence;

	  # comment needed
	   %words = $text->unique_words;
	   foreach $word ( sort	keys %words )
	   {
	     print("$words{$word} :$word\n");
	   }

	   $fog	    = $text->fog;
	   $flesch  = $text->flesch;
	   $kincaid = $text->kincaid;

	   print($text->report);

REQUIRES
       Perl, version 5.001 or higher, Lingua::EN::Syllable

DESCRIPTION
       This module analyses English text in either a string or file. Totals
       are then	calculated for the number of characters, words,	sentences,
       blank and non blank (text) lines	and paragraphs.

       Three common readability	statistics are also derived, the Fog, Flesch
       and Kincaid indices.

       All of these properties can be accessed through individual methods, or
       by generating a text report.

       A hash of all unique words and the number of times they occur is
       generated.

METHODS
   new
       The "new" method	creates	an instance of an text object This must	be
       called before any of the	following methods are invoked. Note that the
       object only needs to be created once, and can be	reused with new	input
       data.

	  my $text = Lingua::EN::Fathom->new();

   analyse_file
       The "analyse_file" method takes as input	the name of a text file.
       Various text based statistics are calculated for	the file. This method
       and "analyse_block" are prerequisites for all the following methods. An
       optional	argument may be	supplied to control accumulation of
       statistics. If set to a non zero	value, all statistics are accumulated
       with each successive call.

	   $text->analyse_file("sample.txt");

   analyse_block
       The "analyse_block" method takes	as input a text	string.	Various	text
       based statistics	are calculated for the file. This method and
       "analyse_file" are prerequisites	for all	the following methods. An
       optional	argument may be	supplied to control accumulation of
       statistics. If set to a non zero	value, all statistics are accumulated
       with each successive call.

	   $text->analyse_block($text_str);

   num_chars
       Returns the number of characters	in the analysed	text file or block.
       This includes characters	such as	spaces,	and punctuation	marks.

   num_words
       Returns the number of words in the analysed text	file or	block. A word
       must consist of letters a-z with	at least one vowel sound, and
       optionally an apostrophe	or hyphen. Items such as "&, K108, NW" are not
       counted as words.

   percent_complex_words
       Returns the percentage of complex words in the analysed text file or
       block. A	complex	word must consist of three or more syllables. This
       statistic is used to calculate the fog index.

   num_sentences
       Returns the number of sentences in the analysed text file or block. A
       sentence	is any group of	words and non words terminated with a single
       full stop. Spaces may occur before and after the	full stop.

   num_text_lines
       Returns the number of lines containing some text	in the analysed	text
       file or block.

   num_blank_lines
       Returns the number of lines NOT containing any text in the analysed
       text file or block.

   num_paragraphs
       Returns the number of paragraphs	in the analysed	text file or block.

   syllables_per_word
       Returns the average number of syllables per word	in the analysed	text
       file or block.

   words_per_sentence
       Returns the average number of words per sentence	in the analysed	text
       file or block.

   READABILITY
       Three indices of	text readability are calculated. They all measure
       complexity as a function	of syllables per word and words	per sentence.
       They assume the text is well formed and logical.	You could analyse a
       passage of nonsensical English and find the readability is quite	good,
       provided	the words are not too complex and the sentences	not too	long.

       For more	information see:
       <http://www.plainlanguage.com/Resources/readability.html>

   fog
       Returns the Fog index for the analysed text file	or block.

	 ( words_per_sentence +	percent_complex_words )	* 0.4

       The Fog index, developed	by Robert Gunning, is a	well known and simple
       formula for measuring readability. The index indicates the number of
       years of	formal education a reader of average intelligence would	need
       to read the text	once and understand that piece of writing with its
       word sentence workload.

	  18 unreadable
	  14 difficult
	  12 ideal
	  10 acceptable
	   8 childish

   flesch
       Returns the Flesch reading ease score for the analysed text file	or
       block.

	  206.835 - (1.015 * words_per_sentence) - (84.6 * syllables_per_word)

       This score rates	text on	a 100 point scale. The higher the score, the
       easier it is to understand the text. A score of 60 to 70	is considered
       to be optimal.

   kincaid
       Returns the Flesch-Kincaid grade	level score for	the analysed text file
       or block.

	  (11.8	* syllables_per_word) +	 (0.39 * words_per_sentence) - 15.59;

       This score rates	text on	U.S. grade school level. So a score of 8.0
       means that the document can be understood by an eighth grader. A	score
       of 7.0 to 8.0 is	considered to be optimal.

   unique_words
       Returns a hash of unique	words. The words (in lower case) are held in
       the hash	keys while the number of occurrences are held in the hash
       values.

   report
	   print($text->report);

       Produces	a text based report containing all Fathom statistics for the
       currently analysed text block or	file. For example:

       Number of characters	  : 813	Number of words		   : 135
       Percent of complex words	  : 20.00 Average syllables per	word : 1.7704
       Number of sentences	  : 12 Average words per sentence : 11.2500
       Number of text lines	  : 13 Number of blank lines	  : 8 Number
       of paragraphs	   : 4

       READABILITY INDICES

       Fog			  : 12.5000 Flesch		       :
       45.6429 Flesch-Kincaid		  : 9.6879

       The return value	is a string containing the report contents

SEE ALSO
       Lingua::EN::Syllable,Lingua::EN::Sentence,B::Fathom

POSSIBLE EXTENSIONS
	  Count	white space and	punctuation characters
	  Allow	user control over what strictly	defines	a word

LIMITATIONS
       The syllable count provided in Lingua::EN::Syllable is about 90%
       accurate

       Acronyms	that contain vowels, like GPO, will be counted as words.

       The fog index should exclude proper names

BUGS
       None known

AUTHOR
       Lingua::EN::Fathom was written by Kim Ryan <kimryan at cpan dot org>.

COPYRIGHT AND LICENSE
       Copyright (c) 2018 Kim Ryan. All	rights reserved.

       This library is free software; you can redistribute it and/or modify it
       under the same terms as Perl itself.

perl v5.32.0			  2018-10-31		 Lingua::EN::Fathom(3)

NAME | SYNOPSIS | REQUIRES | DESCRIPTION | METHODS | SEE ALSO | POSSIBLE EXTENSIONS | LIMITATIONS | BUGS | AUTHOR | COPYRIGHT AND LICENSE

Want to link to this manual page? Use this URL:
<https://www.freebsd.org/cgi/man.cgi?query=Lingua::EN::Fathom&sektion=3&manpath=FreeBSD+12.2-RELEASE+and+Ports>

home | help