Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
AI::Categorizer::ColleUsernContributed Perl DocuAI::Categorizer::Collection(3)

NAME
       AI::Categorizer::Collection - Access stored documents

SYNOPSIS
	 my $c = new AI::Categorizer::Collection::Files
	   (path => '/tmp/docs/training',
	    category_file => '/tmp/docs/cats.txt');
	 print "Total number of	docs: ", $c->count_documents, "\n";
	 while (my $document = $c->next) {
	   ...
	 }
	 $c->rewind; # For further operations

DESCRIPTION
       This abstract class implements an iterator for accessing	documents in
       their natively stored format.  You cannot directly create an instance
       of the Collection class,	because	it is abstract - see the documentation
       for the "Files",	"SingleFile", or "InMemory" subclasses for a concrete
       interface.

METHODS
       new()
	   Creates a new Collection object and returns it.  Accepts the
	   following parameters:

	   category_hash
	       Indicates a reference to	a hash which maps document names to
	       category	names.	The keys of the	hash are the document names,
	       each value should be a reference	to an array containing the
	       names of	the categories to which	each document belongs.

	   category_file
	       Indicates a file	which should be	read in	order to create	the
	       "category_hash".	 Each line of the file should list a
	       document's name,	followed by a list of category names, all
	       separated by whitespace.

	   stopword_file
	       Specifies a file	containing a list of "stopwords", which	are
	       words that should automatically be disregarded when
	       scanning/reading	documents.  The	file should contain one	word
	       per line.  The file will	be parsed and then fed as the
	       "stopwords" parameter to	the Document "new()" method.

	   verbose
	       If true,	some status/debugging information will be printed to
	       "STDOUT"	during operation.

	   document_class
	       The class indicating what type of Document object should	be
	       created.	 This generally	specifies the format that the
	       documents are stored in.	 The default is
	       "AI::Categorizer::Document::Text".

       next()
	   Returns the next Document object in the Collection.

       rewind()
	   Resets the iterator for further calls to "next()".

       count_documents()
	   Returns the total number of documents in the	Collection.  Note that
	   this	usually	resets the iterator.  This is because it may not be
	   possible to resume iterating	where we left off.

AUTHOR
       Ken Williams, ken@mathforum.org

COPYRIGHT
       Copyright 2002-2003 Ken Williams.  All rights reserved.

       This library is free software; you can redistribute it and/or modify it
       under the same terms as Perl itself.

SEE ALSO
       AI::Categorizer(3), Storable(3)

perl v5.32.0			  2020-08-09	AI::Categorizer::Collection(3)

NAME | SYNOPSIS | DESCRIPTION | METHODS | AUTHOR | COPYRIGHT | SEE ALSO

Want to link to this manual page? Use this URL:
<https://www.freebsd.org/cgi/man.cgi?query=AI::Categorizer::Collection&sektion=3&manpath=FreeBSD+12.2-RELEASE+and+Ports>

home | help