Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
Index(3)	      User Contributed Perl Documentation	      Index(3)

NAME
       Search::OpenFTS::Index -	Provides functions for indexing

SYNOPSIS
   API
       my $fts=Search::OpenFTS::Index->new( DBI	);

       my $fts=Search::OpenFTS::Index->new( DBI, prefix	);

       my $fts=Search::OpenFTS::Index->init(
	   dbi=>DBI,
	   txttid=>NAME_TXT_ID,
	   dict=>[DICT1, DICT2,	...],
	   parser=>PARSER,
	   map=>'{IDTYPELEXEM1=>[IDDICT1, ...],	...}',
	   tsvector_field=>FIELD_NAME,
	   ignore_id_index=>"IDTYPELEXEM1 [IDTYPELEXEM2	[...]]",
	   ignore_headline=>"IDTYPELEXEM1 [IDTYPELEXEM2	[...]]",
	   prefix=>PREFIX );

       This is the initialization function. It is called only once, at the
       creation	of a new search	index, to create the configuration and
       indexing	tables.

       txttid
	 The table where the documents are stored together with	its primary
	 key (e.g. messages.msg_id)

       dict
	 List of available dictionaries. Dictionaries should support three
	 methods: lemms, is_stoplexem, drop and	init. init is used for the
	 initialization	of the dictionary. lemms returns an array of lexems
	 for a given word and is_stoplexem answers whether the given lexeme
	 corresponds to	a stop word or not. drop is used for clearing
	 dictionaries tables (if any) while dropping OpenFTS instance. Methods
	 is_stoplexem, drop and	init are optional.

       parser
	 The full name of the parser in	use. Parser should have	the same
	 interface as Search::OpenFTS::Parser module.

       map
	 A mapping from	types of lexemes to dictionaries. This is helpful for
	 optimizing the	search engine and it is	also helpful for indexing
	 multi-languages or exotic-text	documents.

       tsvector_field
	 The field name	that holds the text index of integers for each
	 document.  This field must have tsvector type(	from contrib/tsearch )

       ignore_id_index
	 Type IDs of lexemes to	ignore while indexing documents.

       ignore_id_headline
	 Type IDs of lexemes to	ignore while constructing headlines of the
	 search	results.

       prefix
	 If more than one content tables require indexing and searching
	 functionality the user	can pass a special parameter named prefix
	 which is a character value from a-z. The given	prefix is used,	as a
	 naming	convention, to create different	instances of the configuration
	 and indexing table.

	 To specify dictionary which requires parameters (snowball stemmer,
	 for example), use following syntax:

	     dict=>[
	 # example how to use snowball stemmer
		   { mod=>'Search::OpenFTS::Dict::Snowball', param=>'{lang=>"english"}'	},
		   'Search::OpenFTS::Dict::UnknownDict',
		   ]

   Methods
       index( $txt_id, [ $FH | $text | $reftext	] );
       index( $txt_id, [ $FH | $text | $reftext	], $title );
	   Used	for indexing text.

       delete (	$txt_id	)
	   Deletes all records of the given identifier.

       create_index
       create_index(1);
	   Creates indices for fast searching, non-zero	option - verbose mode

       drop_index()
	   Removes all indices on tables correspoding current instance of
	   OpenFTS.  Any error are ignored, only warn. This method is opposite
	   for create_index.  This is usefull for bulk uploading.

       drop()
	   Removes all tables correspoding current instance of OpenFTS.	 Any
	   error are ignored, only warn.

       start_index( $tid )
	   Opening a session for indexing

	   Use:

	   my $idx = Search::OpenFTS::Index->new( ... );

	   my $idx_chunk = $idx->start_index( ID );

	   foreach my $f ( glob	<*.html> ) {

		   $idx_chunk->index_chunk( IO::File->new( $f )	);

	   }

	   $idx_chunk->flush;

       fix_permissions($user)
	   Grant r/o access on indexes and search table	to user	$user or to
	   PUBLIC if $user doesn't specified.

	   Return TRUE on success or error message if fails. Please, check
	   return value	explicitly for '1' !

	   Calls fix_permissions for each dictionary if	it can.

       index_chunk( [FH|REFTXT|TXT], direction=>[1|-1] )
       index_chunk( [FH|REFTXT|TXT], wclass=>[A|B|C|D] )
       index_chunk( FH,	direction=>[1|-1], offset=>$offset, length=>$length
       );
       index_chunk( FH,	wclass=>[A|B|C|D], offset=>$offset, length=>$length
       );
	   Adds	a part to an index. Option 'direction' is to store
	   compatibility with old version of OpenFTS. wclass option has
	   defaults 'D'.

       flush
	   Dump	in base	of an index

DESCRIPTION
SEE ALSO
	   The OpenFTS Primer	       (  see doc/ subdirectory	)

	   The Crash-course to OpenFTS ( in examples/ subdirectory )

	   perldoc Search::OpenFTS::Search

	   perldoc Search::OpenFTS::Parser

	   perldoc Search::OpenFTS::Dict::PorterEng

	   perldoc Search::OpenFTS::Dict::Snowball

	   perldoc Search::OpenFTS::Dict::UnknownDict

	   perldoc Search::OpenFTS::Morph::ISpell

perl v5.24.1			  2004-01-26			      Index(3)

NAME | SYNOPSIS | DESCRIPTION | SEE ALSO

Want to link to this manual page? Use this URL:
<https://www.freebsd.org/cgi/man.cgi?query=Search::OpenFTS::Index&sektion=3&manpath=FreeBSD+12.1-RELEASE+and+Ports>

home | help