Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages


home | help
Apache::Solr(3)	      User Contributed Perl Documentation      Apache::Solr(3)

       Apache::Solr - Apache Solr (Lucene) extension

	Apache::Solr is	extended by

	 # use Log::Report mode	=> "DEBUG";
	 my $solr    = Apache::Solr->new(server	=> $url);

	 my $doc     = Apache::Solr::Document->new(...);
	 my $results = $solr->addDocument($doc);
	 $results or die $results->errors;

	 my $results = $solr->select(q => 'author:mark');
	 my $doc     = $results->selected(3);
	 print $doc->_author;

	 my $results = $solr->select(q => "really", hl => {fl=>'content'});
	 while(my $doc = $results->nextSelected)
	 {   my	$hldoc = $results->highlighted($doc);
	     print $hldoc->_content;

	 # based on Log::Report, hence (for communication errors and such)
	 use Log::Report;
	 dispatcher SYSLOG => 'default';  # now	all warnings/error to syslog
	 try { $solr->select(...) }; print $@->wasFatal;

       Solr is a stand-alone full-text search-engine (based on Lucent),	with
       loads of	features.  This	module tries to	provide	a high level interface
       to the Solr server.

       See	and

	   Create a client to connect to one "core" (collection) of the	Solr

	    -Option	   --Default
	     agent	     <created internally>
	     autocommit	     true
	     core	     undef
	     format	     'XML'
	     server	     <required>
	     server_version  <latest>

	   agent => LWP::UserAgent object
	     Agent which implements the	communication between this client and
	     the Solr server.

	     When you have multiple "Apache::Solr" objects in your program,
	     you may want to share this	agent, to share	the connection.	Since
	     [0.94], this will happen automagically: the parameter defaults to
	     the agent created for the previous	object.

	     Do	not forget to install LWP::Protocol::https if you need to
	     connect via https.

	   autocommit => BOOLEAN
	     Commit all	changes	immediately unless specified differently.

	   core	=> NAME
	     Set the core name to be addressed by this client. When there is
	     no	core name specified, the core is selected by the server	or
	     already part of the URL.

	     You probably want to set-up a core	dedicated for testing and one
	     for the live environment.

	   format => 'XML'|'JSON'
	     Communication format between client and server.  You may also
	     instantiate Apache::Solr::XML or Apache::Solr::JSON directly.

	   server => URL
	     The locations of the Solr server depends on the way the java
	     environment is set-up.   The URL is either	an URI object or a
	     string which can be instantiated as such.

	   server_version => VERSION
	     By	default	the latest version of the server software, currently
	     4.5.  Try to get this setting right, because it will help you a
	     lot in correct parameter use and support for the right features.

	   Returns the LWP::UserAgent object which maintains the connection to
	   the server.

       $obj->autocommit( [BOOLEAN] )
       $obj->core( [$core] )
	   Returns the $core, when not defined the default core	as set by
	   new(core).  May return "undef".

       $obj->server( [$uri|STRING] )
	   Returns the URI object which	refers to the server base address.
	   You need to clone() it before modifying.  You may set a new value
	   as STRING or	$uri object.

	   Returns the specified version of the	Solr server software (by
	   default the latest).	 Treat this version as string, to avoid
	   rounding errors.


	   Search for often used terms.	See

	   $terms are passed to	expandTerms() before being used.

	   Be warned: The result is not	sorted when XML	communication is used,
	   even	when you explicitly request it.


	     my	$r = $self->queryTerms(fl => 'subject',	limit => 100);
	     {	 foreach my $hit ($r->terms('subject'))
		 {   my	($term,	$count)	= @$hit;
		     print "term=$term,	count=$count\n";

	     if(my $r =	$self->queryTerms(fl =>	'subject', limit => 100))

	   Find	information in the document collection.

	   This	method has a HUGE number of parameters.	 These values are
	   passed in the uri of	the http query to the solr server.  See
	   expandSelect() for all the simplifications offered here.  Sets of
	   there parameters may	need configuration help	in the server as well.


       See  Missing are the
       atomic updates.

       $obj->addDocument( <$doc|ARRAY>,	%options )
	   Add one or more documents (Apache::Solr::Document objects) to the
	   Solr	database on the	server.

	    -Option	       --Default
	     allowDups		 <false>
	     commit		 <autocommit>
	     commitWithin	 undef
	     overwrite		 <true>
	     overwriteCommitted	 <not allowDups>
	     overwritePending	 <not allowDups>

	   allowDups =>	BOOLEAN
	     [removed since Solr 4.0]  Use option "overwrite".

	   commit => BOOLEAN
	   commitWithin	=> SECONDS
	     [Since Solr 3.4] Automatically translated into 'commit' for older
	     servers.  Currently, the resolution is milli-seconds.

	   overwrite =>	BOOLEAN
	   overwriteCommitted => BOOLEAN
	     [removed since Solr 4.0]  Use option "overwrite".

	   overwritePending => BOOLEAN
	     [removed since Solr 4.0]  Use option "overwrite".

	    -Option	   --Default
	     expungeDeletes  <false>
	     softCommit	     <false>
	     waitFlush	     <true>
	     waitSearcher    <true>

	   expungeDeletes => BOOLEAN
	     [since Solr 1.4]

	   softCommit => BOOLEAN
	     [since Solr 4.0]

	   waitFlush =>	BOOLEAN
	     [before Solr 1.4, removed in 4.0]

	   waitSearcher	=> BOOLEAN
	   Remove one or more documents, based on id or	query.

	    -Option	  --Default
	     commit	    <autocommit>
	     fromCommitted  true
	     fromPending    true
	     id		    undef
	     query	    undef

	   commit => BOOLEAN
	     When specified, it	indicates whether to commit (update the
	     indexes) after the	last delete.  By default the value of

	   fromCommitted => BOOLEAN
	     [deprecated since ?]

	   fromPending => BOOLEAN
	     [deprecated since ?]

	   id => ID|ARRAY-of-IDs
	     The expected content of the uniqueKey fields (usually named "id")
	     for the documents to be removed.

	   query => QUERY|ARRAY-of-QUERYs
	   Call	the Solr Tika built-in to have the server translate various
	   kinds of structured documents into Solr searchable documents.  This
	   component is	also called "Solr Cell".

	   The %options	are mostly passed on as	attributes to the server call,
	   but there are a few more.  You need to pass either a	"file" or
	   "string" with data.


	    -Option	 --Default
	     commit	   new(autocommit)
	     content_type  <from> filename
	     file	   undef
	     string	   undef

	   commit => BOOLEAN
	     [0.94] commit the document	to the database.

	   content_type	=> MIME
	     Either "file" or "string" must be used.

	   string => STRING|SCALAR
	     The document provided as normal text or a reference to raw	text.
	     You may also specify the "file" option with a filename.


	      my $r = $solr->extractDocument(file => 'design.pdf'
		, literal_id =>	'host');

	    -Option	 --Default
	     maxSegments   1
	     softCommit	   <false>
	     waitFlush	   <true>
	     waitSearcher  <true>

	   maxSegments => INTEGER
	     [since Solr 1.3]

	   softCommit => BOOLEAN
	     [since Solr 4.0]

	   waitFlush =>	BOOLEAN
	     [before Solr 1.4, removed from 4.0]

	   waitSearcher	=> BOOLEAN
	   [solr 1.4]

       Core management

       The CREATE, SWAP, ALIAS,	and RENAME actions are not yet supported,
       because they are	not very useful, it seems.

       $obj->coreReload( [$core] )
	   [0.94] Load a new core (on the server) from the configuration of
	   this	core. While the	new core is initializing, the existing one
	   will	continue to handle requests. When the new Solr core is ready,
	   it takes over and the old core is unloaded.

	     core    <this core>

	   core	=> NAME


	     my	$result	= $solr->coreReload;
	     $result or	die $result->errors;

	   [0.94] Returns a HASH with information about	this core.  There is
	   no description about	the exact structure and	interpretation of this

	     core    <this core>

	   core	=> NAME


	     my	$result	= $solr->coreStatus;
	     $result or	die $result->errors;

	     use Data::Dumper;
	     print Dumper $result->decoded->{status};

	   Removes a core from Solr. Active requests will continue to be
	   processed, but no new requests will be sent to the named core. If a
	   core	is registered under more than one name,	only the given name is

	     core    <this core>

	   core	=> NAME

       Parameter pre-processing

       Many parameters are passed to the server.  The syntax of	the
       communication protocol is not optimal for the end-user: it is too
       verbose and depends on the Solr server version.

       General rules:

       o   you can group them on prefix

       o   use underscore as alternative to dots: less quoting needed

       o   boolean values in Perl will get translated into 'true' and 'false'

       o   when	an ARRAY (or LIST), the	order of the parameters	get preserved

	   Produce a warning $message about deprecated parameters with the
	   indicated server version.

	   Used	by extractDocument().

	   [0.93] If the key is	"literal" or "literals", then the keys in the
	   value HASH (or ARRAY	of PAIRS) get 'literal.' prepended.
	   "Literals" are fields you add yourself to the SolrCEL output.
	   Unless "extractOnly", you need to specify the 'id' literal.

	   [0.94] You can also use "fmap", "boost", and	"resource" with	an
	   HASH	(or ARRAY-of-PAIRS).  [0.97] the value in each PAIR may	be a
	   SCALAR (ref string) which circumvents some copying.


	     my	$result	= $solr->extractDocument(string	=> $document
		, resource_name	=> $fn,	extractOnly => 1
		, literals => {	id => 5, b => 'tic' }, literal_xyz => 42
		, fmap => { id => 'doc_id' }, fmap_subject => 'mysubject'
		, boost	=> { abc => 3.5	}, boost_xyz =>	2.0);

	   The select()	method accepts many, many parameters.  These are
	   passed to modules in	the server, which need configuration before
	   being usable.

	   Besides the common parameters, like 'q' (query) and 'rows', there
	   are parameters for various (pluggable) backends, usually prefixed
	   by the backend abbreviation.

	   o   facet ->

	   o   hl (highlight) ->

	   o   mtl ->

	   o   stats ->

	   o   group ->

	   You may use WebService::Solr::Query to construct the	query ('q').


	     my	@r = $solr->expandSelect
	       ( q => 'inStock:true', rows => 10
	       , facet => {limit => -1,	field => [qw/cat inStock/], mincount =>	1}
	       , f_cat_facet =>	{missing => 1}
	       , hl    => {}
	       , mlt   => { fl => 'manu,cat', mindf => 1, mintf	=> 1 }
	       , stats => { field => [ 'price',	'popularity' ] }
	       , group => { query => 'price:[0 TO 99.99]', limit => 3 }

	     # becomes (one line)

	   Used	by queryTerms()	only.


	     my	@t = $solr->expandTerms('terms.lower.incl' => 'true');
	     my	@t = $solr->expandTerms([lower_incl => 1]);   #	same

	     my	$r = $self->queryTerms(fl => 'subject',	limit => 100);

	   Produce a warning $message about parameters which will get ignored
	   because they	were not yet supported by the indicated	server

	   Produce a warning $message about parameters which will not be
	   passed on, because they were	removed	from the indicated server

       Other helpers

       $obj->endpoint($action, %options)
	   Compute the address to be called (for HTTP)

	     core    new(core)
	     params  []

	   core	=> NAME
	     If	no core	is specified, the default of the server	is addressed.

	   params => HASH|ARRAY-of-pairs
	     The order of the parameters will be preserved when	an ARRAY or
	     parameters	is passed; you never know for a	HASH.

   Comparison with other implementations
       Compared	to WebService::Solr

       WebService::Solr	is a good module, with a lot of	miles.	The main
       differences is that "Apache::Solr" has much more	abstraction.

       o   simplified parameter	syntax,	improving readibility

       o   real	Perl-level boolean parameters, not 'true' and 'false'

       o   warnings for	deprecated and ignored parameters

       o   smart result	object with built-in trace and timing

       o   hidden paging of results

       o   flexible logging framework (Log::Report)

       o   both-way XML	or both-way JSON, not requests in XML and answers in

       o   access to plugings like terms and tika

       o   no Moose

       This module is part of Apache-Solr distribution version 1.05, built on
       January 11, 2019. Website:

       Copyrights 2012-2019 by [Mark Overmeer].	For other contributors see

       This program is free software; you can redistribute it and/or modify it
       under the same terms as Perl itself.  See

       Hey! The	above document had some	coding errors, which are explained

       Around line 44:
	   Unterminated	F<...> sequence

perl v5.32.1			  2019-01-11		       Apache::Solr(3)


Want to link to this manual page? Use this URL:

home | help