Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages


home | help
Bio::DB::NCBIHelper(3)User Contributed Perl DocumentatioBio::DB::NCBIHelper(3)

       Bio::DB::NCBIHelper - A collection of routines useful for queries to
       NCBI databases.

	# Do not use this module directly.

	# get a	Bio::DB::NCBIHelper object somehow
	my $seqio = $db->get_Stream_by_acc(['J00522']);
	foreach	my $seq	( $seqio->next_seq ) {
	    # process seq

       Provides	a single place to setup	some common methods for	querying NCBI
       web databases.  This module just	centralizes the	methods	for
       constructing a URL for querying NCBI GenBank and	NCBI GenPept and the
       common HTML stripping done in postprocess_data().

       The base	NCBI query URL used is:

   Mailing Lists
       User feedback is	an integral part of the	evolution of this and other
       Bioperl modules.	Send your comments and suggestions preferably to one
       of the Bioperl mailing lists. Your participation	is much	appreciated.			- General discussion	- About	the mailing lists

       Please direct usage questions or	support	issues to the mailing list:

       rather than to the module maintainer directly. Many experienced and
       reponsive experts will be able look at the problem and quickly address
       it. Please include a thorough description of the	problem	with code and
       data examples if	at all possible.

   Reporting Bugs
       Report bugs to the Bioperl bug tracking system to help us keep track
       the bugs	and their resolution.  Bug reports can be submitted via	the

AUTHOR - Jason Stajich

       The rest	of the documentation details each of the object	methods.
       Internal	methods	are usually preceded with a _

	Title	: new
	Usage	:
	Function: the new way to make modules a	little more lightweight
	Returns	:
	Args	:

	Title	: get_params
	Usage	: my %params = $self->get_params($mode)
	Function: returns key,value pairs to be	passed to NCBI database
		  for either 'batch' or	'single' sequence retrieval method
	Returns	: a key,value pair hash
	Args	: 'single' or 'batch' mode for retrieval

	Title	: default_format
	Usage	: my $format = $self->default_format
	Function: returns default sequence format for this module
	Returns	: string
	Args	: none

	Title	: get_request
	Usage	: my $url = $self->get_request
	Function: HTTP::Request
	Returns	:
	Args	: %qualifiers =	a hash of qualifiers (ids, format, etc)

	Title	: get_seq_stream
	Usage	: my $seqio = $self->get_seq_stream(%qualifiers)
	Function: builds a url and queries a web db
	Returns	: a Bio::SeqIO stream capable of producing sequence
	Args	: %qualifiers =	a hash qualifiers that the implementing	class
		  will process to make a url suitable for web querying

	 Title	 : get_Stream_by_batch
	 Usage	 : $seq	= $db->get_Stream_by_batch($ref);
	 Function: Retrieves Seq objects from Entrez 'en masse', rather	than one
		   at a	time.  For large numbers of sequences, this is far superior
		   than	get_Stream_by_id or get_Stream_by_acc.
	 Example :
	 Returns : a Bio::SeqIO	stream object
	 Args	 : $ref	: either an array reference, a filename, or a filehandle
		   from	which to get the list of unique	ids/accession numbers.

		   NOTE: deprecated API.  Use get_Stream_by_id() instead.

	 Title	 : get_Stream_by_query
	 Usage	 : $seq	= $db->get_Stream_by_query($query);
	 Function: Retrieves Seq objects from Entrez 'en masse', rather	than one
		   at a	time.  For large numbers of sequences, this is far superior
		   to get_Stream_by_id and get_Stream_by_acc.
	 Example :
	 Returns : a Bio::SeqIO	stream object
	 Args	 : An Entrez query string or a Bio::DB::Query::GenBank object.
		   It is suggested that	you create a Bio::DB::Query::GenBank object and	get
		   the entry count before you fetch a potentially large	stream.

	Title	: postprocess_data
	Usage	: $self->postprocess_data ( 'type' => 'string',
								    'location' => \$datastr );
	Function: Process downloaded data before loading into a	Bio::SeqIO. This
		  works	for Genbank and	Genpept, other classes should override
		  it with their	own method.
	Returns	: void
	Args	: hash with two	keys:

		  'type' can be	'string' or 'file'
		  'location' either file location or string reference containing data

	Title	: request_format
	Usage	: my ($req_format, $ioformat) =	$self->request_format;
	Function: Get/Set sequence format retrieval. The get-form will normally	not
		  be used outside of this and derived modules.
	Returns	: Array	of two strings,	the first representing the format for
		  retrieval, and the second specifying the corresponding SeqIO format.
	Args	: $format = sequence format

	Title	: redirect_refseq
	Usage	: $db->redirect_refseq(1)
	Function: simple getter/setter which redirects RefSeqs to use Bio::DB::RefSeq
	Returns	: Boolean value
	Args	: Boolean value	(optional)
	Throws	: 'unparseable output exception'
	Note	: This replaces	'no_redirect' as a more	straightforward	flag to
		  redirect possible RefSeqs to use Bio::DB::RefSeq (EBI	interface)
		  instead of retrieving	the NCBI records

	Title	: complexity
	Usage	: $db->complexity(3)
	Function: get/set complexity value
	Returns	: value	from 0-4 indicating level of complexity
	Args	: value	from 0-4 (optional); if	unset server assumes 1
	Throws	: if arg is not	an integer or falls outside of noted range above
	Note	: From efetch docs, the	complexity regulates the display:

		  0 - get the whole blob
		  1 - get the bioseq for gi of interest	(default in Entrez)
		  2 - get the minimal bioseq-set containing the	gi of interest
		  3 - get the minimal nuc-prot containing the gi of interest
		  4 - get the minimal pub-set containing the gi	of interest

	Title	: strand
	Usage	: $db->strand(1)
	Function: get/set strand value
	Returns	: strand value if set
	Args	: value	of 1 (plus) or 2 (minus); if unset server assumes 1
	Throws	: if arg is not	an integer or is not 1 or 2
	Note	: This differs from BioPerl's use of strand: 1 = plus, -1 = minus 0 = not relevant.
		  We should probably add in some functionality to convert over in the future.

	Title	: seq_start
	Usage	: $db->seq_start(123)
	Function: get/set sequence start location
	Returns	: sequence start value if set
	Args	: integer; if unset server assumes 1
	Throws	: if arg is not	an integer

	Title	: seq_stop
	Usage	: $db->seq_stop(456)
	Function: get/set sequence stop	(end) location
	Returns	: sequence stop	(end) value if set
	Args	: integer; if unset server assumes 1
	Throws	: if arg is not	an integer

	Title	: email
	Usage	: $db->email('')
	Function: get/set email	value
	Returns	: email	(string)  or undef
	Args	: string with a	valid email address; note we do	not vallidate this
	Throws	: if arg is not	an integer or falls outside of noted range above
	Note	: This is required if you wish to speed	up mulltiple requests faster
		  than 4s per request.

   Bio::DB::WebDBSeqI methods
       Overriding WebDBSeqI method to help newbies to retrieve sequences

	 Title	 : get_Stream_by_acc
	 Usage	 : $seq	= $db->get_Stream_by_acc([$acc1, $acc2]);
	 Function: gets	a series of Seq	objects	by accession numbers
	 Returns : a Bio::SeqIO	stream object
	 Args	 : $ref	: a reference to an array of accession numbers for
			  the desired sequence entries
	 Note	 : For GenBank,	this just calls	the same code for get_Stream_by_id()

	 Title	 : delay_policy
	 Usage	 : $secs = $self->delay_policy
	 Function: NCBI	requests a delay of 4 seconds between requests unless email is
		   provided. This method implements a 4	second delay; use 'delay()' to
		   override, though understand if no email is provided we are not
		   responsible for users being IP-blocked by NCBI
	 Returns : number of seconds to	delay
	 Args	 : none

	Title	: cookie
	Usage	: ($cookie,$querynum) =	$db->cookie
	Function: return the NCBI query	cookie,	this information is used by
		  Bio::DB::GenBank in conjunction with efetch, ripped from
	Returns	: list of (cookie,querynum)
	Args	: none

	Title	: _parse_response
	Usage	: $db->_parse_response($content)
	Function: parse	out response for cookie, this is a trimmed-down	version
		  of _parse_response from Bio::DB::Query::GenBank
	Returns	: empty
	Args	: none
	Throws	: 'unparseable output exception'

	Title	: no_redirect
	Usage	: $db->no_redirect($content)
	Function: DEPRECATED - Used to indicate	that Bio::DB::GenBank instance retrieves
		  possible RefSeqs from	EBI instead; default behavior is now to
		  retrieve directly from NCBI
	Returns	: None
	Args	: None
	Throws	: Method is deprecated in favor	of positive flag method	'redirect_refseq'

perl v5.32.1			  2021-06-30		Bio::DB::NCBIHelper(3)


Want to link to this manual page? Use this URL:

home | help