Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
Image::Grab(3)	      User Contributed Perl Documentation	Image::Grab(3)

NAME
       Image::Grab - Perl extension for	Grabbing images	off the	Internet.

SYNOPSIS
	 # If you call grab without instantiating an Image::Grab, then you
	 # can pass grab args and it will instantiate one for you and return
	 # whatever the	image is.
	 use Image::Grab qw(grab);
	 # $image should contain GIF data after	this.
	 $image	= grab(URL=>'http://www.example.com/test.gif');

	 use Image::Grab;
	 $pic =	new Image::Grab;

	 # You can also	pass new arguments:
	 use Image::Grab;
	 $pic =	Image::Grab->new(SEARCH_URL=>'http://www.example.com/',
				 REGEXP	   =>'.*\.gif');

	 # The simplest	OO case	of a grab
	 use Image::Grab;
	 $pic->url('http://www.example.com/someimage.jpg')
	 $pic->grab;

	 # Now to save the image to disk
	 open(IMAGE, ">image.jpg") || die"image.jpg: $!";
	 binmode IMAGE;	 # for MSDOS derivations.
	 print IMAGE $pic->image;
	 close IMAGE;

	 # A slightly more complicated case
	 use Image::Grab;
	 $pic->regexp('.*logo.*\.gif');
	 $pic->search_url('http://www.example.com');
	 $pic->grab;

	 # Get a weather forecast
	 use Image::Grab;
	 $pic->regexp('msy.*\.gif');
	 $pic->search_url('http://www.example.com/weather/msy/content.shtml');
	 $pic->grab;

DESCRIPTION
       Image::Grab is a	simple way to get images with URLs that	are either not
       predictable or are "hidden" by some method.

RATIONALE
       I created this module so	that I would have a uniform API	for grabbing
       multiple	images from multiple sites that	use various methods of making
       their images difficult to retrieve automatically.

       I've tried to put into code all the ways	that website creators will use
       to try to "protect" their images.  If you know of any methods I've
       missed, please email me.

       This module was born from a script.  The	script was born	when a certain
       Comics Syndicate	stopped	having a static	(or even predictable) url for
       their comics.  I	generalized the	code for a friend when he needed to do
       something similar.

       Hopefully, others will find this	module useful as well.

Retrieval Methods and Properties
       The following are the retrieval methods and properties available	for
       any Image::Grab object.

       One of the following should be set to specify the image.	 If either
       regexp or index are used	to specify the image, then search_url must be
       set to specify the page to be searched for the image.

       Image::Grab will	use the	data in	the following order: url, regexp,
       index.

       refer, regexp, search_url and url all have POSIX	time string expansion
       performed on them by the	expand_url method when do_posix	is set.	 Thus,
       if you wish to have a '%' character in your URL,	you must put '%%'.

   url
       The fully qualified URL of the image.  This method is included simply
       for completeness	and convenience.  If this is all you need, you might
       check out LWP::Simple.  (Although, the date expansion is	nice...)

       POSIX time string expansion is performed	if do_posix is set.

       Example:

	 $url =	$image->url("http://www.example.com/%Y/%m/%d.gif");

   search_url
       If regexp and/or	index methods are used to specify an image then	the
       url in the search_url field will	be used	to find	the image.  For
       example,	if "regexp="mac.*\.gif"" and
       "search_url="http://www.example.com"", then when	a grab is performed,
       the page	at www.example.com is searched to see if any images on the
       page match the regular expression "mac.*\.gif".

       Also, when Image::Grab finally grabs the	image, it uses the search_url
       as the referer field.

       POSIX time string expansion is performed	if do_posix is set.

       Example:

	 $image->search_url("http://www.example.com/weather_maps.html");

   index
       An integer indicating the image on the page to grab.  For instance, '1'
       would find the second image on the page pointed to by search_url.  Used
       in conjunction with regexp, it specifies	which image to grab that the
       regular expression matches.

       Example:

	 $image->search_url("http://www.example.com/index.html");
	 $image->regexp(".*\.gif");
	 $image->index(1);

   regexp
       A regular expression that will match the	URL of the image.  If index is
       not set,	then the first image that matches will be used.	 If index is
       set, then the nth image (base 0)	that matches will be used.

       Set search_url to the web page that you want to search with this
       regular expression.

       POSIX time string expansion is performed	if do_posix is set.

       Example:

	 $image->search_url("http://www.example.com/index.html");
	 $image->regexp(".*\.gif");

   grab	([$tries])
       Grab the	image.	Returns	$image->image;

       If the url method is not	used to	give an	absolute url, then expand_url
       is called before	the image is fetched.

       If $tries is specified, then $tries are attempted before	giving up.
       $tries defaults to 10.

       Returns the image grabbed.

       Example:

	 $image->url("http://www.example.com/comic_strip/%Y/%M/%d.gif");
	 $pic =	$image->grab;

   grab_new ([$tries])
       If neither date nor md5 are set,	than this method acts identically to
       grab.

       If md5 is set, then the grab is performed only if the checksum of the
       newer image is different	than the current checksum.

       If date is set than the grab is performed only if the image has been
       modified	since date.

       If both date and	md5 are	set then the conditions	are ANDed.  That is,
       the image is returned only if it	has been modified since	date and its
       checksum	is different than md5.

Image Properties
       These are various properties of the image.  Generally, you don't	want
       to set these after you've grabbed an image..

   image
       Returns the actual image.

   date
       The date	that the image was last	updated.  The date is represented in
       the number of seconds from epoch.

       If this is set when grab_new is called, then an image will only be
       returned	if the date of the image is newer than the date	set in this
       field.  (See grab_new for full details.)

	 $image->date(localtime(time));

	 # Grab	the image if it	changes	in the past 30 seconds;
	 $pic =	$image->grab_new;
	 $date = $image->date;

   md5
       The md5 sum for the image.

       If this is set when grab_new is called, then an image will only be
       returned	if the md5 checksums don't match. (See grab_new	for full
       details.)

       This will only be used if the MD5 module	is available.  Otherwise,
       there will be no	effect.

   type
       The Content-Type	of information returned.  Usually it will be a MIME
       type such as "image/jpeg".

Other Properties
       These are miscellaneous properties.  do_posix and cookiefile are	the
       only ones you should need to use.

   do_posix
       Tells Image::Grab to do POSIX date substitution.	 This is on by default
       in recentish perls.

       Perl versions 5.005 and up will have this set versions before this will
       not in order to avoid buggy behavior on long URLs.  If you have an
       earlier version of Perl and wish	to use the expansion, then set this
       on:

	 $image->do_posix(1);

   cookiefile
       Where the cookiefile is located.	 Set this to the file containing the
       cookies if you wish to use the cookie file for the image.

       For example, I use this to authenticate on sites	that require cookie
       authentication.	To do this, first load the cookie file by visiting the
       site with Netscape and getting a	cookie.	 Next, set the cookie file
       like this:

	 $image->cookiefile($ENV{HOME} ."/.netscape/cookies")

       Image::Grab will	automatically send the correct cookie when the remote
       server asks for it.

       The cookiefile is assumed to be in Netscape Navigator's format.

   cookiejar
       Usually only used internally.  Contains an HTTP::Cookies::Netscape
       blessed reference.

   ua
       This contains an	Image::Grab::RequestAgent blessed reference.
       Image::Grab::RequestAgent is sub-class of LWP::UserAgent	and inherits
       all its methods.

   refer
       When you	do a grab, this	url will be given as the referring URL.

       POSIX time string expansion is performed	if do_posix is set.

Other Methods
   auth($user, $password)
       Provides	a username/password pair for grabbing the image.

   getAllURLs ([$tries])
       Returns a list of URLs pointing to images from the page pointed to by
       search_url.  Of course, search_url must be set for this method to be of
       any use.

       If $tries is specified, then $tries are attempted before	giving up.
       $tries defaults to 10.

       Returns undef if	no connection is made in $tries	attempts.

   expand_url ([$tries])
   getRealURL ([$tries])
       Returns the actual URL of the image specified.  Performs	POSIX time
       string expansion	(see strftime(3)) using	the current time if do_posix
       is set.

       You can use this	method to get the URL for an image if that is all you
       need.

       If $tries is specified, then $tries are attempted before	giving up.
       $tries defaults to 10.

       Returns undef if	no connection is made in $tries	attempts, if the
       search_url URL is not of	type text/html,	or if no image that matches
       the specs is found.

       If url is given a full URL, then	it is returned with POSIX time string
       expansion performed if do_posix is set.

       The getRealURL method is	deprecated.

       Example:

	 $image->regexp('msy.*\.gif');
	 $image->search_url('http://www.example.com/weather/msy/content.shtml');
	 $url =	$image->expand_url;

	 # Grab	the image using	LWP::Simple.
	 use LWP::Simple;
	 $pic =	get($url);

   loadCookieJar
       Usually used only internally.  Loads up the cookiejar with cookies.

BUGS
       getAllURLs and expand_url should	really be fixed	so that	they go	out to
       the 'net	only once if they need to.

       POSIX date substitution screws up strings longer	than 127 chars.	 At
       least on	Perl 5.004_04 -- Perl 5.005_03 seems to	behave properly.

       Ummm... I am sure there are others...

LICENSE
       Same as Perl.

AUTHOR
       Mark A. Hershberger (mah@everybody.org),	http://everybody.org/mah/

SEE ALSO
       HTTP::Request, HTTP::Cookies, HTML::TreeBuilder,	LWP::UserAgent,
       Digest::MD5, URI::URL, strftime(3).

perl v5.32.0			  2004-10-27			Image::Grab(3)

NAME | SYNOPSIS | DESCRIPTION | RATIONALE | Retrieval Methods and Properties | Image Properties | Other Properties | Other Methods | BUGS | LICENSE | AUTHOR | SEE ALSO

Want to link to this manual page? Use this URL:
<https://www.freebsd.org/cgi/man.cgi?query=Image::Grab&sektion=3&manpath=FreeBSD+12.2-RELEASE+and+Ports>

home | help