Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
LW2(3)		      User Contributed Perl Documentation		LW2(3)

NAME
       LW2 - Perl HTTP library version 2.5

SYNOPSIS
       use LW2;

       require 'LW2.pm';

DESCRIPTION
       Libwhisker is a Perl library useful for HTTP testing scripts.  It
       contains	a pure-Perl reimplementation of	functionality found in the
       "LWP", "URI", "Digest::MD5", "Digest::MD4", "Data::Dumper",
       "Authen::NTLM", "HTML::Parser", "HTML::FormParser", "CGI::Upload",
       "MIME::Base64", and "GetOpt::Std" modules.

       Libwhisker is designed to be portable (a	single perl file), fast
       (general	benchmarks show	libwhisker is faster than LWP),	and flexible
       (great care was taken to	ensure the library does	exactly	what you want
       to do, even if it means breaking	the protocol).

FUNCTIONS
       The following are the functions contained in Libwhisker:

       auth_brute_force
	   Params: $auth_method, \%req,	$user, \@passwords [, $domain,
	   $fail_code ]

	   Return: $first_valid_password, undef	if error/none found

	   Perform a HTTP authentication brute force against a server (host
	   and URI defined in %req).  It will try every	password in the
	   password array for the given	user.  The first password (in
	   conjunction with the	given user) that doesn't return	HTTP 401 is
	   returned (and the brute force is stopped at that point).  You
	   should retry	the request with the given password and	double-check
	   that	you got	a useful HTTP return code that indicates successful
	   authentication (200,	302), and not something	a bit more abnormal
	   (407, 500, etc).  $domain is	optional, and is only used for NTLM
	   auth.

	   Note: set up	any proxy settings and proxy auth in %req before
	   calling this	function.

	   You can brute-force proxy authentication by setting up the target
	   proxy as proxy_host and proxy_port in %req, using an	arbitrary host
	   and uri (preferably one that	is reachable upon successful proxy
	   authorization), and setting the $fail_code to 407.  The
	   $auth_method	passed to this function	should be a proxy-based	one
	   ('proxy-basic', 'proxy-ntlm', etc).

	   if your server returns something other than 401 upon	auth failure,
	   then	set $fail_code to whatever is returned (and it needs to	be
	   something *different* than what is received on auth success,	or
	   this	function won't be able to tell the difference).

       auth_unset
	   Params: \%req

	   Return: nothing (modifies %req)

	   Modifes %req	to disable all authentication (regular and proxy).

	   Note: it only removes the values set	by auth_set().	Manually-
	   defined [Proxy-]Authorization headers will also be deleted (but you
	   shouldn't be	using the auth_* functions if you're manually handling
	   your	own auth...)

       auth_set
	   Params: $auth_method, \%req,	$user, $password [, $domain]

	   Return: nothing (modifies %req)

	   Modifes %req	to use the indicated authentication info.

	   Auth_method can be: 'basic',	'proxy-basic', 'ntlm', 'proxy-ntlm'.

	   Note: this function may not necessarily set any headers after being
	   called.  Also, proxy-ntlm with SSL is not currently supported.

       cookie_new_jar
	   Params: none

	   Return: $jar

	   Create a new	cookie jar, for	use with the other functions.  Even
	   though the jar is technically just a	hash, you should still use
	   this	function in order to be	future-compatible (should the jar
	   format change).

       cookie_read
	   Params: $jar, \%response [, \%request, $reject ]

	   Return: $num_of_cookies_read

	   Read	in cookies from	an %response hash, and put them	in $jar.

	   Notice: cookie_read uses internal magic done	by http_do_request in
	   order to read cookies regardless of 'Set-Cookie[2]' header
	   appearance.

	   If the optional %request hash is supplied, then it will be used to
	   calculate default host and path values, in case the cookie doesn't
	   specify them	explicitly.  If	$reject	is set to 1, then the %request
	   hash	values are used	to calculate and reject	cookies	which are not
	   appropriate for the path and	domains	of the given request.

       cookie_parse
	   Params: $jar, $cookie [, $default_domain, $default_path, $reject ]

	   Return: nothing

	   Parses the cookie into the various parts and	then sets the
	   appropriate values in the cookie $jar. If the cookie	value is
	   blank, it will delete it from the $jar.  See	the 'docs/cookies.txt'
	   document for	a full explanation of how Libwhisker parses cookies
	   and what RFC	aspects	are supported.

	   The optional	$default_domain	value is taken literally.  Values with
	   no leading dot (e.g.	'www.host.com')	are considered to be strict
	   hostnames and will only match the identical hostname.  Values with
	   leading dots	(e.g.  '.host.com') are	treated	as sub-domain matches
	   for a single	domain level.  If the cookie does not indicate a
	   domain, and a $default_domain is not	provided, then the cookie is
	   considered to match all domains/hosts.

	   The optional	$default_path is used when the cookie does not specify
	   a path.  $default_path must be absolute (start with '/'), or	it
	   will	be ignored.  If	the cookie does	not specify a path, and
	   $default_path is not	provided, then the default value '/' will be
	   used.

	   Set $reject to 1 if you wish	to reject cookies based	upon the
	   provided $default_domain and	$default_path.	Note that
	   $default_domain and $default_path must be specified for $reject to
	   actually do something meaningful.

       cookie_write
	   Params: $jar, \%request, $override

	   Return: nothing

	   Goes	through	the given $jar and sets	the Cookie header in %req
	   pending the correct domain and path.	 If $override is true, then
	   the secure, domain and path restrictions of the cookies are ignored
	   and all cookies are essentially included.

	   Notice: cookie expiration is	currently not implemented.  URL
	   restriction comparision is also case-insensitive.

       cookie_get
	   Params: $jar, $name

	   Return: @elements

	   Fetch the named cookie from the $jar, and return the	components.
	   The returned	items will be an array in the following	order:

	   value, domain, path,	expire,	secure

	   value  = cookie value, should always	be non-empty string domain =
	   domain root for cookie, can be undefined path   = URL path for
	   cookie, should always be a non-empty	string expire =	undefined
	   (depreciated, but exists for	backwards-compatibility) secure	=
	   whether or not the cookie is	limited	to HTTPs; value	is 0 or	1

       cookie_get_names
	   Params: $jar

	   Return: @names

	   Fetch all the cookie	names from the jar, which then let you
	   cooke_get() them individually.

       cookie_get_valid_names
	   Params: $jar, $domain, $url,	$ssl

	   Return: @names

	   Fetch all the cookie	names from the jar which are valid for the
	   given $domain, $url,	and $ssl values.  $domain should be string
	   scalar of the target	host domain ('www.example.com',	etc.).	$url
	   should be the absolute URL for the page ('/index.html',
	   '/cgi-bin/foo.cgi', etc.).  $ssl should be 0	for non-secure
	   cookies, or 1 for all (secure and normal) cookies.  The return
	   value is an array of	names compatible with cookie_get().

       cookie_set
	   Params: $jar, $name,	$value,	$domain, $path,	$expire, $secure

	   Return: nothing

	   Set the named cookie	with the provided values into the %jar.	 $name
	   is required to be a non-empty string.  $value is required, and will
	   delete the named cookie from	the $jar if it is an empty string.
	   $domain and $path can be strings or undefined.  $expire is ignored
	   (but	exists for backwards-compatibility).  $secure should be	the
	   numeric value of 0 or 1.

       crawl_new
	   Params: $START, $MAX_DEPTH, \%request_hash [, \%tracking_hash ]

	   Return: $crawl_object

	   The crawl_new() functions initializes a crawl object	(hash) to the
	   default values, and then returns it for later use by	crawl().
	   $START is the starting URL (in the form of
	   'http://www.host.com/url'), and MAX_DEPTH is	the maximum number of
	   levels to crawl (the	START URL counts as 1, so a value of 2 will
	   crawl the START URL and all URLs found on that page).  The
	   request_hash	is a standard initialized request hash to be used for
	   requests; you should	set any	authentication information or headers
	   in this hash	in order for the crawler to use	them.  The optional
	   tracking_hash lets you supply a hash	for use	in tracking URL
	   results (otherwise crawl_new() will allocate	a new anon hash).

       crawl
	   Params: $crawl_object [, $START, $MAX_DEPTH ]

	   Return: $count [ undef on error ]

	   The heart of	the crawl package.  Will perform an HTTP crawl on the
	   specified HOST, starting at START URI, proceeding up	to MAX_DEPTH.

	   Crawl_object	needs to be the	variable returned by crawl_new().  You
	   can also indirectly call crawl() via	the crawl_object itself:

		   $crawl_object->{crawl}->($START,$MAX_DEPTH)

	   Returns the number of URLs actually crawled (not including those
	   skipped).

       dump
	   Params: $name, \@array [, $name, \%hash, $name, \$scalar ]

	   Return: $code [ undef on error ]

	   The dump function will take the given $name and data	reference, and
	   will	create an ASCII	perl code representation suitable for eval'ing
	   later to recreate the same structure.  $name	is the name of the
	   variable that it will be saved as.  Example:

	    $output = LW2::dump('request',\%request);

	   NOTE: dump()	creates	anonymous structures under the name given.
	   For example,	if you dump the	hash %hin under	the name 'hin',	then
	   when	you eval the dumped code you will need to use %$hin, since
	   $hin	is now a *reference* to	a hash.

       dump_writefile
	   Params: $file, $name, \@array [, $name, \%hash, $name, \@scalar ]

	   Return: 0 if	success; 1 if error

	   This	calls dump() and saves the output to the specified $file.

	   Note: LW does not checking on the validity of the file name,	it's
	   creation, or	anything of the	sort.  Files are opened	in overwrite
	   mode.

       encode_base64
	   Params: $data [, $eol]

	   Return: $b64_encoded_data

	   This	function does Base64 encoding.	If the binary MIME::Base64
	   module is available,	it will	use that; otherwise, it	falls back to
	   an internal perl version.  The perl version carries the following
	   copyright:

	    Copyright 1995-1999	Gisle Aas <gisle@aas.no>

	   NOTE: the $eol parameter will be inserted every 76 characters.
	   This	is used	to format the data for output on a 80 character	wide
	   terminal.

       decode_base64
	   Params: $data

	   Return: $b64_decoded_data

	   A perl implementation of base64 decoding.  The perl code for	this
	   function was	actually taken from an older MIME::Base64 perl module,
	   and bears the following copyright:

	   Copyright 1995-1999 Gisle Aas <gisle@aas.no>

       encode_uri_hex
	   Params: $data

	   Return: $result

	   This	function encodes every character (except the / character) with
	   normal URL hex encoding.

       encode_uri_randomhex
	   Params: $data

	   Return: $result

	   This	function randomly encodes characters (except the / character)
	   with	normal URL hex encoding.

       encode_uri_randomcase
	   Params: $data

	   Return: $result

	   This	function randomly changes the case of characters in the
	   string.

       encode_unicode
	   Params: $data

	   Return: $result

	   This	function converts a normal string into Windows unicode format
	   (non-overlong or anything fancy).

       decode_unicode
	   Params: $unicode_string

	   Return: $decoded_string

	   This	function attempts to decode a unicode (UTF-8) string by
	   converting it into a	single-byte-character string.  Overlong
	   characters are converted to their standard characters in place;
	   non-overlong	(aka multi-byte) characters are	substituted with the
	   0xff; invalid encoding characters are left as-is.

	   Note: this function is useful for dealing with the various unicode
	   exploits/vulnerabilities found in web servers; it is	*not* good for
	   doing actual	UTF-8 parsing, since characters	over a single byte are
	   basically dropped/replaced with a placeholder.

       encode_anti_ids
	   Params: \%request, $modes

	   Return: nothing

	   encode_anti_ids computes the	proper anti-ids	encoding/tricks
	   specified by	$modes,	and sets up %hin in order to use those tricks.
	   Valid modes are (the	mode numbers are the same as those found in
	   whisker 1.4):

	   1 Encode some of the	characters via normal URL encoding
	   2 Insert directory self-references (/./)
	   3 Premature URL ending (make	it appear the request line is done)
	   4 Prepend a long random string in the form of "/string/../URL"
	   5 Add a fake	URL parameter
	   6 Use a tab instead of a space as a request spacer
	   7 Change the	case of	the URL	(works against Windows and Novell)
	   8 Change normal seperators ('/') to Windows version ('\')
	   9 Session splicing [NOTE: not currently available]
	   A Use a carriage return (0x0d) as a request spacer
	   B Use binary	value 0x0b as a	request	spacer

	   You can set multiple	modes by setting the string to contain all the
	   modes desired; i.e. $modes="146" will use modes 1, 4, and 6.

       FORMS FUNCTIONS
	   The goal is to parse	the variable, human-readable HTML into
	   concrete structures useable by your program.	 The forms functions
	   does	do a good job at making	these structures, but I	will admit:
	   they	are not	exactly	simple,	and thus not a cinch to	work with.
	   But then again, representing	something as complex as	a HTML form is
	   not a simple	thing either.  I think the results are acceptable for
	   what's trying to be done.  Anyways...

	   Forms are stored in perl hashes, with elements in the following
	   format:

	    $form{'element_name'}=@([ 'type', 'value', @params ])

	   Thus	every element in the hash is an	array of anonymous arrays.
	   The first array value contains the element type (which is 'select',
	   'textarea', 'button', or an 'input' value of	the form 'input-text',
	   'input-hidden', 'input-radio', etc).

	   The second value is the value, if applicable	(it could be undef if
	   no value was	specified).  Note that select elements will always
	   have	an undef value--the actual values are in the subsequent
	   options elements.

	   The third value, if defined,	is an anonymous	array of additional
	   tag parameters found	in the element (like 'onchange="blah"',
	   'size="20"',	'maxlength="40"', 'selected', etc).

	   The array does contain one special element, which is	stored in the
	   hash	under a	NULL character ("\0") key.  This element is of the
	   format:

	    $form{"\0"}=['name', 'method', 'action', @parameters];

	   The element is an anonymous array that contains strings of the
	   form's name,	method,	and action (values can be undef), and a
	   @parameters array similar to	that found in normal elements (above).

	   Accessing individual	values stored in the form hash becomes a test
	   of your perl	referencing skills.  Hint: to access the 'value' of
	   the third element named 'choices', you would	need to	do:

	    $form{'choices'}->[2]->[1];

	   The '[2]' is	the third element (normal array	starts with 0),	and
	   the actual value is '[1]' (the type is '[0]', and the parameter
	   array is '[2]').

       forms_read
	   Params: \$html_data

	   Return: \@found_forms

	   This	function parses	the given $html_data into libwhisker form
	   hashes.  It returns a reference to an array of hash references to
	   the found forms.

       forms_write
	   Params: \%form_hash

	   Return: $html_of_form [undef	on error]

	   This	function will take the given %form hash	and compose a generic
	   HTML	representation of it, formatted	with tabs and newlines in
	   order to make it neat and tidy for printing.

	   Note: this function does *not* escape any special characters	that
	   were	embedded in the	element	values.

       html_find_tags
	   Params: \$data, \&callback_function [, $xml_flag, $funcref,
	   \%tag_map]

	   Return: nothing

	   html_find_tags parses a piece of HTML and 'extracts'	all found
	   tags, passing the info to the given callback	function.  The
	   callback function must accept two parameters: the current tag (as a
	   scalar), and	a hash ref of all the tag's elements. For example, the
	   tag <a href="/file">	will pass 'a' as the current tag, and a	hash
	   reference which contains {'href'=>"/file"}.

	   The xml_flag, when set, causes the parser to	do some	extra
	   processing and checks to accomodate XML style tags such as <tag
	   foo="bar"/>.

	   The optional	%tagmap	is a hash of lowercase tag names.  If a	tagmap
	   is supplied,	then the parser	will only call the callback function
	   if the tag name exists in the tagmap.

	   The optional	$funcref variable is passed straight to	the callback
	   function, allowing you to pass flags	or references to more complex
	   structures to your callback function.

       html_find_tags_rewrite
	   Params: $position, $length, $replacement

	   Return: nothing

	   html_find_tags_rewrite() is used to 'rewrite' an HTML stream	from
	   within an html_find_tags() callback function.  In general, you can
	   think of html_find_tags_rewrite working as:

	   substr(DATA,	$position, $length) = $replacement

	   Where DATA is the current HTML string the html parser is using.
	   The reason you need to use this function and	not substr() is
	   because a few internal parser pointers and counters need to be
	   adjusted to accomodate the changes.

	   If you want to remove a piece of the	string,	just set the
	   replacement to an empty string ('').	 If you	wish to	insert a
	   string instead of overwrite,	just set $length to 0; your string
	   will	be inserted at the indicated $position.

       html_link_extractor
	   Params: \$html_data

	   Return: @urls

	   The html_link_extractor() function uses the internal	crawl tests to
	   extract all the HTML	links from the given HTML data stream.

	   Note: html_link_extractor() does not	unique the returned array of
	   discovered links, nor does it attempt to remove javascript links or
	   make	the links absolute.  It	just extracts every raw	link from the
	   HTML	stream and returns it.	You'll have to do your own post-
	   processing.

       http_new_request
	   Params: %parameters

	   Return: \%request_hash

	   This	function basically 'objectifies' the creation of whisker
	   request hash	objects.  You would call it like:

	    $req = http_new_request( host=>'www.example.com', uri=>'/' )

	   where 'host'	and 'uri' can be any number of {whisker} hash control
	   values (see http_init_request for default list).

       http_new_response
	   Params: [none]

	   Return: \%response_hash

	   This	function basically 'objectifies' the creation of whisker
	   response hash objects.  You would call it like:

		   $resp = http_new_response()

       http_init_request
	   Params: \%request_hash_to_initialize

	   Return: Nothing (modifies input hash)

	   Sets	default	values to the input hash for use.  Sets	the host to
	   'localhost',	port 80, request URI '/', using	HTTP 1.1 with GET
	   method.  The	timeout	is set to 10 seconds, no proxies are defined,
	   and all URI formatting is set to standard HTTP syntax.  It also
	   sets	the Connection (Keep-Alive) and	User-Agent headers.

	   NOTICE!!  It's important to use http_init_request before calling
	   http_do_request, or http_do_request might puke.  Thus, a special
	   magic value is placed in the	hash to	let http_do_request know that
	   the hash has	been properly initialized.  If you really must 'roll
	   your	own' and not use http_init_request before you call
	   http_do_request, you	will at	least need to set the MAGIC value
	   (amongst other things).

       http_do_request
	   Params: \%request, \%response [, \%configs]

	   Return: >=1 if error; 0 if no error (also modifies response hash)

	   *THE* core function of libwhisker.  http_do_request actually
	   performs the	HTTP request, using the	values submitted in %request,
	   and placing result values in	%response.  This allows	you to
	   resubmit %request in	subsequent requests (%response is
	   automatically cleared upon execution).  You can submit 'runtime'
	   config directives as	%configs, which	will be	spliced	into
	   $hin{whisker}->{} before anything else.  That means you can do:

	   LW2::http_do_request(\%req,\%resp,{'uri'=>'/cgi-bin/'});

	   This	will set $req{whisker}->{'uri'}='/cgi-bin/' before execution,
	   and provides	a simple shortcut (note: it does modify	%req).

	   This	function will also retry any requests that bomb	out during the
	   transaction (but not	during the connecting phase).  This is
	   controlled by the {whisker}->{retry}	value.	Also note that the
	   returned error message in hout is the *last*	error received.	 All
	   retry errors	are put	into {whisker}->{retry_errors},	which is an
	   anonymous array.

	   Also	note that all NTLM auth	logic is implemented in
	   http_do_request().  NTLM requires multiple requests in order	to
	   work	correctly, and so this function	attempts to wrap that and make
	   it all transparent, so that the final end result is what's passed
	   to the application.

	   This	function will return 0 on success, 1 on	HTTP protocol error,
	   and 2 on non-recoverable network connection error (you can retry
	   error 1, but	error 2	means that the server is totally unreachable
	   and there's no point	in retrying).

       http_req2line
	   Params: \%request, $uri_only_switch

	   Return: $request

	   req2line is used internally by http_do_request, as well as provides
	   a convienient way to	turn a %request	configuration into an actual
	   HTTP	request	line.  If $switch is set to 1, then the	returned
	   $request will be the	URI only ('/requested/page.html'), versus the
	   entire HTTP request ('GET /requested/page.html HTTP/1.0\n\n').
	   Also, if the	'full_request_override'	whisker	config variable	is set
	   in %hin, then it will be returned instead of	the constructed	URI.

       http_resp2line
	   Params: \%response

	   Return: $response

	   http_resp2line provides a convienient way to	turn a %response hash
	   back	into the original HTTP response	line.

       http_fixup_request
	   Params: $hash_ref

	   Return: Nothing

	   This	function takes a %hin hash reference and makes sure the	proper
	   headers exist (for example, it will add the Host: header, calculate
	   the Content-Length: header for POST requests, etc).	For standard
	   requests (i.e. you want the request to be HTTP RFC-compliant), you
	   should call this function right before you call http_do_request.

       http_reset
	   Params: Nothing

	   Return: Nothing

	   The http_reset function will	walk through the %http_host_cache,
	   closing all open sockets and	freeing	SSL resources.	It also	clears
	   out the host	cache in case you need to rerun	everything fresh.

	   Note: if you	just want to close a single connection,	and you	have a
	   copy	of the %request	hash you used, you should use the http_close()
	   function instead.

       ssl_is_available
	   Params: Nothing

	   Return: $boolean [, $lib_name, $version]

	   The ssl_is_available() function will	inform you whether SSL
	   requests are	allowed, which is dependant on whether the appropriate
	   SSL libraries are installed on the machine.	In scalar context, the
	   function will return	1 or 0.	 In array context, the second element
	   will	be the SSL library name	that is	currently being	used by	LW2,
	   and the third elment	will be	the SSL	library	version	number.
	   Elements two	and three (name	and version) will be undefined if
	   called in array context and no SSL libraries	are available.

       http_read_headers
	   Params: $stream, \%in, \%out

	   Return: $result_code, $encoding, $length, $connection

	   Read	HTTP headers from the given stream, storing the	results	in
	   %out.  On success, $result_code will	be 1 and $encoding, $length,
	   and $connection will	hold the values	of the Transfer-Encoding,
	   Content-Length, and Connection headers, respectively.  If any of
	   those headers are not present, then it will have an 'undef' value.
	   On an error,	the $result_code will be 0 and $encoding will contain
	   an error message.

	   This	function can be	used to	parse both request and response
	   headers.

	   Note: if there are multiple Transfer-Encoding, Content-Length, or
	   Connection headers, then only the last header value is the one
	   returned by the function.

       http_read_body
	   Params: $stream, \%in, \%out, $encoding, $length

	   Return: 1 on	success, 0 on error (and sets
	   $hout->{whisker}->{error})

	   Read	the body from the given	stream,	placing	it in
	   $out->{whisker}->{data}.  Handles chunked encoding.	Can be used to
	   read	HTTP (POST) request or HTTP response bodies.  $encoding
	   parameter should be lowercase encoding type.

	   NOTE: $out->{whisker}->{data} is erased/cleared when	this function
	   is called, leaving {data} to	just contain this particular HTTP
	   body.

       http_construct_headers
	   Params: \%in

	   Return: $data

	   This	function assembles the headers in the given hash into a	data
	   string.

       http_close
	   Params: \%request

	   Return: nothing

	   This	function will close any	open streams for the given request.

	   Note: in order for http_close() to find the right connection, all
	   original host/proxy/port parameters in %request must	be the exact
	   same	as when	the original request was made.

       http_do_request_timeout
	   Params: \%request, \%response, $timeout

	   Return: $result

	   This	function is identical to http_do_request(), except that	it
	   wraps the entire request in a timeout wrapper.  $timeout is the
	   number of seconds to	allow for the entire request to	be completed.

	   Note: this function uses alarm() and	signals, and thus will only
	   work	on Unix-ish platforms.	It should be safe to call on any
	   platform though.

       md5 Params: $data

	   Return: $hex_md5_string

	   This	function takes a data scalar, and composes a MD5 hash of it,
	   and returns it in a hex ascii string.  It will use the fastest MD5
	   function available.

       md4 Params: $data

	   Return: $hex_md4_string

	   This	function takes a data scalar, and composes a MD4 hash of it,
	   and returns it in a hex ascii string.  It will use the fastest MD4
	   function available.

       multipart_set
	   Params: \%multi_hash, $param_name, $param_value

	   Return: nothing

	   This	function sets the named	parameter to the given value within
	   the supplied	multipart hash.

       multipart_get
	   Params: \%multi_hash, $param_name

	   Return: $param_value, undef on error

	   This	function retrieves the named parameter to the given value
	   within the supplied multipart hash.	There is a special case	where
	   the named parameter is actually a file--in which case the resulting
	   value will be "\0FILE".  In general,	all special values will	be
	   prefixed with a NULL	character.  In order to	get a file's info, use
	   multipart_getfile().

       multipart_setfile
	   Params: \%multi_hash, $param_name, $file_path [, $filename]

	   Return: undef on error, 1 on	success

	   NOTE: this function does not	actually add the contents of
	   $file_path into the %multi_hash; instead, multipart_write() inserts
	   the content when generating the final request.

       multipart_getfile
	   Params: \%multi_hash, $file_param_name

	   Return: $path, $name	($path=undef on	error)

	   multipart_getfile is	used to	retrieve information for a file
	   parameter contained in %multi_hash.	To use this you	would most
	   likely do:

	    ($path,$fname)=LW2::multipart_getfile(\%multi,"param_name");

       multipart_boundary
	   Params: \%multi_hash	[, $new_boundary_name]

	   Return: $current_boundary_name

	   multipart_boundary is used to retrieve, and optionally set, the
	   multipart boundary used for the request.

	   NOTE: the function does no checking on the supplied boundary, so if
	   you want things to work make	sure it's a legit boundary.
	   Libwhisker does *not* prefix	it with	any '---' characters.

       multipart_write
	   Params: \%multi_hash, \%request

	   Return: 1 if	successful, undef on error

	   multipart_write is used to parse and	construct the multipart	data
	   contained in	%multi_hash, and place it ready	to go in the given
	   whisker hash	(%request) structure, to be sent to the	server.

	   NOTE: file contents are read	into the final %request, so it's
	   possible for	the hash to get	*very* large if	you have (a) large
	   file(s).

       multipart_read
	   Params: \%multi_hash, \%hout_response [, $filepath ]

	   Return: 1 if	successful, undef on error

	   multipart_read will parse the data contents of the supplied
	   %hout_response hash,	by passing the appropriate info	to
	   multipart_read_data().  Please see multipart_read_data() for	more
	   info	on parameters and behaviour.

	   NOTE: this function will return an error if the given
	   %hout_response Content-Type is not set to "multipart/form-data".

       multipart_read_data
	   Params: \%multi_hash, \$data, $boundary [, $filepath	]

	   Return: 1 if	successful, undef on error

	   multipart_read_data parses the contents of the supplied data	using
	   the given boundary and puts the values in the supplied %multi_hash.
	   Embedded files will *not* be	saved unless a $filepath is given,
	   which should	be a directory suitable	for writing out	temporary
	   files.

	   NOTE: currently only	application/octet-stream is the	only supported
	   file	encoding.  All other file encodings will not be	parsed/saved.

       multipart_files_list
	   Params: \%multi_hash

	   Return: @files

	   multipart_files_list	returns	an array of parameter names for	all
	   the files that are contained	in %multi_hash.

       multipart_params_list
	   Params: \%multi_hash

	   Return: @params

	   multipart_files_list	returns	an array of parameter names for	all
	   the regular parameters (non-file) that are contained	in
	   %multi_hash.

       ntlm_new
	   Params: $username, $password	[, $domain, $ntlm_only]

	   Return: $ntlm_object

	   Returns a reference to an array (otherwise known as the 'ntlm
	   object') which contains the various informations specific to	a
	   user/pass combo.  If	$ntlm_only is set to 1,	then only the NTLM
	   hash	(and not the LanMan hash) will be generated.  This results in
	   a speed boost, and is typically fine	for using against IIS servers.

	   The array contains the following items, in order: username,
	   password, domain, lmhash(password), ntlmhash(password)

       ntlm_decode_challenge
	   Params: $challenge

	   Return: @challenge_parts

	   Splits the supplied challenge into the various parts.  The returned
	   array contains elements in the following order:

	   unicode_domain, ident, packet_type, domain_len, domain_maxlen,
	   domain_offset, flags, challenge_token, reserved, empty, raw_data

       ntlm_client
	   Params: $ntlm_obj [,	$server_challenge]

	   Return: $response

	   ntlm_client() is responsible	for generating the base64-encoded text
	   you include in the HTTP Authorization header.  If you call
	   ntlm_client() without a $server_challenge, the function will	return
	   the initial NTLM request packet (message packet #1).	 You send this
	   to the server, and take the server's	response (message packet #2)
	   and pass that as $server_challenge, causing ntlm_client() to
	   generate the	final response packet (message packet #3).

	   Note: $server_challenge is expected to be base64 encoded.

       get_page
	   Params: $url	[, \%request]

	   Return: $code, $data	($code will be set to undef on error, $data
	   will		       contain error message)

	   This	function will fetch the	page at	the given URL, and return the
	   HTTP	response code and page contents.  Use this in the form of:
	   ($code,$html)=LW2::get_page("http://host.com/page.html")

	   The optional	%request will be used if supplied.  This allows	you to
	   set headers and other parameters.

       get_page_hash
	   Params: $url	[, \%request]

	   Return: $hash_ref (undef on no URL)

	   This	function will fetch the	page at	the given URL, and return the
	   whisker HTTP	response hash.	The return code	of the function	is set
	   to $hash_ref->{whisker}->{get_page_hash}, and uses the
	   http_do_request() return values.

	   Note: undef is returned if no URL is	supplied

       get_page_to_file
	   Params: $url, $filepath [, \%request]

	   Return: $code ($code	will be	set to undef on	error)

	   This	function will fetch the	page at	the given URL, place the
	   resulting HTML in the file specified, and return the	HTTP response
	   code.  The optional %request	hash sets the default parameters to be
	   used	in the request.

	   NOTE: libwhisker does not do	any file checking; libwhisker will
	   open	the supplied filepath for writing, overwriting any previously-
	   existing files.  Libwhisker does not	differentiate between a	bad
	   request, and	a bad file open.  If you're having troubles making
	   this	function work, make sure that your $filepath is	legal and
	   valid, and that you have appropriate	write permissions to
	   create/overwrite that file.

       time_mktime
	   Params: $seconds, $minutes, $hours, $day_of_month, $month,
	   $year_minus_1900

	   Return: $seconds [ -1 on error ]

	   Performs a general mktime calculation with the given	time
	   components.	Note that the input parameter values are expected to
	   be in the format output by localtime/gmtime.	 Namely, $seconds is
	   0-60	(yes, there can	be a leap second value of 60 occasionally),
	   $minutes is 0-59, $hours is 0-23, $days is 1-31, $month is 0-11,
	   and $year is	70-127.	 This function is limited in that it will not
	   process dates prior to 1970 or after	2037 (that way 32-bit time_t
	   overflow calculations aren't	required).

	   Additional parameters passed	to the function	are ignored, so	it is
	   safe	to use the full	localtime/gmtime output, such as:

		   $seconds = LW2::time_mktime(	localtime( time	) );

	   Note: this function does not	adjust for time	zone, daylight savings
	   time, etc.  You must	do that	yourself.

       time_gmtolocal
	   Params: $seconds_gmt

	   Return: $seconds_local_timezone

	   Takes a seconds value in UTC/GMT time and adjusts it	to reflect the
	   current timezone.  This function is slightly	expensive; it takes
	   the gmtime()	and localtime()	representations	of the current time,
	   calculates the delta	difference by turning them back	into seconds
	   via time_mktime, and	then applies this delta	difference to
	   $seconds_gmt.

	   Note	that if	you give this function a time and subtract the return
	   value from the original time, you will get the delta	value.	At
	   that	point, you can just apply the delta directly and skip calling
	   this	function, which	is a massive performance boost.	 However, this
	   will	cause problems if you have a long running program which
	   crosses daylight savings time boundaries, as	the DST	adjustment
	   will	not be accounted for unless you	recalculate the	new delta.

       uri_split
	   Params: $uri_string [, \%request_hash]

	   Return: @uri_parts

	   Return an array of the following values, in order:  uri, protocol,
	   host, port, params, frag, user, password.  Values not defined are
	   given an undef value.  If a %request	hash is	passed in, then
	   uri_split() will also set the appropriate values in the hash.

	   Note:  uri_split() will only	set the	%request hash if the protocol
	   is HTTP or HTTPS!

       uri_join
	   Params: @vals

	   Return: $url

	   Takes the @vals array output	from http_split_uri, and returns a
	   single scalar/string	with them joined again,	in the form of:
	   protocol://user:pass@host:port/uri?params#frag

       uri_absolute
	   Params: $uri, $base_uri [, $normalize_flag ]

	   Return: $absolute_uri

	   Double checks that the given	$uri is	in absolute form (that is,
	   "http://host/file"),	and if not (it's in the	form "/file"), then it
	   will	append the given $base_uri to make it absolute.	 This provides
	   a compatibility similar to that found in the	URI subpackage.

	   If $normalize_flag is set to	1, then	the output will	be passed
	   through uri_normalize before	being returned.

       uri_normalize
	   Params: $uri	[, $fix_windows_slashes	]

	   Return: $normalized_uri [ undef on error ]

	   Takes the given $uri	and does any /./ and /../ dereferencing	in
	   order to come up with the correct absolute URL.  If the $fix_
	   windows_slashes parameter is	set to 1, all \	(back slashes) will be
	   converted to	/ (forward slashes).

	   Non-http/https URIs return an error.

       uri_get_dir
	   Params: $uri

	   Return: $uri_directory

	   Will	take a URI and return the directory base of it,	i.e.
	   /rfp/page.php will return /rfp/.

       uri_strip_path_parameters
	   Params: $uri	[, \%param_hash]

	   Return: $stripped_uri

	   This	function removes all URI path parameters of the	form

	    /blah1;foo=bar/blah2;baz

	   and returns the stripped URI	('/blah1/blah2').  If the optional
	   parameter hash reference is provided, the stripped parameters are
	   saved in the	form of	'blah1'=>'foo=bar', 'blah2'=>'baz'.

	   Note: only the last value of	a duplicate name is saved into the
	   param_hash, if provided.  So	a $uri of '/foo;A/foo;B/' will result
	   in a	single hash entry of 'foo'=>'B'.

       uri_parse_parameters
	   Params: $parameter_string [,	$decode, $multi_flag ]

	   Return: \%parameter_hash

	   This	function takes a string	in the form of:

	    foo=1&bar=2&baz=3&foo=4

	   And parses it into a	hash.  In the above example, the element 'foo'
	   has two values (1 and 4).  If $multi_flag is	set to 1, then the
	   'foo' hash entry will hold an anonymous array of both values.
	   Otherwise, the default is to	just contain the last value (in	this
	   case, '4').

	   If $decode is set to	1, then	normal hex decoding is done on the
	   characters, where needed (both the name and value are decoded).

	   Note: if a URL parameter name appears without a value, then the
	   value will be set to	undef.	E.g. for the string "foo=1&bar&baz=2",
	   the 'bar' hash element will have an undef value.

       uri_escape
	   Params: $data

	   Return: $encoded_data

	   This	function encodes the given $data so it is safe to be used in
	   URIs.

       uri_unescape
	   Params: $encoded_data

	   Return: $data

	   This	function decodes the given $data out of	URI format.

       utils_recperm
	   Params: $uri, $depth, \@dir_parts, \@valid, \&func, \%track,
	   \%arrays, \&cfunc

	   Return: nothing

	   This	is a special function which is used to recursively-permutate
	   through a given directory listing.  This is really only used	by
	   whisker, in order to	traverse down directories, testing them	as it
	   goes.  See whisker 2.0 for exact usage examples.

       utils_array_shuffle
	   Params: \@array

	   Return: nothing

	   This	function will randomize	the order of the elements in the given
	   array.

       utils_randstr
	   Params: [ $size, $chars ]

	   Return: $random_string

	   This	function generates a random string between 10 and 20
	   characters long, or of $size	if specified.  If $chars is specified,
	   then	the random function picks characters from the supplied string.
	   For example,	to have	a random string	of 10 characters, composed of
	   only	the characters 'abcdef', then you would	run:

	    utils_randstr(10,'abcdef');

	   The default character string	is alphanumeric.

       utils_port_open
	   Params: $host, $port

	   Return: $result

	   Quick function to attempt to	make a connection to the given host
	   and port.  If a connection was successfully made, function will
	   return true (1).  Otherwise it returns false	(0).

	   Note: this uses standard TCP	connections, thus is not recommended
	   for use in port-scanning type applications.	Extremely slow.

       utils_lowercase_keys
	   Params: \%hash

	   Return: $number_changed

	   Will	lowercase all the header names (but not	values)	of the given
	   hash.

       utils_find_lowercase_key
	   Params: \%hash, $key

	   Return: $value, undef on error or not exist

	   Searches the	given hash for the $key	(regardless of case), and
	   returns the value. If the return value is placed into an array, the
	   will	dereference any	multi-value references and return an array of
	   all values.

	   WARNING!  In	scalar context,	$value can either be a single-value
	   scalar or an	array reference	for multiple scalar values.  That
	   means you either need to check the return value and act
	   appropriately, or use an array context (even	if you only want a
	   single value).  This	is very	important, even	if you know there are
	   no multi-value hash keys.  This function may	still return an	array
	   of multiple values even if all hash keys are	single value, since
	   lowercasing the keys	could result in	multiple keys matching.	 For
	   example, a hash with	the values { 'Foo'=>'a', 'fOo'=>'b' }
	   technically has two keys with the lowercase name 'foo', and so this
	   function will either	return an array	or array reference with	both
	   'a' and 'b'.

       utils_find_key
	   Params: \%hash, $key

	   Return: $value, undef on error or not exist

	   Searches the	given hash for the $key	(case-sensitive), and returns
	   the value. If the return value is placed into an array, the will
	   dereference any multi-value references and return an	array of all
	   values.

       utils_delete_lowercase_key
	   Params: \%hash, $key

	   Return: $number_found

	   Searches the	given hash for the $key	(regardless of case), and
	   deletes the key out of the hash if found.  The function returns the
	   number of keys found	and deleted (since multiple keys can exist
	   under the names 'Key', 'key', 'keY',	'KEY', etc.).

       utils_getline
	   Params: \$data [, $resetpos ]

	   Return: $line (undef	if no more data)

	   Fetches the next \n terminated line from the	given data.  Use the
	   optional $resetpos to reset the internal position pointer.  Does
	   *NOT* return	trialing \n.

       utils_getline_crlf
	   Params: \$data [, $resetpos ]

	   Return: $line (undef	if no more data)

	   Fetches the next \r\n terminated line from the given	data.  Use the
	   optional $resetpos to reset the internal position pointer.  Does
	   *NOT* return	trialing \r\n.

       utils_save_page
	   Params: $file, \%response

	   Return: 0 on	success, 1 on error

	   Saves the data portion of the given whisker %response hash to the
	   indicated file.  Can	technically save the data portion of a
	   %request hash too.  A file is not written if	there is no data.

	   Note: LW does not do	any special file checking; files are opened in
	   overwrite mode.

       utils_getopts
	   Params: $opt_str, \%opt_results

	   Return: 0 on	success, 1 on error

	   This	function is a general implementation of	GetOpts::Std.  It will
	   parse @ARGV,	looking	for the	options	specified in $opt_str, and
	   will	put the	results	in %opt_results.  Behavior/parameter values
	   are similar to GetOpts::Std's getopts().

	   Note: this function does *not* support long options (--option),
	   option grouping (-opq), or options with immediate values (-ovalue).
	   If an option	is indicated as	having a value,	it will	take the next
	   argument regardless.

       utils_text_wrapper
	   Params: $long_text_string [,	$crlf, $width ]

	   Return: $formatted_test_string

	   This	is a simple function used to format a long line	of text	for
	   display on a	typical	limited-character screen, such as a unix shell
	   console.

	   $crlf defaults to "\n", and $width defaults to 76.

       utils_bruteurl
	   Params: \%req, $pre,	$post, \@values_in, \@values_out

	   Return: Nothing (adds to @out)

	   Bruteurl will perform a brute force against the host/server
	   specified in	%req.  However,	it will	make one request per entry in
	   @in,	taking the value and setting $hin{'whisker'}->{'uri'}=
	   $pre.value.$post.  Any URI responding with an HTTP 200 or 403
	   response is pushed into @out.  An example of	this would be to brute
	   force usernames, putting a list of common usernames in @in, setting
	   $pre='/~' and $post='/'.

       utils_join_tag
	   Params: $tag_name, \%attributes

	   Return: $tag_string [undef on error]

	   This	function takes the $tag_name (like 'A')	and a hash full	of
	   attributes (like {href=>'http://foo/'}) and returns the constructed
	   HTML	tag string (<A href="http://foo">).

       utils_request_clone
	   Params: \%from_request, \%to_request

	   Return: 1 on	success, 0 on error

	   This	function takes the connection/request-specific values from the
	   given from_request hash, and	copies them to the to_request hash.

       utils_request_fingerprint
	   Params: \%request [,	$hash ]

	   Return: $fingerprint	[undef on error]

	   This	function constructs a 'fingerprint' of the given request by
	   using a cryptographic hashing function on the constructed original
	   HTTP	request.

	   Note: $hash can be 'md5' (default) or 'md4'.

       utils_flatten_lwhash
	   Params: \%lwhash

	   Return: $flat_version [undef	on error]

	   This	function takes a %request or %response libwhisker hash,	and
	   creates an approximate flat data string of the original request/
	   response (i.e. before it was	parsed into components and placed into
	   the libwhisker hash).

       utils_carp
	   Params: [ $package_name ]

	   Return: nothing

	   This	function acts like Carp's carp function.  It warn's with the
	   file	and line number	of user's code which causes a problem.	It
	   traces up the call stack and	reports	the first function that	is not
	   in the LW2 or optional $package_name	package	package.

       utils_croak
	   Params: [ $package_name ]

	   Return: nothing

	   This	function acts like Carp's croak	function.  It die's with the
	   file	and line number	of user's code which causes a problem.	It
	   traces up the call stack and	reports	the first function that	is not
	   in the LW2 or optional $package_name	package	package.

SEE ALSO
       LWP

COPYRIGHT
       Copyright 2009 Jeff Forristal

2.5				  2021-11-05				LW2(3)

NAME | SYNOPSIS | DESCRIPTION | FUNCTIONS | SEE ALSO | COPYRIGHT

Want to link to this manual page? Use this URL:
<https://www.freebsd.org/cgi/man.cgi?query=LW2&sektion=3pm&manpath=FreeBSD+13.0-RELEASE+and+Ports>

home | help