Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages


home | help
README(3)	      User Contributed Perl Documentation	     README(3)

       URI::Fast - A fast(er) URI parser

	 use URI::Fast qw(uri);

	 my $uri = uri '';

	 if ($uri->scheme =~ /http(s)?/) {
	   my @path  = $uri->path;
	   my $fnord = $uri->param('fnord');
	   my $foo   = $uri->param('foo');

	 if ($uri->path	=~ /\/login/ &&	$uri->scheme ne	'https') {
	   $uri->param('upgraded', 1);

       "URI::Fast" is a	faster alternative to URI. It is written in C and
       provides	basic parsing and modification of a URI.

       URI is an excellent module; it is battle-tested,	robust,	and handles
       many edge cases.	As a result, it	is rather slower than it would
       otherwise be for	more trivial cases, such as inspecting the path	or
       updating	a single query parameter.

       Subroutines are exported	on demand.

       Accepts a URI string, minimally parses it, and returns a	"URI::Fast"

       Note: passing a "URI::Fast" instance to this routine will cause the
       object to be interpolated into a	string (via "to_string"), effectively
       creating	a clone	of the original	"URI::Fast" object.

       Similar to "uri", but returns a "URI::Fast::IRI"	object.	A
       "URI::Fast::IRI"	differs	from a "URI::Fast" in that UTF-8 characters
       are permitted and will not be percent-encoded when modified.

       Behaves (hopefully) identically to URI::Split, but roughly twice	as

       See "ENCODING".

       If desired, both	"URI::Fast" and	URI::Fast::IRI may be instantiated
       using the default constructor, "new".

	 my $uri = URI::Fast->new('');

       All attributes serve as full accessors, allowing	the URI	segment	to be
       both retrieved and modified.

       Each attribute defines a	"raw_*"	method,	which returns the raw, encoded
       string value for	that attribute.

       Each attribute further has a matching clearer method ("clear_*")	which
       unsets its value.

       In general, accessors accept an unencoded string	and set	their slot
       value to	the encoded value. They	return the decoded value. See
       "ENCODING" for an in depth description of their behavior	as well	as an
       explanation of the more complex behavior	of compound fields.

       Gets or sets the	scheme portion of the URI (e.g.	"http"), excluding

       The authorization section is composed of	the username, password,	host
       name, and port number:

       Setting this field may be done with a string (see the note below	about
       "ENCODING") or a	hash reference of individual field names ("usr",
       "pwd", "host", and "port"). In both cases, the existing values are
       completely replaced by the new values and any values missing from the
       caller-supplied input are deleted.


       The username segment of the authorization string. Updating this value
       alters "auth".


       The password segment of the authorization string. Updating this value
       alters "auth".


       The host	name segment of	the authorization string. May be a domain
       string or an IP address.	If the host is an IPV6 address,	it must	be
       surrounded by square brackets (per spec), which are included in the
       host string. Updating this value	alters "auth".


       The port	number segment of the authorization string. Updating this
       value alters "auth".

       In scalar context, returns the entire path string. In list context,
       returns a list of path segments,	split by "/".

	 my $uri = uri '/foo/bar';
	 my $path = $uri->path;	 # "/foo/bar"
	 my @path = $uri->path;	 # ("foo", "bar")

       The path	may also be updated using either a string or an	array ref of

	 $uri->path(['foo', 'bar']);

       This differs from the behavior of "path_segments" in URI, which
       considers the leading slash separating the path from the	authority
       section to be an	individual segment. If this behavior is	desired, the
       lower level "split_path_compat" is available. "split_path_compat" (and
       its partner, "split_path"), always return an array reference.

	 my $uri = uri '/foo/bar';
	 $uri->split_path;	   # ['foo', 'bar'];
	 $uri->split_path_compat;  # ['', 'foo', 'bar'];

       In scalar context, returns the complete query string, excluding the
       leading "?". The	query string may be set	in several ways.

	 $uri->query("foo=bar&baz=bat"); # note: no percent-encoding performed
	 $uri->query({foo => 'bar', baz	=> 'bat'}); # foo=bar&baz=bat
	 $uri->query({foo => 'bar', baz	=> 'bat'}, ';'); # foo=bar;baz=bat

       In list context,	returns	a hash ref mapping query keys to array refs of
       their values (see "query_hash").

       Both '&'	and ';'	are treated as separators for key/value	parameters.

       The fragment section of the URI,	excluding the leading "#".

       Does a fast scan	of the query string and	returns	a list of unique
       parameter names that appear in the query	string.

       Both '&'	and ';'	are treated as separators for key/value	parameters.

       Scans the query string and returns a hash ref of	key/value pairs.
       Values are returned as an array ref, as keys may	appear multiple	times.
       Both '&'	and ';'	are treated as separators for key/value	parameters.

       May optionally be called	with a new hash	of parameters to replace the
       query string with, in which case	keys may map to	scalar values or
       arrays of scalar	values.	As with	all query setter methods, a third
       parameter may be	used to	explicitly specify the separator to use	when
       generating the new query	string.

       Gets or sets a parameter	value. Setting a parameter value will replace
       existing	values completely; the "query" string will also	be updated.
       Setting a parameter to "undef" deletes the parameter from the URI.

	 $uri->param('foo', ['bar', 'baz']);
	 $uri->param('fnord', 'slack');

	 my $value_scalar = $uri->param('fnord'); # fnord appears once
	 my @value_list	  = $uri->param('foo');	  # foo	appears	twice
	 my $value_scalar = $uri->param('foo');	  # croaks; expected single value but foo has multiple

	 # Delete parameter
	 $uri->param('foo', undef); # deletes foo

	 # Ambiguous cases
	 $uri->param('foo', '');  # foo=
	 $uri->param('foo', '0'); # foo=0
	 $uri->param('foo', ' '); # foo=%20

       Both '&'	and ';'	are treated as separators for key/value	parameters
       when parsing the	query string. An optional third	parameter explicitly
       selects the character used to separate key/value	pairs.

	 $uri->param('foo', 'bar', ';'); # foo=bar
	 $uri->param('baz', 'bat', ';'); # foo=bar;baz=bat

       When unspecified, '&' is	chosen as the default. In either case, all
       separators in the query string will be normalized to the	chosen

	 $uri->param('foo', 'bar', ';'); # foo=bar
	 $uri->param('baz', 'bat', ';'); # foo=bar;baz=bat
	 $uri->param('fnord', 'slack');	 # foo=bar&baz=bat&fnord=slack

       Updates the query string	by adding a new	value for the specified	key.
       If the key already exists in the	query string, the new value is
       appended	without	altering the original value.

	 $uri->add_param('foo',	'bar');	# foo=bar
	 $uri->add_param('foo',	'baz');	# foo=bar&foo=baz

       This method is simply sugar for calling:

	 $uri->param('key', [$uri->param('key'), 'new value']);

       As with "param",	the separator character	may be specified as the	final
       parameter. The same caveats apply with regard to	normalization of the
       query string separator.

	 $uri->add_param('foo',	'bar', ';'); # foo=bar
	 $uri->add_param('foo',	'baz', ';'); # foo=bar;foo=baz

       Allows modification of the query	string in the manner of	a set, using
       keys without "=value", e.g. "foo&bar&baz". Accepts a hash ref of	keys
       to update.  A truthy value adds the key,	a falsey value removes it. Any
       keys not	mentioned in the update	hash are left unchanged.

	 my $uri = uri '&baz&bat';
	 $uri->query_keyset({foo => 1, bar => 1}); # baz&bat&foo&bar
	 $uri->query_keyset({baz => 0, bat => 0}); # foo&bar

       If there	are key-value pairs in the query string	as well, the behavior
       of this method becomes a	little more complex. When a key	is specified
       in the hash update hash ref, a positive value will leave	an existing
       key/value pair untouched. A negative value will remove the key and

	 my $uri = uri '&foo=bar&baz&bat';
	 $uri->query_keyset({foo => 1, baz => 0}); # foo=bar&bat

       An optional second parameter may	be specified to	control	the separator
       character used when updating the	query string. The same caveats apply
       with regard to normalization of the query string	separator.

       Serially	appends	path segments, query strings, and fragments, to	the
       end of the URI. Each argument is	added in order.	If the segment begins
       with "?", it is assumed to be a query string and	it is appended using
       "add_param". If the segment begins with "#", it is treated as a
       fragment, replacing any existing	fragment. Otherwise, the segment is
       treated as a path fragment and appended to the path.

	 my $uri = uri '';
	 $uri->append('bar', 'baz/bat',	'?k=v1&k=v2', '#fnord',	'slack');
	 # ''

       Stringifies the URI, encoding output as necessary. String interpolation
       is overloaded.

   $uri	eq $other
       Compares	the URI	to another, returning true if the URIs are equivalent.
       Overloads the "eq" operator.

       Sugar for:

	 my $uri = uri '...';
	 my $clone = uri $uri;

       Builds an absolute URI from a relative URI and a	base URI string.
       Adheres as strictly as possible to the rules for	resolving a target URI
       in RFC3986 section 5.2 <>.
       Returns a new URI::Fast object representing the absolute, merged	URI.

	 my $uri = uri('some/path')->absolute('');
	 $uri->to_string; # ""

       Builds a	relative URI using a second URI	(either	a "URI::Fast" object
       or a string) as a base. Unlike "rel" in URI, ignores differences	in
       domain and scheme assumes the caller wishes to adopt the	base URL's
       instead.	Aside from that	difference, it's behavior should mimic "rel"
       in URI's.

	 my $uri = uri('')->relative('');
	 $uri->to_string; # "foo/bar"

	 my $uri = uri('')->relative('');
	 $uri->to_string; # "foo/bar/"

       Similar to "canonical" in URI, performs a minimal normalization on the
       URI. Only generic normalization described in the	rfc is performed; no
       scheme-specific normalization is	done. Specifically, the	scheme and
       host members are	converted to lower case, dot segments are collapsed in
       the path, and any percent-encoded characters in the URI are converted
       to upper	case.

       "URI::Fast" tries to do the right thing in most cases with regard to
       reserved	and non-ASCII characters. "URI::Fast" will fully encode
       reserved	and non-ASCII characters when setting individual values	and
       return their fully decoded values. However, the "right thing" is
       somewhat	ambiguous when it comes	to setting compound fields like
       "auth", "path", and "query".

       When setting compound fields with a string value, reserved characters
       are expected to be present, and are therefore accepted as-is. Any non-
       ASCII characters	will be	percent-encoded	(since they are	unambiguous
       and there is no risk of double-encoding them). Thus,

	 print $uri->auth; # ""

       On the other hand, when setting these fields with a reference value
       (assumed	to be a	hash ref for "auth" and	"query"	or an array ref	for
       "path"; see individual methods' docs for	details), each field is	fully
       percent-encoded,	just as	if each	individual simple slot's setter	had
       been called:

	 $uri->auth({usr => 'some one',	host =>	''});
	 print $uri->auth; # ""
	 print $uri->usr;; # "some one"

       The same	goes for return	values.	For compound fields returning a
       string, non-ASCII characters are	decoded	but reserved characters	are
       not. When returning a list or reference of the deconstructed field,
       individual values are decoded of	both reserved and non-ASCII

   '+' vs '%20'
       Although	no longer part of the standard,	"+" is commonly	used as	the
       encoded space character (rather than %20); it is	still official to the
       "application/x-www-form-urlencoded" type, and is	treated	as a space by

       Percent-encodes a string	for use	in a URI. By default, both reserved
       and UTF-8 chars ("! * ' ( ) ; : @ & = + $ , / ? # [ ] %") are encoded.

       A second	(optional) parameter provides a	string containing any
       characters the caller does not wish to be encoded. An empty string will
       result in the default behavior described	above.

       For example, to encode all characters in	a query-like string except for
       those used by the query:

	 my $encoded = URI::Fast::encode($some_string, '?&=');

       Decodes a percent-encoded string.

	 my $decoded = URI::Fast::decode($some_string);

       These are aliases of "encode" and "decode", respectively. They were
       added to	make BLUEFEET <> happy
       after he	made fun of me for naming "encode" and "decode"	too

       In fact,	these were originally aliased as "url_encode" and
       "url_decode", but due to	some pedantic whining on the part of BGRIMM
       <>, they have been renamed to
       "uri_encode" and	"uri_decode".

       Traverses a data	structure, escaping or unescaping defined scalar
       values in place.	Accepts	a reference to be traversed. Any further
       parameters are passed unchanged to "encode" or "decode".	Croaks if the
       input to	escape/unescape	is a non-reference value.

	 my $obj = {
	   foo => ['bar	baz', 'bat%fnord'],
	   bar => {baz => 'bat%bat'},
	   baz => undef,
	   bat => '',


	 # $obj	is now:
	   foo => ['bar%20baz',	'bat%25fnord'],
	   bar => {baz => 'bat%25bat'},
	   baz => undef,
	   bat => '',

	 URI::Fast::unescape_tree($obj); # $obj	returned to original form

	 URI::Fast::escape_tree($obj, '%'); # escape but allow "%"

	 # $obj	is now:
	   foo => ['bar%20baz',	'bat%fnord'],
	   bar => {baz => 'bat%bat'},
	   baz => undef,
	   bat => '',

       See URI::Fast::Benchmarks.

       URI The de facto	standard.

       RFC 3986	<>
	   The official	standard.

       Thanks to ZipRecruiter <> for encouraging
       their employees to contribute back to the open source ecosystem.
       Without their dedication	to quality software development	this
       distribution would not exist.

       The following people have contributed to	this module with patches, bug
       reports,	API advice, identifying	areas where the	documentation is
       unclear,	or by making fun of me for naming certain methods too

       Andy Ruder
       Aran Deltac (BLUEFEET)
       Ben Grimm (BGRIMM)
       Dave Hubbard (DAVEH)
       James Messrie
       Martin Locklear
       Randal Schwartz (MERLYN)
       Sara Siegal (SSIEGAL)
       Tim Vroom (VROOM)
       Des Daignault (NAWGLAN)
       Josh Rosenbaum

       Jeff Ober <>

       This software is	copyright (c) 2018 by Jeff Ober. This is free
       software; you can redistribute it and/or	modify it under	the same terms
       as the Perl 5 programming language system itself.

perl v5.32.0			  2019-08-02			     README(3)


Want to link to this manual page? Use this URL:

home | help