Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages


home | help
XPath(3)	      User Contributed Perl Documentation	      XPath(3)

       Class::XPath - adds xpath matching to object trees

       In your node class, use Class::XPath:

	 # generate xpath() and	match()	using Class::XPath
	 use Class::XPath

	    get_name =>	'name',	       # get the node name with	the 'name' method

	    get_parent => 'parent',    # get parent with the 'parent' method

	    get_root   => \&get_root,  # call get_root($node) to get the root

	    get_children => 'kids',    # get children with the 'kids' method

	    get_attr_names => 'param', # get names and values of attributes
	    get_attr_value => 'param', # from param

	    get_content	   => 'data',  # get content from the 'data' method


       Now your	objects	support	XPath-esque matching:

	 # find	all pages, anywhere in the tree
	 @nodes	= $node->match('//page');

	 # returns an XPath like "/page[1]/paragraph[2]"
	 $xpath	= $node->xpath();

       This module adds	XPath-style matching to	your object trees.  This means
       that you	can find nodes using an	XPath-esque query with "match()" from
       anywhere	in the tree.  Also, the	"xpath()" method returns a unique path
       to a given node which can be used as an identifier.

       To use this module you must already have	an OO implementation of	a
       tree.  The tree must be a true tree - all nodes have a single parent
       and the tree must have a	single root node.  Also, the order of children
       within a	node must be stable.

       NOTE: This module is not	yet a complete XPath implementation.  Over
       time I expect the subset	of XPath supported to grow.  See the SYNTAX
       documentation for details on the	current	level of support.

       This module is used by providing	it with	information about how your
       class works.  Class::XPath uses this information	to build the "match()"
       and "xpath()" methods for your class.  The parameters passed to 'use
       Class::XPath' may be set	with strings, indicating method	names, or
       subroutine references.  They are:

       get_name	(required)
	   Returns the name of this node.  This	will be	used as	the element
	   name	when evaluating	an XPath match.	 The value returned must
	   matches /^[\w:]+$/.

       get_parent (required)
	   Returns the parent of this node.  The root node must	return undef
	   from	the get_parent method.

       get_children (required)
	   Returns a list of child nodes, in order.

       get_attr_names (required)
	   Returns a list of available attribute names.	 The values returned
	   must	match /^[\w:]+$/).

       get_attr_value (required)
	   Called with a single	parameter, the name of the attribute.  Returns
	   the value associated	with that attribute.  The value	returned must
	   be "undef" if no value exists for the attribute.

       get_content (required)
	   Returns the contents	of the node.  In XML this is text between
	   start and end tags.

       get_root	(required)
	   Returns the root node of this tree.

       call_match (optional)
	   Set this to the name	of the "match()" method	to generate.  Defaults
	   to 'match'.

       call_xpath (optional)
	   Set this to the name	of the "xpath()" method	to generate.  Defaults
	   to 'xpath'.

       If you're using someone else's OO tree module, and you don't want to
       subclass	it, you	can still use Class::XPath to add XPath	matching to
       it.  This is done by calling "Class::XPath-"add_methods()> with all the
       options usually passed to "use" and one extra one, "target".  For
       example,	to add xpath() and match() to HTML::Element (the node class
       for HTML::TreeBuilder):

	 # add Class::XPath routines to	HTML::Element
	 Class::XPath->add_methods(target	  => 'HTML::Element',
				   get_parent	  => 'parent',
				   get_name	  => 'tag',
				   get_attr_names =>
				     sub { my %attr = shift->all_external_attr;
					   return keys %attr; },
				   get_attr_value =>
				     sub { my %attr = shift->all_external_attr;
					   return $attr{$_[0]};	},
				   get_children	  =>
				     sub { grep	{ ref $_ } shift->content_list },
				   get_content	  =>
				     sub { grep	{ not ref $_ } shift->content_list },
				   get_root	  =>
				     sub { local $_=shift;
					   while($_->parent) { $_ = $_->parent }
					   return $_; });

       Now you can load	up an HTML file	and do XPath matching on it:

	 my $root = HTML::TreeBuilder->new;

	 # get a list of all paragraphs
	 my @paragraphs	= $root->match('//p');

	 # get the title element
	 my ($title) = $root->match('/head/title');

       This module generates two public	methods	for your class:

       "@results = $node->match('/xpath/expression')"
	   This	method performs	an XPath match against the tree	to which this
	   node	belongs.  See the SYNTAX documentation for the range of
	   supported expressions.  The return value is either a	list of	node
	   objects, a list of values (when retrieving specific attributes) or
	   an empty list if no matches could be	found.	If your	XPath
	   expression cannot be	parsed then the	method will die.

	   You can change the name of this method with the 'call_match'	option
	   described above.

       "$xpath = $node->xpath()"
	   Get an xpath	to uniquely identify this node.	 Can be	used with
	   match() to find the element later.  The xpath returned is
	   guaranteed to be unqiue within the element tree.  For example, the
	   third node named "paragraph"	inside node named "page" has the xpath

	   You can change the name of this method with the 'call_xpath'	option
	   described above.

       This module supports a small subset of XPath at the moment.  Here is a
       list of the type	of expressions it knows	about:

       .   Selects and returns the current node.

	   Selects a list of nodes called 'name' in the	tree below the current

	   Selects a list of nodes called 'name' directly below	the root of
	   the tree.

	   Selects all nodes with a matching name, anywhere in the tree.

	   Selects a list of grandchildren for all children of all parents.

	   Selects a single child by indexing into the children	lists.

	   Selects the first child of the last parent.	In the real XPath they
	   spell this 'parent[last()]/child[0]'	but supporting the Perl	syntax
	   is practically free here.  Eventually I'll support the XPath	style

	   Selects the second child from the parent of the current node.
	   Currently ..	only works at the start	of an XPath, mostly because I
	   can't imagine using it anywhere else.

	   Selects the child node with an 'id' attribute of 10.

	   Selects all the child nodes with an 'id' attribute greater than 10.
	   Other supported operators are '<', '<=', '>=' and '!='.

	   Selects the child with an 'category'	attribute of "sports".	The
	   value must be a quoted string (single or double) and	no escaping is

       child[title="Hello World"]
	   Selects the child with a 'title' child element whose	content	is
	   "Hello World".  The value must be a quoted string (single or
	   double) and no escaping is allowed.	e.g.

	      <title>Hello World</title>

       //title[.="Hello	World"]
	   Selects all 'title' elements	whose content is "Hello	World".

	   Returns the list of values for all attributes "attr"	within each

	   Returns the list of values for all attributes "attr"	within each

       NOTE: this module has no	support	for Unicode.  If this is a problem for
       you please consider sending me a	patch.	I'm certain that I don't know
       enough about Unicode to do it right myself.

       I know of no bugs in this module.  If you find one, please file a bug
       report at:

       Alternately you can email me directly at	 Please
       include the version of the module and a complete	test case that
       demonstrates the	bug.

       Planned future work:

       o   Support more	of XPath!

       o   Do more to detect broken get_* functions.  Maybe use	Carp::Assert
	   and a special mode for use during development?

       I would like to thank the creators of XPath for their fine work and the
       W3C for supporting them in their	efforts.

       The following people have sent me patches and/or	suggestions:

	 Tim Peoples
	 Mark Addison
	 Timothy Appnel

       Copyright (C) 2002 Sam Tregar

       This program is free software; you can redistribute it and/or modify it
       under the same terms as Perl 5 itself.

       Sam Tregar <>

       The XPath W3C Recommendation:

perl v5.32.0			  2004-02-29			      XPath(3)


Want to link to this manual page? Use this URL:

home | help