Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages


home | help
XML::Smart(3)	      User Contributed Perl Documentation	 XML::Smart(3)

       XML::Smart - A smart, easy and powerful way to access or	create XML
       from fiels, data	and URLs.

       Version 1.79

       This module provides an easy way	to access/create XML data. It's	based
       on a HASH tree created from the XML data, and enables dynamic access to
       it through the standard Perl syntax for Hash and	Array, without
       necessarily caring about	which you are working with. In other words,
       each point in the tree works as a Hash and an Array at the same time!

       This module additionally	provides special resources such	as: search for
       nodes by	attribute, select an attribute value in	each multiple node,
       change the returned format, and so on.

       The module also automatically handles binary data (encoding/decoding
       to/from base64),	CDATA (like contents with <tags>) and Unicode. It can
       be used to create XML files, load XML from the Web ( just by using an
       URL as the file path ) and has an easy way to send XML data through
       sockets - just adding the length	of the data in the <?xml?> header.

       You can use XML::Smart with XML::Parser,	or with	the 2 standard parsers
       of XML::Smart:


       XML::Smart::HTMLParser can be used to load/parse	wild/bad XML data, or
       HTML tags.

Tutorial and F.A.Q.
       You can find some extra documents about XML::Smart at:

       XML::Smart::Tutorial - Tutorial and examples for	XML::Smart.
       XML::Smart::FAQ	    - Frequently Asked Questions about XML::Smart.

	 ## Create the object and load the file:
	 my $XML = XML::Smart->new('file.xml') ;

	 ## Force the use of the parser	'XML::Smart::Parser'.
	 my $XML = XML::Smart->new('file.xml' ,	'XML::Smart::Parser') ;

	 ## Get	from the web:
	 my $XML = XML::Smart->new('') ;

	 ## Cut	the root:
	 $XML =	$XML->cut_root ;

	 ## Or change the root:
	 $XML =	$XML->{hosts} ;

	 ## Get	the address [0]	of server [0]:
	 my $srv0_addr0	= $XML->{server}[0]{address}[0]	;
	 ## ...or...
	 my $srv0_addr0	= $XML->{server}{address} ;

	 ## Get	the server where the attibute 'type' eq	'suse':
	 my $server = $XML->{server}('type','eq','suse') ;

	 ## Get	the address again:
	 my $addr1 = $server->{address}[1] ;
	 ## ...or...
	 my $addr1 = $XML->{server}('type','eq','suse'){address}[1] ;

	 ## Get	all the	addresses of a server:
	 my @addrs = @{$XML->{server}{address}}	;
	 ## ...or...
	 my @addrs = $XML->{server}{address}('@') ;

	 ## Get	a list of types	of all the servers:
	 my @types = $XML->{server}('[@]','type') ;

	 ## Add	a new server node:
	 my $newsrv = {
	 os	 => 'Linux' ,
	 type	 => 'Mandrake' ,
	 version => 8.9	,
	 address => [qw(]
	 } ;

	 push(@{$XML->{server}}	, $newsrv) ;

	 ## Get/rebuild	the XML	data:
	 my $xmldata = $XML->data ;

	 ## Save in some file:
	 $XML->save('newfile.xml') ;

	 ## Send through a socket:
	 print $socket $XML->data(length => 1) ; ## show the 'length' in the XML header	to the
						 ## socket know	the amount of data to read.

	 <?xml version="1.0" encoding="iso-8859-1"?>
	   <server os="linux" type="redhat" version="8.0">
	   <server os="linux" type="suse" version="7.0">
	   <server address="" os="linux" type="conectiva" version="9.0"/>

       Create a	XML object.


		 The first argument can	be:

		   - XML data as string.
		   - File path.
		   - File Handle (GLOB).
		   - URL (Need LWP::UserAgent).

		 If not	passed,	a null XML tree	is started, where you should
		 create	your own XML data, than	build/save/send	it.

       PARSER (optional)
		 Set the XML parser to use. Options:


		 XML::Smart::Parser can	only handle basic XML data (not
		 supported PCDATA, and any header like:	ENTITY,	NOTATION,
		 etc...), but is a good	choice when you	don't want to install
		 big modules to	parse XML, since it comes with the main
		 module. But it	still can handle CDATA and binary data.

		 ** See	"PARSING HTML as XML" for XML::Smart::HTMLParser.

		 Aliases for the options:

		   SMART|REGEXP	  => XML::Smart::Parser
		   HTML		  => XML::Smart::HTMLParser


		 If not	set it will look for XML::Parser and load it.  If
		 XML::Parser can't be loaded it	will use XML::Smart::Parser,
		 which is actually a clone of XML::Parser::Lite	with some

       OPTIONS	 You can force the uper	case and lower case for	tags (nodes)
		 and arguments (attributes), and other extra things.

		 lowtag	   Make	the tags lower case.

		 lowarg	   Make	the arguments lower case.

		 upertag   Make	the tags uper case.

		 uperarg   Make	the arguments uper case.

			   Set the value of arguments to 1 when	they have a
			   undef value.

			   ** This option will work only when the XML is
			   parsed by XML::Smart::HTMLParser, since it accept
			   arguments without values:

			     my	$xml = new XML::Smart(
			     '<root><foo arg1="" flag></root>' ,
			     'XML::Smart::HTMLParser' ,
			     arg_single	=> 1 ,
			     ) ;

			   In this example the option "arg_single" was used,
			   what	will define flag to 1, but arg1	will still
			   have	a null string value ("").

			   Here's the tree of the example above:

			     'root' => {
					 'foo' => {
						    'flag' => 1,
						    'arg1' => ''

			   Accept contents that	have only spaces.

		 on_start (CODE) *optional
			   Code/sub to call on start a tag.

			   ** This will	be called after	XML::Smart parse the
			   tag,	should be used only if you want	to change the

		 on_char (CODE)	*optional
			   Code/sub to call on content.

			   ** This will	be called after	XML::Smart parse the
			   tag,	should be used only if you want	to change the

		 on_end	(CODE) *optional
			   Code/sub to call on end a tag.

			   ** This will	be called after	XML::Smart parse the
			   tag,	should be used only if you want	to change the

		 ** This options are applied when the XML data is loaded. For
		 XML generation	see data() OPTIONS.

       Examples	of use:

	 my $xml_from_url = XML::Smart->new("") ;


	 my $xml_from_str = XML::Smart->new(q`<?xml version="1.0" encoding="iso-8859-1"	?>
	   <foo	arg="xyz"/>
	 `) ;


	 my $null_xml =	XML::Smart->new() ;


	 my $xml_from_html = XML::Smart->new($html_data	, 'html' ,
	 lowtag	=> 1 ,
	 lowarg	=> 1 ,
	 on_char => sub	{
		      my ( $tag	, $pointer , $pointer_back , $cont) = @_ ;
		      $pointer->{extra_arg} = 123 ; ## add an extrar argument.
		      $pointer_back->{$tag}{extra_arg} = 123 ; ## Same,	but using the previous pointer.
		      $$cont .=	"\n" ; ## append data to the content.
	 ) ;

   apply_dtd (DTD , OPTIONS)
       Apply the DTD to	the XML	tree.

       DTD can be a source, file, GLOB or URL.

       This method is usefull if you need to have the XML generated by data()
       formated	in a specific DTD, so, elements	will be	nodes automatically,
       attributes will be checked, required elements and attributes will be
       created,	the element order will be set, etc...


       no_delete BOOL
		 If TRUE tells that not	defined	elements and attributes	in the
		 DTD won't be deleted from the XML tree.

       Example of use:

	 <!DOCTYPE cds [
	 <!ELEMENT cds (album+)>
	 <!ATTLIST cds
		   creator  CDATA
		   date	    CDATA #REQUIRED
		   type	    (a|b|c) #REQUIRED "a"
	 <!ELEMENT album (#PCDATA)>
	 ` ,
	 no_delete => 1	,

       Return the arguments names (not nodes).

       Return the arguments values (not	nodes).

       Get back	one level the pointer in the tree.

       ** Se base().

       Get back	to the base of the tree.

       Each query to the XML::Smart object return an object pointing to	a
       different place in the tree (and	share the same HASH tree). So, you can
       get the main object again (an object that points	to the base):

	 my $srv = $XML->{root}{host}{server} ;
	 my $addr = $srv->{adress} ;
	 my $XML2 = $srv->base() ;

       Return the content of a node:

	 ## Data:
	 <foo>my content</foo>

	 ## Access:

	 my $content = $XML->{foo}->content ;
	 print "<<$content>>\n"	; ## show: <<my	content>>

	 ## or just:
	 my $content = $XML->{foo} ;

       Also can	be used	with multiple contents:

       For this	XML data:

	 <tag1 arg="1"/>

       Getting all the content:

	 my $all_content = $XML->{root}->content ;
	 print "[$all_content]\n" ;




       Getting in parts:

	 my @contents =	$XML->{root}->content ;
	 print "[@contents[0]]\n" ;
	 print "[@contents[1]]\n" ;



       Setting multiple	contents:

	 $XML->{root}->content(0,"aaaaa") ;
	 $XML->{root}->content(1,"bbbbb") ;

       Output now will be:


       And now the XML data generated will be:

	 <root>aaaaa<tag1 arg="1"/>bbbbb</root>

       Return a	copy of	the XML::Smart object (pointing	to the base).

       ** This is good when you	want to	keep 2 versions	of the same XML	tree
       in the memory, since one	object can't change the	tree of	the other!

       WARNING:	set_node(), set_cdata()	and set_binary() changes are not
       persistant over copy - Once you create a	second copy these states are

       b<warning:> do not copy after apply_dtd() unless	you have checked for
       dtd errors.

       Cut the root key:

	 my $srv = $XML->{rootx}{host}{server} ;

	 ## Or if you don't know the root name:
	 $XML =	$XML->cut_root() ;
	 my $srv = $XML->{host}{server}	;

       ** Note that this will cut the root of the pointer in the tree.	So, if
       you are in some place that have more than one key (multiple roots), the
       same object will	be retuned without cut anything.

   data	(OPTIONS)
       Return the data of the XML object (rebuilding it).


       nodtd	  Do not add in	the XML	content	the DTD	applied	by the method

       noident	  If set to true the data isn't	idented.

       nospace	  If set to true the data isn't	idented	and doesn't have space
		  between the tags (unless the CONTENT have).

       lowtag	  Make the tags	lower case.

       lowarg	  Make the arguments lower case.

       upertag	  Make the tags	uper case.

       uperarg	  Make the arguments uper case.

       length	  If set true, add the attribute 'length' with the size	of the
		  data to the xml header (<?xml	...?>).	 This is useful	when
		  you send the data through a socket, since the	socket can
		  know the total amount	of data	to read.

       noheader	  Do not add  the <?xml	...?> header.

       nometagen  Do not add the meta generator	tag: <?meta
		  generator="XML::Smart" ?>

       meta	  Set the meta tags of the XML document.

       decode	  As of	VERSION	1.73 there are three different base64
		  encodings that are used. They	are picked based on which of
		  them support the data	provided. If you want to retrieve data
		  using	the 'data' function the	resultant xml will have
		  dt:dt="binary.based" contained within	it. To retrieve	the
		  decoded data use: $XML->data(	decode => 1 )


		      my $meta = {
		      build_from => "wxWindows 2.4.0" ,
		      file => "wx26.htm" ,
		      }	;

		      print $XML->data(	meta =>	$meta )	;

		      <?meta build_from="wxWindows 2.4.0" file="wx283.htm" ?>

		  Multiple meta:

		      my $meta = [
		      {build_from => "wxWindows	2.4.0" , file => "wx26.htm" } ,
		      {script => "" , ver => "1.0" } ,
		      ]	;

		      <?meta build_from="wxWindows 2.4.0" file="wx26.htm" ?>
		      <?meta script=""	ver="1.0" ?>

		  Or set directly the meta tag:

		      my $meta = '<?meta foo="bar" ?>' ;

		      ## For multiple:
		      my $meta = ['<?meta foo="bar" ?>'	, '<?meta x="1"	?>'] ;

		      print $XML->data(	meta =>	$meta )	;

       tree	  Set the HASH tree to parse. If not set will use the tree of
		  the XML::Smart object	(tree()). ;

       wild	  Accept wild tags and arguments.

		  ** This wont fix wrong keys and tags.

       sortall	  Sort all the tags alphabetically. If not set will keep the
		  order	of the document	loaded,	or the order of	tag creation.
		  Default: off

   data_pointer	(OPTIONS)
       Make the	tree from current point	in the XML tree	(not from the base as

       Accept the same OPTIONS of the method data().

       Dump the	tree of	the object using Data::Dumper.

       Dump the	tree of	the object, from the pointer, using Data::Dumper.

       ** Same as dump_tree_pointer().

       Return the index	of the value.

       ** If the value is from an hash key (not	an ARRAY ref) undef is

       Return if a key is a node.

       Return the key of the value.

       If wantarray return the index too: return(KEY , I) ;

       Return the nodes	(objects) in the pointer (keys that aren't arguments).

       Return the nodes	names (not the object) in the pointer (keys that
       aren't arguments).

       Return true if the XML object has a null	tree or	if the pointer is in
       some place that doesn't exist.

       Return the order	of the keys. See set_order().

       Return the path of the pointer.



       Note that the index is 0	based and 'address' can	be an attribute	or a
       node, what is not compatible with XPath.

       ** See path_as_xpath().

       Return the path of the pointer in the XPath format.

       Return the HASH tree from the pointer.

       Return a	copy of	the tree of the	object,	from the pointer, but without
       internal	keys added by XML::Smart.

       Return the ROOT name of the XML tree (main key).

       ** See also key() for sub nodes.

       Save the	XML data inside	a file.

       Accept the same OPTIONS of the method data().

       Define the key to be handled automatically. Soo,	data() will define
       automatically if	it's a node, content or	attribute.

       ** This method is useful	to remove set_node(), set_cdata() and
       set_binary() changes.

       Define the key as a node, and data() will define	automatically if it's
       CDATA or	BINARY.

       ** This method is useful	to remove set_cdata() and set_binary()

       Define the node as a BINARY content when	TRUE, or force to not handle
       it as a BINARY on FALSE.

       Example of node handled as BINARY:

	 <root><foo dt:dt="binary.base64">PGgxPnRlc3QgAzwvaDE+</foo></root>

       Original	content	of foo (the base64 data):

	 <h1>test \x03</h1>

       Define the node as CDATA	when TRUE, or force to not handle it as	CDATA
       on FALSE.

       Example of CDATA	node:

	 <root><foo><![CDATA[bla bla bla <tag> bla bla]]></foo></root>

       Set/unset the current key as a node (tag).

       ** If BOOL is not defined will use TRUE.

       WARNING:	You cannot set_node, copy the object and then set_node(	0 ) [
       Unset node ]

       Set the order of	the keys (nodes	and attributes)	in this	point.

       Same as set_node.

       Return the HASH tree of the XML data.

       ** Note that the	real HASH tree is returned here. All the other ways
       return an object	that works like	a HASH/ARRAY through tie.

       Same as pointer().

       Return a	copy of	the tree of the	object,	but without internal keys
       added by	XML::Smart, like /order	and /nodes.

       Return a	copy of	the tree of the	object,	from the pointer, but without
       internal	keys added by XML::Smart.

   xpath() || XPath()
       Return a	XML::XPath object, based in the	XML root in the	tree.

	 ## look from the root:
	 my $data = $XML->XPath->findnodes_as_string('/') ;

       ** Need XML::XPath installed, but only load when	is needed.

   xpath_pointer() || XPath_pointer()
       Return a	XML::XPath object, based in the	XML::Smart pointer in the

	 ## look from this point, soo XPath '/'	actually starts	at /server/:

	 my $srvs = $XML->{server} ;
	 my $data = $srvs->XPath_pointer->findnodes_as_string('/') ;

       ** Need XML::XPath installed, but only load when	is needed.

       XML::Smart uses XML::XPath that,	for perfomance reasons,	leaks memory.
       The ensure that this memory is freed you	can explicitly call ANNIHILATE
       before the XML::Smart object goes out of	scope.

       To access the data you use the object in	a way similar to HASH and

	 my $XML = XML::Smart->new('file.xml') ;

	 my $server = $XML->{server} ;

       But when	you get	a key {server},	you are	actually accessing the data
       through tie(), not directly to the HASH tree inside the object, (This
       will fix	wrong accesses):

	 ## {server} is	a normal key, not an ARRAY ref:

	 my $server = $XML->{server}[0]	; ## return $XML->{server}
	 my $server = $XML->{server}[1]	; ## return UNDEF

	 ## {server} has an ARRAY with 2 items:

	 my $server = $XML->{server} ;	  ## return $XML->{server}[0]
	 my $server = $XML->{server}[0]	; ## return $XML->{server}[0]
	 my $server = $XML->{server}[1]	; ## return $XML->{server}[1]

       To get all the values of	multiple elements/keys:

	 ## This work having only a string inside {address}, or	with an	ARRAY ref:
	 my @addrsses =	@{$XML->{server}{address}} ;

   Select search
       When you	don't know the position	of the nodes, you can select it	by
       some attribute value:

	 my $server = $XML->{server}('type','eq','suse') ; ## return $XML->{server}[1]

       Syntax for the select search:


       NAME	 The attribute name in the node	(tag).

       CONDITION Can be

		   eq  ne  ==  !=  <=  >=  <  >

		 For REGEX:

		   =~  !~

		   ## Case insensitive:
		   =~i !~i

       VALUE	 The value.

		 For REGEX use like this:

		   $XML->{server}('type','=~','^s\w+$')	;

   Select attributes in	multiple nodes:
       You can get the list of values of an attribute looking in all multiple

	 ## Get	all the	server types:
	 my @types = $XML->{server}('[@]','type') ;

       Also as:

	 my @types = $XML->{server}{type}('<@')	;

       Without the resource:

	 my @list ;
	 my @servers = @{$XML->{server}} ;

	 foreach my $servers_i ( @servers ) {
	   push(@list ,	$servers_i->{type} ) ;

   Return format
       You can change the returned format:



       Where TYPE can be:

	 $  ## the content.
	 @  ## an array	(list of multiple values).
	 %  ## a hash.
	 .  ## The exact point in the tree, not	an object.

	 $@  ##	an array, but with the content,	not an objects.
	 $%  ##	a hash,	but the	values are the content,	not an object.

	 ## The	use of $@ and $% is good if you	don't want to keep the object
	 ## reference (and save	memory).

	 @keys	## The keys of the node. note that if you have a key with
		## multiple nodes, it will be replicated (this is the
		## difference of "keys %{$this->{node}}" ).

	 <@ ## Return the attribute in the previous node, but looking for
	    ## multiple	nodes. Example:

	 my @names = $this->{method}{wxFrame}{arg}{name}('<@') ;
	 #### @names = (parent , id , title) ;

	 <xml> ## Return a XML data from this point.

	   <wxFrame return="wxFrame">
	     <arg name="parent"	type="wxWindow"	/>
	     <arg name="id" type="wxWindowID" />
	     <arg name="title" type="wxString" />


	 ## A servers content
	 my $name = $XML->{server}{name}('$') ;
	 ## ...	or:
	 my $name = $XML->{server}{name}->content ;
	 ## ...	or:
	 my $name = $XML->{server}{name} ;
	 $name = "$name" ;

	 ## All	the servers
	 my @servers = $XML->{server}('@') ;
	 ## ...	or:
	 my @servers = @{$XML->{server}} ;

	 ## It still has the object reference:
	 @servers[0]->{name} ;

	 ## Without the	reference:
	 my @servers = $XML->{server}('$@') ;

	 ## A XML data,	same as	data_pointer():
	 my $xml_data =	$XML->{server}('<xml>')	;

       If a {key} has a	content	you can	access it directly from	the variable
       or from the method:

	 my $server = $XML->{server} ;

	 print "Content: $server\n" ;
	 ## ...or...
	 print "Content: ". $server->content ."\n" ;

       So, if you use the object as a string it	works as a string, if you use
       as an object it works as	an object! ;-P

       **See the method	content() for more.

       To create XML data is easy, you just use	as a normal HASH, but you
       don't need to care with multiple	nodes, and ARRAY creation/convertion!

	 ## Create a null XML object:
	 my $XML = XML::Smart->new() ;

	 ## Add	a server to the	list:
	 $XML->{server}	= {
	 os => 'Linux' ,
	 type => 'mandrake' ,
	 version => 8.9	,
	 address => '' ,
	 } ;

	 ## The	data now:
	 <server address="" os="Linux" type="mandrake" version="8.9"/>

	 ## Add	a new address to the server. Have an ARRAY creation, convertion
	 ## of the previous key	to ARRAY:
	 $XML->{server}{address}[1] = '' ;

	 ## The	data now:
	 <server os="Linux" type="mandrake" version="8.9">

       After create your XML tree you just save	it or get the data:

	 ## Get	the data:
	 my $data = $XML->data ;

	 ## Or save it directly:
	 $XML->save('newfile.xml') ;

	 ## Or send to a socket:
	 print $socket $XML->data(length => 1) ;

       From version 1.2	XML::Smart can handle binary data and CDATA blocks

       When parsing, binary data will be detected as:

	 <code dt:dt="binary.base64">f1NPTUUgQklOQVJZIERBVEE=</code>

       Since this is the oficial automatically format for binary data at _  The
       content will be decoded from base64 and saved in	the object tree.

       CDATA will be parsed as any other content, since	CDATA is only a	block
       that won't be parsed.

       When creating XML data, like at $XML->data(), the binary	format and
       CDATA are detected using	these rules:

	 - If your data	has characters that can't be in	XML.

	 * Characters accepted:

	   \s \w \d
	   0x82, 0x83, 0x84, 0x85, 0x86, 0x87, 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8e, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96,
	   0x97, 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9e, 0x9f, 0xa1, 0xa2, 0xa3, 0xa4, 0xa5, 0xa6, 0xa7, 0xa8, 0xa9, 0xaa,
	   0xab, 0xac, 0xad, 0xae, 0xaf, 0xb0, 0xb1, 0xb2, 0xb3, 0xb4, 0xb5, 0xb6, 0xb7, 0xb8, 0xb9, 0xba, 0xbb, 0xbc,
	   0xbd, 0xbe, 0xbf, 0xc0, 0xc1, 0xc2, 0xc3, 0xc4, 0xc5, 0xc6, 0xc7, 0xc8, 0xc9, 0xca, 0xcb, 0xcc, 0xcd, 0xce,
	   0xcf, 0xd0, 0xd1, 0xd2, 0xd3, 0xd4, 0xd5, 0xd6, 0xd7, 0xd8, 0xd9, 0xda, 0xdb, 0xdc, 0xdd, 0xde, 0xdf, 0xe0,
	   0xe1, 0xe2, 0xe3, 0xe4, 0xe5, 0xe6, 0xe7, 0xe8, 0xe9, 0xea, 0xeb, 0xec, 0xed, 0xee, 0xef, 0xf0, 0xf1, 0xf2,
	   0xf3, 0xf4, 0xf5, 0xf6, 0xf7, 0xf8, 0xf9, 0xfa, 0xfb, 0xfc, 0xfd, 0xfe, 0xff, 0x20

	 TODO: 0x80, 0x81, 0x8d, 0x8f, 0x90, 0xa0

	 - If have tags: <...>

	 CONTENT: (<tag>content</tag>)
	 - If have \r\n\t, or '	and " at the same time.

       So, this	will be	a CDATA	content:


       If binary content is detected, it will be converted to base64 and a
       dt:dt attribute added in	the tag	to tell	the format.

	 <code dt:dt="binary.base64">f1NPTUUgQklOQVJZIERBVEE=</code>

       NOTE: As	of VERSION 1.73	there are three	different base64 encodings
       that are	used. They are picked based on which of	them support the data
       provided. If you	want to	retrieve data using the	'data' function	the
       resultant xml will have dt:dt="binary.based" contained within it. To
       retrieve	the decoded data use: $XML->data( decode => 1 )

UNICODE	and ASCII-extended (ISO-8859-1)
       XML::Smart support only thse 2 encode types, Unicode (UTF-8) and	ASCII-
       extended	(ISO-8859-1), and must be enough. (Note	that UTF-8 is only
       supported on Perl-5.8+).

       When creating XML data, if any UTF-8 character is detected the encoding
       attribute in the	<?xml ...?> header will	be set to UTF-8:

	 <?xml version="1.0" encoding="utf-8" ?>
	 <data>0x82, 0x83</data>

       If not, the iso-8859-1 is used:

	 <?xml version="1.0" encoding="iso-8859-1" ?>

       When loading XML	data with UTF-8, Perl (5.8+) should make all the work

       You can use the special parser XML::Smart::HTMLParser to	"use" HTML as
       XML or not well-formed XML data.

       The differences between an normal XML parser and	XML::Smart::HTMLParser

	 - Accept values without quotes:
	   <foo	bar=x>

	 - Accept any data in the values, including <> and &:
	   <root><echo sample="echo \"Hello!\">out.txt"></root>

	 - Accpet URI values without quotes:
	   <link url= target=#_blank>

	 - Don't need to close the tags	adding the '/' before '>':
	   <root><foo bar="1"></root>

	   ** Note that	the parse will try hard	to detect the nodes, and where
	      auto-close or not.

	 - Don't need to have only one root:

       So, XML::Smart::HTMLParser is a willd way to load markuped data (like
       HTML), or if you	don't want to care with	quotes,	end tags, etc... when
       writing by hand your XML	data.  So, you can write by hand a bad XML
       file, load it with XML::Smart::HTMLParser, and rewrite well saving it
       again! ;-P

       ** Note that <SCRIPT> tags will only parse right	if the content is
       inside comments <!--...-->, since they can have tags:

	 <SCRIPT LANGUAGE="JavaScript"><!--
	 document.writeln("some	<tag> in the string");

       Entities	(ENTITY) are handled by	the parser. So,	if you use XML::Parser
       it will do all the job fine.  But If you	use XML::Smart::Parser or
       XML::Smart::HMLParser, only the basic entities (defaults) will be

	 &lt;	=> The less than sign (<).
	 &gt;	=> The greater than sign (>).
	 &amp;	=> The ampersand (&).
	 &apos;	=> The single quote or apostrophe (').
	 &quot;	=> The double quote (").

	 &#ddd;	 => An ASCII character or an Unicode character (>255). Where ddd is a decimal.
	 &#xHHH; => An Unicode character. Where	HHH is in hexadecimal.

       When creating XML data, already existent	Entities won't be changed, and
       the characters '<', '&' and '>' will be converted to the	appropriated

       ** Note that if a content have a	<tag>, the characters '<' and '>'
       won't be	converted to entities, and this	content	will be	inside a CDATA

       Every one that have tried to use	Perl HASH and ARRAY to access XML
       data, like in XML::Simple, have some problems to	add new	nodes, or to
       access the node when the	user doesn't know if it's inside an ARRAY, a
       HASH or a HASH key. XML::Smart create around it a very dynamic way to
       access the data,	since at the same time any node/point in the tree can
       be a HASH and an	ARRAY. You also	have other extra resources, like a
       search for nodes	by attribute:

	 my $server = $XML->{server}('type','eq','suse') ; ## This syntax is not wrong!	;-)

	 ## Instead of:
	 my $server = $XML->{server}[1]	;

	   <server os="linux" type="redhat" version="8.0">
	   <server os="linux" type="suse" version="7.0">

       The idea	for this module, came from the problem that exists to access a
       complex struture	in XML.	 You just need to know how is this structure,
       something that is generally made	looking	the XML	file (what is wrong).
       But at the same time is hard to always check (by	code) the struture,
       before access it.  XML is a good	and easy format	to declare your	data,
       but to extrac it	in a tree way, at least	in my opinion, isn't easy. To
       fix that, came to my mind a way to access the data with some query
       language, like SQL.  The	first idea was to access using something like:{arg1}

	 X =*


       And saw that this is very similar to Hashes and Arrays in Perl:

	 $XML->{foo}{bar}{baz}{arg1} ;

	 $X = $XML->{foo}{bar} ;
	 $X->{baz}{arg1} ;

	 $XML->{hosts}{server}[0]{argx}	;

       But the problem of Hash and Array, is not knowing when you have an
       Array reference or not.	For example, in	XML::Simple:

	 ## This is very diffenrent
	 $XML->{server}{address} ;
	 ## ...	of this:
	 $XML->{server}{address}[0] ;

       So, why don't make both ways work? Because you need to make something

       To create XML::Smart, first I have created the module
       Object::MultiType.  With	it you can have	an object that works at	the
       same time as a HASH, ARRAY, SCALAR, CODE	& GLOB.	So you can do things
       like this with the same object:

	 $obj =	Object::MultiType->new() ;

	 $obj->{key} ;
	 $obj->[0] ;
	 $obj->method ;

	 @l = @{$obj} ;
	 %h = %{$obj} ;

	 &$obj(args) ;

	 print $obj "send data\n" ;

       Seems to	be crazy, and can be more if you use tie() inside it, and this
       is what XML::Smart does.

       For XML::Smart, the access in the Hash and Array	way paste through
       tie(). In other words, you have a tied HASH and tied ARRAY inside it.
       This tied Hash and Array	work together, soo you can access a Hash key
       as the index 0 of an Array, or access an	index 0	as the Hash key:

	 %hash = (
	 key =>	['a','b','c']
	 ) ;

	 $hash->{key}	 ## return $hash{key}[0]
	 $hash->{key}[0] ## return $hash{key}[0]
	 $hash->{key}[1] ## return $hash{key}[1]

	 ## Inverse:

	 %hash = ( key => 'a' )	;

	 $hash->{key}	 ## return $hash{key}
	 $hash->{key}[0] ## return $hash{key}
	 $hash->{key}[1] ## return undef

       The best	thing of this new resource is to avoid wrong access to the
       data and	warnings when you try to access	a Hash having an Array (and
       the inverse). Thing that	generally make the script die().

       Once having an easy access to the data, you can use the same resource
       to create data!	For example:

	 ## Previous data:
	   <server address="" os="linux" type="conectiva" version="9.0"/>

	 ## Now	you have {address} as a	normal key with	a string inside:

	 ## And	to add a new address, the key {address}	need to	be an ARRAY ref!
	 ## So,	XML::Smart make	the convertion:	;-P
	 $XML->{hosts}{server}{address}[1] = '' ;

	 ## Adding to a	list that you don't know the size:
	 push(@{$XML->{hosts}{server}{address}}	, '') ;

	 ## The	data now:
	   <server os="linux" type="conectiva" version="9.0"/>

       Than after changing your	XML tree using the Hash	and Array resources
       you just	get the	data remade (through the Hash tree inside the object):

	 my $xmldata = $XML->data ;

       But note	that XML::Smart	always return an object! Even when you get a
       final key. So this actually returns another object, pointhing (inside
       it) to the key:

	 $addr = $XML->{hosts}{server}{address}[0] ;

	 ## Since $addr	is an object you can TRY to access more	data:
	 $addr->{foo}{bar} ; ##	This doens't make warnings! just return	UNDEF.

	 ## But	you can	use it like a normal SCALAR too:

	 print "$addr\n" ;

	 $addr .= ':80'	; ## After this	$addr isn't an object any more,	just a SCALAR!

	 * Finish XPath	implementation.
	 * DTD - Handle	<!DOCTYPE> gracefully.
	 * Implement a better way to declare meta tags.
	 * Add 0x80, 0x81, 0x8d, 0x8f, 0x90, 0xa0 ( multi byte characters to the list of accepted binary characters )
	 * Ensure object copy holds more in state including: ->data( wild => 1 )

       XML::Parser, XML::Parser::Lite, XML::XPath, XML.

       Object::MultiType - This	is the module that make	everything possible,
       and was created specially for XML::Smart. ;-P

       ** See the script for examples of use. <>

       Graciliano M. P.	"<gm at>"

       I will appreciate any type of feedback (include your opinions and/or
       suggestions). ;-P

       Enjoy and thanks	for who	are enjoying this tool and have	sent e-mails!

       Harish Madabushi, "<harish.tmh at>"

       Please report any bugs or feature requests to "bug-xml-smart at", or	through	the web	interface at
       <>.  Both the
       author and the maintainer will be notified, and then you'll
       automatically be	notified of progress on	your bug as changes are	made.

       You can find documentation for this module with the perldoc command.

	   perldoc XML::Smart

       You can also look for information at:

       o    RT:	CPAN's request tracker (report bugs here)


       o    AnnoCPAN: Annotated	CPAN documentation


       o    CPAN Ratings


       o    Search CPAN


       o    GitHub CPAN


       Thanks to Rusty Allen for the extensive tests of	CDATA and BINARY
       handling	of XML::Smart.

       Thanks to Ted Haining to	point a	Perl-5.8.0 bug for tied	keys of	a

       Thanks to everybody that	have sent ideas, patches or pointed bugs.

       Copyright 2003 Graciliano M. P.

       This program is free software; you can redistribute it and/or modify it
       under the same terms as Perl itself.

perl v5.24.1			  2013-10-04			 XML::Smart(3)


Want to link to this manual page? Use this URL:

home | help