Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
XML::DT(3)	      User Contributed Perl Documentation	    XML::DT(3)

NAME
       XML::DT - a package for down translation	of XML files

SYNOPSIS
	use XML::DT;

	%xml=( 'music'	  => sub{"Music	from: $c\n"},
	       'lyrics'	  => sub{"Lyrics from: $v{name}\n"},
	       'title'	  => sub{ uc($c) },
	       '-userdata => { something => 'I like' },
	       '-default' => sub{"$q:$c"} );

	print dt($filename,%xml);

ABSTRACT
       This module is a	XML down processor. It maps tag	(element) names	to
       functions to process that element and respective	contents.

DESCRIPTION
       This module processes XML files with an approach	similar	to OMNIMARK.
       As XML parser it	uses XML::LibXML module	in an independent way.

       You can parse HTML files	as if they were	XML files. For this, you must
       supply an extra option to the hash:

	%hander	= ( -html => 1,
		    ...
		  );

       You can also ask	the parser to recover from XML errors:

	%hander	= ( -recover =>	1,
		    ...
		  );

Functions
   dt
       Down translation	function "dt" receives a filename and a	set of
       expressions (functions) defining	the processing and associated values
       for each	element.

   dtstring
       "dtstring" works	in a similar way with "dt" but takes input from	a
       string instead of a file.

   dturl
       "dturl" works in	a similar way with "dt"	but takes input	from an
       Internet	url instead of a file.

   pathdt
       The "pathdt" function is	a "dt" function	which can handle a subset of
       XPath on	handler	keys. Example:

	%handler = (
	  "article/title"	 => sub{ toxml("h1",{},$c) },
	  "section/title"	 => sub{ toxml("h2",{},$c) },
	  "title"		 => sub{ $c },
	  "//image[@type='jpg']" => sub{ "JPEG:	<img src=\"$c\">" },
	  "//image[@type='bmp']" => sub{ "BMP: sorry, no bitmaps on the	web" },
	);

	pathdt($filename, %handler);

       Here are	some examples of valid XPath expressions under XML::DT:

	/aaa
	/aaa/bbb
	//ccc				- ccc somewhere	(same as "ccc")
	/*/aaa/*
	//*				- same as "-default"
	/aaa[@id]			- aaa with an attribute	id
	/*[@*]				- root with an attribute
	/aaa[not(@name)]		- aaa with no attribute	"name"
	//bbb[@name='foo']		- ... attribute	"name" = "foo"
	/ccc[normalize-space(@name)='bbb']
	//*[name()='bbb']		- complex way of saying	"//bbb"
	//*[starts-with(name(),'aa')]	- an element named "aa.*"
	//*[contains(name(),'c')]	- an element	   ".*c.*"
	//aaa[string-length(name())=4]			   "...."
	//aaa[string-length(name())&lt;4]		   ".{1,4}"
	//aaa[string-length(name())&gt;5]		   ".{5,}"

       Note that not all XPath is currently handled by XML::DT.	A lot of XPath
       will never be added to XML::DT because is not in	accordance with	the
       down translation	model. For more	documentation about XPath check	the
       specification at	http://www.w3c.org or some tutorials under
       http://www.zvon.org

   pathdtstring
       Like the	"dtstring" function but	supporting XPath.

   pathdturl
       Like the	"dturl"	function but supporting	XPath.

   ctxt
       Returns the context element of the currently being processed element.
       So, if you call ctxt(1) you will	get your father	element, and so	on.

   inpath
       "inpath(pattern)" is true if the	actual element path matches the
       provided	pattern. This function is meant	to be used in the element
       functions in order to achieve context dependent processing.

   inctxt
       "inctxt(pattern)" is true if the	actual element father matches the
       provided	pattern.

   toxml
       This is the default "-default" function.	It can be used to generate XML
       based on	$c $q and %v variables.	Example: add a new attribute to
       element "ele1" without changing it:

	  %handler=( ...
	    ele1 => sub	{ $v{at1} = "v1"; toxml(); },
	  )

       "toxml" can also	be used	with 3 arguments: tag, attributes and contents

	  toxml("a",{href=> "http://local/f.html"}, "example")

       returns:

	<a href='http://local/f.html'>example</a>

       Empty tags are written as empty tags. If	you want an empty tag with
       opening and closing tags, then use the "tohtml".

   tohtml
       See "toxml".

   xmltree
       This simple function just makes a HASH reference:

	{ -c =>	$c, -q => $q, all_the_other_attributes }

       The function "toxml" understands	this structure and makes XML with it.

   mkdtskel
       Used by the mkdtskel script to generate automatically a XML::DT perl
       script file based on an XML file. Check "mkdtskel" manpage for details.

   mkdtskel_fromDTD
       Used by the mkdtskel script to generate automatically a XML::DT perl
       script file based on an DTD file. Check "mkdtskel" manpage for details.

   mkdtdskel
       Used by the mkdtskel script to generate automatically a XML::DT perl
       script file based on a DTD file.	Check "mkdtdskel" manpage for details.

Accessing parents
       With XML::DT you	can access an element parent (or grand-parent)
       attributes, till	the root of the	XML document.

       If you use c<$dtattributes[1]{foo} = 'bar'> on a	processing function,
       you are defining	the attribute "foo" for	that element parent.

       In the same way,	you can	use $dtattributes[2] to	access the grand-
       parent. $dtattributes[-1] is, as	expected, the XML document root
       element.

       There are some shortcuts:

       "father"
       "gfather"
       "ggfather"
	   You can use these functions to access to your "father", grand-
	   father ("gfather") or great-grand-father ("ggfather"):

	      father("x"); # returns value for attribute "x" on	father element
	      father("x", "value"); # sets value for attribute "x" on father
					    # element

	   You can also	use it directly	as a reference to @dtattributes:

	      father->{"x"};	       # gets the attribute
	      father->{"x"} = "value"; # sets the attribute
	      $attributes = father;	       # gets all attributes reference

       "root"
	   You can use it as a function	to access to your tree root element.

	      root("x");	  # gets attribute C<x>	on root	element
	      root("x",	"value"); # sets value for attribute C<x> on root

	   You can also	use it directly	as a reference to $dtattributes[-1]:

	      root->{"x"};	     # gets the	attribute x
	      root->{"x"} = "value"; # sets the	attribute x
	      $attributes = root;    # gets all	attributes reference

User provided element processing functions
       The user	must provide an	HASH with a function for each element, that
       computes	element	output.	Functions can use the element name $q, the
       element content $c and the attribute values hash	%v.

       All those global	variables are defined in $CALLER::.

       Each time an element is find the	associated function is called.

       Content is calculated by	concatenation of element contents strings and
       interior	elements return	values.

   "-default" function
       When a element has no associated	function, the function associated with
       "-default" called. If no	"-default" function is defined the default
       function	returns	a XML like string for the element.

       When you	use "/-type" definitions, you often need do set	"-default"
       function	to return just the contents: "sub{$c}".

   "-outputenc"	option
       "-outputenc" defines the	output encoding	(default is Unicode UTF8).

   "-inputenc" option
       "-inputenc" forces a input encoding type. Whenever that is possible,
       define the input	encoding in the	XML file:

	<?xml version='1.0' encoding='ISO-8859-1'?>

   "-pcdata" function
       "-pcdata" function is used to define transformation over	the contents.
       Typically this function should look at context (see "inctxt" function)

       The default "-pcdata" function is the identity

   "-cdata" function
       You can process "<CDATA"> in a way different from pcdata. If you	define
       a "-cdata" method, it will be used. Otherwise, the "-pcdata" method is
       called.

   "-begin" function
       Function	to be executed before processing XML file.

       Example of use: initialization of side-effect variables

   "-end" function
       Function	to be executed after processing	XML file.  I can use $c
       content value.  The value returned by "-end" will be the	"dt" return
       value.

       Example of use: post-processing of returned contents

   "-recover" option
       If set, the parser will try to recover in XML errors.

   "-html" option
       If set, the parser will try to recover in errors. Note that this
       differs from the	previous one in	the sense it uses some knowledge of
       the HTML	structure for the recovery.

   "-userdata" option
       Use this	to pass	any information	you like to your handlers. The data
       structure you pass in this option will be available as $u in your code.
       -- New in 0.62.

Elements with values other than	strings	("-type")
       By default all elements return strings, and contents ($c) is the
       concatenation of	the strings returned by	the sub-elements.

       In some situations the XML text contains	values that are	better
       processed as a structured type.

       The following types (functors) are available:

       THE_CHILD
	   Return the result of	processing the only child of the element.

       LAST_CHILD
	   Returns the result of processing the	last child of the element.

       STR concatenates	all the	sub-elements returned values (DEFAULT) all the
	   sub-element should return strings to	be concatenated;

       SEQ makes an ARRAY with all the sub elements contents; attributes are
	   ignored (they should	be processed in	the sub-element). (returns a
	   ref)	If you have different types of sub-elements, you should	use
	   SEQH

       SEQH
	   makes an ARRAY of HASH with all the sub elements (returns a ref);
	   for each sub-element:

	    -q	=> element name
	    -c	=> contents
	    at1	=> at value1	for each attribute

       MAP makes an HASH with the sub elements;	keys are the sub-element
	   names, values are their contents. Attributes	are ignored. (they
	   should be processed in the sub-element) (returns a ref)

       MULTIMAP
	   makes an HASH of ARRAY; keys	are the	sub-element names; values are
	   lists of contents; attributes are ignored (they should be processed
	   in the sub-element);	(returns a ref)

       MMAPON(element-list)
	   makes an HASH with the sub-elements;	keys are the sub-element
	   names, values are their contents; attributes	are ignored (they
	   should be processed in the sub-element); for	all the	elements
	   contained in	the element-list, it is	created	an ARRAY with their
	   contents. (returns a	ref)

       XML return a reference to an HASH with:

	    -q	=> element name
	    -c	=> contents
	    at1	=> at value1	for each attribute

       ZERO
	   don't process the sub-elements; return ""

       When you	use "/-type" definitions, you often need do set	"-default"
       function	returning just the contents "sub{$id}".

   An example:
	use XML::DT;
	%handler = ( contacts => sub{ [	split(";",$c)] },
		     -default => sub{$c},
		     -type    => { institution => 'MAP',
				   degrees     =>  MMAPON('name')
				   tels	       => 'SEQ'	}
		   );
	$a = dt	("f.xml", %handler);

       with the	following f.xml

	<degrees>
	   <institution>
	      <id>U.M.</id>
	      <name>University of Minho</name>
	      <tels>
		 <item>1111</item>
		 <item>1112</item>
		 <item>1113</item>
	      </tels>
	      <where>Portugal</where>
	      <contacts>J.Joao;	J.Rocha; J.Ramalho</contacts>
	   </institution>
	   <name>Computer science</name>
	   <name>Informatica </name>
	   <name> history </name>
	</degrees>

       would make $a

	{ 'name' => [ 'Computer	science',
		      'Informatica ',
		      '	history	' ],
	  'institution'	=> { 'tels' => [ 1111, 1112, 1113 ],
			     'name' => 'University of Minho',
			     'where' =>	'Portugal',
			     'id' => 'U.M.',
			     'contacts'	=> [ 'J.Joao',
				      '	J.Rocha',
				      '	J.Ramalho' ] } };

DT Skeleton generation
       It is possible to build an initial processor program based on an
       example

       To do this use the function "mkdtskel(filename)".

       Example:

	perl -MXML::DT -e 'mkdtskel "f.xml"' > f.pl

DTD skeleton generation
       It makes	a naive	DTD based on an	example(s).

       To do this use the function "mkdtdskel(filename*)".

       Example:

	perl -MXML::DT -e 'mkdtdskel "f.xml"' >	f.dtd

SEE ALSO
       mkdtskel(1) and mkdtdskel(1)

AUTHORS
       Home for	XML::DT;

       http://natura.di.uminho.pt/~jj/perl/XML/

       Jose Joao Almeida, <jj@di.uminho.pt>

       Alberto Manuel SimA<micro>es, <albie@alfarrabio.di.uminho.pt>

ACKNOWLEDGEMENTS
       Michel Rodriguez	   <mrodrigu@ieee.org>

       JosA(C) Carlos Ramalho <jcr@di.uminho.pt>

       Mark A. Hillebrand

COPYRIGHT AND LICENSE
       Copyright 1999-2012 Project Natura.

       This library is free software; you can redistribute it and/or modify it
       under the same terms as Perl itself.

perl v5.32.1			  2019-04-22			    XML::DT(3)

NAME | SYNOPSIS | ABSTRACT | DESCRIPTION | Functions | Accessing parents | User provided element processing functions | Elements with values other than strings ("-type") | DT Skeleton generation | DTD skeleton generation | SEE ALSO | AUTHORS | ACKNOWLEDGEMENTS | COPYRIGHT AND LICENSE

Want to link to this manual page? Use this URL:
<https://www.freebsd.org/cgi/man.cgi?query=XML::DT&sektion=3&manpath=FreeBSD+13.0-RELEASE+and+Ports>

home | help