Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages


home | help
XML::Compile::TranslatUsereContributed Perl XML::Compile::Translate::Reader(3)

       XML::Compile::Translate::Reader - translate XML to HASH

	  is a XML::Compile::Translate

	my $schema = XML::Compile::Schema->new(...);
	my $code   = $schema->compile(READER =>	...);

       The translator understands schemas, but does not	encode that into
       actions.	 This module implements	those actions to translate from	XML
       into a (nested) Perl HASH structure.

       Extends "DESCRIPTION" in	XML::Compile::Translate.

       Extends "METHODS" in XML::Compile::Translate.

       Extends "DETAILS" in XML::Compile::Translate.

   Translator options
       Extends "Translator options" in XML::Compile::Translate.

   Processing Wildcards
       If you want to collect information from the XML structure, which	is
       permitted by "any" and "anyAttribute" specifications in the schema, you
       have to implement that yourself.	 The problem is	"XML::Compile" has
       less knowledge than you about the possible data.

       option any_attribute

       By default, the "anyAttribute" specification is ignored.	 When
       "TAKE_ALL" is given, all	attributes which are fulfilling	the name-space
       requirement added to the	returned data-structure.  As key, the absolute
       element name will be used, with as value	the related unparsed XML

       In the current implementation, if an explicit attribute is also covered
       by the name-spaces permitted by the anyAttribute	definition, then it
       will also appear	in that	list (and hence	the handler will be called as

       Use XML::Compile::Schema::compile(any_attribute)	to write your own
       handler,	to influence the behavior.  The	handler	will be	called for
       each attribute, and you must return list	of pairs of derived
       information.  When the returned is empty, the attribute data is lost.
       The value may be	a complex structure.

       option any_element

       By default, the "any" definition	in a schema will ignore	all elements
       from the	container which	are not	used.  Also in this case "TAKE_ALL" is
       required	to produce "any" results.  "SKIP_ALL" will ignore all results,
       although	this are being processed for validation	needs.

       option any_type CODE

       By default, the elements	which have type	"xsd:anyType" will return an
       XML::LibXML::Element when there are sub-elements.  Otherwise, it	will
       return the textual content.

       If you pass your	own CODE reference, you	can change this	behavior.  It
       will get	called with the	path, the node,	and the	default	handler.  Be
       awayre the $node	may actually be	a string already.

	  $schema->compile(READER => ..., any_type => \&handle_any_type);
	  sub handle_any_type($$$)
	  { my ($path, $node, $handler)	= @_;
	    ref	$node or return	$node;

   Mixed elements
       [available since	0.86] ComplexType and ComplexContent components	can be
       declared	with the "<mixed="true""> attribute.  This implies that	text
       is not limited to the content of	containers, but	may also be used
       inbetween elements.  Usually, you will only find	ignorable white-space
       between elements.

       In this example,	the "a"	container is marked to be mixed:
	 <a id="5"> before <b>2</b> after </a>

       Often the "mixed" option	is bending one of both ways: either the
       element is needed as text, or the element should	be parsed and the text
       ignored.	 The reader has	various	options	to avoid the need of
       processing raw XML::LibXML nodes.

       [1.00] When the return is a HASH, that HASH will	also contain the
       "_MIXED_ELEMENT_MODE" key, to help people understand what happens.
       This is not possible for	all modes, only	for some.

       With XML::Compile::Schema::compile(mixed_elements) set to

       ATTRIBUTES  (the	default)
	   a HASH is returned, the attributes are processed.  The node is
	   found as XML::LibXML::Element with the key '_'.  Above example will
	     $r	= { id => 5, _ => $xmlnode };

	   Like	the previous, but now the textual representation of the
	   content is returned with key	'_'.  Above example will produce
	     $r	= { id => 5, _ => ' before 2 after '};

	   will	remove all mixed-in text, and treat the	element	as normal
	   element.  The example will be transformed into
	     $r	= { id => 5, b => 2 };

	   return the XML::LibXML::Node	itself.	 The example:
	     $r	= $xmlnode;

	   return the mixed node as XML	string,	just as	in the source.	Be
	   warned that it is rather expensive: the string was parsed and then
	   stringified again, which is costly for large	nodes.	Result:
	     $r	= '<a id="5"> before <b>2</b> after </a>';

       CODE reference
	   the reference is called with	the XML::LibXML::Node as first
	   argument.  When a value is returned (even undef), then the right
	   tag with the	value will be included in the translators result.
	   When	an empty list is returned by the code reference, then nothing
	   is returned (which may result in an error if	the element is
	   required according to the schema)

       When some of your mixed elements	need different behavior	from other
       elements, then you have to go play with the normal hooks	in specific

   Schema hooks
       hooks executed before the XML is	being processed

       The "before" hooks receives an XML::LibXML::Node	object and the path
       string.	It must	return a new (or same) XML node	which will be used
       from then on.  You probably can best modify a node clone, not the
       original	as provided by the user.  When "undef" is returned, the	whole
       node will disappear.

       This hook offers	a predefined "PRINT_PATH".

       hooks executed as replacement

       Your "replace" hook should return a list	of key-value pairs. To produce
       it, it will get the XML::LibXML::Element, the translator	settings as
       HASH, the path, and the localname.

       This hook has a predefined "SKIP", which	will not process the found
       element,	but simply return the string "SKIPPED" as value.  This way, a
       whole tree of unneeded translations can be avoided.

       [1.51] The predefined hook "XML_NODE" will not attempt to parse the
       selected	element, but returns the XML::LibXML::Element node instead.
       This may	break on some schema-contained validations.

       Sometimes, the Schema spec is such a mess, that XML::Compile cannot
       automatically translate it.  I have seen	cases where confusion over
       name-spaces is created: a choice	between	three elements with the	same
       name but	different types.  Well,	in such	case you may use
       XML::LibXML::Simple to translate	a part of your tree.  Simply

	use XML::LibXML::Simple	 qw/XMLin/;
	  ( action  => 'READER'
	  , type    => 'tns:xyz'     # or pack_type($tns,'xyz')
	 #  path    => qr!/company$! # by element name
	  , replace =>
	      sub { my ($xml, $args, $path, $type, $r) = @_;
		    ($type => XMLin($xml, ...));

       hooks for post-processing, after	the data is collected

       Your code reference gets	called with three parameters: the XML node,
       the data	collected and the path.	 Be careful that the collected data
       might be	a SCALAR (for simpleType).  Return a HASH or a SCALAR.
       "undef" may work, unless	it is the value	of a required element you
       throw awy.

       This hook also offers a predefined "PRINT_PATH".	 Besides, it has
       "ATTRIBUTE_ORDER", which	will result in additional fields in the	HASH,
       respectively containing the NODE	which was processed (an
       XML::LibXML::Element), the type_of_node,	the element names, and the
       attribute names.	 The keys start	with an	underscore "_".

       In a typemap, a relation	between	an XML element type and	a Perl class
       (or object) is made.  Each translator back-end will implement this a
       little differently.  This section is about how the reader handles

       Typemap to Class

       Usually,	an XML type will be mapped on a	Perl class.  The Perl class
       implements the "fromXML"	method as constructor.

	$schema->addTypemaps($sometype => 'My::Perl::Class');

	package	My::Perl::Class;
	sub fromXML
	{   my ($class,	$data, $xmltype) = @_;
	    my $self = $class->new($data);

       Your method returns the data which will be included in the result tree
       of the reader.  You may return an object, the unmodified	$data, or
       "undef".	 When "undef" is returned, this	may fail the schema parser
       when the	data element is	required.

       In the simpelest	implementation,	the class stores its data exactly as
       the XML structure:

	package	My::Perl::Class;
	sub fromXML
	{   my ($class,	$data, $xmltype) = @_;
	    bless $data, $class;

	# The same, even shorter:
	sub fromXML { bless $_[1], $_[0] }

       Typemap to Object

       Another option is to implement an object	factory: one object which
       creates other objects.  In this case, the $xmltype parameter can	come
       of use, to have one object spawning many	different other	objects.

	my $object = My::Perl::Class->new(...);
	$schema->typemap($sometype => $object);

	package	My::Perl::Class;
	sub fromXML
	{   my ($object, $xmltype, $data) = @_;
	    return Some::Other::Class->new($data);

       This object factory may be a very simple	solution when you map XML onto
       objects which are not under your	control; where there is	not way	to add
       the "fromXML" method.

       Typemap to CODE

       The light version of an object factory works with CODE references.

	$schema->typemap($t1 =>	\&myhandler);
	sub myhandler
	{   my ($backend, $data, $type)	= @_;
	    return My::Perl::Class->new($data)
		if $backend eq 'READER';

	# shorter
	$schema->typemap($t1 =>	sub {My::Perl::Class->new($_[1])} );

       Typemap implementation

       Internally, the typemap is simply translated into an "after" hook for
       the specific type.  After the data was processed	via the	usual
       mechanism, the hook will	call method "fromXML" on the class or object
       you specified with the data which was read.  You	may still use "before"
       and "replace" hooks, if you need	them.

       Syntactic sugar:

	 $schema->typemap($t1 => 'My::Package');
	 $schema->typemap($t2 => $object);

       is comparible to

	 $schema->typemap($t1 => sub {My::Package->fromXML(@_)});
	 $schema->typemap($t2 => sub {$object->fromXML(@_)} );

       with some extra checks.

       This module is part of XML-Compile distribution version 1.63, built on
       July 02,	2019. Website:

       Copyrights 2006-2019 by [Mark Overmeer <>]. For other
       contributors see	ChangeLog.

       This program is free software; you can redistribute it and/or modify it
       under the same terms as Perl itself.  See

perl v5.32.0			  2019-07-02XML::Compile::Translate::Reader(3)


Want to link to this manual page? Use this URL:

home | help