Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages


home | help
YAWriter(3)	      User Contributed Perl Documentation	   YAWriter(3)

       XML::Handler::YAWriter -	Yet another Perl SAX XML Writer

	 use XML::Handler::YAWriter;

	 my $ya	= new XML::Handler::YAWriter( %options );
	 my $perlsax = new XML::Parser::PerlSAX( 'Handler' => $ya );

       YAWriter	implements Yet Another XML::Handler::Writer. The reasons for
       this one	are that I needed a flexible escaping technique, and want some
       kind of pretty printing.	If an instance of YAWriter is created without
       any options, the	default	behavior is to produce an array	of strings
       containing the XML in :


       Options are given in the	usual 'key' => 'value' idiom.

       Output IO::File
	   This	option tells YAWriter to use an	already	open file for output,
	   instead of using $ya->{Strings} to store the	array of strings. It
	   should be noted that	the only thing the object needs	to implement
	   is the print	method.	So anything can	be used	to receive a stream of
	   strings from	YAWriter.

       AsFile string
	   This	option will cause start_document to open named file and
	   end_document	to close it. Use the literal dash "-" if you want to
	   print on standard output.

       AsPipe string
	   This	option will cause start_document to open a pipe	and
	   end_document	to close it. The pipe is a normal shell	command.
	   Secure shell	comes handy but	has a 2GB limit	on most	systems.

       AsArray boolean
	   This	option will force storage of the XML in	$ya->{Strings},	even
	   if the Output option	is given.

       AsString	boolean
	   This	option will cause end_document to return the complete XML
	   document in a single	string.	Most SAX drivers return	the value of
	   end_document	as a result of their parse method. As this may not
	   work	with some combinations of SAX drivers and filters, a join of
	   $ya->{Strings} in the controlling method is preferred.

       Encoding	string
	   This	will change the	default	encoding from UTF-8 to anything	you
	   like.  You should ensure that given data are	already	in this
	   encoding or provide an Escape hash, to tell YAWriter	about the

       Escape hash
	   The Escape hash defines substitutions that have to be done to any
	   string, with	the exception of the processing_instruction and
	   doctype_decl	methods, where I think that escaping of	target and
	   data	would cause more trouble than necessary.

	   The default value for Escape	is

	       $XML::Handler::YAWriter::escape = {
		       '&'  => '&',
		       '<'  => '&lt;',
		       '>'  => '&gt;',
		       '"'  => '&quot;',
		       '--' => '&#45;&#45;'

	   YAWriter will use an	evaluated sub to make the recoding based on a
	   given Escape	hash reasonably	fast. Future versions may use XS to
	   improve this	performance bottleneck.

       Pretty hash
	   Hash	of string => boolean tuples, to	define kind of prettyprinting.
	   Default to undef. Possible string values:

	   AddHiddenNewline boolean
	       Add hidden newline before ">"

	   AddHiddenAttrTab boolean
	       Add hidden tabulation for attributes

	   CatchEmptyElement boolean
	       Catch empty Elements, apply "/>"	compression

	   CatchWhiteSpace boolean
	       Catch whitespace	with comments

	       Places Attributes on the	same line as the Element

	   IsSGML boolean
	       This option will	cause start_document, processing_instruction
	       and doctype_decl	to appear as SGML. The SGML is still well-
	       formed of course, if your SAX events are	well-formed.

	   NoComments boolean
	       Supress Comments

	   NoDTD boolean
	       Supress DTD

	   NoPI	boolean
	       Supress Processing Instructions

	   NoProlog boolean
	       Supress <?xml ... ?> Prolog

	   NoWhiteSpace	boolean
	       Supress WhiteSpace to clean documents from prior	pretty

	   PrettyWhiteIndent boolean
	       Add visible indent before any eventstring

	   PrettyWhiteNewline boolean
	       Add visible newlines before any eventstring

	   SAX1	boolean	(not yet implemented)
	       Output only SAX1	compliant eventstrings

       Correct handling	of start_document and end_document is required!

       The YAWriter Object initialises its structures during start_document
       and does	its cleanup during end_document.  If you forget	to call
       start_document, any other method	will break during the run. Most	likely
       place is	the encode method, trying to eval undef	as a subroutine. If
       you forget to call end_document,	you should not use a single instance
       of YAWriter more	than once.

       For small documents AsArray may be the fastest method and AsString the
       easiest one to receive the output of YAWriter. But AsString and AsArray
       may run out of memory with infinite SAX streams.	The only method
       XML::Handler::Writer calls on a given Output object is the print
       method. So it's easy to use a self written Output object	to improve

       A single	instance of XML::Handler::YAWriter is able to produce more
       than one	file in	a single run. Be sure to provide a fresh IO::File as
       Output before you call start_document and close this File after calling
       end_document. Or	provide	a filename in AsFile, so start_document	and
       end_document can	open and close its own filehandle.

       Automatic recoding between 8bit and 16bit does not work in any Perl
       correctly !

       I have Perl-5.00563 at home and here I can specify "use utf8;" in the
       right places to make recoding work. But I dislike saying	"use 5.00555;"
       because many systems run	5.00503.

       If you use some 8bit character set internally and want use national
       characters, either state	your character as Encoding to be ISO-8859-1,
       or provide an Escape hash similar to the	following :

	   $ya->{'Escape'} = {
			   '&'	=> '&amp;',
			   '<'	=> '&lt;',
			   '>'	=> '&gt;',
			   '"'	=> '&quot;',
			   '--'	=> '&#45;&#45;'
			   'A<paragraph>' => '&ouml;'
			   'Ax'	=> '&auml;'
			   'A1/4' => '&uuml;'
			   'A' => '&Ouml;'
			   'A' => '&Auml;'
			   'A' => '&Uuml;'
			   'A' => '&szlig;'

       You may abuse YAWriter to clean whitespace from XML documents. Take a
       look at,	doing just that	with an	XML::Edifact message, without
       querying	the DTD. This may work in 99% of the cases where you want to
       get rid of ignorable whitespace caused by the various forms of pretty

	   my $ya = new	XML::Handler::YAWriter(
	       'Output'	=> new IO::File	( ">-" );
	       'Pretty'	=> {
	       } );

       XML::Handler::Writer implements any method XML::Parser::PerlSAX wants.
       This extends the	Java SAX1.0 specification. I have in mind using
       Pretty=>SAX1=>1 to disable this feature,	if abusing YAWriter for	a SAX

       Michael Koehne, Kraehe@Copyleft.De

       "Derksen, Eduard	(Enno),	CSCIO" <> helped me	with the
       Escape hash and gave quite a lot	of useful comments.

       perl and	XML::Parser::PerlSAX

       Hey! The	above document had some	coding errors, which are explained

       Around line 465:
	   Non-ASCII character seen before =encoding in	''A<paragraph>''.
	   Assuming CP1252

perl v5.32.0			  2002-01-28			   YAWriter(3)


Want to link to this manual page? Use this URL:

home | help