Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
XML::SAX::Machine(3)  User Contributed Perl Documentation XML::SAX::Machine(3)

NAME
       XML::SAX::Machine - Manage a collection of SAX processors

VERSION
       version 0.46

SYNOPSIS
	   ## Note: See	XML::SAX::Pipeline and XML::SAX::Machines first,
	   ## this is the gory,	detailed interface.

	   use My::SAX::Machines qw( Machine );
	   use My::SAX::Filter2;
	   use My::SAX::Filter3;

	   my $filter3 = My::SAX::Filter3->new;

	   ## A	simple pipeline.  My::SAX::Filter1 will	be autoloaded.
	   my $m = Machine(
	       #
	       # Name	=> Class/object		   => handler(s)
	       #
	       [ Intake	=> "My::SAX::Filter1"	   => "B"	 ],
	       [ B	=> My::SAX::Filter2->new() => "C"	 ],
	       [ C	=> $filter3		   => "D"	 ],
	       [ D	=> \*STDOUT				 ],
	   );

	   ## A	parser will be created unless My::SAX::Filter1 can parse_file
	   $m->parse_file( "foo.revml" );

	   my $m = Machine(
	       [ Intake	  => "My::SAX::Filter1"	 => qw(	Tee	) ],
	       [ Tee	  => "XML::Filter::SAXT" => qw(	Foo Bar	) ],
	       [ Foo	  => "My::SAX::Filter2"	 => qw(	Out1	) ],
	       [ Out1	  => \$log				  ],
	       [ Bar	  => "My::SAX::Filter3"	 => qw(	Exhaust	) ],
	   );

DESCRIPTION
       WARNING:	This API is alpha!!!  It will be changing.

       A generic SAX machine (an instance of XML::SAX::Machine)	is a container
       of SAX processors (referred to as "parts") connected in arbitrary ways.

       Each parameter to "Machine()" (or "XML::SAX::Machine-"new()>)
       represents one top level	part of	the machine.  Each part	has a name, a
       processor, and one or more handlers (usually specified by name, as
       shown in	the SYNOPSIS).

       Since SAX machines may be passed	in as single top level parts, you can
       also create nested, complex machines ($filter3 in the SYNOPSIS could be
       a Pipeline, for example).

       A SAX machines can act as a normal SAX processors by connecting them to
       other SAX processors:

	   my $w = My::Writer->new();
	   my $m = Machine( ...., { Handler => $w } );
	   my $g = My::Parser->new( Handler => $w );

   Part	Names
       Although	it's not required, each	part in	a machine can be named.	 This
       is useful for retrieving	and manipulating the parts (see	"part",	for
       instance), and for debugging, since debugging output (see "trace_parts"
       and "trace_all_parts") includes the names.

       Part names must be valid	Perl subroutine	names, beginning with an
       uppercase character.  This is to	allow convenience part accessors
       methods like

	   $c =	$m->NameOfAFilter;

       to work without ever colliding with the name of a method	(all method
       names are completely lower case).  Only filters named like this can be
       accessed	using the magical accessor functions.

   Reserved Names: Intake and Exhaust
       The names c<Intake> and "Exhaust" are reserved.	"Intake" refers	to the
       first part in the processing chain.  This is not	necessarily the	first
       part in the constructor list, just the first part to receive external
       events.

       "Exhaust" refers	to the output of the machine; no part may be named
       "Exhaust", and any parts	with a handler named "Exhaust" will deliver
       their output to the machine's handler.  Normally, only one part should
       deliver it's output to the Exhaust port.

       Calling $m->set_handler() alters	the Exhaust port, assuming any
       processors pointing to the "Exhaust" provide a "set_handler()" method
       like XML::SAX::Base's.

       "Intake"	and "Exhaust" are usually assigned automatically by single-
       purpose machines	like XML::SAX::Pipeline	and XML::SAX::Manifold.

   SAX Processor Support
       The XML::SAX::Machine class is very agnostic about what SAX processors
       it supports; about the only constraint is that it must be a blessed
       reference (of any type) that does not happen to be a Perl IO::Handle
       (which are assumed to be	input or output	filehandles).

       The major constraint placed on SAX processors is	that they must provide
       either a	"set_handler" or "set_handlers"	method (depending on how many
       handlers	a processor can	feed) to allow the SAX::Machine	to disconnect
       and reconnect them.  Luckily, this is true of almost any	processor
       derived from XML::SAX::Base.  Unfortunately, many SAX older (SAX1)
       processors do not meet this requirement;	they assume that SAX
       processors will only ever be connected together using their
       constructors.

   Connections
       SAX machines allow you to connect the parts however you like; each part
       is given	a name and a list of named handlers to feed.  The number of
       handlers	a part is allowed depends on the part; most filters only allow
       once downstream handler,	but filters like XML::Filter::SAXT and
       XML::Filter::Distributor	are meant to feed multiple handlers.

       Parts may not be	connected in loops ("cycles" in	graph theory terms).
       The machines specified by:

	   [ A => "Foo"	=> "A" ],  ## Illegal!

       and

	   [ A => "Foo"	=> "B" ],  ## Illegal!
	   [ B => "Foo"	=> "A" ],

       .  Configuring a	machine	this way would cause events to flow in an
       infinite	loop, and/or cause the first processor in the cycle to start
       receiving events	from the end of	the cycle before the input document
       was complete.  Besides, it's not	a very useful topology :).

       SAX machines detect loops at construction time.

NAME
	   XML::SAX::Machine - Manage a	collection of SAX processors

API
   Public Methods
       These methods are meant to be used by users of SAX machines.

       new()
	       my $m = $self->new( @machine_spec, \%options );

	   Creates $self using %options, and compiles the machine spec.	 This
	   is the longhand form	of "Machines( ... )".

       find_part
	   Gets	a part contained by this machine by name, number or object
	   reference:

	       $c = $m->find_part( $name );
	       $c = $m->find_part( $number );
	       $c = $m->find_part( $obj	);    ## useful	only to	see if $obj is in $m

	   If a	machine	contains other machines, parts of the contained
	   machines may	be accessed by name using unix directory syntax:

	       $c = $m->find_part( "/Intake/Foo/Bar" );

	   (all	paths must be absolute).

	   Parts may also be accessed by number	using array indexing:

	       $c = $m->find_part(0);  ## Returns first	part or	undef if none
	       $c = $m->find_part(-1); ## Returns last part or undef if	none
	       $c = $m->find_part( "Foo/0/1/-1"	);

	   There is no way to guarantee	that a part's position number means
	   anything, since parts can be	reconnected after their	position
	   numbers are assigned, so using a part name is recommended.

	   Throws an exception if the part is not found, so doing things like

	      $m->find_part( "Foo" )->bar()

	   garner informative messages when "Foo" is not found.	 If you	want
	   to test a result code, do something like

	       my $p = eval { $m->find_part };
	       unless (	$p ) {
		   ...handle lookup failure...
	       }

       parts
	       for ( $m->parts ) { ... }

	   Gets	an arbitrarily ordered list of top level parts in this
	   machine.  This is all of the	parts directly contained by this
	   machine and none of the parts that may be inside them.  So if a
	   machine contains an XML::SAX::Pipeline as one of it's parts,	the
	   pipeline will be returned but not the parts inside the pipeline.

       all_parts
	       for ( $m->all_parts ) { ... }

	   Gets	all parts in this machine, not just top	level ones. This
	   includes any	machines contained by this machine and their parts.

       set_handler
	       $m->set_handler(	$handler );
	       $m->set_handler(	DTDHandler => $handler );

	   Sets	the machine's handler and sets the handlers for	all parts that
	   have	"Exhaust" specified as their handlers.	Requires that any such
	   parts provide a "set_handler" or (if	the part has multiple
	   handlers) a "set_handlers" method.

	   NOTE: handler types other than "Handler" are	only supported if they
	   are supported by whatever parts point at the	"Exhaust".  If the
	   handler type	is "Handler", then the appropriate method is called
	   as:

	       $part->set_handler( $handler );
	       $part->set_handlers( $handler0, $handler1, ... );

	   If the type is some other handler type, these are called as:

	       $part->set_handler( $type => $handler );
	       $part->set_handlers( { $type0 =>	$handler0 }, ... );

       trace_parts
	       $m->trace_parts;		 ## trace all top-level	parts
	       $m->trace_parts(	@ids );	 ## trace the indicated	parts

	   Uses	Devel::TraceSAX	to enable tracing of all events	received by
	   the parts of	this machine.  Does not	enable tracing of parts
	   contained in	machines in this machine; for that, see
	   trace_all_parts.

       trace_all_parts
	       $m->trace_all_parts;	 ## trace all parts

	   Uses	Devel::TraceSAX	to trace all events received by	the parts of
	   this	machine.

       untracify_parts
	       $m->untracify_parts( @ids );

	   Converts the	indicated parts	to SAX processors with tracing
	   enabled.  This may not work with processors that use	AUTOLOAD.

Events and parse routines
       XML::SAX::Machine provides all SAX1 and SAX2 events and delgates	them
       to the processor	indicated by $m->find_part( "Intake" ).	 This adds
       some overhead, so if you	are concerned about overhead, you might	want
       to direct SAX events directly to	the Intake instead of to the machine.

       It also provides	parse...() routines so it can whip up a	parser if need
       be.  This means:	parse(), parse_uri(), parse_string(), and parse_file()
       (see XML::SAX::EventMethodMaker for details).  There is no way to pass
       methods directly	to the parser unless you know that the Intake is a
       parser and call it directly.  This is not so important for parsing,
       because the overhead it takes to	delegate is minor compared to the
       effort needed to	parse an XML document.

   Internal and	Helper Methods
       These methods are meant to be used/overridden by	subclasses.

       _compile_specs
	       my @comp	= $self->_compile_specs( @_ );

	   Runs	through	a list of module names,	output specifiers, etc., and
	   builds the machine.

	       $scalar	   --> "$scalar"->new
	       $ARRAY_ref  --> pipeline	@$ARRAY_ref
	       $SCALAR_ref --> XML::SAX::Writer->new( Output =>	$SCALAR_ref )
	       $GLOB_ref   --> XML::SAX::Writer->new( Output =>	$GLOB_ref )

       generate_description
	       $m->generate_description( $h );
	       $m->generate_description( Handler => $h );
	       $m->generate_description( Pipeline ... );

	   Generates a series of SAX events to the handler of your choice.

	   See XML::Handler::Machine2GraphViz on CPAN for a way	of visualizing
	   machine innards.

TODO
       o   Separate initialization from	construction time; there should	be
	   somthing like a $m->connect(	....machine_spec... ) that new() calls
	   to allow you	to delay parts speficication and reconfigure existing
	   machines.

       o   Allow an XML	doc to be passed in as a machine spec.

LIMITATIONS
AUTHOR
	   Barrie Slaymaker <barries@slaysys.com>

LICENSE
       Artistic	or GPL,	any version.

AUTHORS
       o   Barry Slaymaker

       o   Chris Prather <chris@prather.org>

COPYRIGHT AND LICENSE
       This software is	copyright (c) 2013 by Barry Slaymaker.

       This is free software; you can redistribute it and/or modify it under
       the same	terms as the Perl 5 programming	language system	itself.

perl v5.32.0			  2013-08-19		  XML::SAX::Machine(3)

NAME | VERSION | SYNOPSIS | DESCRIPTION | NAME | API | Events and parse routines | TODO | LIMITATIONS | AUTHOR | LICENSE | AUTHORS | COPYRIGHT AND LICENSE

Want to link to this manual page? Use this URL:
<https://www.freebsd.org/cgi/man.cgi?query=XML::SAX::Machine&sektion=3&manpath=FreeBSD+12.2-RELEASE+and+Ports>

home | help