Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages


home | help
XML::Checker::Parser(3User Contributed Perl DocumentatiXML::Checker::Parser(3)

       XML::Checker::Parser - an XML::Parser that validates at parse time

	use XML::Checker::Parser;

	my %expat_options = (KeepCDATA => 1,
			     Handlers => [ Unparsed => \&my_Unparsed_handler ]);
	my $parser = new XML::Checker::Parser (%expat_options);

	eval {
	    local $XML::Checker::FAIL =	\&my_fail;
	    $parser->parsefile ("fail.xml");
	if ($@)	{
	    # Either XML::Parser (expat) threw an exception or my_fail() died.
	    ...	your error handling code here ...

	# Throws an exception (with die) when an error is encountered, this
	# will stop the	parsing	process.
	# Don't	die if a warning or info message is encountered, just print a message.
	sub my_fail {
	    my $code = shift;
	    die	XML::Checker::error_string ($code, @_) if $code	< 200;
	    XML::Checker::print_error ($code, @_);

       XML::Checker::Parser extends XML::Parser

       I hope the example in the SYNOPSIS says it all, just use
       XML::Checker::Parser as if it were an XML::Parser.  See XML::Parser for
       the supported (expat) options.

       You can also derive your	parser from XML::Checker::Parser instead of
       from XML::Parser. All you should	have to	do is replace:

	package	MyParser;
	@ISA = qw( XML::Parser );


	package	MyParser;
	@ISA = qw( XML::Checker::Parser	);

XML::Checker::Parser constructor
	$parser	= new XML::Checker::Parser (SkipExternalDTD => 1, SkipInsignifWS => 1);

       The constructor takes the same parameters as XML::Parser	with the
       following additions:

	   By default, it will try to load external DTDs using LWP. You	can
	   disable this	by setting SkipExternalDTD to 1. See External DTDs for

	   By default, it will treat insignificant whitespace as regular Char
	   data.  By setting SkipInsignifWS to 1, the user Char	handler	will
	   not be called if insignificant whitespace is	encountered.  See
	   "INSIGNIFICANT_WHITESPACE" in XML::Checker for details.

	   When	calling	parsefile() with a URL (instead	of a filename) or when
	   loading external DTDs, we use LWP to	download the remote file. By
	   default it will use a LWP::UserAgent	that is	created	as follows:

	    use	LWP::UserAgent;
	    $LWP_USER_AGENT = LWP::UserAgent->new;

	   Note	that env_proxy reads proxy settings from your environment
	   variables, which is what I need to do to get	thru our firewall.  If
	   you want to use a different LWP::UserAgent, you can either set it
	   globally with:

	    XML::Checker::Parser::set_LWP_UserAgent ($my_agent);

	   or, you can specify it for a	specific XML::Checker::Parser by
	   passing it to the constructor:

	    my $parser = new XML::Checker::Parser (LWP_UserAgent => $my_agent);

	   Currently, LWP is used when the filename (passed to parsefile)
	   starts with one of the following URL	schemes: http, https, ftp,
	   wais, gopher, or file (followed by a	colon.)	If I missed one,
	   please let me know.

	   The LWP modules are part of libwww-perl which is available at CPAN.

External DTDs
       XML::Checker::Parser will try to	load and parse external	DTDs that are
       referenced in DOCTYPE definitions unless	you set	the SkipExternalDTD
       option to 1 (the	default	setting	is 0.)	See CAVEATS for	details	on
       what is not supported by	XML::Checker::Parser.

       XML::Parser (version 2.27 and up) does a	much better job	at reading
       external	DTDs, because recently external	DTD parsing was	added to
       expat.  Make sure you set the XML::Parser option	ParseParamEnt to 1 and
       the XML::Checker::Parser	option SkipExternalDTD to 1.  (They can	both
       be set in the XML::Checker::Parser constructor.)

       When external DTDs are parsed by	XML::Checker::Parser, they are located
       in the following	order:

       o   With	the %URI_MAP, which can	be set using map_uri.  This hash maps
	   external resource ids (like system ID's and public ID's) to full
	   path	URI's.	It was meant to	aid in resolving PUBLIC	IDs found in
	   DOCTYPE declarations	after the PUBLIC keyword, e.g.


	   However, you	can also use this to force XML::Checker	to read	DTDs
	   from	a different URL	than was specified (e.g. from the local	file
	   system for performance reasons.)

       o   on the Internet, if their system identifier starts with a protocol
	   (like http://...)

       o   on the local	disk, if their system identifier starts	with a slash
	   (absolute path)

       o   in the SGML_SEARCH_PATH, if their system identifier is a relative
	   file	name. It will use @SGML_SEARCH_PATH if it was set with
	   set_sgml_search_path(), or the colon-separated
	   $ENV{SGML_SEARCH_PATH}, or (if that isn't set) the list (".",
	   "$ENV{'HOME'}/.sgml", "/usr/lib/sgml", "/usr/share/sgml"), which
	   includes the	current	directory, so it should	do the right thing in
	   most	cases.

   Static methods related to External DTDs
       set_sgml_search_path (dir1, dir2, ...)
	   External DTDs with relative file paths are looked up	using the
	   @SGML_SEARCH_PATH, which can	be set with this method. If
	   @SGML_SEARCH_PATH is	never set, it will use the colon-separated
	   $ENV{SGML_SEARCH_PATH} instead. If neither are set it uses the
	   list: ".", "$ENV{'HOME'}/.sgml", "/usr/lib/sgml",

	   set_sgml_search_path	is a static method.

       map_uri (pubid => uri, ...)
	   To define the location of PUBLIC ids, as found in DOCTYPE
	   declarations	after the PUBLIC keyword, e.g.


	   call	this method, e.g.

	     XML::Checker::Parser::map_uri (
		   "-//W3C//DTD	HTML 4.0//EN" => "file:/user/html.dtd");

	   See External	DTDs for more info.

	   XML::Checker::Parser::map_uri is a static method.

Switching user handlers	at parse time
       You should be able to use setHandlers() just as in XML::Parser.	(Using
       setHandlers has not been	tested yet.)

Error handling
       XML::Checker::Parser routes the fail handler through
       XML::Checker::Parser::fail_add_context()	before calling your fail
       handler (i.e. the global	fail handler: $XML::Checker::FAIL.  See
       "ERROR_HANDLING"	in XML::Checker.)  It adds the (line, column, byte)
       information from	XML::Parser to the error context (unless it was	the
       end of the XML document.)

Supported XML::Parser handlers
       Only the	following XML::Parser handlers are currently routed through
       XML::Checker: Init, Final, Char,	Start, End, Element, Attlist, Doctype,
       Unparsed, Notation.

       When using XML::Checker::Parser to parse	external DTDs (i.e. with
       SkipExternalDTD => 0), expect trouble when your external	DTD contains
       parameter entities inside declarations or conditional sections. The
       external	DTD should probably have the same encoding as the orignal XML

       Send bug	reports, hints,	tips, suggestions to Enno Derksen at

       XML::Checker ("SEE_ALSO"	in XML::Checker), XML::Parser

perl v5.24.1			  2000-01-31	       XML::Checker::Parser(3)

NAME | SYNOPSIS | DESCRIPTION | XML::Checker::Parser constructor | External DTDs | Switching user handlers at parse time | Error handling | Supported XML::Parser handlers | CAVEATS | AUTHOR | SEE ALSO

Want to link to this manual page? Use this URL:

home | help