Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages


home | help
IO::Util(3)	      User Contributed Perl Documentation	   IO::Util(3)

       IO::Util	- A selection of general-utility IO function

       The latest versions changes are reported	in the Changes file in this

	       Time::HiRes   = 0
	       Sys::Hostname = 0

	       perl -MCPAN -e 'install IO::Util'

       Standard	installation
	   From	the directory where this file is located, type:

	       perl Makefile.PL
	       make test
	       make install

	 use IO::Util qw(capture slurp Tid Lid Uid load_mml);


	 # captures the	selected filehandle
	 $output_ref = capture { any_printing_code() } ;
	 # now $$output_ref eq 'something'

	 # captures FILEHANDLE
	 $output_ref = capture { any_special_printing_code() } \*FILEHEANDLER ;

	 # append the output to	$captured
	 capture { any_printing_code() } \*FILEHEANDLER	, \$captured
	 # now $captured eq 'something'

	 # use another class to	tie the	handler
	 use IO::Scalar	;
	 $IO::Util::TIE_HANDLE_CLASS = 'IO::Scalar'


	 $_ = '/path/to/file' ;
	 $content_ref =	slurp ;
	 $content_ref =	slurp '/path/to/file' ;
	 $content_ref =	slurp \*FILEHANDLE ;

	 # append the file content to $content
	 $_ = '/path/to/file' ;
	 slurp \$content;
	 slurp '/path/to/file',	\$content ;
	 slurp \*FILEHANDLE, \$content ;

       Tid(), Lid(), Uid()

	 $temporarily_unique_id	= Tid ;	# 'Q9MU1N_NVRM'
	 $locally_unique_id	= Lid ;	# '2MS_Q9MU1N_P5F6'
	 $universally_unique_id	= Uid ;	# 'MGJFSBTK_2MS_Q9MU1N_PWES'

       A MML file (Minimal Markup Language)

	   <!--	a multi	line
		     <key>any key</key>


	 $struct = load_mml 'path/to/mml_file' ;
	 $struct = load_mml \ $mml_string ;
	 $struct = load_mml \ *MMLFILE ;
	 $struct = load_mml ..., %options ;

	 # $struct = {
	 #	       'parA' => {
	 #			   'optA' => [
	 #				       '01',
	 #				       '02',
	 #				       '03'
	 #				     ]
	 #			 },
	 #	       'parB' => {
	 #			   'optA' => [
	 #				       '04',
	 #				       '05',
	 #				       '06'
	 #				     ],
	 #			   'optB' => {
	 #				       'key' =>	'any key'
	 #				     }
	 #			}
	 #	     }

       This is a micro-weight module that exports a few	functions of general
       utility in IO operations.

   capture { code } [ arguments	]
       Use this	function in order to capture the output	that a code write to
       any FILEHANDLE (usually STDOUT) by using	"print", "printf" and
       "syswrite" statements. The function works also with already tied
       handles,	and older perl versions.

       This function expects a mandatory code reference	(usually a code	block)
       passed as the first argument and	two optional arguments:	handle_ref and
       scalar_ref. The function	executes the referenced	code and returns a
       reference to the	captured output.

       If handle_ref is	omitted	the selected filehandle	will be	used by
       default (usually	"STDOUT"). If you pass scalar_ref, the output will be
       appended	to the referenced scalar. In this case the result of the
       function	will be	the same SCALAR	reference passed as the	argument.

       You can pass the	optional arguments in mixed order. All the following
       statement work:

	  $output_ref =	capture	{...}
	  $output_ref =	capture	{...} \*FH
	  capture {...}	\*FH, \$output ; # capure FH and append	to $output
	  capture {...}	\$output, \*FH ; # capure FH and append	to $output
	  capture {...}	\$output ;	 # append to $output
	  capture \&code, ...		 # a classical code ref	works too

       Note: This function ties	the FILEHANDLE to IO::Util::WriteHandle	class
       and unties it after the execution of the	code. If FILEHANDLE is already
       tied to any other class,	it just	temporary re-bless the tied object to
       IO::Util::Handle	class, re-blessing it again to its original class
       after the execution of the code,	thus preserving	the original
       FILEHANDLE configuration.

   slurp [ arguments ]
       The "slurp" function expects a path to a	file or	an open	handle_ref,
       and returns the reference to the	whole file|handle_ref content. If no
       argument	is passed it will use $_ as the	argument.

       As an alternative you can pass also a SCALAR reference as an argument,
       so the content will be appended to the referenced scalar. In this case
       the result of the function will be the same SCALAR reference passed as
       the argument.

       Note: You can pass the optional arguments in mixed order. All the
       following statement work:

	  $content_ref = slurp ;   # open file in $_
	  $content_ref = slurp '/path/to/file';
	  slurp	'/path/to/file', \$content ;
	  slurp	\$content, '/path/to/file' ;
	  slurp	\$content ;

       The "Tid", "Lid"	and "Uid" functions return an unique ID	string useful
       to name temporary files,	or use for other purposes.

   Tid ( [options] )
       This function returns a temporary ID valid for the current process
       only. Different temporarily-unique strings are granted to be unique for
       the current process only	($$)

   Lid ( [options] )
       This function returns a local ID	valid for the local host only.
       Different locally-unique	strings	are granted to be unique when
       generated by the	same local host

   Uid ( [options] )
       This function returns an	universal ID. Different	universally-unique
       strings are granted to be unique	also when generated by different
       hosts. Use this function	if you have more than one machine generating
       the IDs for the same context. This function includes the	host IP	number
       in the id algorithm.

   *id options
       The above functions accept an optional hash of named arguments:

	   You can specify the set of characters used to generate the uniquid
	   string. You have the	following options:

	   chars => 'base34'
	       uses [1..9, 'A'..'N', 'P'..'Z'].	No lowercase chars, no number
	       0 no capital 'o'. Useful	to avoid human mistakes	when the id
	       may be represented by non-electronical means (e.g. communicated
	       by voice	or read	from paper). This is the default (used if you
	       don't specify any chars option).

	   chars => 'base62'
	       Uses "[0..9, 'a'..'z', 'A'..'Z']". This option tries to
	       generate	shorter	ids.

	   chars => \@chars
	       Any reference to	an array of arbitrary characters.

	   The character used to separate group	of characters in the id.
	   Default '_'.

       IP  Applies to "Uid" only. This option allows to	pass the IP number
	   used	generating the universally-unique id. Use this option if you
	   know	what you are doing.

	  $ui =	Tid			      #	Q9MU1N_NVRM
	  $ui =	Lid			      #	2MS_Q9MU1N_P5F6
	  $ui =	Uid			      #	MGJFSBTK_2MS_Q9MU1N_PWES
	  $ui =	Uid separator=>'-'	      #	MGJFSBTK-2DH-Q9MU6H-7Z1Y
	  $ui =	Tid chars=>'base62'	      #	1czScD_2h0v
	  $ui =	Lid chars=>'base62'	      #	rq_1czScD_2jC1
	  $ui =	Uid chars=>'base62'	      #	jQaB98R_rq_1czScD_2rqA
	  $ui =	Lid chars=>[ 0..9, 'A'..'F']  #	9F4_41AF2B34_62E76

Minimal	Markup Language	(MML)
       A lot of	programmers use	(de facto) a subset of canonical XML which is
       characterized by:

	No Attributes
	No mixed Data and Element content
	No Processing Instructions (PI)
	No Document Type Declaration (DTD)
	No non-character entity-references
	No CDATA marked	sections
	Support	for only UTF-8 character encoding
	No optional features

       That subset has no official standard, so	in this	description we will
       generically refer to it as 'Minimal Markup Language' or MML. Please,
       note that MML is	just an	unofficial and generic way to name that
       minimal XML subset, avoiding any	possible MXML, SML, MinML, /.+ML$/

   MML advantages
       If you need just	to store configuration parameters and construct	any
       perl data structure, MLM	is all what you	need. Using it instead full
       featured	XML gives you a	few very interesting advantages:

       o   it is really	simple to use/edit and understand also by any
	   unskilled people

       o   you can parse it with very lite, fast and simple RE,	thus avoiding
	   to load and execute several thousands of perl code needed to	parse
	   full	featured XML

       o   anyway any canonical	XML parser will	be able	to parse it as well

   About XML parsing and structure reduction
       The "load_mml" function produces	perl structures	exactly	like other
       CPAN modules (e.g. XML::Simple, XML::Smart) but use the opposite
       approach. That modules usually require a	canonical XML parser to
       achieve a full XML tree,	then prune all the unwanted branches. That
       means thousands of line of code loaded and executed, and	a potentially
       big structure to	reduce,	which probably is a waste of resources when
       you have	just to	deal with simple MML.

       The "load_mml" uses just	a few lines of recursive code, parsing MML
       with a simple RE. It builds up only the branches	it needs, optionally
       ignoring	all the	unwanted nodes.	That is	exactly	what you need for MML,
       but it is obviously completely inappropriate for	full XML files (e.g.
       HTML) which use attributes and other features unsupported by MML.

   load_mml ( MML [, options] )
       This function parses the	MML eventually using the options, and returns
       a perl structure	reflecting the MML structure and any custom logic you
       may need	(see "options"). It accepts one	MML parameter that can be a
       reference to a SCALAR content, a	path to	a file or a reference to a

       This function accepts also a few	options	which could be passed as plain
       name=>value pairs or as a HASH reference.


       You can customize the process by	setting	a few option, which will allow
       you to gain full	control	over the process and the resulting structure
       (see also the t/05_load_mml.t test file for a few examples):

       optional	=> 0|1
	   Boolean. This option	applies	when the MML argument is a path	to a
	   file: a true	value will not try to load a file that doesn't exists;
	   a false value will croak on error. False by default.

       strict => 1|0
	   Boolean. A true value will croak when any unsupported syntax	is
	   found, while	a false	value will quitely ignore unsupported syntax.
	   Default true	(strict).

	      $strict_mml = '<opt><a>01</a></opt>';
	      $non_strict_mml =	<< 'EOS';
		  mixed	content	ignored
		  <elem	attr="ignored">01</elem>

	      $structA = load_mml \$non_strict_mml ; # would croak
	      $structB = load_mml \$non_strict_mml, strict=>0 ;	 # ok

       cache =>	1|0
	   Boolean. if MML is a	path, a	true value will	cache the mml
	   structure in	a global (persistent under mod_perl). "load_mml" will
	   open	and parse the file only	the first time or if the file has been
	   modified. If	for any	reason you don't want to cache the structure
	   or  set this	option to a false value. Default true (cached).

	   Note: If you	need to	parse the same file with different options,
	   (thus producing different structures) you must disable the chache.
	   Also, when you have a lot of	mml files with very simple structure
	   the cache could slow	down the parsing. Caching is convenient	when
	   you have complex or large structure and a few files.

       markers => '<>'|'[]'|'{}'
	   Instead of using the	canonical '<>' markers,	you can	use the	"[]"
	   or the "{}"non standard markers. Your file will not be XML
	   compliant anymore, anyway it	may be very useful when	the file
	   content is composed by XML or HTML chunks, since you	can avoid the
	   escaping of '<' and '>'. Default standard XML markers '<>'.

	      $mml = '[opt][a]<a href="something">something</a>[/a][/opt]';
	      $structA = load_mml \$mml, markers => '[]' ;
	      #	$structA = {
	      #		     'a' => '<a	href="something">something</a>'
	      #		   };

       keep_root => 0|1
	   Boolean. A true value will keep the root element, while a false
	   value will strip the	root. Default false (root stripped)

	      $mml = '<opt><a>01</a></opt>';
	      $structA = load_mml \$mml	;

	      $$struct{a} eq '01'; # true

	      #	$structA = {
	      #		     'a' => '01'
	      #		   };

	      $structB = load_mml \$mml, keep_root=>1 ;

	      $$struct{opt}{a} eq '01';	# true

	      #	$structB = {
	      #		     'opt' => {
	      #				'a' => '01'
	      #			      }
	      #		   };

       filter => { id|re => CODE|'TRIM_BLANKS'|'ONE_LINE' }
	   This	option allows to filter	data from the MML to the structure.
	   You must set	it to an hash of id/filter. The	key id can be the
	   literal element id which content you	want to	filter,	or any
	   compiled RE you want	to match against the id	elements; the filter
	   can be a CODE reference or the name of a couple of literal built-in
	   filters: 'TRIM_BLANKS', 'ONE_LINE'.

	   The referenced code will receive id,	data_reference and
	   active_options_referece as the arguments; besides for regexing
	   convenience the data	is aliased in $_.

	      $mml = <<	'EOS';
		 <anything_else>not filtered</anything_else>

	      $struct =	load_mml \$mml,	filter=>{ foo	      => sub{uc},
						  qr/^b/      => sub{lc},
						  multi_line  => 'TRIM_BLANKS',
						  other_stuff => \&my_filter

	      sub my_filter {
		  my ($id, $data_ref, $opt) = @_ ;
		  # $_ contains	the actual data
		  # so you could use it	instead	of $$data_ref
		  # return $_ (if modified it with any s///)
		  # or any arbitrarily modified	data
		  return 'something else';

	      #	$struct	= {
	      #		    'foo' => 'AAA', # it was 'aaa'
	      #		    'bar' => 'bbb', # it was 'bBB'
	      #		    'baz' => 'zzz', # it was 'ZZz'
	      #		    'multi_line' => "other\ndata",  # it was "\n  other\n  data\n"
	      #		    'other_stuff' => 'something	else', # it was	'something'
	      #		    'anything_else' => 'not filtered'  # the same
	      #		  }

       handler => { id|re => CODE|'SPLIT_LINES'	}
	   This	option allows you to execute any code during the parsing of
	   the MML in order to change the returned structure or	do any other
	   task. It allows you to implement your own syntax, checks and
	   executions, skip any	branch,	change the options of any child	node,
	   generate nodes or even objects to add to the	returned structure.

	   You must set	it to an hash of id/handler. The key id	can be the
	   literal element id which content you	want to	handler, or any
	   compiled RE you want	to match against the id	elements; the filter
	   may be a CODE reference or the name of a literal built-in handler
	   'SPLIT_LINES' (an handler that splits the lines of the node into an
	   array of elements: see the example below).

	   The referenced CODE will be called instead the standard
	   "IO::Util::parse_mml" handler, and will receive id, data_reference
	   and active_options_referece as the arguments.

	   It is expected to return the	branch to add to the returned
	   structure. If the referenced	CODE needs to refers to	the original
	   branch structure, it	could retrieve it by using

	   A few examples using	this same MML string:

	      $mml = <<	'EOS';

	   Regular parsing and structure:

	      $struct =	load_mml \$mml # no options

	      #	$struct	= {
	      #		    'a'	=> {
	      #			     'b' => [
	      #				      'Foo',
	      #				      'Bar'
	      #				    ]
	      #			   },
	      #		    'c'	=> 'something'
	      #		  } ;

	   Skip	all the	'a' elements:

	      $struct =	load_mml \$mml,	handler=>{ a =>	sub{} }	; # just for 'a' elements

	      #	$struct	= { 'c'	=> 'something' } ;

	   Folding an array:

	      $struct =	load_mml \$mml,	handler	=> { a => \&a_handler }	; # just for 'a'

	      sub a_handler {
		  # get	the original branch
		  my $branch = IO::Util::parse_mml( @_ );
		  $$branch{b} #	['Foo','Bar']

	      #	$structB = {
	      #		     'a' => [
	      #			      'Foo',
	      #			      'Bar'
	      #			    ],
	      #		     'c' => 'something'
	      #		   } ;

	   You can also	use the	built-in handler 'SPLIT_LINES' and write a MML
	   like	this:

	      $mml = <<	'EOS';

	      $struct =	load_mml \$mml,
			 handler=>{ b => 'SPLIT_LINES },
			 filter	=>{ b => 'TRIM_BLANKS' }

	      #	$struct	= {
	      #		    'a'	=> {
	      #			     'b' => [
	      #				      'Foo',
	      #				      'Bar'
	      #				    ]
	      #			   },
	      #		    'c'	=> 'something'
	      #		  } ;

   parse_mml (id, MML [, options])
       Used internally and eventually by any handler, in order to parse	any
       MML chunk and return its	branch structure. It requires the element id,
       the reference to	the MML	chunk, eventually accepting the	options	hash
       reference to use	for the	branch.

       Note: You can escape any	character (specially < and >) by using the
       backslash '\'. XML comments can be added	to the MML and will be ignored
       by the parser.

       If you need support or if you want just to send me some feedback	or
       request,	please use this	link:

       A(C) 2004-2005 by Domizio Demichelis.

       All Rights Reserved. This module	is free	software. It may be used,
       redistributed and/or modified under the same terms as perl itself.

       Hey! The	above document had some	coding errors, which are explained

       Around line 829:
	   Non-ASCII character seen before =encoding in	'A(C)'.	Assuming

perl v5.32.1			  2005-12-31			   IO::Util(3)


Want to link to this manual page? Use this URL:

home | help