Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
Marpa::PP(3)	      User Contributed Perl Documentation	  Marpa::PP(3)

NAME
       Marpa::PP - Pure	Perl version of	Marpa

SYNOPSIS
	   use Marpa::PP;

	   my $grammar = Marpa::PP::Grammar->new(
	       {   start   => 'Expression',
		   actions => 'My_Actions',
		   default_action => 'first_arg',
		   rules   => [
		       { lhs =>	'Expression', rhs => [qw/Term/]	},
		       { lhs =>	'Term',	rhs => [qw/Factor/] },
		       { lhs =>	'Factor', rhs => [qw/Number/] },
		       { lhs =>	'Term',	rhs => [qw/Term	Add Term/], action => 'do_add' },
		       {   lhs	  => 'Factor',
			   rhs	  => [qw/Factor	Multiply Factor/],
			   action => 'do_multiply'
		       },
		   ],
	       }
	   );

	   $grammar->precompute();

	   my $recce = Marpa::PP::Recognizer->new( { grammar =>	$grammar } );

	   $recce->read( 'Number', 42 );
	   $recce->read( 'Multiply', );
	   $recce->read( 'Number', 1 );
	   $recce->read( 'Add',	);
	   $recce->read( 'Number', 7 );

	   sub My_Actions::do_add {
	       my ( undef, $t1,	undef, $t2 ) = @_;
	       return $t1 + $t2;
	   }

	   sub My_Actions::do_multiply {
	       my ( undef, $t1,	undef, $t2 ) = @_;
	       return $t1 * $t2;
	   }

	   sub My_Actions::first_arg { shift; return shift; }

	   my $value_ref = $recce->value;
	   my $value = $value_ref ? ${$value_ref} : 'No	Parse';

DESCRIPTION
   This	is ALPHA software
       This is alpha software.	There may be bugs. The interface may change.
       Please be careful.  Do not rely on it for anything mission-critical.

   Overview
       Marpa parses any	language whose grammar can be written in BNF.  That
       includes	recursive grammars, ambiguous grammars,	infinitely ambiguous
       grammars	and grammars with useless or empty productions.

       This document contains a	top-level overview of the API for the Marpa
       parse engine.  The two examples in this document	show the typical flows
       of Marpa	method calls.  This document will use these examples to
       describe	the basic features of Marpa in semi-tutorial fashion.  Marpa's
       advanced	features, and full reference details of	all features, can be
       found in	the other Marpa	API documents.

   The Three Phases
       A parser	needs to:

       o   Accept a grammar.

       o   Read	input.

       o   Return values from the parses, according to a semantics.

       In Marpa	these three tasks are, for the most part, distinct phases.
       Grammars	are "Marpa::PP::Grammar" objects.  The reading of input	and
       the evaluation of the parse according to	the semantics is performed by
       "Marpa::PP::Recognizer" objects.

EXAMPLE	1: A SIMPLE CALCULATOR
       The synopsis shows the code for a very simple calculator.  It handles
       only addition and multiplication	of integers.  This section explains,
       line by line, how it works.

   Marpa::PP::Grammar::new
	   my $grammar = Marpa::PP::Grammar->new(
	       {   start   => 'Expression',
		   actions => 'My_Actions',
		   default_action => 'first_arg',
		   rules   => [
		       { lhs =>	'Expression', rhs => [qw/Term/]	},
		       { lhs =>	'Term',	rhs => [qw/Factor/] },
		       { lhs =>	'Factor', rhs => [qw/Number/] },
		       { lhs =>	'Term',	rhs => [qw/Term	Add Term/], action => 'do_add' },
		       {   lhs	  => 'Factor',
			   rhs	  => [qw/Factor	Multiply Factor/],
			   action => 'do_multiply'
		       },
		   ],
	       }
	   );

       Marpa grammars are "Marpa::PP::Grammar" objects.	 They are created with
       the Marpa::PP::Grammar::new constructor.	 The arguments to
       Marpa::PP::Grammar::new are references to hashes	of named arguments.
       In the key/value	pairs of these hashes, the hash	key is the name	of the
       argument, and the hash value is the value of the	named argument.

       The start Named Argument

	   start => 'Expression',

       The "start" named argument is required.	Its value is a string
       containing the name of the grammar's start symbol.

       Named Arguments for the Semantics

		   actions => 'My_Actions',
		   default_action => 'first_arg',

       The "actions" and "default_action" named	arguments specify semantics.
       Their argument values are strings, which	acquire	their semantics	during
       evaluation.

       Evaluation will be described later.  Peeking ahead, the
       "default_action"	named argument will be interpreted as an action	name.
       This action name	will resolve to	an action -- a Perl closure that
       implements semantics.  The action specified by "default_action" is used
       as the action for rules with no action of their own.  "actions"
       provides	the name of a Perl package where Marpa will look for its
       actions.

       The rules Named Argument

	   rules => [
	       { lhs =>	'Expression', rhs => [qw/Term/]	},
	       { lhs =>	'Term',	      rhs => [qw/Factor/] },
	       { lhs =>	'Factor',     rhs => [qw/Number/] },
	       { lhs =>	'Term',	rhs => [qw/Term	Add Term/], action => 'do_add' },
	       {   lhs	  => 'Factor',
		   rhs	  => [qw/Factor	Multiply Factor/],
		   action => 'do_multiply'
	       },
	   ],

       The value of the	"rules"	named argument is a reference to an array of
       rule descriptors.  In this example, all the rule	descriptors are	in the
       "long" form -- they are references to hashes of rule properties.	 In
       each key/value pair of a	rule descriptor	hash, the key is the name of a
       rule property, and the hash value is the	value of that rule property.

       The lhs Property

       The value of the	"lhs" rule property must be a string containing	the
       name of the rule's left hand side symbol.  Every	Marpa rule must	have a
       left hand side symbol.

       The rhs Property

       The value of the	"rhs" property is a reference to an array of strings
       containing names	of the rule's right hand symbols, in order.  This
       array may be zero length, in which case this is an empty	rule --	a rule
       with no symbols on the right hand side.	There are no empty rules in
       this example.

       The action Property

       The value of the	"action" rule property is a string.  Peeking ahead,
       each "action" property string will be interpreted as an action name.
       This action name	will be	resolved to a Perl closure that	implements the
       rule's semantics.

   Marpa::PP::Grammar::precompute
	   $grammar->precompute();

       Before a	Marpa grammar object can be used by a Marpa recognizer,	it
       must be precomputed.  Precomputation compiles data structures that the
       recognizer will need.

   Marpa::PP::Recognizer::new
	   my $recce = Marpa::PP::Recognizer->new( { grammar =>	$grammar } );

       "Marpa::PP::Recognizer::new" creates a new recognizer.  Its arguments
       are references to hashes	of named arguments.  In	this example the only
       named argument is the required argument:	""grammar"".  The value	of the
       "grammar" named argument	must be	a precomputed Marpa grammar.

   Marpa::PP::Recognizer::read
	   $recce->read( 'Number', 42 );
	   $recce->read( 'Multiply', );
	   $recce->read( 'Number', 1 );
	   $recce->read( 'Add',	);
	   $recce->read( 'Number', 7 );

       The "Marpa::PP::Recognizer::read" method	takes two arguments, a token
       name and	a token	value.	The token name must be the name	of a valid
       terminal	symbol in the grammar.	By default all symbols are valid as
       terminal	symbols, unless	a grammar contains empty rules.

       The grammars in the examples in this document do	not contain empty
       rules, and therefore are	free to	use any	symbol in the grammar as a
       token name.  For	more on	terminals, including how to explicitly mark
       terminal	symbols, see "Terminals" in Marpa::PP::Grammar.

       The token value must be a Perl scalar, but otherwise its	form and
       semantics are entirely up to the	application.  If the token value is
       omitted,	it is a	Perl "undef".  In the calculator example, the values
       of the ""Add"" and ""Multiply"" tokens are never	used, so they are
       allowed to default to "undef".

   Marpa::PP::Recognizer::value
	   my $value_ref = $recce->value;
	   my $value = $value_ref ? ${$value_ref} : 'No	Parse';

       The "Marpa::PP::Recognizer::value" method returns a reference to	the
       parse result's value, if	there was a parse result.  If there was	no
       parse result, "Marpa::PP::Recognizer::value" returns "undef".

   Resolving the Semantics
       The first thing "Marpa::PP::Recognizer::value" needs to do is to
       resolve the semantics.  Resolving the semantics means mapping the
       action names into actions.  Actions are Perl closures which directly
       implement semantics.  In	this example, the "actions" named argument is
       specified.  "actions" is	a Perl package name.  Marpa will look for
       actions in that package.

	   actions => 'My_Actions',

	   { lhs => 'Factor', rhs => [qw/Factor	Multiply Factor/], action => 'do_multiply' },

       For example, the	"action" property for the above	rule is
       ""do_multiply"" and the "actions" named argument	to the grammar was
       ""My_Actions"".	So Marpa looks for a closure whose fully qualified
       name is "My_Actions::do_multiply", which	it finds:

	   sub My_Actions::do_multiply {
	       my ( undef, $t1,	undef, $t2 ) = @_;
	       return $t1 * $t2;
	   }

       Rules do	not always have	"action" properties.  That is the case with
       these rules in this example:

	   { lhs => 'Expression', rhs => [qw/Term/] },
	   { lhs => 'Term', rhs	=> [qw/Factor/]	},
	   { lhs => 'Factor', rhs => [qw/Number/] },

       Where there is no "action" rule property, Marpa tries to	use the	"lhs"
       property	as an action name.  When Marpa cannot resolve the "lhs"
       property	as an action name, it will fall	back to	using the default
       action for the grammar.

       For example, in the first rule in the above display, Marpa will look
       for a Perl closure with the fully qualified name
       ""My_Actions::Expression"".  When Marpa does not	find a closure by that
       name, Marpa will	fall back to trying to use the default action.

       The other two rules in the above	display	work similarly.	 Marpa will
       look for	Perl closures named ""My_Actions::Term"" and
       ""My_Actions::Factor"".	It will	not find them and will fall back to
       trying to use the default action, as described next.

	   default_action => 'first_arg',

       The "default_action" named argument is resolved in the same way as are
       the "action" properties of the rules.  In this example, default_action
       is specified as ""first_arg"" and resolves to "My_Actions::first_arg".

   Actions
	   sub My_Actions::first_arg { shift; return shift; }

	   sub My_Actions::do_add {
	       my ( undef, $t1,	undef, $t2 ) = @_;
	       return $t1 + $t2;
	   }

       Value actions are Perl closures used as callbacks.  Value actions are
       called when nodes in a parse tree are evaluated.	 A value action
       receives	one or more arguments.	The first argument to a	value action
       is always a per-parse-result object, which the callbacks	can use	as a
       scratchpad.  In these examples, the per-parse-result object is not
       used.

       For a non-empty rule, the second	and any	subsequent arguments to	the
       callback	are the	values,	in lexical order, of the symbols on the	right
       hand side of the	rule.  If the action is	for an empty rule, the per-
       parse-result object will	be its only argument.

       Every value action is expected to return	a value.  With one exception,
       this value is passed up to a parent node	as an argument.	 The exception
       is the value for	the start rule.	 The return value for the start	rule
       becomes the value of the	parse result.

       Rules with no action specified for them take their semantics from the
       "default_action"	named argument.	 If there is no	default	action for a
       grammar,	rules with no action specified for them	return a Perl "undef".

EXAMPLE	2: AN AMBIGUOUS	PARSE
       This is the same	calculator as before, rewritten	to be ambiguous.
       Rather than give	multiplication precedence over addition, the rewritten
       calculator allows any order of operations.  In this example, the
       actions ("My_Actions::do_add", etc.)  and the @tokens array remain the
       same as before.

       Eliminating precedence makes the	grammar	shorter, but it	also means
       there can be multiple parse results, and	that the different parse
       results can have	different values.  In this application we decide, for
       each input, to return the value for every one of	the parse results.

	   use Marpa::PP;

	   my $ambiguous_grammar = Marpa::PP::Grammar->new(
	       {   start   => 'E',
		   actions => 'My_Actions',
		   rules   => [
		       [ 'E', [qw/E Add	E/],	  'do_add' ],
		       [ 'E', [qw/E Multiply E/], 'do_multiply'	],
		       [ 'E', [qw/Number/],	  ],
		   ],
		   default_action => 'first_arg',
	       }
	   );

	   $ambiguous_grammar->precompute();

	   my $ambiguous_recce =
	       Marpa::PP::Recognizer->new( { grammar =>	$ambiguous_grammar } );

	   $ambiguous_recce->read( 'Number', 42	);
	   $ambiguous_recce->read( 'Multiply', );
	   $ambiguous_recce->read( 'Number', 1 );
	   $ambiguous_recce->read( 'Add', );
	   $ambiguous_recce->read( 'Number', 7 );

	   my @values =	();
	   while ( defined( my $ambiguous_value_ref = $ambiguous_recce->value()	) ) {
	       push @values, ${$ambiguous_value_ref};
	   }

   Short Form Rule Descriptors
	   rules => [
	       [ 'E', [qw/E Add	E/],	  'do_add' ],
	       [ 'E', [qw/E Multiply E/], 'do_multiply'	],
	       [ 'E', [qw/Number/], ],
	   ],

       The rule	descriptors in the ambiguous example demonstrate the "short"
       or array	form of	rule descriptors.  Array form rule descriptors are
       references to arrays.  Here the elements	are, in	order, the "lhs"
       property, the "rhs" property, and the "action" property.

   Marpa::PP::Recognizer::value
	   my @values =	();
	   while ( defined( my $ambiguous_value_ref = $ambiguous_recce->value()	) ) {
	       push @values, ${$ambiguous_value_ref};
	   }

       When called more	than once, the "Marpa::PP::Recognizer::value" method
       iterates	through	the parse results.  For	each call, it returns a
       reference to the	value of a parse result.  At the end of	the iteration,
       after all parse results have been returned,
       "Marpa::PP::Recognizer::value" returns "undef".	If there were no parse
       results,	"Marpa::PP::Recognizer::value" returns "undef" the first time
       that it is called.

ERRORS AND EXCEPTIONS
       Methods in the Marpa API	do not return errors.  When there are errors,
       Marpa API methods throw an exception.

INHERITANCE
       Classes in the Marpa API	are not	designed to be inherited.

OTHER DOCUMENTS
       This document gives a semi-tutorial overview of the entire Marpa	API.
       For full	details	on Marpa's grammar objects and their methods, see the
       Marpa::PP::Grammar document.  For full details on Marpa's recognizer
       objects and their methods, see the Marpa::PP::Recognizer	document.

       Marpa::PP::Vocabulary is	intended as a quick refresher in parsing
       terminology, emphasizing	how the	standard terms are used	in the Marpa
       context.	 Marpa's standard semantics are	fully described	in the
       Marpa::PP::Semantics document.  Techniques for tracing and for
       debugging your Marpa grammars are described in the Marpa::PP::Tracing
       document	and the	Marpa::PP::Debug document.  For	those with a
       theoretical bent, my sources, and other useful references, are
       described in Marpa::PP::Advanced::Bibliography.

AUTHOR
       Jeffrey Kegler

   Why is it Called "Marpa"?
       Marpa is	the name of the	greatest of the	Tibetan	"translators".	In his
       time (the 11th century AD) Indian Buddhism was at its height.  Marpa's
       generation of scholars was devoted to producing Tibetan versions	of
       Buddhism's Sanskrit scriptures.	Marpa became the greatest of them, and
       today is	known as Marpa Lotsawa:	"Marpa the Translator".

   Blatant Plug
       Marpa is	a character in my novel, The God Proof.	 The God Proof centers
       around Kurt Goedel's proof of God's existence.  Yes, that Kurt Goedel,
       and yes,	he really did work out a God Proof (it's in his	Collected
       Works, Vol. 3, pp. 403-404).  The God Proof is available	as a free
       download	(<http://www.lulu.com/content/933192>).	 It can	be purchased
       in print	form at	Amazon.com:
       <http://www.amazon.com/God-Proof-Jeffrey-Kegler/dp/1434807355>.

ACKNOWLEDGMENTS
       Marpa is	directly derived from two other	parsers.  The first was
       discovered by John Aycock and R.	 Nigel Horspool	and is described in
       their Aycock and	Horspool 2002.	The second was described by Joop Leo
       and is described	in Leo 1991.  Aycock, Horspool,	and Leo, in turn,
       based their algorithms on the algorithm discovered by Jay Earley.  I
       combined	the Aycock-Horspool algorithm with the Leo algorithm, and
       added significant changes of my own.

       I'm grateful to Randal Schwartz for his support over the	years that
       I've been working on Marpa.  My chats with Larry	Wall have been few and
       brief, but his openness to new ideas has	been a major encouragement and
       his insight into	the relationship between "natural language" and
       computer	language has been a major influence.  More recently, Allison
       Randal and Patrick Michaud have been generous with their	very valuable
       time.  They might have preferred	that I volunteered as a	Parrot cage-
       cleaner,	but if so, they	were too polite	to say.

       Many at perlmonks.org answered questions	for me.	 I used	answers	from
       chromatic, Corion, dragonchild, jdporter, samtregar and Juerd, among
       others, in writing this module.	I'm just as grateful to	those whose
       answers I didn't	use.  My inquiries were	made while I was thinking out
       the code	and it wasn't always 100% clear	what I was after.  If the butt
       is moved	after the round, it shouldn't count against the	archer.

       In writing the Pure Perl	version	of Marpa, I benefited from studying
       the work	of Francois Desarmenien	("Parse::Yapp"), Damian	Conway
       ("Parse::RecDescent") and Graham	Barr ("Scalar::Util").	Adam Kennedy
       patiently instructed me in module writing, both on the finer points and
       on issues about which I really should have know better.

SUPPORT
       Marpa::PP comes without warranty.  Support is provided on a volunteer
       basis through the standard mechanisms for CPAN modules.	The Support
       document	has details.

COPYRIGHT AND LICENSE
	 Copyright 2012	Jeffrey	Kegler
	 This file is part of Marpa::PP.  Marpa::PP is free software: you can
	 redistribute it and/or	modify it under	the terms of the GNU Lesser
	 General Public	License	as published by	the Free Software Foundation,
	 either	version	3 of the License, or (at your option) any later	version.

	 Marpa::PP is distributed in the hope that it will be useful,
	 but WITHOUT ANY WARRANTY; without even	the implied warranty of
	 MERCHANTABILITY or FITNESS FOR	A PARTICULAR PURPOSE.  See the GNU
	 Lesser	General	Public License for more	details.

	 You should have received a copy of the	GNU Lesser
	 General Public	License	along with Marpa::PP.  If not, see
	 http://www.gnu.org/licenses/.

perl v5.32.1			  2021-11-04			  Marpa::PP(3)

NAME | SYNOPSIS | DESCRIPTION | EXAMPLE 1: A SIMPLE CALCULATOR | EXAMPLE 2: AN AMBIGUOUS PARSE | ERRORS AND EXCEPTIONS | INHERITANCE | OTHER DOCUMENTS | AUTHOR | ACKNOWLEDGMENTS | SUPPORT | COPYRIGHT AND LICENSE

Want to link to this manual page? Use this URL:
<https://www.freebsd.org/cgi/man.cgi?query=Marpa::PP&sektion=3&manpath=FreeBSD+13.0-RELEASE+and+Ports>

home | help