Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
Marpa::Semantics(3)   User Contributed Perl Documentation  Marpa::Semantics(3)

NAME
       Marpa::Semantics	- How Marpa Evaluates Parses

SYNOPSIS
	   my $grammar = Marpa::Grammar->new(
	       {   start	  => 'Expression',
		   actions	  => 'My_Actions',
		   default_action => 'first_arg',
		   rules	  => [
		       { lhs =>	'Expression', rhs => [qw/Term/]	},
		       { lhs =>	'Term',	      rhs => [qw/Factor/] },
		       { lhs =>	'Factor',     rhs => [qw/Number/] },
		       { lhs =>	'Term',	rhs => [qw/Term	Add Term/], action => 'do_add' },
		       {   lhs	  => 'Factor',
			   rhs	  => [qw/Factor	Multiply Factor/],
			   action => 'do_multiply'
		       },
		   ],
	       }
	   );

	   sub My_Actions::do_add {
	       my ( undef, $t1,	undef, $t2 ) = @_;
	       return $t1 + $t2;
	   }

	   sub My_Actions::do_multiply {
	       my ( undef, $t1,	undef, $t2 ) = @_;
	       return $t1 * $t2;
	   }

	   sub My_Actions::first_arg { shift; return shift; }

	   my $value_ref = $recce->value;
	   my $value = $value_ref ? ${$value_ref} : 'No	Parse';

OVERVIEW
       Marpa's semantics will be familiar to those who have used traditional
       methods to evaluate parses.  A parse is seen as a parse tree.  Nodes on
       the tree	are evaluated recursively, bottom-up.  Once the	values of all
       its child nodes are known, a parent node	is ready to be evaluated.  The
       value of	a parse	is the value of	the top	node of	the parse tree.

       Marpa's semantics see a virtual parse tree with virtual nodes.  These
       virtual structures keep ambiguity and rule rewrites invisible to	the
       semantics.  Nodes in Marpa's virtual parse trees	are of three kinds:

       o   Token Nodes

       o   Rule	Nodes

       o   Null	Nodes

   Action Names	and Semantic Closures
       During grammar generation and input recognition,	Marpa does not allow
       semantics to be specified directly as Perl closures.  This approach to
       semantics follows Perl 6, the wisdom of whose example I discovered the
       hard way.

       When a Marpa grammar is created,	all non-constant semantics take	the
       form of strings which express actions.  Actions are semantics in	an
       indirect	form.  Actions are given meaning later by the evaluator.

       At evaluation time, actions are resolved	to semantic Perl closures, and
       the semantic Perl closures are run to produce values.  (For clarity, I
       often call the Perl closures which implement semantics, semantic
       closures	or semantic Perl closures.)

NODES
   Token Nodes
       For every input token, there is an associated token node.  In the
       usual, token-stream, model of input, every token	will become a leaf
       node in the parse tree.	Tokens always have a token symbol.  At lexing
       time, they can be assigned a token value.  If no	token value is
       assigned	at lex time, the token value defaults to a Perl	"undef".

   Rule	Nodes
       Nodes which are ancestors of token nodes	are called rule	nodes.	Rule
       nodes are always	associated with	a rule.	 The value of a	rule node is
       computed	at node	evaluation time.  Applications can specify, on a per-
       rule basis, semantic Perl closures to evaluate rule nodes.

       The semantic closure's arguments	will be	a per-parse variable followed
       by the values of	its child nodes	in lexical order.  The value returned
       by the semantic Perl closure becomes the	value of the node.  If there
       is no semantic closure for a rule node, the value of the	rule node is a
       Perl "undef".

   Sequence Rule Nodes
       Some rules are sequence rules.  Everything said above about rule	nodes,
       also applies to sequence	rule nodes.  Specifically, the arguments to
       the semantic Perl closures for sequence rules are the per-parse
       variable	followed by the	values of the child nodes in lexical order.

       The difference (and it is a big one) is that in an ordinary rule, the
       right hand side is fixed	in length, and that length is known when you
       are writing the semantic	Perl closure.  In a sequence rule, the number
       of right	hand side symbols is not known until node evaluation time.  A
       semantic	Perl closure for a sequence rule node is written in the	same
       way that	you write any Perl closure that	handles	a variable number of
       arguments.

       Sequence	semantics work best for	sequences of items all of which	have
       the same	semantics.  When that is not the case, writing the sequence
       using ordinary non-sequence rules should	be considered as an
       alternative.

       By default, if a	sequence rule has separators, the separators are
       thrown away before the semantics	are applied.  They do not appear in
       the semantic Perl closure's @_ array.  If the value of the "keep" rule
       property	is a Perl true,	separators are kept, and do appear in the @_
       array.

   Null	Nodes
       A null node is a	node which derives the zero-length, or empty string.
       This means that a null node cannot be the ancestor of any token nodes.
       In Marpa, null nodes are	always leaf nodes.

       Null nodes are of two kinds.  A nulling symbol node corresponds to a
       nulling symbol.	A nulled rule node represents a	nulled rule.

       For every null node there is a null node	symbol,	which is used to the
       determine the value of the null node.  For a nulled rule	node, the null
       node symbol is the nulled rule's	left hand side symbol.	For a nulling
       symbol node, the	null node symbol is the	nulling	symbol.

       The value of a null node	is the null value of the null node symbol.
       The null	value of a symbol comes	from that symbol's "null_value"
       property, if one	is defined.  Otherwise,	the null value of the symbol
       comes from the grammar's	default	null value, as defined by the
       grammar's "default_null_value" named argument.  If neither the symbol
       "null_value" property or	the grammar's "default_null_value" named
       argument	is defined, a symbol's null value is a Perl "undef".

       A null subtree is a subtree all of whose	nodes are null.	 Marpa prunes
       all null	subtrees back to their topmost null node.  This	means that all
       null nodes that remain in Marpa's virtual parse tree will be leaf
       nodes.

       The "lost" semantics of the non-root nodes of null subtrees are usually
       not missed.  Null subtrees cannot contain token nodes, so no token
       nodes are lost when null	subtrees are pruned.  As bushy as a null
       subtree might be, all of	its nodes are null nodes.

       All null	nodes correspond to zero-length	strings, so we are literally
       dealing here with the "semantics	of nothing".  In theory	the semantics
       of nothing can be arbitrarily complex.  In practice it should be
       possible	to keep	them simple.

       If any application ever actually	needs it, Marpa	could implement	a
       complex,	and even dynamic, "semantics of	nothing".  For details see
       below.

   Null	Sequence Nodes
       Rule nodes for sequences	were mentioned above.  Sequence	nodes can also
       be null nodes.  This happens with sequence rules	which have a "min"
       rule property of	0.  Such a sequence rule can contain any number	of
       sequence	items, including zero items.  When a sequence contains zero
       items, it must derive the zero-length string, and its node is a null
       node.

       Sequence	null nodes follow the rules for	null nodes.  Their value is
       that of a symbol	-- the left hand side symbol of	the nulled sequence
       rule.

       Nullable	sequence rules do have a potential to surprise.	 When the node
       for a nullable sequence rule is a null node, its	semantics comes	from
       the null	value for its left hand	side symbol.  When the node for	a
       nullable	sequence rule is not a null node, then it is a rule node and
       its semantics come from the rule.  It's up to the application to	see
       that these two -- the null value	and the	sequence rule -- "make sense"
       together.  One way to do	this is	to create a special symbol which is
       dedicated to service as the left	hand side symbol for the sequence
       rule.  That dedicated symbol can	then be	given the correct semantics.

       The rules for nodes in null subtrees apply with equal force to nodes
       for sequence rules.  In a nulled	subtree, the only node whose semantics
       matters is the root node	of that	subtree.  If a zero-length sequence is
       in a nulled subtree, and	that zero-length sequence is not the root node
       of that subtree,	its semantics will be completely ignored.

SEMANTIC PHASES
       This section explains the order in which	the semantics specified	by an
       application are implemented.  As	a reminder, the	implementation of an
       action requires two steps.  In the first	step the action	is resolved to
       a Perl closure.	In the second step that	Perl closure is	called.

       When the	semantics are applied to a parse, it produces a	value called a
       parse result.  Because Marpa allows ambiguous parsing, each parse can
       have zero or more parse results.

       A parse series is the series of zero or more parse results for a	single
       evaluation.  The	first call to the Marpa::Recognizer::value method
       after the recognizer is created is the start of the first parse series.
       The first parse series continues	until there is a call to the
       Marpa::Recognizer::reset_evaluation method or until the recognizer is
       destroyed.  Usually, an application is only interested in a single
       parse series.

       When the	Marpa::Recognizer::reset_evaluation method is called for a
       recognizer, it begins a new parse series.  The new parse	series
       continues until there is	another	call to	the
       Marpa::Recognizer::reset_evaluation method, or until the	recognizer is
       destroyed.

   Recognizer Setup Phase
       In the Recognizer Setup Phase, the null values of the symbols are
       determined.  The	null values of symbols never change -- they are	in
       effect properties of the	grammar.

       For every Marpa Recognizer object, the Recognizer Setup Phase occurs
       before any Parse	Series Setup Phase.  The Recognizer Setup Phase	occurs
       only once in the	life of	a recognizer object.

   Parse Series	Setup Phase
       During the Parse	Series Setup Phase all semantic	action names are
       resolved	to Perl	closures.  The semantic	Perl closures are only
       resolved	in this	phase -- they will not be called until the Parse Tree
       Traversal Phase.

       A Parse Series Setup Phase occurs in the	first call of the
       Marpa::Recognizer::value	method of each parse series.  Ranking action
       names are not resolved in the Parse Series Setup	Phase.

   Ranking Phase
       In the Ranking Phase, all ranking actions are resolved to Perl
       closures, and those Perl	closures are called.  The Ranking Phase	is the
       only time during	which ranking actions are resolved or called.

       Ranking Phases do not occur if the ranking method is left as the
       default.	 If the	ranking	method is set to a value other than ""none"",
       a Ranking Phase occurs in the first call	of the
       Marpa::Recognizer::value	method of each parse series.

   Parse Result	Setup Phase
       In the Parse Result Setup Phase,	the per-parse variable is created.  If
       a constructor was found for the "action_object",	it is run at this
       point, and the per-parse	variable is its	return value.  A Parse Result
       Setup Phase occurs once for each	parse result or, in other words, once
       for each	call to	the "Marpa::Evaluator::value" method.

       In every	parse series, all the Parse Result Setup Phases	occur after
       the Parse Series	Setup Phase.  In every parse series which has a
       Ranking Phase, all the Parse Result Setup Phases	occur after the
       Ranking Phase.  For each	parse result, the Parse	Result Setup Phase
       occurs before any of the	Parse Tree Traversal Phases.

   Parse Tree Traversal	Phase
       During the Parse	Tree Traversal Phase, the semantic Perl	closures are
       called.	For each parse result, the Parse Tree Traversal	Phase follows
       the Parse Result	Setup Phase.

       During parse tree traversal, Node Evaluation Time occurs	for every node
       in the parse.  A	Parse Tree Traversal Phase occurs once for each	call
       to the "Marpa::Evaluator::value"	method.

   Node	Evaluation Time
       Node Evaluation Time is not really a separate phase.  Node Evaluation
       Time is the Parse Tree Traversal	Phase, seen from the point of view of
       the individual nodes of the parse tree.	If a node has a	semantic Perl
       closure,	it is called at	Node Evaluation	Time.

SEARCHING FOR RULE SEMANTIC CLOSURES
       Marpa finds the semantic	Perl closure for each rule based on rule and
       symbol properties and on	the grammar named arguments.  The search for a
       semantic	Perl closure is	equivalent to following	these steps:

       o   Resolve the "action"	property

	   If no "action" property is defined for a rule, the search proceeds
	   to the next step.

	   If the "action" property for	a rule is defined and if it resolves
	   to a	Perl closure, that Perl	closure	becomes	the rule's semantic
	   Perl	closure, and the search	ends.  The resolution of action	names
	   is described	below.

	   If a	defined	"action" property does not successfully	resolve	to a
	   closure, no further attempt to find the semantic Perl closure is
	   made.  The search ends, and an exception is thrown.

       o   Resolve the "lhs" property

	   If the "lhs"	property for a rule resolves to	a Perl closure,	that
	   Perl	closure	becomes	the rule's semantic Perl closure, and the
	   search ends.

	   If the "lhs"	property fails to resolve to a Perl closure, no
	   exception is	thrown.	 This is the one case where failure of an
	   action name to resolve is not treated as a fatal error.  The	search
	   proceeds to the next	step.

       o   Resolve the "default_action"	Named Argument

	   If the grammar does not have	a "default_action" defined, the	search
	   proceeds to the next	step.

	   If the grammar does have a "default_action" defined,	and if it
	   resolves to a Perl closure, that Perl closure becomes the rule's
	   semantic Perl closure, and the search ends.

	   If a	defined	"default_action" grammar named argument	does not
	   successfully	resolve	to a closure, no further attempt to find the
	   semantic Perl closure is made.  The search ends, and	an exception
	   is thrown.

       o   Evaluate to "undef"

	   If no semantic Perl closure is found, the value of the rule is
	   always a Perl "undef".  Marpa optimizes for this case, but the
	   effect is the same as if the	rule's semantic	Perl closure was the
	   following:

		sub { return }

RESOLVING ACTION NAMES
       Action names are	resolved to semantic Perl closures in the evaluator
       setup phase.  Candidates	for resolution as action names include

       o   The "default_action"	named argument of Marpa's grammar.

       o   The "action"	property of Marpa's rules.

       o   The "ranking_action"	property of Marpa's rules.

       o   The "new" constructor in the	package	specified by the
	   "action_object" named argument of the Marpa grammar;

       o   The "lhs" property of Marpa's rules.

       Resolution of the action	object constructor is explained	below as a
       special case.

   Explicit Resolution
       The "closures" named argument allows the	user to	directly control the
       mapping from action names to semantic Perl closures.  The value of the
       "closures" named	argument is a reference	to a hash whose	keys are
       action names and	whose hash values are CODE refs.

       If an action name is the	key of an entry	in the "closures" hash,	it
       resolves	to the closure referenced by the value part of that hash
       entry.  Resolution via the "closures" named argument is called explicit
       resolution.

       When explicit resolution	is the only kind of resolution that is wanted,
       it is best to pick a name that is very unlikely to be the name of a
       Perl closure.  Many of Marpa::HTML's action names are intended for
       explicit	resolution only.  In Marpa::HTML those action names begin with
       an exclamation mark ("!"), and that convention is recommended.

   Fully Qualified Action Names
       If explicit resolution fails, Marpa transforms the action name into a
       fully qualified Perl name.  An action name that contains	a double colon
       (""::"")	or a single quote (""'"") is considered	to be a	fully
       qualified name.	Any other action name is considered to be a bare
       action name.

       If the action name to be	resolved is already a fully qualified name, it
       is not further transformed.  It will be resolved	in the form it was
       received, or not	at all.

       For bare	action names, Marpa tries to qualify them by adding a package
       name.  If the "actions" grammar named argument is defined, Marpa	uses
       it as the package name.	Otherwise, if the "action_object" grammar
       named argument is defined, Marpa	uses it	as the package name.  Once
       Marpa has fully qualified the action name, Marpa	looks for a Perl
       closure with that name in the namespace.

       If Marpa	cannot fully qualify an	action name, it	will not attempt to
       resolve it.  This means that for	an action name to resolve
       successfully, one of these four things must be the case:

       o   The "actions" named argument	is defined.

       o   The "action_object" named argument is defined.

       o   The action name resolves explicitly.

       o   The action name is fully qualified to begin with.

       In all but one circumstance, failure to resolve an action name is
       thrown as an exception.	Marpa is more lenient when it attempts to use
       the "lhs" rule property as an action name.  That	is the one case	in
       which Marpa will	look at	other alternaties.  See	the section on finding
       rule semantic closures.

       Marpa's philosophy is that asking the programmer	to be specific about
       action names can	be a slight inconvenience, but silently	executing
       unintended code might be	a major	disaster.  Marpa prefers action	name
       resolution to fail when the application's intent	is not clear.

       Generally it's best practice to put the semantic	Perl closures into
       their own namespace.  But if, for example, the user wants to leave the
       semantic	closures in the	"main" namespace, she can specify "main" as
       the value of the	"actions" named	argument.

THE PER-PARSE VARIABLE
       In the Parse Result Setup Phase,	Marpa creates a	per-parse variable.
       This becomes the	first argument of the semantic Perl closures for the
       rule nodes.  If the grammar's "action_object" named argument is not
       defined,	the per-parse variable is initialized to an empty hash ref.

       Most data for the rule node semantic Perl closures will be passed up
       the parse tree.	The semantic Perl closures will	see the	values of
       their child nodes as arguments, and will	return their own value to be
       seen as an argument by their parent node.  The per-parse	variable can
       be used for data	which does not conveniently fit	this model.

       The lifetime of the per-parse variable extends into the Parse Tree
       Traversal Phase.	 During	the Parse Tree Traversal Phase,	Marpa's
       internals never alter the per-parse variable -- it is reserved for use
       by the semantic Perl closures implementing rule node semantics.

   Action Object Constructor
       If the grammar's	"action_object"	named argument has a defined value,
       that value is treated as	the name of a class.  The action object
       constructor is the "new"	method in the "action_object" class.

       The action object constructor is	run at parse setup time.  The return
       value of	the action object constructor becomes the per-parse variable.
       It is a fatal error if the grammar's "action_object" named argument is
       defined,	but does not name a class with a "new" method.

       The fully qualified name	of the action object constructor is the	value
       of the "action_object" named argument followed by the literal string
       ""::new"".  Resolution of the action object constructor is done as
       resolution of this fully	qualified action name.

       If a grammar has	both the "action" and the "action_object" named
       arguments defined, all action names except for the action object
       constructor will	be resolved in the "action" package or not at all.
       Only the	action object constructor name will be resolved	using the
       "action_object" class.

       Bypass via explicit resolution applies to the action object
       constructor.  If	the fully qualified name of the	action object
       constructor is a	hash key in the	evaluator's "closures" named argument,
       then the	Perl closure referred to by the	value of that hash entry
       becomes the action object constructor.

PARSE ORDER
   Duplicate Parses
       The same	parse result is	never returned twice.  A parse result is
       considered to be	a duplicate of another if both parse results apply the
       same rules in the same order at the same	earleme	locations.

   Default Parse Order
       By calling the Marpa::Recognizer::value method repeatedly, Marpa	can
       produce all the parse results for a given parse.	 The default is	for
       the parse results to be returned	in an arbitrary	order.	This
       corresponds to the ""none"" value of the	Recognizer's "ranking_method"
       named argument.

   A General Approach to Sorting Parses
       The most	general	way to sort Marpa parses is for	the application	to
       take control.  The application can set up the Marpa semantic actions so
       that the	value of every parse result is a "<rank, true_value>" duple.
       These duples can	be implemented as references to	arrays.	 The duples
       can then	be sorted and the rank discarded.

       In the worst case, producing and	sorting	all the	parses can take	a very
       long time.  The Marpa interface has no special features to deal with
       this general case.  Putting this	logic inside Marpa would clutter the
       interface, and there would be no	gain in	efficiency to compensate.

   The "Constant" Ranking Method
       Marpa does support a simplified approach	to sorting parses.  It is a
       compromise between generality and efficiency.  The Constant Ranking
       Method is general enough	to handle many,	perhaps	even most, practical
       applications, and simple	enough that Marpa is able to optimize it.

       The Constant Ranking Method is specified	by giving the Marpa
       Recognizer's ranking method named argument a value of ""constant"".

       The basic idea is to allow the user to specify constant values for
       rules, and to rank all other nodes according to the sum of the values
       of their	children.  Leaf	nodes default to a value of 0.

       The value of a rule must	be "constant" in the sense that	it cannot
       depend on the values of its children.  This is a	major limitation, but
       it greatly simplifies the logic for re-ranking parses as	they are
       iterated.  And it is less of a limitation than it may appear, because
       rules without ranking closures will take	into account the values	of
       their children.	By strategically mixing	rules which have no ranking
       action, but which do take into account child values, and	rules which
       can have	a user-specified rank, but which are not allowed to take into
       account child values, most real-life applications can accomplish	what
       they need to.

       The ranking action of a token leaf node is specified using the token
       symbol's	"ranking_action" property.  The	ranking	action of a nulled
       leaf node is specified using the	null node symbol's "ranking_action"
       property.  The ranking action of	a rule is specified using the rule's
       "ranking_action"	property.

       Ranking actions must return a reference to the rank.  Ranks must	be
       Perl numbers.  Negative values and non-integer values are allowed.  The
       highest numeric value is	considered to be the highest rank, and the
       lowest numeric value is considered to be	the lowest rank.

       As a special case, if a ranking action returns a	Perl "undef", Marpa
       will skip that possibility, rather than ranking it.  Note that any
       "skipped" node in a parse result	causes that whole parse	result to be
       skipped.	 A consequence of this is that any skipped node	in an
       unambiguous parse will result in	no parse results being found.

       This behavior may seem to be draconian, but in fact skipping the	entire
       tree is probably	the easiest and	most natural way to deal with skipped
       nodes.  The traditional semantics requires trees	to have	a full set of
       nodes.  It is unclear at	this point what	purposes a semantics of
       partial trees would be expected to serve.

       An instance of a	rule is	a rule,	a start	location, and an end location.
       Ranking actions are called once for each	rule instance.	While ranking
       actions return constants	in the sense that they cannot be aware of the
       ranks of	their child nodes, the rank returned can vary based on the
       rule's start and	end location.  The ranking actions can determine their
       location	using the context-aware	static methods.

       For the rank of a node to be calculated,	the ranking action must	first
       be resolved to a	ranking	Perl closure.  Ranking actions are resolved to
       ranking Perl closures in	the Ranking Phase, using the same logic	that
       resolves	semantics actions to Perl semantic closures.  The logic	that
       resolves	action names to	closures is described above.  The ranking
       actions are resolved to Perl closures, which are	also called during the
       Ranking Phase.

   Using Ranking for Side Effects
       For every parse series, ranking actions are guaranteed to be called
       once and	only once for each rule	instance.  As a	reminder, a rule
       instance	is a rule, together with a start and end location.  This
       guarantee makes ranking actions useful for their	side effects, even
       when there is no	interest in changing the order of the parse results.
       In fact,	ranking	actions	can be used in cases when there	is no interest
       in evaluating the actual	parse results.

       For example, an application may wants to	know all the instances of a
       particular rule that occur in any parse result.	In particular, an
       application might be looking for	ambiguities -- cases in	the parse
       series where two	different rules	derive the same	input string.  The
       application can create ranking actions which have the side effect of
       tracking	instances of these two rules by	location, and use this
       information to detect ambiguities.  If there is no interest in an
       actual parse, ranking actions which return "undef" can cause all	parses
       to be discarded.

       This is much faster than	the alternative	of producing all the parse
       results.	 The number of parse results grows much	faster than the	number
       of ambiguities -- it can	be exponential in the length of	the input.

CONTEXT-AWARE STATIC METHODS
	   sub rank_null_a {
	       return \( ( $MyTest::MAXIMAL ? -1 : 1 )
		   * 10**( 3 - Marpa::token_location() ) );
	   }

       Ranking Perl closures have available a set of context-aware static
       methods.	 A closure can use these methods to learn about	the context in
       which it	is called.  As of this writing,	the context-aware static
       methods are tested for the ranking Perl closures	only.  This may	change
       in the future.

   Marpa::location
       Returns the earleme location of the origin (or start) of	the rule.

   Marpa::token_location
       Returns the earleme location of the token.  Intended for	use in
       callbacks associated with empty rules and nulling symbols.

NULL SUBTREES
       In Marpa, a null	node must be leaf node.	 Because Marpa prunes every
       null subtree back to its	topmost	null node, none	of the non-root	nodes
       in a null subtree are represented in Marpa's virtual parse tree.
       Here's an example:

	   sub L {
	       shift;
	       return 'L(' . ( join q{;}, @_ ) . ')';
	   }

	   sub R {
	       shift;
	       return 'R(' . ( join q{;}, @_ ) . ')';
	   }

	   sub S {
	       shift;
	       return 'S(' . ( join q{;}, @_ ) . ')';
	   }

	   my $grammar = Marpa::Grammar->new(
	       {   start   => 'S',
		   actions => 'main',
		   rules   => [
		       [ 'S', [qw/L R/]	],
		       [ 'L', [qw/A B X/] ],
		       [ 'L', [] ],
		       [ 'R', [qw/A B Y/] ],
		       [ 'R', [] ],
		       [ 'A', [] ],
		       [ 'B', [] ],
		       [ 'X', [] ],
		       [ 'Y', [] ],
		   ],
		   symbols	  => {
		       L => { null_value => 'null L' },
		       R => { null_value => 'null R' },
		       A => { null_value => 'null A' },
		       B => { null_value => 'null B' },
		       X => { null_value => 'null X', terminal => 1 },
		       Y => { null_value => 'null Y', terminal => 1 },
		   },
	       }
	   );

	   $grammar->precompute();

	   my $recce = Marpa::Recognizer->new( { grammar => $grammar } );

	   $recce->tokens( [ [ 'X', 'x'	], ] );

       If we write the unpruned	parse tree one node per	line in	pre-order,
       depth-first, indenting children below their parents, we get something
       like this:

	       0: Rule Node, Rule: S :=	L R
		    1: Rule Node, Rule L := A B	X
			1.1: Null Node,	Symbol A
			1.2: Null Node,	Symbol B
			1.3: Token Node, Token value is	'x'
		    2: Rule Node, Rule R := A B	Y
			2.1: Null Node,	Symbol A
			2.2: Null Node,	Symbol B
			2.3: Null Node,	Symbol Y

       In this example,	six nodes are nulled.  Four of them are	in a single
       subtree:	2, 2.1,	2.2 and	2.3.  Marpa prunes every null subtree back to
       its null	root node, which in this case is the node numbered 2.

       The pruned tree looks like this

	       0: Rule Node, Rule: S :=	L R
		    1: Rule Node, Rule L := A B	X
			1.1: Null Node,	Symbol A
			1.2: Null Node,	Symbol B
			1.3: Token Node, Token value is	'x'
		    2: Null Node, Symbol R

       Here is the output:

	   S(L(null A;null B;x);null R)

       In the output we	see

       o   The null value for node 1.1:	""null A"".

       o   The null value for node 1.2:	""null B"".

       o   The token value for node 1.3: ""x"".

       o   An application of the semantic Perl closure for node	1.

       o   The null value for node 2: ""null R"".

       o   An application of the semantic Perl closure for rule	node 0.

       We do not see any output	for nodes 2.1, 2.2, or 2.3 because they	were
       non-root	nodes in the pruned subtree.  We do see	the null value for
       node 2, because after pruning it	is a leaf node.	 We do not see an
       application of the semantic Perl	closure	for node 2, because after
       pruning,	node 2 is not a	rule node.

   The Semantics of Nothing
       Rarely, your application	may call for a complex semantics of nothing.
       If the semantics	of nothing, while complex, remains constant, you can
       handle it setting every nullable	symbol's "null_value" property to the
       value which your	semantics produces when	that nullable symbol is	the
       root symbol of a	null subtree.

       If the values in	your "semantics	of nothing" are	not constants, Marpa
       can still calculate them.  Determine which of your nullable symbols
       have a dynamic semantics.  Call these your dynamic nullables.  Let the
       "null_value" property of	every dynamic nullable be a hash key.  For
       every rule with a dynamic nullable on its right hand side, write	the
       rule's semantic Perl closure so that it maps that hash key to a Perl
       closure,	which it then runs to calculate	the value of the dynamic
       nullable.

INFINITELY AMBIGUOUS GRAMMARS
       Marpa will parse	using an infinitely ambiguous grammar.	(In the
       technical literature, an	infinite ambiguity is more usually called a
       cycle.)

       An example of an	infinitely ambiguous grammar is	the following:

	   S ::= A
	   A ::= B
	   B ::= A
	   B ::	'x'

       Given the input 'x', this grammar will produce these parses

	   S ->	A -> B -> x
	   S ->	A -> B -> A -> B -> x
	   S ->	A -> B -> A -> B -> A -> B -> x
	   .
	   .
	   .

       Because of the two rules	"A ::= B" and "B ::= A", this list of parses
       could go	on forever.  The two rules "A ::= B" and "B ::=	A" form	what
       is called a cycle.

       Typically, if a user has	written	an grammar with	an infinite cycle, it
       was a mistake and he wants to rewrite it	before proceeding.  By
       default,	an infinitely ambiguous	grammar	is a fatal error.  This	is the
       behavior	most users will	want.

       To proceed to producing parse results from an infinitely	ambiguous
       grammar,	the user must set the grammar's	infinite action	named argument
       to a value other	than ""fatal"".	 The other choices are ""warn""	and
       ""quiet"".

   Cycle Length
       Obviously, Marpa	cannot list all	of an infinite number of parse
       results.	 Marpa deals with potentially infinite parses by limiting the
       cycle length.  Cycle length is the number of times a parse derivation
       goes around a potentially infinite cycle.

       Marpa limits all	cycles to a length of 1.  There	will always be a
       finite number of	these parse results.

       Above I showed a	set of parses from an example of an infinitely
       ambiguous grammar.  Here	are those parses again,	this time labeled with
       their cycle length.

	   Cycle length	1: S ->	A -> B -> x
	   Cycle length	2: S ->	A -> B -> A -> B -> x
	   Cycle length	3: S ->	A -> B -> A -> B -> A -> B -> x

       Of the parse results in the above list, Marpa would return a value only
       for the first, the one whose cycle length is 1.

   Caveats
       The precise behavior of Marpa's cycle detection is, at this point,
       defined somewhat	loosely.  In Marpa cycle is a return to	the "same
       rule", with the same start and end earlemes.  But as of this writing
       the "same rule" means the same rule after Marpa's rewriting of the
       grammar,	not the	same rule as in	the original grammar.

       Since, in all other cases, the semantics	hides Marpa's rewritings, the
       precise cutoff points of	cycles can seem	arbitrary and unnatural.  The
       CHAF rewrite, for example, breaks rules up, and a CHAF rewrite can
       cause a cycle to	end before it would end	in terms of the	rules of the
       original	grammar.

       Marpa's exact definition	of a cycle is experimental and subject to
       change.	In particular, it would	be unwise to have an application's
       semantics rely in detail	on Marpa's present behavior in determining the
       content and length of cycles.

       Marpa's cycle detection could be	more carefully defined,	and/or made to
       follow the original grammar.  But as far	as I know nobody cares about
       these details.  The present implementation I believe matches or exceeds
       the extent of present interest.

LICENSE	AND COPYRIGHT
       Copyright 2007-2010 Jeffrey Kegler, all rights reserved.	 Marpa is free
       software	under the Perl license.	 For details see the LICENSE file in
       the Marpa distribution.

perl v5.24.1			  2017-07-03		   Marpa::Semantics(3)

NAME | SYNOPSIS | OVERVIEW | NODES | SEMANTIC PHASES | SEARCHING FOR RULE SEMANTIC CLOSURES | RESOLVING ACTION NAMES | THE PER-PARSE VARIABLE | PARSE ORDER | CONTEXT-AWARE STATIC METHODS | NULL SUBTREES | INFINITELY AMBIGUOUS GRAMMARS | LICENSE AND COPYRIGHT

Want to link to this manual page? Use this URL:
<https://www.freebsd.org/cgi/man.cgi?query=Marpa::Semantics&sektion=3&manpath=FreeBSD+12.1-RELEASE+and+Ports>

home | help