Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages


home | help
LLnextgen(1)		  LLnextgen parser generator		  LLnextgen(1)

       LLnextgen - an Extended-LL(1) parser generator

       LLnextgen [OPTIONS] [FILES]

       LLnextgen  is  a	 (partial) reimplementation of the LLgen ELL(1)	parser
       generator created by D. Grune and C.J.H.	Jacobs (note: this is not  the
       same as the LLgen parser	generator by Fischer and LeBlanc). It takes an
       EBNF-like description of	the grammar as input(s), and produces a	parser
       in C.

       Input  files  are  expected to end in .g. The output files will have .g
       removed and .c and .h added. If the input file does not end in .g,  the
       extensions  .c  and  .h	will  simply be	added to the name of the input
       file. Output files can also be given a different	base  name  using  the
       option --base-name (see below).

       LLnextgen accepts the following options:

       -c, --max-compatibility
	      Set  options  required  for  maximum source-level	compatibility.
	      This is different	from running as	LLgen, as all  extensions  are
	      still  allowed.  LLreissue and the prototypes in the header file
	      are still	generated. This	option turns on	the --llgen-arg-style,
	      --llgen-escapes-only and --llgen-output-style options.

       -e, --warnings-as-errors
	      Treat warnings as	errors.

       -Enum, --error-limit=num
	      Set  the	maximum	 number	of errors, before LLnextgen aborts. If
	      num is set 0, the	error limit is set to  infinity.  This	is  to
	      override the error limit option specified	in the grammar file.

       -h[which], --help[=which]
	      Print  out  a help message, describing the options. The optional
	      which argument allows selection of which options to print. which
	      can be set to all, depend, error,	and extra.

       -V, --version
	      Print the	program	version	and copyright information, and exit.

       -v[level], --verbose[=level]
	      Increase	(without  explicit level) or set (with explicit	level)
	      the verbosity level. LLnextgen uses this option differently than
	      LLgen. At	level 1, LLnextgen will	output traces of the conflicts
	      to standard error. At level 2, LLnextgen will also write a  file
	      named LL.output with the rules containing	conflicts. At level 3,
	      LLnextgen	will include the entire	grammar	in LL.output.
	      LLgen will write the LL.output file from	level  1,  but	cannot
	      generate	conflict  traces.  It also has an intermediate setting
	      between LLnextgen	levels 2 and 3.

       -w[warnings], --suppress-warnings[=warnings]
	      Suppress all or selected warnings. Available warnings are:  arg-
	      separator,   option-override,   unbalanced-c,   multiple-parser,
	      eofile, unused[:<identifier>], datatype and  unused-retval.  The
	      unused warning can suppress all warnings about unused tokens and
	      non-terminals, or	can be used to suppress	 warnings  about  spe-
	      cific  tokens or non-terminals by	adding a colon and a name. For
	      example, to suppress warning messages about FOO not being	 used,
	      use -wunused:FOO.	Several	comma separated	warnings can be	speci-
	      fied with	one option on the command line.

	      Generate the LLabort function.

	      Set the base name	for the	output files. Normally LLnextgen  uses
	      the  name	of the first input file	without	any trailing .g	as the
	      base name. This option can be used to override the default.  The
	      files  created will be name.c and	name.h.	 This option cannot be
	      used in combination with --llgen-output-style.

	      Generate dependency information to be used by the	 make(1)  pro-
	      gram. The	modifiers can be used to change	the make targets (tar-
	      gets:<targets>,  and  extra-targets:<targets>)  and  the	output
	      (file:<file>).  The  default are to use the output names as they
	      would be created by running with the same	arguments as  targets,
	      and  to  output  to standard output. Using the targets modifier,
	      the list of targets can be specified manually. The extra-targets
	      modifier	allows targets to be added to the default list of tar-
	      gets. Finally, the phony modifier	will add phony targets for all
	      dependencies to avoid make(1) problems when removing or renaming
	      dependencies. This is like the gcc(1) -MP	option.

	      Dump all top-level C-code	to standard out. This can be  used  to
	      generate	dependency information for the generated files by pip-
	      ing the output from LLnextgen through the	 C  preprocessor  with
	      the appropriate options.

	      Write the	lexer wrapper function to standard output, and exit.

	      Write  the  default  LLmessage  function to standard output, and

	      Dump %token directives for unknown identifiers  that  match  the
	      --token-pattern  pattern.	 The  default  is to generate a	single
	      %token directive with all	the unknown identifiers	 separated  by
	      comma's.	This  default can be overridden	by modifier. The modi-
	      fier separate produces a	separate  %token  directive  for  each
	      identifier, while	label produces a %label	directive. The text of
	      the label	will be	the name of the	identifier.  If	the label mod-
	      ifier  and the --lowercase-symbols option	are both specified the
	      label will contain only lowercase	characters.
	      Note: this option	is not always available. It requires the POSIX
	      regex API. If the	POSIX regex API	is not available on your plat-
	      form, or the LLnextgen binary was	compiled without  support  for
	      the API, you will	not be able to use this	option.

	      Specify  the  extensions to be used for the generated files. The
	      list must	be comma separated, and	should not contain the	.  be-
	      fore  the	 extension. The	first item in the list is the C	source
	      file and the second item is the header file. You	can  omit  the
	      extension	 for  the C source file	and only specify the extension
	      for the header file.

	      Indicate whether to generate a wrapper for the lexical analyser.
	      As  LLnextgen requires a lexical analyser	to return the last to-
	      ken returned after detecting an error which requires inserting a
	      token to repair, most lexical analysers require a	wrapper	to ac-
	      commodate	LLnextgen. As it is identical for almost each grammar,
	      LLnextgen	 can  provide one. Use --dump-lexer-wrapper to see the
	      code. If you do specifiy this option LLnextgen will  generate  a
	      warning, to help remind you that a wrapper is required.
	      If you do	not want the automatically generate wrapper you	should
	      specifiy this option followed by =no.

	      Generate an LLmessage function. LLnextgen	requires  programs  to
	      provide  a  function  for	informing the user about errors	in the
	      input. When developing a parser, it is often desirable to	have a
	      default  LLmessage.   The	 provided LLmessage is very simple and
	      should be	replaced by a more elaborate one, once the  parser  is
	      beyond  the first	testing	phase. Use --dump-llmessage to see the
	      code. This option	automatically turns  on	 --generate-symbol-ta-

	      Generate	a  symbol table. The symbol table will contain strings
	      for all tokens and character literals. By	 default,  the	symbol
	      table  contains  the  token name as specified in the grammar. To
	      change the string, for both tokens and character	literals,  use
	      the %label directive.

	      Add  gettext  support. A macro call is added around symbol table
	      entries generated	from %label directives.	The macro will	expand
	      to the string itself.  This is meant to allow xgettext(1)	to ex-
	      tract the	strings. The default is	N_, because that is what  most
	      people use. A guard will be included such	that compilation with-
	      out gettext is possible by not defining the guard. The guard  is
	      set  to  USE_NLS by default. Translations	will be	done automati-
	      cally in LLgetSymbol in the generated parser through a  call  to

	      Do  not  remove  directory component of the input	file-name when
	      creating the output file-name. By	default, outputs  are  created
	      in  the current directory.  This option will generate the	output
	      in the directory of the input.

	      Use semicolons as	argument separators in rule headers. LLnextgen
	      uses comma's by default, as this is what ANSI C does.

	      Only  allow  the	escape sequences defined by LLgen in character
	      literals.	 By default LLnextgen also allows \a, \v, \?, \",  and
	      hexadecimal constants with \x.

	      Generate	one  .c	 output	 per  input, and the files Lpars.c and
	      Lpars.h, instead of one .c and one .h file based on the name  of
	      the first	input.

	      Convert  the token names used for	generating the symbol table to
	      lower case.  This	only applies to	tokens for which no %label di-
	      rective has been specified.

	      Do  not  allow  the  %label directive to create new tokens. Note
	      that this	requires that the token	being  labelled	 is  either  a
	      character	literal	or a %token directive creating the named token
	      has preceded the %label directive.

	      Do not check argument counts for rules. LLnextgen	checks whether
	      a	 rule  is  used	with the same number of	arguments as it	is de-
	      fined. LLnextgen also checks that	any rules for which  a	%start
	      directive	is specified, the number of arguments is 0.

	      Do  not use 0 as end-of-file token. (f)lex(1) uses 0 as the end-
	      of-file token. Other lexical-analyser generators may use -1, and
	      may use 0	for something else (e.g. the nul character).

	      Do  not  initialise LLretval with	0 bytes. Note that you have to
	      take care	of initialisation of LLretval yourself when using this

	      Do  not  generate	#line directives in the	output.	This means all
	      errors will be reported relative to the output file. By  default
	      LLnextgen	generates #line	directives to make the C compiler gen-
	      erate errors relative to the LLnextgen input file.

	      Do not generate the LLreissue variable, which is used  to	 indi-
	      cate when	a token	should be reissued by the lexical analyser.

	      Do not generate prototypes for the parser	and other functions in
	      the header file.

	      Do not only analyse reachable rules. LLnextgen by	 default  does
	      not  take	 unreachable  rules  into  account when	doing conflict
	      analysis,	as these can cause spurious conflicts. However,	if the
	      unreachable  rules will be used in the future, one might already
	      want to be notified of problems with these rules.	 LLgen by  de-
	      fault does analyse unreachable rules.
	      Note:  in	 the case where	a rule is unreachable because the only
	      alternative of another reachable rule that mentions it is	 never
	      chosen (because of a %avoid directive), the rule is still	deemed
	      reachable	for the	analysis. The only way to avoid	this behaviour
	      is  by  doing the	complete analysis twice, which is an excessive
	      amount of	work to	do for a very rare case.

	      Generate a reentrant parser.  By	default,  LLnextgen  generates
	      non-reentrant parsers. A reentrant parser	can be called from it-
	      self, but	not from another thread. Use --thread-safe to generate
	      a	thread-safe parser.
	      Note  that  when	multiple  parsers are specified	in one grammar
	      (using multiple %start directives), and  one  of	these  parsers
	      calls  another,  either  the --reentrant option or the --thread-
	      safe option is also required. If these parsers are  only	called
	      when none	of the others is running, the option is	not necessary.
	      Use only in combination with a reentrant lexical analyser.

	      Show  directory  names of	source files in	error and warning mes-
	      sages. These are usually omitted for readability,	but may	 some-
	      times be necessary for tracing errors.

	      Generate a thread-safe parser. Thread-safe parsers can be	run in
	      parallel in different threads of the same	program. The interface
	      of  a thread-safe	parser is different from the regular (and then
	      reentrant) version. See the detailed manual for more details.

	      Specify a	regular	expression to match with  unknown  identifiers
	      used in the grammar. If an unknown identifier matches, LLnextgen
	      will generate a token declaration	for the	identifier.  This  op-
	      tion  is primarily implemented to	aid in the first stages	of de-
	      velopment, to allow for quick testing for	conflicts without hav-
	      ing  to specify all the tokens yet. A list of tokens can be gen-
	      erated with the --dump-tokens option.
	      Note: this option	is not always available. It requires the POSIX
	      regex API. If the	POSIX regex API	is not available on your plat-
	      form, or the LLnextgen binary was	compiled without  support  for
	      the API, you will	not be able to use this	option.

       By  running  LLnextgen using the	name LLgen, LLnextgen goes into	LLgen-
       mode. This is implemented by turning off	all default extra  functional-
       ity  like  LLreissue,  and disallowing all extensions to	the LLgen lan-
       guage. When running as LLgen, LLnextgen accepts the  following  options
       from LLgen:

       -a     Ignored. LLnextgen only generates	ANSI C.

       -hnum  Ignored.	LLnextgen  leaves optimisation of jump tables entirely
	      up to the	C-compiler.

	      Ignored. LLnextgen leaves	optimisation of	jump  tables  entirely
	      up to the	C-compiler.

	      Ignored.	LLnextgen  leaves optimisation of jump tables entirely
	      up to the	C-compiler.

       -v     Increase the verbosity level. See	the description	of the -v  op-
	      tion above for details.

       -w     Suppress all warnings.

       -x     Ignored.	LLnextgen  will	only generate token sets in LL.output.
	      The extensive error-reporting mechanisms in LLnextgen make  this
	      feature obsolete.

       LLnextgen  cannot  create  parsers  with	non-correcting error-recovery.
       Therefore, using	the -n or -s options will cause	LLnextgen to print  an
       error message and exit.

       At  this	 time  the  basic LLgen	functionality is implemented. This in-
       cludes everything apart from the	extended user error-handling with  the
       %onerror	directive and the non-correcting error-recovery.

       Although	 I've  tried to	copy the behaviour of LLgen accurately,	I have
       implemented some	aspects	slightly differently. The following is a  list
       of the differences in behaviour between LLgen and LLnextgen:

       +o      LLgen generated both K&R style C code and	ANSI C code. LLnextgen
	      only supports generation of ANSI C code.

       +o      There is a minor difference in the determination of the  default
	      choices.	LLnextgen simply chooses the first production with the
	      shortest possible	terminal production, while  LLgen  also	 takes
	      the complexity in	terms of non-terminals and terms into account.
	      There is also a minor difference when there  is  more  than  one
	      shortest	alternative  and  some of them are marked with %avoid.
	      Both differences are not very important as the user can  specify
	      which  alternative  should be the	default, thereby circumventing
	      the differences in the algorithms.

       +o      The default behaviour of generating one output C file per	 input
	      and Lpars.c and Lpars.h has been changed in favour of generating
	      one .c file and one .h file. The rationale  given	 for  creating
	      multiple	output	files in the first place was that it would re-
	      duce the compilation time	for the	generated parser. As  computa-
	      tion  power  has	become	much  more abundant this feature is no
	      longer necessary,	and the	difficult interaction  with  the  make
	      program  makes it	undesirable. The LLgen behaviour is still sup-
	      ported through a command-line switch.

       +o      in LLgen one could have a	parser and a  %first  macro  with  the
	      same  name.   LLnextgen forbids this, as it leads	to name	colli-
	      sions in the new file naming scheme. For the old LLgen file nam-
	      ing  scheme  it  could  also easily lead to name collisions, al-
	      though they could	be circumvented	by not mentioning  the	parser
	      in any of	the C code in the .g files.

       +o      LLgen  names  the	 labels	it generates L_X, where	X is a number.
	      LLnextgen	names these LL_X.

       +o      LLgen parsers are	always reentrant. As this feature is not  used
	      very  often,  LLnextgen parsers are non-reentrant	unless the op-
	      tion --reentrant is used.

       Furthermore, LLnextgen has many extended	features, for easier  develop-

       If  you think you have found a bug, please check	that you are using the
       latest version of LLnextgen []. When  re-
       porting	bugs,  please  include a minimal grammar that demonstrates the

       G.P. Halkes <>

       Copyright (C) 2005-2008 G.P. Halkes
       LLnextgen is licensed under the GNU General Public License version 3.
       For more	details	on the license,	see the	file COPYING in	the documenta-
       tion	directory.     On     Un*x    systems	 this	 is    usually

       LLgen(1), bison(1), yacc(1), lex(1), flex(1).

       A detailed manual for LLnextgen is available as part of	the  distribu-
       tion.   It includes the syntax for the grammar files, details on	how to
       use the generated parser	in your	programs, and details on the  workings
       of the generated	parsers. This manual can be found in the documentation
       directory.     On      Un*x	systems	     this      is      usually

Version	0.5.5			  31-12-2011			  LLnextgen(1)


Want to link to this manual page? Use this URL:

home | help