Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages


home | help
ANTLR(1)		      PCCTS Manual Pages		      ANTLR(1)

       antlr - ANother Tool for	Language Recognition

       antlr [options] grammar_files

       Antlr converts an extended form of context-free grammar into a set of C
       functions which directly	implement an efficient form  of	 deterministic
       recursive-descent LL(k) parser.	Context-free grammars may be augmented
       with predicates to allow	semantics to influence parsing;	this allows  a
       form  of	 context-sensitive  parsing.   Selective  backtracking is also
       available to handle non-LL(k) and even non-LALR(k)  constructs.	 Antlr
       also  produces  a definition of a lexer which can be automatically con-
       verted into C code for a	DFA-based lexer	by dlg.	 Hence,	antlr serves a
       function	 much  like that of yacc, however, it is notably more flexible
       and is more integrated with a lexer generator (antlr directly generates
       dlg  code,  whereas  yacc  and lex are given independent	descriptions).
       Unlike yacc which accepts LALR(1) grammars, antlr accepts  LL(k)	 gram-
       mars  in	 an  extended  BNF  notation  -- which eliminates the need for
       precedence rules.

       Like yacc grammars, antlr  grammars  can	 use  automatically-maintained
       symbol  attribute  values referenced as dollar variables.  Further, be-
       cause antlr generates top-down parsers, arbitrary values	may be	inher-
       ited  from  parent rules	(passed	like function parameters).  Antlr also
       has a mechanism for creating and	manipulating abstract-syntax-trees.

       There are various other niceties	in antlr,  including  the  ability  to
       spread  one  grammar over multiple files	or even	multiple grammars in a
       single file, the	ability	to generate a version of the grammar with  ac-
       tions stripped out (for documentation purposes),	and lots more.

       -ck n  Use  up  to n symbols of lookahead when using compressed (linear
	      approximation) lookahead.	 This type of lookahead	is very	 cheap
	      to  compute  and is attempted before full	LL(k) lookahead, which
	      is of exponential	complexity in the worst	case.  In general, the
	      compressed  lookahead  can be much deeper	(e.g, -ck 10) than the
	      full lookahead (which usually must be less than 4).

       -CC    Generate C++ output from both ANTLR and DLG.

       -cr    Generate a cross-reference for all rules.	 For each rule,	 print
	      a	list of	all other rules	that reference it.

       -e1    Ambiguities/errors shown in low detail (default).

       -e2    Ambiguities/errors shown in more detail.

       -e3    Ambiguities/errors shown in excruciating detail.

       -fe file
	      Rename err.c to file.

       -fh file
	      Rename stdpccts.h	header (turns on -gh) to file.

       -fl file
	      Rename lexical output, parser.dlg, to file.

       -fm file
	      Rename file with lexical mode definitions, mode.h, to file.

       -fr file
	      Rename  file  which remaps globally visible symbols, remap.h, to

       -ft file
	      Rename tokens.h to file.

       -ga    Generate ANSI-compatible code (default case).  This has not been
	      rigorously  tested to be ANSI XJ11 C compliant, but it is	close.
	      The normal output	of antlr is currently  compilable  under  both
	      K&R,  ANSI  C,  and  C++--this option does nothing because antlr
	      generates	a bunch	of #ifdef's to do the right thing depending on
	      the language.

       -gc    Indicates	 that antlr should generate no C code, i.e., only per-
	      form analysis on the grammar.

       -gd    C	code is	inserted in each of the	antlr generated	parsing	 func-
	      tions  to	 provide for user-defined handling of a	detailed parse
	      trace.  The inserted code	consists of calls to the user-supplied
	      macros  or  functions called zzTRACEIN and zzTRACEOUT.  The only
	      argument is a char * pointing to a C-style string	which  is  the
	      grammar  rule recognized by the current parsing function.	 If no
	      definition is given for the trace	functions, upon	rule entry and
	      exit,  a	message	 will  be printed indicating that a particular
	      rule as been entered or exited.

       -ge    Generate an error	class for each non-terminal.

       -gh    Generate stdpccts.h for non-ANTLR-generated  files  to  include.
	      This  file  contains  all	defines	needed to describe the type of
	      parser generated by antlr	(e.g. how much lookahead is  used  and
	      whether  or  not	trees are constructed) and contains the	header
	      action specified by the user.

       -gk    Generate parsers that  delay  lookahead  fetches	until  needed.
	      Without this option, antlr generates parsers which always	have k
	      tokens of	lookahead available.

       -gl    Generate line info about grammar actions in C parser of the form
	      #	line "file" which makes	error messages from the	C/C++ compiler
	      make more	sense as they will point into the grammar file not the
	      resulting	C file.	 Debugging is easier as	well, because you will
	      step through the grammar not C file.

       -gs    Do not generate sets for token expression	lists; instead	gener-
	      ate a ||-separated sequence of LA(1)==token_number.  The default
	      is to generate sets.

       -gt    Generate code for	Abstract-Syntax	Trees.

       -gx    Do not create the	lexical	analyzer  files	 (dlg-related).	  This
	      option should be given when the user wishes to provide a custom-
	      ized lexical analyzer.  It may also be used in make  scripts  to
	      cause  only the parser to	be rebuilt when	a change not affecting
	      the lexical structure is made to the input grammars.

       -k n   Set k of LL(k) to	n; i.e.	set tokens of look-ahead (default==1).

       -o dir Directory	where output files should go (default=".").   This  is
	      very  nice  for  keeping the source directory clear of ANTLR and
	      DLG spawn.

       -p     The complete grammar, collected from all input grammar files and
	      stripped of all comments and embedded actions, is	listed to std-
	      out.  This is intended to	aid in viewing the entire grammar as a
	      whole and	to eliminate the need to keep actions concisely	stated
	      so that the grammar is easier to read.  Hence, it	is  preferable
	      to  embed	 even  complex actions directly	in the grammar,	rather
	      than to call them	as  subroutines,  since	 the  subroutine  call
	      overhead will be saved.

       -pa    This  option  is	the same as -p except that the output is anno-
	      tated with the first sets	determined from	grammar	analysis.

       -prc on
	      Turn on the computation and hoisting of predicate	context.

       -prc off
	      Turn off the computation	and  hoisting  of  predicate  context.
	      This  option makes 1.10 behave like the 1.06 release with	option
	      -pr on.  Context computation is off by default.

       -rl n  Limit the	maximum	number of tree nodes used by grammar  analysis
	      to  n.   Occasionally, antlr is unable to	analyze	a grammar sub-
	      mitted by	the user.  This	rare situation can only	occur when the
	      grammar  is  large  and  the amount of lookahead is greater than
	      one.  A nonlinear	analysis algorithm is used by PCCTS to	handle
	      the  general  case  of LL(k) parsing.  The average complexity of
	      analysis,	however, is near linear	due to some fancy footwork  in
	      the implementation which reduces the number of calls to the full
	      LL(k) algorithm.	An error message will be  displayed,  if  this
	      limit  is	 reached,  which indicates the grammar construct being
	      analyzed when antlr hit a	non-linearity.	 Use  this  option  if
	      antlr  seems  to	go out to lunch	and your disk start thrashing;
	      try n=10000 to start.  Once the  offending  construct  has  been
	      identified, try to remove	the ambiguity that antlr was trying to
	      overcome with large lookahead  analysis.	 The  introduction  of
	      (...)?  backtracking blocks eliminates some of these problems --
	      antlr does not analyze alternatives that begin with  (...)?  (it
	      simply backtracks, if necessary, at run time).

       -w1    Set  low	warning	 level.	  Do  not  warn	if semantic predicates
	      and/or (...)? blocks are assumed	to  cover  ambiguous  alterna-

       -w2    Ambiguous	 parsing  decisions  yield  warnings  even if semantic
	      predicates or (...)? blocks are used.  Warn if predicate context
	      computed	and  semantic predicates incompletely disambiguate al-
	      ternative	productions.

       -      Read grammar from	standard input and  generate  stdin.c  as  the
	      parser file.

       Antlr  works...	we think.  There is no implicit	guarantee of anything.
       We reserve no legal rights to the software known	as the Purdue Compiler
       Construction Tool Set (PCCTS) --	PCCTS is in the	public domain.	An in-
       dividual	or company may do whatever they	wish with source code distrib-
       uted  with PCCTS	or the code generated by PCCTS,	including the incorpo-
       ration of PCCTS,	or its output, into commercial software.  We encourage
       users  to  develop software with	PCCTS.	However, we do ask that	credit
       is given	to us for developing PCCTS.  By	"credit", we mean that if  you
       incorporate our source code into	one of your programs (commercial prod-
       uct, research project, or otherwise) that  you  acknowledge  this  fact
       somewhere  in  the  documentation, research report, etc...  If you like
       PCCTS and have developed	a nice tool with the  output,  please  mention
       that  you  developed  it	 using PCCTS.  As long as these	guidelines are
       followed, we expect to continue enhancing this  system  and  expect  to
       make other tools	available as they are completed.

       *.c    output C parser.

       *.cpp  output C++ parser	when C++ mode is used.

	      output dlg lexical analyzer.

       err.c  token  string array, error sets and error	support	routines.  Not
	      used in C++ mode.

	      file that	redefines all globally visible	parser	symbols.   The
	      use of the #parser directive creates this	file.  Not used	in C++

	      list of definitions needed by C files, not generated  by	PCCTS,
	      that reference PCCTS objects.  This is not generated by default.
	      Not used in C++ mode.

	      output #defines for tokens  used	and  function  prototypes  for
	      functions	generated for rules.

       dlg(1), pccts(1)

ANTLR				September 1995			      ANTLR(1)


Want to link to this manual page? Use this URL:

home | help