Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages


home | help
AFLEX(1)		    General Commands Manual		      AFLEX(1)

       aflex - fast lexical analyzer generator for Ada

       aflex [ -bdfipstvEILT -Sskeleton_file ] [ filename ]

       aflex  is a version of the Unix tool lex	, but it is written in Ada and
       generates scanners in Ada.  It is upwardly compatible with the UCI tool
       alex, but is much faster	and generates smaller scanners.

       Command	line  options  are given in a different	format than in the old
       UCI alex.  Aflex	options	are as follows

       -t     Write the	scanner	output to the standard output rather than to a
	      file.  The default name of the scanner file for base.l is	base.a
	      Note that	this option is not as useful with aflex	because	in ad-
	      dition  to  the  scanner file there are files for	the externally
	      visible dfa functions (base_dfa.a) and the external IO functions

       -b     Generate backtracking information	to aflex.backtrack.  This is a
	      list of scanner states which require backtracking	and the	 input
	      characters  on which they	do so.	By adding rules	one can	remove
	      backtracking states.  If all backtracking	states are  eliminated
	      and  -f  is used,	the generated scanner will run faster (see the
	      -p flag).	 Only users who	wish to	squeeze	every last  cycle  out
	      of their scanners	need worry about this option.

       -d     makes  the generated scanner run in debug	mode.  Whenever	a pat-
	      tern is recognized the scanner will write	to stderr  a  line  of
	      the form:

		  --accepting rule #n

	      Rules  are  numbered  sequentially  with	the first one being 1.
	      Rule #0 is executed when the  scanner  backtracks;  Rule	#(n+1)
	      (where  n	 is the	number of rules) indicates the default action;
	      Rule #(n+2) indicates that the input buffer is empty  and	 needs
	      to  be refilled and then the scan	restarted.  Rules beyond (n+2)
	      are end-of-file actions.

       -f     has the same effect as lex's -f flag (do not compress the	 scan-
	      ner tables); the mnemonic	changes	from fast compilation to (take
	      your pick) full table or fast scanner.  The  actual  compilation
	      takes  longer,  since aflex is I/O bound writing out the big ta-
	      ble.  The	compilation of the Ada file containing the scanner  is
	      also likely to take a long time because of the large arrays gen-

       -i     instructs	aflex to generate  a  case-insensitive	scanner.   The
	      case  of	letters	 given in the aflex input patterns will	be ig-
	      nored, and the rules will	be matched regardless  of  case.   The
	      matched text given in yytext will	have the preserved case	(i.e.,
	      it will not be folded).

       -p     generates	a performance report to	stderr.	 The  report  consists
	      of  comments  regarding  features	 of the	aflex input file which
	      will cause a loss	of performance in the resulting	scanner.  Note
	      that the use of the ^ operator and the -I	flag entail minor per-
	      formance penalties.

       -s     causes the default rule (that unmatched scanner input is	echoed
	      to  stdout)  to  be suppressed.  If the scanner encounters input
	      that does	not match any of its rules, it aborts with  an	error.
	      This option is useful for	finding	holes in a scanner's rule set.

       -v     has  the	same  meaning as for lex (print	to stderr a summary of
	      statistics of the	generated scanner).  Many more statistics  are
	      printed,	though,	 and the summary spans several lines.  Most of
	      the statistics are meaningless to	the casual aflex user, but the
	      first  line identifies the version of aflex, which is useful for
	      figuring out where you stand with	respect	to patches and new re-

       -E     instructs	 aflex	to  generate additional	information about each
	      token, including line and	column numbers.	 This  is  needed  for
	      the advanced automatic error option correction in	ayacc.

       -I     instructs	 aflex	to generate an interactive scanner.  Normally,
	      scanners generated by aflex always look ahead one	character  be-
	      fore deciding that a rule	has been matched.  At the cost of some
	      scanning overhead, aflex will  generate  a  scanner  which  only
	      looks  ahead  when needed.  Such scanners	are called interactive
	      because if you want to write a scanner for an interactive	system
	      such as a	command	shell, you will	probably want the user's input
	      to be terminated with a newline, and without -I  the  user  will
	      have  to type a character	in addition to the newline in order to
	      have the newline recognized.  This leads to dreadful interactive

	      If  all  this  seems to confusing, here's	the general rule: if a
	      human will be typing in input to your scanner, use -I, otherwise
	      don't;  if  you  don't care about	how fast your scanners run and
	      don't want to make any assumptions about the input to your scan-
	      ner, always use -I.

	      Note,  -I	 cannot	 be used in conjunction	with full i.e.,	the -f

       -L     instructs	aflex to not generate #line directives (see below).

       -T     makes aflex run in trace mode.  It will generate a lot  of  mes-
	      sages  to	stdout concerning the form of the input	and the	resul-
	      tant  non-deterministic  and  deterministic  finite  automatons.
	      This option is mostly for	use in maintaining aflex.

	      overrides	 the  default  internal	skeleton from which aflex con-
	      structs its scanners.  You'll probably never  need  this	option
	      unless you are doing aflex maintenance or	development.

       aflex is	fully compatible with lex with the following exceptions:

       -      Source file format:

	      The  input  specification	 file for aflex	must use the following

			definitions section
			rules section
			user defined section
			user defined section

       -      lex's %r (Ratfor scanners) and %t	 (translation  table)  options
	      are not supported.

       -      The do-nothing -n	flag is	not supported.

       -      When  definitions	are expanded, aflex encloses them in parenthe-
	      ses.  With lex, the following

		  NAME	  [A-Z][A-Z0-9]*
		  foo{NAME}?	  text_io.put_line( "Found it" );

	      will not match the string	"foo" because when the	macro  is  ex-
	      panded  the  rule	is equivalent to "foo[A-Z][A-Z0-9]*?"  and the
	      precedence is such that the '?' is associated with  "[A-Z0-9]*".
	      With  aflex, the rule will be expanded to	"foo([A-z][A-Z0-9]*)?"
	      and so the string	"foo" will match.  Note	that because of	 this,
	      the ^, $,	<s>, and / operators cannot be used in a definition.

       -      Input  can  be  controlled  by redefining	the YY_INPUT function.
	      YY_INPUT's calling sequence is  "YY_INPUT(buf,result,max_size)".
	      Its  action is to	place up to max_size characters	in the charac-
	      ter buffer "buf" and return in the integer variable "result" ei-
	      ther  the	 number	 of characters read or the constant YY_NULL to
	      indicate EOF.  The default YY_INPUT reads	from Standard_Input.

	      You also can add in things like counting keeping	track  of  the
	      input  line number this way; but don't expect your scanner to go
	      very fast.

       -      Yytext is	a function returning a vstring.

       -      aflex reads only one input file, while lex's input is made up of
	      the concatenation	of its input files.

       -      The following lex	constructs are not supported
     - REJECT

     - %T      -- character set	tables

     - %x -- changes to	internal array sizes (see below)

       -      Exclusive	 start-conditions  can be declared by using %x instead
	      of %s.  These start-conditions have the property that when  they
	      are active, no other rules are active.  Thus a set of rules gov-
	      erned by the same	exclusive start	condition describe  a  scanner
	      which  is	independent of any of the other	rules in the aflex in-
	      put.  This feature makes	it  easy  to  specify  "mini-scanners"
	      which  scan portions of the input	that are syntactically differ-
	      ent from the rest	(e.g.,	comments).   End-of-file  rules.   The
	      special  rule  "<<EOF>>" indicates actions which are to be taken
	      when an end-of-file is encountered and yywrap() returns non-zero
	      (i.e.,  indicates	 no further files to process).	The action can
	      either text_io.set_input() to a new file to  process,  in	 which
	      case  the	 action	 should	 finish	 with  YY_NEW_FILE  (this is a
	      branch, so subsequent code in the	action won't be	executed),  or
	      it should	finish with a return statement.	 <<EOF>> rules may not
	      be used with other patterns; they	may only be qualified  with  a
	      list  of	start  conditions.   If	an unqualified <<EOF>> rule is
	      given, it	applies	only to	the INITIAL start condition,  and  not
	      to  %s  start  conditions.   These rules are useful for catching
	      things like unclosed comments.  An example:

		  %x quote
		  <quote><<EOF>>   {
			error( "unterminated quote" );
		  <<EOF>>	   {
			set_input( next_file );

       -      aflex dynamically	resizes	its  internal  tables,	so  directives
	      like "%a 3000" are not needed when specifying large scanners.

       -      aflex  generates --#line comments	mapping	lines in the output to
	      their origin in the input	file.

       -      All actions must be enclosed by curly braces.

       -      Comments may be put in the first section of the input by preced-
	      ing them with '#'.

       -      Ada style	comments are supported instead of C style comments.

       -      All template files are internalized.

       -      The input	source file must end with a ".l" extension.

       The names of the	files containing the generated scanner,	IO,
	      and  DFA	packages  are based on the basename of the input file.
	      For example if the input file is called scan.l then the  scanner
	      file  is	called	scan.a,	 the DFA package is in scan_dfa.a, and
	      scan_io.a	is the IO package file.	 All of	these file  names  may
	      be  changed  by modifying	the external_file_manager package (see
	      the porting notes	for more information.)

	      backtracking information for -b


       M. E. Lesk and E. Schmidt, LEX -	Lexical	Analyzer Generator.  Technical
       Report Computing	Science	Technical Report, 39, Bell Telephone Laborato-
       ries, Murray Hill, NJ, 1975.

       Military	  Standard   Ada    Programming	   Language	    (ANSI/MIL-
       STD-1815A-1983),	American National Standards Institute, January 1983.

       T. Nguyen and K.	Forester, Alex - An Ada	Lexical	Analysis Generator Ar-
       cadia Document UCI-88-17, University of California, Irvine, 1988

       D.  Taback  and	D.  Tolani,  Ayacc  User's  Manual,  Arcadia  Document
       UCI-85-10, University of	California, Irvine, 1986

       John Self.  Based on the	tool flex written and designed by Vern Paxson.
       It reimplements the functionality of the	tool alex designed by Thieu Q.

       Send requests for aflex information to
       Send bug	reports	for aflex to

       aflex  scanner  jammed  - a scanner compiled with -s has	encountered an
       input string which wasn't matched by any	of its rules.

       old-style lex command ignored - the aflex input contains	a lex  command
       (e.g., "%n 1000") which is being	ignored.

       Some  trailing context patterns cannot be properly matched and generate
       warning messages	("Dangerous trailing context").	  These	 are  patterns
       where the ending	of the first part of the rule matches the beginning of
       the second part,	such as	"zx*/xy*", where the 'x*' matches the  'x'  at
       the beginning of	the trailing context.  (Lex doesn't get	these patterns
       right either.)

       variable	trailing context (where	both the leading and trailing parts do
       not have	a fixed	length)	entails	a substantial performance loss.

       For  some trailing context rules, parts which are actually fixed-length
       are not recognized as such, leading to the  abovementioned  performance
       loss.   In  particular,	parts  using  '|' or {n} are always considered

       Nulls are not allowed in	aflex inputs or	in the inputs to scanners gen-
       erated by aflex.	 Their presence	generates fatal	errors.

       Pushing	back  definitions enclosed in ()'s can result in nasty,	diffi-
       cult-to-understand problems like:

	    {DIG}  [0-9] -- a digit

       In which	the pushed-back	text is	"([0-9]	-- a digit)".

       Due to both buffering of	input  and  read-ahead,	 you  cannot  intermix
       calls  to  text_io  routines,  such as, for example, text_io.get() with
       aflex rules and expect it to work.  Call	input()	instead.

       There are still more features that could	be implemented (especially RE-
       JECT) Also the speed of the compressed scanners could be	improved.

       The utility needs more complete documentation.

Version	1.4			 10 March 1994			      AFLEX(1)


Want to link to this manual page? Use this URL:

home | help