Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
RAGEL(1)		 Ragel State Machine Compiler		      RAGEL(1)

NAME
       ragel - compile regular languages into executable state machines

SYNOPSIS
       ragel [options] file

DESCRIPTION
       Ragel compiles executable finite	state machines from regular languages.
       Ragel can generate C, C++, Objective-C, D,  Go,	or  Java  code.	 Ragel
       state machines can not only recognize byte sequences as regular expres-
       sion machines do, but can also execute code at arbitrary	points in  the
       recognition  of a regular language.  User code is embedded using	inline
       operators that do not disrupt the regular language syntax.

       The core	language consists of standard  regular	expression  operators,
       such as union, concatenation and	kleene star, accompanied by action em-
       bedding operators. Ragel	also provides operators	that let  you  control
       any non-determinism that	you create, construct scanners using the long-
       est match paradigm, and	build  state  machines	using  the  statechart
       model.  It  is  also possible to	influence the execution	of a state ma-
       chine from inside an embedded action by jumping	or  calling  to	 other
       parts of	the machine and	reprocessing input.

       Ragel provides a	very flexibile interface to the	host language that at-
       tempts to place minimal restrictions on how the generated code is  used
       and  integrated	into the application. The generated code has no	depen-
       dencies.

OPTIONS
       -h, -H, -?, --help
	      Display help and exit.

       -v     Print version information	and exit.

       -o  file
	      Write output to file. If -o is not given,	a default file name is
	      chosen  by  replacing the	file extenstion	of the input file. For
	      source files ending in .rh the suffix .h is used.	For all	 other
	      source  files a suffix based on the output language is used (.c,
	      .cpp, .m,	etc.). If -o is	not given for Graphviz output the gen-
	      erated dot file is written to standard output.

       -s     Print some statistics on standard	error.

       --error-format=gnu
	      Print  error  messages using the format "file:line:column:" (de-
	      fault)

       --error-format=msvc
	      Print error messages using the format "file(line,column):"

       -d     Do not remove duplicate actions from action lists.

       -I  dir
	      Add dir to the list of directories to search  for	 included  and
	      imported files

       -n     Do not perform state minimization.

       -m     Perform  minimization once, at the end of	the state machine com-
	      pilation.

       -l     Minimize after nearly every operation. Lists of like  operations
	      such  as	unions	are minimized once at the end. This is the de-
	      fault minimization option.

       -e     Minimize after every operation.

       -x     Compile the state	machines and emit an XML representation	of the
	      host data	and the	machines.

       -V     Generate a dot file for Graphviz.

       -p     Display printable	characters on labels.

       -S <spec>
	      FSM specification	to output.

       -M <machine>
	      Machine definition/instantiation to output.

       -C     The  host	 language is C,	C++, Obj-C or Obj-C++. This is the de-
	      fault host language option.

       -D     The host language	is D.

       -J     The host language	is Java.

       -Z     The host language	is Go.

       -R     The host language	is Ruby.

       -L     Inhibit writing of #line directives.

       -T0    (C/D/Java/Ruby/C#/Go) Generate a table driven FSM. This  is  the
	      default  code  style.  The table driven FSM represents the state
	      machine as static	data. There are	tables of states, transitions,
	      indicies and actions. The	current	state is stored	in a variable.
	      The execution is a loop that looks that given the	current	 state
	      and current character to process looks up	the transition to take
	      using a binary search, executes any actions  and	moves  to  the
	      target  state.  In  general,  the	 table	driven	FSM produces a
	      smaller binary and requires a less expensive host	language  com-
	      pile but results in slower running code. The table driven	FSM is
	      suitable for any FSM.

       -T1    (C/D/Ruby/C#/Go) Generate	a faster table driven FSM by expanding
	      action lists in the action execute code.

       -F0    (C/D/Ruby/C#/Go)	Generate  a flat table driven FSM. Transitions
	      are represented as an array  indexed  by	the  current  alphabet
	      character.  This	eliminates the need for	a binary search	to lo-
	      cate transitions and produces faster code, however  it  is  only
	      suitable for small alphabets.

       -F1    (C/D/Ruby/C#/Go)	Generate a faster flat table driven FSM	by ex-
	      panding action lists in the action execute code.

       -G0    (C/D/C#/Go) Generate a goto driven FSM. The goto driven FSM rep-
	      resents  the state machine as a series of	goto statements. While
	      in the machine, the current state	is stored by  the  processor's
	      instruction pointer. The execution is a flat function where con-
	      trol is passed from state	to state using gotos. In general,  the
	      goto FSM produces	faster code but	results	in a larger binary and
	      a	more expensive host language compile.

       -G1    (C/D/C#/Go) Generate a faster goto driven	FSM by	expanding  ac-
	      tion lists in the	action execute code.

       -G2    (C/D/Go) Generate	a really fast goto driven FSM by embedding ac-
	      tion lists in the	state machine control code.

       -P<N>  (C/D) N-Way Split	really fast goto-driven	FSM.

RAGEL INPUT
       NOTE: This is a very brief description of Ragel	input.	Ragel  is  de-
       scribed	in  more  detail in the	user guide available from the homepage
       (see below).

       Ragel normally passes input files straight to the output. When it  sees
       an  FSM	specification that contains machine instantiations it stops to
       generate	the state machine. If there  are  write	 statements  (such  as
       "write exec") then ragel	emits the corresponding	code. There can	be any
       number of FSM specifications in an input	file. A	multi-line FSM	speci-
       fication	starts with '%%{' and ends with	'}%%'. A single	line FSM spec-
       ification starts	with %%	and ends at the	first newline.

FSM STATEMENTS
       Machine Name:
	      Set the the name of the machine. If given, it must be the	 first
	      statement.

       Alphabet	Type:
	      Set the data type	of the alphabet.

       GetKey:
	      Specify  how to retrieve the alphabet character from the element
	      type.

       Include:
	      Include a	machine	of same	name as	the current or of a  different
	      name in either the current file or some other file.

       Action Definition:
	      Define an	action that can	be invoked by the FSM.

       Fsm Definition, Instantiation and Longest Match Instantiation:
	      Used to build FSMs. Syntax description in	next few sections.

       Access:
	      Specify how to access the	persistent state machine variables.

       Write: Write some component of the machine.

       Variable:
	      Override the default variable names (p, pe, cs, act, etc).

BASIC MACHINES
       The  basic  machines  are the base operands of the regular language ex-
       pressions.

       'hello'
	      Concat literal. Produces a concatenation of  the	characters  in
	      the  string.   Supports  escape  sequences with '\'.  The	result
	      will have	a start	state and a transition to a new	state for each
	      character	 in the	string.	The last state in the sequence will be
	      made final. To make the string case-insensitive, append  an  'i'
	      to the string, as	in 'cmd'i.

       "hello"
	      Identical	to single quote	version.

       [hello]
	      Or  literal. Produces a union of characters.  Supports character
	      ranges with '-', negating	the sense of the union with an initial
	      '^'  and	escape	sequences  with	 '\'. The result will have two
	      states with a transition between	them  for  each	 character  or
	      range.

       NOTE:  '',  "",	and [] produce null FSMs. Null machines	have one state
       that is both a start state and a	final state and	match the zero	length
       string. A null machine may be created with the null builtin machine.

       integer
	      Makes a two state	machine	with one transition on the given inte-
	      ger number.

       hex    Makes a two state	machine	with one transition on the given  hex-
	      idecimal number.

       /simple_regex/
	      A	 simple	regular	expression. Supports the notation '.', '*' and
	      '[]', character ranges with '-', negating	the sense of an	OR ex-
	      pression	with  and  initial  '^'	and escape sequences with '\'.
	      Also supports one	trailing flag: i. Use it to produce a case-in-
	      sensitive	regular	expression, as in /GET/i.

       lit .. lit
	      Specifies	a range. The allowable upper and lower bounds are con-
	      cat literals of length one and number  machines.	 For  example,
	      0x10..0x20,  0..63, and 'a'..'z' are valid ranges.

       variable_name
	      References  the machine definition assigned to the variable name
	      given.

       builtin_machine
	      There are	several	builtin	machines available. They are  all  two
	      state  machines  for  the	 purpose of matching common classes of
	      characters. They are:

	      any    Any character in the alphabet.

	      ascii  Ascii characters 0..127.

	      extend Ascii extended characters.	This is	 the  range  -128..127
		     for  signed  alphabets  and the range 0..255 for unsigned
		     alphabets.

	      alpha  Alphabetic	characters /[A-Za-z]/.

	      digit  Digits /[0-9]/.

	      alnum  Alpha numerics /[0-9A-Za-z]/.

	      lower  Lowercase characters /[a-z]/.

	      upper  Uppercase characters /[A-Z]/.

	      xdigit Hexidecimal digits	/[0-9A-Fa-f]/.

	      cntrl  Control characters	0..31.

	      graph  Graphical characters /[!-~]/.

	      print  Printable characters /[ -~]/.

	      punct  Punctuation. Graphical characters that are	not  alpha-nu-
		     merics /[!-/:-@\[-`{-~]/.

	      space  Whitespace	/[\t\v\f\n\r ]/.

	      null   Zero length string. Equivalent to '', "" and [].

	      empty  Empty set.	Matches	nothing.

BRIEF OPERATOR REFERENCE
       Operators are grouped by	precedence, group 1 being the lowest and group
       6 the highest.

       GROUP 1:

       expr , expr
	      Join machines together without drawing any transitions,  setting
	      up  a  start  state or any final states. Start state must	be ex-
	      plicitly specified with the "start" label. Final states  may  be
	      specified	with the an epsilon transitions	to the implicitly cre-
	      ated "final" state.

       GROUP 2:

       expr | expr
	      Produces a machine that matches any string in machine one	or ma-
	      chine two.

       expr _ expr
	      Produces	a  machine that	matches	any string that	is in both ma-
	      chine one	and machine two.

       expr - expr
	      Produces a machine that matches any string that  is  in  machine
	      one but not in machine two.

       expr -- expr
	      Strong  Subtraction. Matches any string in machine one that does
	      not have any string in machine two as a substring.

       GROUP 3:

       expr . expr
	      Produces a machine that matches all the strings in  machine  one
	      followed by all the strings in machine two.

       expr :_ expr
	      Entry-Guarded  Concatenation:  terminates	machine	one upon entry
	      to machine two.

       expr :__	expr
	      Finish-Guarded Concatenation: terminates machine	one  when  ma-
	      chine two	finishes.

       expr _: expr
	      Left-Guarded  Concatenation:  gives a higher priority to machine
	      one.

       NOTE: Concatenation is the default operator. Two	machines next to  each
       other with no operator between them results in the concatenation	opera-
       tion.

       GROUP 4:

       label: expr
	      Attaches a label to an expression. Labels	can be used by epsilon
	      transitions and fgoto and	fcall statements in actions. Also note
	      that the referencing of a	machine	definition causes the implicit
	      creation of label	by the same name.

       GROUP 5:

       expr -_ label
	      Draws an epsilon transition to the state defined by label. Label
	      must be a	name in	the current scope. Epsilon transitions are re-
	      solved when comma	operators are evaluated	and at the root	of the
	      expression tree of machine assignment/instantiation.

       GROUP 6:	Actions

       An action may be	a name predefined with an action statement or  may  be
       specified directly with '{' and '}' in the expression.

       expr _ action
	      Embeds action into starting transitions.

       expr @ action
	      Embeds action into transitions that go into a final state.

       expr $ action
	      Embeds action into all transitions. Does not include pending out
	      transitions.

       expr % action
	      Embeds action into pending out transitions from final states.

       GROUP 6:	EOF Actions

       When a machine's	finish routine is called the current state's  EOF  ac-
       tions are executed.

       expr _/ action
	      Embed an EOF action into the start state.

       expr _/ action
	      Embed an EOF action into all states except the start state.

       expr $/ action
	      Embed an EOF action into all states.

       expr %/ action
	      Embed an EOF action into final states.

       expr @/ action
	      Embed an EOF action into all states that are not final.

       expr __/	action
	      Embed an EOF action into all states that are not the start state
	      and that are not final (middle states).

       GROUP 6:	Global Error Actions

       Global error actions are	stored in states until the final state machine
       has  been fully constructed. They are then transferred to error transi-
       tions, giving the effect	of a default action.

       expr _! action
	      Embed a global error action into the start state.

       expr _! action
	      Embed a global error action into all  states  except  the	 start
	      state.

       expr $! action
	      Embed a global error action into all states.

       expr %! action
	      Embed a global error action into the final states.

       expr @! action
	      Embed a global error action into all states which	are not	final.

       expr __!	action
	      Embed  a	global	error action into all states which are not the
	      start state and are not final (middle states).

       GROUP 6:	Local Error Actions

       Local error actions are stored in states	until  the  named  machine  is
       fully constructed. They are then	transferred to error transitions, giv-
       ing the effect of a default action for a	section	of the total  machine.
       Note  that  the	name  may be omitted, in which case the	action will be
       transferred to error actions upon construction of the current machine.

       expr _^ action
	      Embed a local error action into the start	state.

       expr _^ action
	      Embed a local error action into  all  states  except  the	 start
	      state.

       expr $^ action
	      Embed a local error action into all states.

       expr %^ action
	      Embed a local error action into the final	states.

       expr @^ action
	      Embed a local error action into all states which are not final.

       expr __^	action
	      Embed  a	local  error  action into all states which are not the
	      start state and are not final (middle states).

       GROUP 6:	To-State Actions

       To state	actions	are stored in states and executed any time the machine
       moves into a state. This	includes regular transitions, and transfers of
       control such as fgoto. Note that	setting	the current state from outside
       the  machine  (for  example  during initialization) does	not count as a
       transition into a state.

       expr _~ action
	      Embed a to-state action action into the start state.

       expr _~ action
	      Embed a to-state action into all states except the start state.

       expr $~ action
	      Embed a to-state action into all states.

       expr %~ action
	      Embed a to-state action into the final states.

       expr @~ action
	      Embed a to-state action into all states which are	not final.

       expr __~	action
	      Embed a to-state action into all states which are	not the	 start
	      state and	are not	final (middle states).

       GROUP 6:	From-State Actions

       From  state actions are executed	whenever a state takes a transition on
       a character.  This includes the error transition	and  a	transition  to
       self.

       expr _* action
	      Embed a from-state action	into the start state.

       expr _* action
	      Embed  a	from-state  action  into  every	state except the start
	      state.

       expr $* action
	      Embed a from-state action	into all states.

       expr %* action
	      Embed a from-state action	into the final states.

       expr @* action
	      Embed a from-state action	into all states	which are not final.

       expr __*	action
	      Embed a from-state action	into all  states  which	 are  not  the
	      start state and are not final (middle states).

       GROUP 6:	Priority Assignment

       Priorities are assigned to names	within transitions. Only priorities on
       the same	name are allowed to interact. In the first form	of  priorities
       the name	defaults to the	name of	the machine definition the priority is
       assigned	in.  Transitions do not	have default priorities.

       expr _ int
	      Assigns the priority int in all transitions  leaving  the	 start
	      state.

       expr @ int
	      Assigns the priority int in all transitions that go into a final
	      state.

       expr $ int
	      Assigns the priority int in all existing transitions.

       expr % int
	      Assigns the priority int in all pending out transitions.

       A second	form of	priority assignment allows the programmer  to  specify
       the  name  to  which the	priority is assigned, allowing interactions to
       cross machine definition	boundaries.

       expr _ (name,int)
	      Assigns the priority int to name in all transitions leaving  the
	      start state.

       expr @ (name, int)
	      Assigns the priority int to name in all transitions that go into
	      a	final state.

       expr $ (name, int)
	      Assigns the priority int to name in all existing transitions.

       expr % (name, int)
	      Assigns the priority int to name in all pending out transitions.

       GROUP 7:

       expr * Produces the kleene star of a machine. Matches zero or more rep-
	      etitions of the machine.

       expr **
	      Longest-Match  Kleene  Star.  This version of kleene star	puts a
	      higher priority on staying in the	machine	over  wrapping	around
	      and  starting over. This operator	is equivalent to ( ( expr ) $0
	      %1 )*.

       expr ? Produces a machine that accepts the machine given	 or  the  null
	      string. This operator is equivalent to  (	expr | '' ).

       expr + Produces the machine concatenated	with the kleen star of itself.
	      Matches one or more repetitions of the machine.	This  operator
	      is equivalent to ( expr .	expr* ).

       expr {n}
	      Produces a machine that matches exactly n	repetitions of expr.

       expr {,n}
	      Produces	a machine that matches anywhere	from zero to n repeti-
	      tions of expr.

       expr {n,}
	      Produces a machine that matches n	or more	repetitions of expr.

       expr {n,m}
	      Produces a machine that matches n	to m repetitions of expr.

       GROUP 8:

       ! expr Produces a machine that matches any string not  matched  by  the
	      given  machine.  This operator is	equivalent to (	*extend	- expr
	      ).

       ^ expr Character-Level  Negation.  Matches  any	single	character  not
	      matched by the single character machine expr.

       GROUP 9:

       ( expr )
	      Forces precedence	on operators.

VALUES AVAILABLE IN CODE BLOCKS
       fc     The current character. Equivalent	to *p.

       fpc    A	pointer	to the current character. Equivalent to	p.

       fcurs  An integer value representing the	current	state.

       ftargs An integer value representing the	target state.

       fentry(_label_)
	      An integer value representing the	entry point <label>.

STATEMENTS AVAILABLE IN	CODE BLOCKS
       fhold; Do not advance over the current character. Equivalent to --p;.

       fexec _expr_;
	      Sets  the	current	character to something else. Equivalent	to p =
	      (<expr>)-1;

       fgoto _label_;
	      Jump to the machine defined by <label>.

       fgoto *_expr_;
	      Jump to the entry	point given by	<expr>.	 The  expression  must
	      evaluate to an integer value representing	a state.

       fnext _label_;
	      Set  the	next  state  to	be the entry point defined by <label>.
	      The fnext	statement does not immediately jump to	the  specified
	      state. Any action	code following the statement is	executed.

       fnext *_expr_;
	      Set  the	next  state to be the entry point given	by <expr>. The
	      expression must evaluate to  an  integer	value  representing  a
	      state.

       fcall _label_;
	      Call  the	machine	defined	by <label>. The	next fret will jump to
	      the target of the	transition on which the	action is invoked.

       fcall *_expr_;
	      Call the entry point given by <expr>. The	next fret will jump to
	      the target of the	transition on which the	action is invoked.

       fret;  Return  to  the target state of the transition on	which the last
	      fcall was	made.

       fbreak;
	      Save the current state and immediately break out of the machine.

CREDITS
       Ragel was written by Adrian Thurston  <thurston@complang.org>.	Objec-
       tive-C  output contributed by Erich Ocean. D output contributed by Alan
       West. Ruby output contributed by	Victor Hugo Borja. C Sharp code	gener-
       ation contributed by Daniel Tang. Contributions to Java code generation
       by Colin	Fleming.  Go code generation contributed by Justine Tunney.

SEE ALSO
       re2c(1),	flex(1)

       Homepage: http://www.complang.org/ragel/

Ragel 6.10			  March	2017			      RAGEL(1)

NAME | SYNOPSIS | DESCRIPTION | OPTIONS | RAGEL INPUT | FSM STATEMENTS | BASIC MACHINES | BRIEF OPERATOR REFERENCE | VALUES AVAILABLE IN CODE BLOCKS | STATEMENTS AVAILABLE IN CODE BLOCKS | CREDITS | SEE ALSO

Want to link to this manual page? Use this URL:
<https://www.freebsd.org/cgi/man.cgi?query=ragel&sektion=1&manpath=FreeBSD+12.0-RELEASE+and+Ports>

home | help