Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
PATTERNS(7)	   FreeBSD Miscellaneous Information Manual	   PATTERNS(7)

NAME
     patterns -- Lua's pattern matching	rules

DESCRIPTION
     Pattern matching in httpd(8) is based on the implementation of the	Lua
     scripting language	and provides a simple and fast alternative to the reg-
     ular expressions (REs) that are described in re_format(7).	 Patterns are
     described by regular strings, which are interpreted as patterns by	the
     pattern-matching "find" and "match" functions.  This document describes
     the syntax	and the	meaning	(that is, what they match) of these strings.

CHARACTER CLASS
     A character class is used to represent a set of characters.  The follow-
     ing combinations are allowed in describing	a character class:

     x	     (where x is not one of the	magic characters `^$()%.[]*+-?') rep-
	     resents the character x itself.

     .	     (a	dot) represents	all characters.

     %a	     represents	all letters.

     %c	     represents	all control characters.

     %d	     represents	all digits.

     %g	     represents	all printable characters except	space.

     %l	     represents	all lowercase letters.

     %p	     represents	all punctuation	characters.

     %s	     represents	all space characters.

     %u	     represents	all uppercase letters.

     %w	     represents	all alphanumeric characters.

     %x	     represents	all hexadecimal	digits.

     %x	     (where x is any non-alphanumeric character) represents the	char-
	     acter x.  This is the standard way	to escape the magic charac-
	     ters.  Any	non-alphanumeric character (including all punctuation
	     characters, even the non-magical) can be preceded by a `%'	when
	     used to represent itself in a pattern.

     [set]   represents	the class which	is the union of	all characters in set.
	     A range of	characters can be specified by separating the end
	     characters	of the range, in ascending order, with a `-'.  All
	     classes `%x' described above can also be used as components in
	     set.  All other characters	in set represent themselves.  For ex-
	     ample, `[%w_]' (or	`[_%w]') represents all	alphanumeric charac-
	     ters plus the underscore, `[0-7]' represents the octal digits,
	     and `[0-7%l%-]' represents	the octal digits plus the lowercase
	     letters plus the `-' character.

	     The interaction between ranges and	classes	is not defined.
	     Therefore,	patterns like `[%a-z]' or `[a-%%]' have	no meaning.

     [^set]  represents	the complement of set, where set is interpreted	as
	     above.

     For all classes represented by single letters ( `%a', `%c', etc.),	the
     corresponding uppercase letter represents the complement of the class.
     For instance, `%S'	represents all non-space characters.

     The definitions of	letter,	space, and other character groups depend on
     the current locale.  In particular, the class `[a-z]' may not be equiva-
     lent to `%l'.

PATTERN	ITEM
     A	pattern	item can be

     +o	 a single character class, which matches any single character in the
	 class;

     +o	 a single character class followed by `*', which matches zero or more
	 repetitions of	characters in the class.  These	repetition items will
	 always	match the longest possible sequence;

     +o	 a single character class followed by `+', which matches one or	more
	 repetitions of	characters in the class.  These	repetition items will
	 always	match the longest possible sequence;

     +o	 a single character class followed by `-', which also matches zero or
	 more repetitions of characters	in the class.  Unlike `*', these repe-
	 tition	items will always match	the shortest possible sequence;

     +o	 a single character class followed by `?', which matches zero or one
	 occurrence of a character in the class.  It always matches one	occur-
	 rence if possible;

     +o	 `%n', for n between 1 and 9; such item	matches	a substring equal to
	 the n-th captured string (see below);

     +o	 `%bxy', where x and y are two distinct	characters; such item matches
	 strings that start with x, end	with y,	and where the x	and y are
	 balanced.  This means that if one reads the string from left to
	 right,	counting +1 for	an x and -1 for	a y, the ending	y is the first
	 y where the count reaches 0.  For instance, the item `%b()' matches
	 expressions with balanced parentheses.

     +o	 `%f[set]', a frontier pattern;	such item matches an empty string at
	 any position such that	the next character belongs to set and the pre-
	 vious character does not belong to set.  The set set is interpreted
	 as previously described.  The beginning and the end of	the subject
	 are handled as	if they	were the character `\0'.

PATTERN
     A pattern is a sequence of	pattern	items.	A caret	`^' at the beginning
     of	a pattern anchors the match at the beginning of	the subject string.  A
     `$' at the	end of a pattern anchors the match at the end of the subject
     string.  At other positions, `^' and `$' have no special meaning and rep-
     resent themselves.

CAPTURES
     A pattern can contain sub-patterns	enclosed in parentheses; they describe
     captures.	When a match succeeds, the substrings of the subject string
     that match	captures are stored (captured) for future use.	Captures are
     numbered according	to their left parentheses.  For	instance, in the pat-
     tern "(a*(.)%w(%s*))", the	part of	the string matching "a*(.)%w(%s*)" is
     stored as the first capture (and therefore	has number 1); the character
     matching "." is captured with number 2, and the part matching "%s*" has
     number 3.

     As	a special case,	the empty capture `()' captures	the current string po-
     sition (a number).	 For instance, if we apply the pattern "()aa()"	on the
     string "flaaap", there will be two	captures: 2 and	4.

SEE ALSO
     fnmatch(3), re_format(7), httpd(8)

     Roberto Ierusalimschy, Luiz Henrique de Figueiredo, and Waldemar Celes,
     Patterns, Lua 5.3 Reference Manual,
     http://www.lua.org/manual/5.3/manual.html#6.4.1, Lua.org PUC-Rio, June
     2015.

HISTORY
     The first implementation of the pattern rules were	introduced with	Lua
     2.5.  Almost twenty years later, an implementation	based on Lua 5.3.1 ap-
     peared in OpenBSD 5.8.

AUTHORS
     The pattern matching is derived from the original implementation of the
     Lua scripting language written by Roberto Ierusalimschy, Waldemar Celes,
     and Luiz Henrique de Figueiredo at	PUC-Rio.  It was turned	into a native
     C API for httpd(8)	by Reyk	Floeter	<reyk@openbsd.org>.

CAVEATS
     A notable difference with the Lua implementation is the position in the
     string returned by	captures.  It follows the C-style indexing (position
     starting from 0) instead of Lua-style indexing (position starting from
     1).

FreeBSD	13.0			 June 10, 2017			  FreeBSD 13.0

NAME | DESCRIPTION | CHARACTER CLASS | PATTERN ITEM | PATTERN | CAPTURES | SEE ALSO | HISTORY | AUTHORS | CAVEATS

Want to link to this manual page? Use this URL:
<https://www.freebsd.org/cgi/man.cgi?query=patterns&sektion=7&manpath=FreeBSD+13.0-RELEASE+and+Ports>

home | help