Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages


home | help
FBB::Pattern(3bobcat)		Pattern	matcher		 FBB::Pattern(3bobcat)

       FBB::Pattern - Performs RE pattern matching

       #include	<bobcat/pattern>
       Linking option: -lbobcat

       Pattern	objects	may be used for	Regular	Expression (RE)	pattern	match-
       ing. The	class is a wrapper around the regcomp(3) family	of  functions.
       By default it uses `extended regular expressions', requiring you	to es-
       cape multipliers	and bounding-characters	when  they  should  be	inter-
       preted as ordinary characters (i.e., *, +, ?, ^,	$, |, (, ), [, ], {, }
       should be escaped when used as literal characters).

       The Pattern class supports the use of the following (Perl-like) special
       escape sequences:
       \b - indicating a word-boundary
       \d - indicating a digit ([[:digit:]]) character
       \s - indicating a white-space ([:space:]) character
       \w - indicating a word ([:alnum:]) character

       The  corresponding capitals (e.g., \W) define the complementary charac-
       ter sets. The capitalized character set shorthands are not expanded in-
       side  explicit character-classes	(i.e., [ ... ] constructions). So [\W]
       represents a set	of two characters: \ and W.

       As the backslash	(\) is treated as a special  character	it  should  be
       handled	carefully. Pattern converts the	escape sequences \d \s \w (and
       outside of explicit character classes the sequences \D \S \W) to	 their
       respective  character  classes.	All other escape sequences are kept as
       is, and the resulting regular expression	 is  offered  to  the  pattern
       matching	 compilation function regcomp(3). This function	will again in-
       terpret escape sequences. Consequently some care	 should	 be  exercised
       when defining patterns containing escape	sequences. Here	are the	rules:

       o      Special  escape  sequences  (like	\d) are	converted to character
	      classes. E.g.,

		  Specify:    Converts to:    regcomp uses:	 Matches:
		  \d	      [[:digit:]]     [[:digit:]]	 3

       o      Ordinary escape sequences	(like \x) are kept as-is. E.g.,

		  Specify:    Converts to:    regcomp uses:	 Matches:
		  \x	      \x	      x			 x

       o      To specify a literal escape sequence, it must be written	twice.

		  Specify:    Converts to:    regcomp uses:	 Matches:
		  \\x	      \\x	      \x		 \x

       All  constructors,  members,  operators	and manipulators, mentioned in
       this man-page, are defined in the namespace FBB.


       o      Pattern::Position:
	      A	nested type representing the offsets of	 the  first  character
	      and  the offset beyond the last character	of the matched text or
	      indexed		subexpression,		 defined	    as
	      std::pair_std::string::size_type,	std::string::size_type_.

       o      Pattern():
	      The  default constructor defines no pattern, but is available as
	      a	placeholder for, e.g., containers requiring default  construc-
	      tors.  A Pattern object thus constructed cannot be used to match
	      patterns,	but can	be the lvalue  in  assignments	where  another
	      Pattern  object is the rvalue. However, it can receive a pattern
	      using the	member setPattern() (see below). An FBB::Exception ob-
	      ject is thrown if	the object could not be	constructed.

       o      Pattern(std::string  const  &pattern, bool caseSensitive = true,
	      size_t nSub = 10,	int options = REG_EXTENDED | REG_NEWLINE):
	      This constructor compiles	pattern, preparing the Pattern	object
	      for  pattern  matches.  The  second parameter determines whether
	      case sensitive matching will be used (the	default) or not.  Sub-
	      expressions are defined by parentheses pairs. Each matching pair
	      defines a	subexpression, where the order-number of their opening
	      parentheses  determines the subexpression's index. By default at
	      most 10 subexpressions are recognized.  The  options  flags  may

	      Use  POSIX  Extended Regular Expression syntax when interpreting
	      regex.  If not set, POSIX	Basic  Regular	Expression  syntax  is

	      Support  for  substring  addressing of matches is	 not required.
	      The  nmatch  and	pmatch	parameters to regexec are  ignored  if
	      the pattern buffer  supplied was compiled	with this flag set.

	      Match-any-character  operators  don't  match a newline.

	      A	 non-matching list ([^...])  not containing a newline does not
	      match a newline.

	      Match-beginning-of-line operator (^) matches  the	 empty	string
	      immediately  after  a newline, regardless	of whether eflags, the
	      execution	flags of regexec, contains REG_NOTBOL.

	      Match-end-of-line	operator ($)  matches  the  empty string   im-
	      mediately	 before	 a  newline, regardless	of whether eflags con-
	      tains REG_NOTEOL.

       Pattern offers  copy and	move constructors.

       All members of std::ostringstream and   std::exception  are  available,
       as Pattern inherits from	these classes.

       o      std::string before() const:
	      Following	 a  successful match, before() returns the text	before
	      the matched text.

       o      std::string beyond() const:
	      Following	a successful match, beyond() returns the  text	beyond
	      the matched text.

       o      size_t end() const:
	      Returns  the  number  of	matched	 elements (text	and subexpres-
	      sions). end() is the lowest index	value for which	position() re-
	      turns  two  std::string::npos  values (see the position()	member
	      function,	below).

       o      void match(std::string const &text, int options =	0):
	      Match a string with  a  pattern.	 If  the  text	could  not  be
	      matched,	 an   Exception	 exception  is	thrown	,  using  Pat-
	      tern::match() as its prefix-text.

	      Options may be:

	      The match-beginning-of-line operator always fails	to match  (but
	      see  the	compilation  flag  REG_NEWLINE above) This flag	may be
	      used when	different portions of a	string are passed  to  regexec
	      and the beginning	of the string should not be interpreted	as the
	      beginning	of the line.

	      The  match-end-of-line  operator	always	fails  to match	  (but
	      see  the	compilation flag REG_NEWLINE)

       o      std::string matched() const:
	      Following	 a successful match, this function returns the matched

       o      std::string const	&pattern() const:
	      This member function returns the pattern that is offered to reg-
	      comp(3).	It  returns  the  contents  of a static	string that is
	      overwritten at each construction of a Pattern object and at each
	      call of the setPattern() member function.

       o      Pattern::Position	position(size_t	index) const:
	      With index == 0 the fully	matched	text is	returned (identical to
	      matched()). Other	index values return the	 corresponding	subex-
	      pressions.  std::string::npos,  std::string::npos	is returned if
	      index is at least	end() (which may happen	at index value 0).

       o      void setPattern(std::string const	&pattern, bool caseSensitive =
	      true,  size_t  nSub  = 10, int options = REG_EXTENDED | REG_NEW-
	      This member function installs a new   compiled  pattern  in  its
	      Pattern  object.	This  member's parameters are identical	to the
	      second constructor's parameters. Refer to	that  constructor  for
	      details  about the parameters. Like the constructor, an FBB::Ex-
	      ception exception	is thrown if the new pattern could not be com-

       o      void swap(Pattern	&other):
	      The  contents  of	 the  current  object and the other object are

       o      Pattern &operator=(Pattern &other):
	      A	standard overloaded assignment operator.

       o      std::string operator[](size_t index) const:
	      Returns the matched text (for index 0) or	the text of  a	subex-
	      pression.	An empty string	is returned for	index values which are
	      at least end().

       o      Pattern &operator<<(int matchOptions):
	      Defines match-options to be used with the	 following  overloaded

       o      bool operator<<(std::string const	&text):
	      Performs	a  match(text, matchOptions) call, catching any	excep-
	      tion that	might be thrown. If no matchOptions were set using the
	      above  overloaded	 operator, none	are used. The options set this
	      way are not `sticky': when necessary, they  have	to  be	re-in-
	      serted  before  each  new	pattern	matching. The function returns
	      true if the matching was successful, false otherwise.


       #include	"driver.h"

       #include	<bobcat/pattern>

       using namespace std;
       using namespace FBB;

       void showSubstr(string const &str)
	   static int
	       count = 1;

	   cout	<< "String " <<	count++	<< " is	'" << str << "'\n";

       int main(int argc, char **argv)
	       Pattern one("one");
	       Pattern two(one);
	       Pattern three("a");
	       Pattern four;
	       three = two;

	       Pattern pattern("aap|noot|mies");

		   Pattern extra(Pattern(pattern));

	       if (pattern << "noot")
		   cout	<< "noot matches\n";
		   cout	<< ": noot doesn't match\n";
	   catch (exception const &e)
	       cout << e.what()	<< ": compilation failed" << endl;

	   string pat =	"\\d+";

	   while (true)
	       cout << "Pattern: '" << pat << "'\n";

		   Pattern patt(pat, argc == 1);   // case sensitive by	default,
						   // any arg for case insensitive

		   cout	<< "Compiled pattern: "	<< patt.pattern() << endl;

		   Pattern pattern;
		   pattern = patt;		   // assignment operator

		   while (true)
		       cout << "string to match	: ";

		       string st;
		       getline(cin, st);
		       if (st == "")
		       cout << "String:	'" << st << "'\n";

			   Pattern p3(pattern);

			   cout	<< "before:  " << p3.before() << "\n"
				   "matched: " << p3.matched() << "\n"
				   "beyond:  " << pattern.beyond() << "\n"
				   "end() = " << pattern.end() << endl;

			   for (size_t idx = 0;	idx < pattern.end(); ++idx)
			       string str = pattern[idx];

			       if (str == "")
				   cout	<< "part " << idx << " not present\n";
				   Pattern::Position pos = pattern.position(idx);

				   cout	<< "part " << idx << ":	'" << str << "'	(" <<
					   pos.first <<	"-" << pos.second << ")\n";
		       catch (exception	const &e)
			   cout	<< e.what() << ": " << st << " doesn't match" << endl;
	       catch (exception	const &e)
		   cout	<< e.what() << ": compilation failed" << endl;

	       cout << "New pattern: ";

	       if (!getline(cin, pat) || !pat.length())
		   return 0;

       bobcat/pattern -	defines	the class interface

       bobcat(7), regcomp(3), regex(3),	regex(7)

       Using Pattern objects as	static data members of classes (or  as	global
       objects)	 is  potentially dangerous. If the object files	defining these
       static data members are stored in a dynamic library  they  may  not  be
       initialized  properly or	timely,	and their eventual destruction may re-
       sult in a segmentation fault. This is a well-known problem with	static
       data,				 see,				 e.g.,  In	situa-
       tions  like this	prefer the use of a (shared, unique) pointer to	a Pat-
       tern, initialzing the pointer when, e.g., first used.

       o      bobcat_3.25.01-x.dsc: detached signature;

       o      bobcat_3.25.01-x.tar.gz: source archive;

       o      bobcat_3.25.01-x_i386.changes: change log;

       o      libbobcat1_3.25.01-x_*.deb:  debian  package  holding  the   li-

       o      libbobcat1-dev_3.25.01-x_*.deb:  debian  package holding the li-
	      braries, headers and manual pages;

       o public archive location;

       Bobcat is an acronym of `Brokken's Own Base Classes And Templates'.

       This is free software, distributed under	the terms of the  GNU  General
       Public License (GPL).

       Frank B.	Brokken	(

libbobcat-dev_3.25.01-x.tar.gz	   2005-2015		 FBB::Pattern(3bobcat)


Want to link to this manual page? Use this URL:

home | help