Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
Regexp::Common::time(3User Contributed Perl DocumentatiRegexp::Common::time(3)

NAME
       Regexp::Common::time - Date and time regexps.

SYNOPSIS
	use Regexp::Common qw(time);

	# Piecemeal, Time::Format-like patterns
	$RE{time}{tf}{-pat => 'pattern'}

	# Piecemeal, strftime-like patterns
	$RE{time}{strftime}{-pat => 'pattern'}

	# Match	ISO8601-style date/time	strings
	$RE{time}{iso}

	# Match	RFC2822-style date/time	strings
	$RE{time}{mail}
	$RE{time}{MAIL}	   # more-strict matching

	# Match	informal American date strings
	$RE{time}{american}

	# Fuzzy	date patterns
	#		YEAR/MONTH/DAY
	$RE{time}{ymd}	       # Most flexible
	$RE{time}{YMD}	       # Strictest (equivalent to y4m2d2)
			# Other	available patterns: y2md, y4md,	y2m2d2,	y4m2d2

	#		MONTH/DAY/YEAR	(American style)
	$RE{time}{mdy}	       # Most flexible
	$RE{time}{MDY}	       # Strictest (equivalent to m2d2y4)
			# Other	available patterns: mdy2, mdy4,	m2d2y2,	m2d2y4

	#		DAY/MONTH/YEAR	(European style)
	$RE{time}{mdy}	       # Most flexible
	$RE{time}{MDY}	       # Strictest (equivalent to d2m2y4)
			# Other	available patterns: dmy2, dmy4,	d2m2y2,	d2m2y4

	# Fuzzy	time pattern
	#		HOUR/MINUTE/SECOND
	$RE{time}{hms}	  # H: matches 1 or 2 digits; 12 or 24 hours
			  # M: matches 2 digits.
			  # S: matches 2 digits; may be	omitted
			  # May	be followed by "a", "am", "p.m.", etc.

DESCRIPTION
       This module creates regular expressions that can	be used	for parsing
       dates and times.	 See Regexp::Common for	a general description of how
       to use this interface.

       Parsing dates is	a dirty	business. Dates	are generally specified	in one
       of three	possible orders: year/month/day, month/day/year, or
       day/month/year.	Years can be specified with four digits	or with	two
       digits (with assumptions	made about the century).  Months can be
       specified as one	digit, two digits, as a	spelled-out name, or as	a
       three-letter abbreviation.  Day numbers can be one digit	or two digits,
       with limits depending on	the month (and,	in the case of February, even
       the year).  Also, different people use different	punctuation for
       separating the various elements.

       A human can easily recognize that "October 21, 2005" and	"21.10.05"
       refer to	the same date, but it's	tricky to get a	program	to come	to the
       same conclusion.	 This module attempts to make it possible to do	so,
       with a minimum of difficulty.

       o   If you know the exact format	of the data to be matched, use one of
	   the specific, piecemeal pattern builders: "tf" or "strftime".

       o   If you are parsing RFC-2822 mail headers, use the "mail" pattern.

       o   If you are parsing informal American	dates, use the "american"
	   pattern.

       o   If there is some variability	in your	input data, use	one of the
	   fuzzy-matching patterns in the "dmy", "mdy",	or "ymd" families.

       o   If the data are wildly variable, such as raw	user input, you	should
	   probably give up and	use Date::Manip	or Date::Parse.

       Time values are generally much simpler to parse than date values.  Only
       one fuzzy pattern is provided, and it should suffice for	most needs.

Time::Format PATTERNS
       The Time::Format	module uses simple, intuitive strings for specifying
       date and	time formats.  You can use these patterns here as well.	 See
       Time::Format for	details	about its format specifiers.

       Example:

	   $str	= 'Thu November	2, 2005';
	   $str	=~ $RE{time}{tf}{-pat => 'Day Month d, yyyy'};

       The patterns can	contain	more complex regexp expressions	as well:

	   $str	=~ $RE{time}{tf}{-pat => '(Weekday|Day)	(Month|Mon) d, yyyy'};

       Time zone matching (the "tz" format code) attempts to adhere to RFC2822
       and ISO8601 as much as possible.	 The following time zones are matched:

	   Z
	   UT	     UTC
	   +hh:mm    -hh:mm
	   +hhmm     -hhmm
	   +hh	     -hh
	   GMT	 EST EDT   CST CDT   MST MDT   PST PDT

strftime PATTERNS
       The POSIX "strftime" function is	a long-recognized standard for
       formatting dates	and times.  This module	supports most of "stftime"'s
       codes for matching; specifically, the
       "aAbBcCDdeHIjmMnprRSTtuUVwWyxXYZ%" codes.  The %Z format	matches	time
       zones in	the same manner	as described above under "Time::Format
       PATTERNS".

       Also, this module provides the following	nonstandard codes:

       "   %_d	-" 1- or 2-digit day number (1-31)

       "   %_H	-" 1- or 2-digit hour (0-23)

       "   %_I	-" 1- or 2-digit hour (1-12)

       "   %_m	-" 1- or 2-digit month number (1-12)

       "   %_M	-" 1- or 2-digit minute	(0-59)

       Example:

	   $str	= 'Thu November	2, 2005';
	   $str	=~ $RE{time}{strftime}{-pat => '%a %B %_d, %Y'};

       The patterns can	contain	more complex regexp expressions	as well:

	   $str	=~ $RE{time}{strftime}{-pat => '(%A|%a)? (%B|%b) ?%_d, %Y'};

ISO-8601 DATE/TIME MATCHING
       The $RE{time}{iso} pattern will match most (all?) strings formatted as
       recommended by ISO-8601.	 The canonical ISO-8601	form is:

	   YYYY-MM-DDTHH:MM:SS

       (where ""T"" is a literal T character).	The $RE{time}{iso} pattern
       will match this form, and some variants:

       o   The date separator character	may be a hyphen, slash ("/"), period,
	   or empty string (omitted).  The two date separators must match.

       o   The time separator character	may be a colon,	a period, a space, or
	   empty string	(omitted).  The	two time separators must match.

       o   The date-time separator may be a "T", an underscore,	a space, or
	   empty string	(omitted).

       o   Either the date or the time may be omitted.	But at least one must
	   be there.

       o   If the date is not omitted, all three of its	components must	be
	   present.

       o   If the time is not omitted, all three of its	components must	be
	   present.

RFC 2822 MATCHING
       RFC 2822	specifies the format of	date/time values in e-mail message
       headers.	 In a nutshell,	the format is:

	   dd Mon yyyy hh:mm:ss	+zzzz

       where "dd" is the day of	the month; "Mon" is the	abbreviated month name
       (apparently always in English); "yyyy" is the year; "hh:mm:ss" is the
       time; and "+zzzz" is the	time zone, generally specified as an offset
       from GMT.

       RFC 2822	requires that the weekday also be specified, but this module
       ignores the weekday, as it is redundant and only	supplied for human
       readability.

       RFC 2822	requires that older, obsolete date forms be allowed as well;
       for example, alphanumeric time zone codes (e.g. EDT).  This module's
       "mail" allows for these obsolete	date forms.  If	you want to match only
       the proper date forms recommended by RFC	2822, you can use the "MAIL"
       pattern instead.

       In either case, "mail" or "MAIL", the pattern generated is very
       flexible	about whitespace.  The main differences	are: with "MAIL", two-
       digit years are not permitted, and the time zone	must be	four digits
       preceded	by a + or - sign.

INFORMAL AMERICAN MATCHING
       People in North America,	particularly in	the United States, are fond of
       specifying dates	as "Month dd, yyyy", or	sometimes with a two-digit
       year and	apostrophe: "Month dd, 'yy".  The "american" pattern matches
       this style of date.  It allows either a month name or abbreviation, and
       is flexible with	respect	to commas and whitespace.

FUZZY PATTERN OVERVIEW
       Fuzzy date patterns have	the following properties in common:

       o   The pattern names consist of	the letters "y", "m", and "d", each
	   optionally followed by a digit (2 for "m" and "d"; 2	or 4 for "y").

       o   If a	"y" is followed	by a 2 or a 4, it must match that many digits.

       o   If a	"y" has	no trailing digit, it can match	either 2 or 4 digits,
	   trying 4 first.

       o   If an "m" is	followed by a 2, then only two-digit matches for the
	   month are considered, and month names are not matched.

       o   If an "m" is	not followed by	a 2, then the month may	be 1 or	2
	   digits, or a	spelled-out name.

       o   Just	like for months, if a "d" is followed by a 2, then only	two-
	   digit matches for the day are considered.

       o   Just	like for months, if a "d" has no trailing digit, then the day
	   may be 1 or 2 digits, and a 1-digit match may not have any adjacent
	   digits.

       o   The uppercase "DMY",	"MDY", and "YMD" patterns are synonyms for the
	   strict "d2m2y4", "m2d2y4", and "y4m2d2" patterns, respectively.

       o   If a	one-digit match	is considered for the month, then no adjacent
	   digits are allowed.	(e.g.: "1/23/45" in M/D/Y format has a valid
	   one-digit month match, but "12345" does not.	 Nor does "91/23/45").

       o   If a	pattern	begins with an digitless "d", "m", or "y", then, in
	   the string to be matched, any leading digits	will cause the pattern
	   to fail.  For example: "012/23/45" will not match $RE{time}{mdy}.
	   However, it will match $RE{time}{m2d2y2}.  If you specify an	exact
	   pattern by using "m2" instead of "m", this module assumes you know
	   what	you're doing.

       o   Likewise, a pattern ending with a digitless "d" or "y" will not
	   match if there are trailing digits in the string.

FUZZY PATTERN DETAILS
   Year-Month-Day order
       $RE{time}{ymd}
	    "05/4/2"	  =~ $RE{time}{ymd};
	    "2005-APR-02" =~ $RE{time}{ymd};

	   This	is the most flexible of	the numeric-only year/month/day
	   formats.  It	matches	a date of the form "year/month/day", where the
	   year	may be 2 or 4 digits; the month	may be 1 or 2 digits or	a
	   spelled-out name or name abbreviation, and the day may be 1 or 2
	   digits.  The	year/month/day elements	may be directly	adjacent to
	   each	other, or may be separated by a	space, period, slash ("/"), or
	   hyphen.

       $RE{time}{y4md}
	    "2005/4/2"	  =~ $RE{time}{y4md};
	    "2005 APR 02" =~ $RE{time}{y4md};

	   This	works as $RE{time}{ymd}, except	that the year is restricted to
	   be exactly 4	digits.

       $RE{time}{y4m2d2}
	    "2005/04/02" =~ $RE{time}{y4m2d2};

	   This	works as $RE{time}{ymd}, except	that the year is restricted to
	   be exactly 4	digits,	and the	month and day must be exactly 2	digits
	   each.

       $RE{time}{y2md}
	    "05/4/2"	=~ $RE{time}{y2md};
	    "05.APR.02"	=~ $RE{time}{y2md};

	   This	works as $RE{time}{ymd}, except	that the year is restricted to
	   be exactly 2	digits.

       $RE{time}{y2m2d2}
	    "05/04/02" =~ $RE{time}{y2m2d2};

	   This	works as $RE{time}{ymd}, except	that the year is restricted to
	   be exactly 2	digits,	and the	month and day must be exactly 2	digits
	   each.

       $RE{time}{YMD}
	    "2005/04/02" =~ $RE{time}{YMD};

	   This	is a shorthand for the "canonical" year/month/day format,
	   "y4m2d2".

   Month-Day-Year (American) order
       $RE{time}{mdy}
       $RE{time}{mdy4}
       $RE{time}{m2d2y4}
       $RE{time}{mdy2}
       $RE{time}{m2d2y2}
       $RE{time}{MDY}
	   These patterns function as the equivalent year/month/day patterns,
	   above; the only difference is the order of the elements.  "MDY" is
	   a synonym for "m2d2y4".

   Day-Month-Year (European) order
       $RE{time}{dmy}
       $RE{time}{dmy4}
       $RE{time}{d2m2y4}
       $RE{time}{dmy2}
       $RE{time}{d2m2y2}
       $RE{time}{DMY}
	   These patterns function as the equivalent year/month/day patterns,
	   above; the only difference is the order of the elements.  "DMY" is
	   a synonym for "d2m2y4".

Time pattern (Hour-minute-second)
       $RE{time}{hms}
	    "10:06:12a"	=~ /$RE{time}{hms}/;
	    "9:00 p.m."	=~ /$RE{time}{hms}/;

	   Matches a time value	in a string.

	   The hour must be in the range 0 to 24.  The minute and second
	   values must be in the range 0 to 59,	and must be two	digits (i.e.,
	   they	must have leading zeroes if less than 10).

	   The hour, minute, and second	components may be separated by colons
	   (":"), periods, or spaces.

	   The "seconds" value may be omitted.

	   The time may	be followed by an "am/pm" indicator; that is, one of
	   the following values:

	     a	 am   a.m.  p	pm   p.m.   A	AM   A.M.  P   PM   P.M.

	   There may be	a space	between	the time and the am/pm indicator.

CAPTURES (-keep)
       Under "-keep", the "tf" and "strftime" patterns capture the entire
       match as	$1, plus one capture variable for each format specifier.
       However,	if your	pattern	contains any parentheses, "tf" and "strftime"
       will not	capture	anything additional beyond what	you specify, "-keep"
       or not.	In other words:	if you use parentheses,	you are	responsible
       for all capturing.

       The "iso" pattern captures:

       "  $1  -" the entire match

       "  $2  -" the year

       "  $3  -" the month

       "  $4  -" the day

       "  $5  -" the hour

       "  $6  -" the minute

       "  $7  -" the second

       The year, month,	and day	($2, $3, and $4) will be "undef" if the
       matched string contains only a time value (e.g.,	"12:34:56").  The
       hour, minute, and second	($5, $6, and $7) will be "undef" if the
       matched string contains only a date value (e.g.,	"2005-01-23").

       The "mail" and "MAIL" patterns capture:

       "  $1  -" the entire match

       "  $2  -" the day

       "  $3  -" the month

       "  $4  -" the year

       "  $5  -" the hour

       "  $6  -" the minute

       "  $7  -" the second

       "  $8  -" the time zone

       The "american" pattern captures:

       "  $1  -" the entire match

       "  $2  -" the month

       "  $3  -" the day

       "  $4  -" the year

       The fuzzy y/m/d patterns	capture

       "  $1  -" the entire match

       "  $2  -" the year

       "  $3  -" the month

       "  $4  -" the day

       The fuzzy m/d/y patterns	capture

       "  $1  -" the entire match

       "  $2  -" the month

       "  $3  -" the day

       "  $4  -" the year

       The fuzzy d/m/y patterns	capture

       "  $1  -" the entire match

       "  $2  -" the day

       "  $3  -" the month

       "  $4  -" the year

       The fuzzy h/m/s pattern captures

       "  $1  -" the entire match

       "  $2  -" the hour

       "  $3  -" the minute

       "  $4  -" the second  ("undef" if omitted)

       "  $5  -" the am/pm indicator ("undef" if omitted)

EXAMPLES
	# Typical usage: parsing a data	record.
	#
	$rec = "blah blah 2005/10/21 blah blarrrrrgh";
	@date =	$rec =~	m{^blah	blah $RE{time}{YMD}{-keep}};
	# or
	@date =	$rec =~	m{^blah	blah $RE{time}{tf}{-pat=>'yyyy/mm/dd'}{-keep}};
	# or
	@date =	$rec =~	m{^blah	blah $RE{time}{strftime}{-pat=>'%Y/%m/%d'}{-keep}};

	# Typical usage: parsing variable-format data.
	#
	use Time::Normalize;

	$record	= "10-SEP-2005";

	# This block tries M-D-Y first,	then D-M-Y, then Y-M-D
	my $matched;
	foreach	my $pattern (qw(mdy dmy	ymd))
	{
	    @values = $record =~ /^$RE{time}{$pattern}{-keep}/
		or next;

	    $matched = $pattern;
	}
	if ($matched)
	{
	    eval{ ($year, $month, $day)	= normalize_rct($matched, @values) };
	    if ($@)
	    {
		.... # handle erroneous	data
	    }
	}
	else
	{
	    .... # no match
	}
	#
	# $day is now 10; $month is now	09; $year is now 2005.

	# Time examples

	$time =	'9:10pm';

	@time_data = $time =~ /$RE{time}{hms}{-keep}/;
	# captures '9:10pm', '9', '10',	undef, 'pm'

	@time_data = $time =~ /$RE{time}{tf}{-pat => '(h):(mm)(:ss)?(am)?'}{-keep}/;
	# captures '9',	'10', undef, 'pm'

EXPORTS
       This module exports no symbols to the caller's namespace.

SEE ALSO
       It's not	enough that the	date regexps can match various formats.	 You
       then have to parse those	matched	data values and	translate them into
       useful values.  The Time::Normalize module is highly recommended	for
       performing this repetitive, error-prone task.

REQUIREMENTS
       Requires	Regexp::Common,	of course.

       If POSIX	and I18N::Langinfo are available, this module will use them;
       otherwise, it will use hardcoded	English	values for month and weekday
       names.

       Test::More is required for the test suite.

AUTHOR
       Eric J. Roode, ROODE -at- cpan -dot- org

LICENSE	AND COPYRIGHT
       Copyright (c) 2005-2008 by Eric J. Roode, ROODE -at- cpan -dot- org

       All rights reserved.

       To avoid	my spam	filter,	please include "Perl", "module", or this
       module's	name in	the message's subject line, and/or GPG-sign your
       message.

       This module is copyrighted only to ensure proper	attribution of
       authorship and to ensure	that it	remains	available to all.  This	module
       is free,	open-source software.  This module may be freely used for any
       purpose,	commercial, public, or private,	provided that proper credit is
       given, and that no more-restrictive license is applied to derivative
       (not dependent) works.

       Substantial efforts have	been made to ensure that this software meets
       high quality standards; however,	no guarantee can be made that there
       are no undiscovered bugs, and no	warranty is made as to suitability to
       any given use, including	merchantability.  Should this module cause
       your house to burn down,	your dog to collapse, your heart-lung machine
       to fail,	your spouse to desert you, or George Bush to be	re-elected, I
       can offer only my sincere sympathy and apologies, and promise to
       endeavor	to improve the software.

perl v5.32.1			  2018-02-28	       Regexp::Common::time(3)

NAME | SYNOPSIS | DESCRIPTION | Time::Format PATTERNS | strftime PATTERNS | ISO-8601 DATE/TIME MATCHING | RFC 2822 MATCHING | INFORMAL AMERICAN MATCHING | FUZZY PATTERN OVERVIEW | FUZZY PATTERN DETAILS | Time pattern (Hour-minute-second) | CAPTURES (-keep) | EXAMPLES | EXPORTS | SEE ALSO | REQUIREMENTS | AUTHOR | LICENSE AND COPYRIGHT

Want to link to this manual page? Use this URL:
<https://www.freebsd.org/cgi/man.cgi?query=Regexp::Common::time&sektion=3&manpath=FreeBSD+13.0-RELEASE+and+Ports>

home | help