Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
localedef(4)		   Kernel Interfaces Manual		  localedef(4)

NAME
       localedef - format and semantics	of locale definition file

DESCRIPTION
       This  is	 a description of the syntax and meaning of the	locale defini-
       tion that is provided as	input to the command to	create a  locale  (see
       localedef(1M)).

       The  following  is a list of category tags, keywords and	subsequent ex-
       pressions which are recognized by The order of keywords within a	 cate-
       gory  is	 irrelevant with the exception of the keyword and other	excep-
       tions noted under the description.  (Note that, as  a  convention,  the
       category	 tags are composed of uppercase	characters, while the keywords
       are composed of lowercase characters).

   Category Tags and Keywords
       The following keywords do not belong to any category and	should	appear
       in the beginning	of the locale definition file:

	      Single character indicating the character
		     to	 be  interpreted as starting a comment line within the
		     locale definition file.  This character should be in  the
		     first column of a comment line.  The default comment_char
		     is	All lines with a comment_char in the first column  are
		     ignored.

	      A	single character indicating the	character
		     to	 be  interpreted  as  an  escape  character within the
		     script.  The default escape_char is escape_char  is  used
		     to	 escape	 localedef  metacharacters  to	remove special
		     meaning and in the	character constant decimal, octal, and
		     hexadecimal  formats.  It is also used to continue	a line
		     onto the next, if escape_char is the  last	 character  on
		     the line (before the new-line character).

       The following keywords can be used in any category:

	      A	string naming another valid locale available on	the system.
		     This  causes  the category	in the locale being created to
		     be	a copy of the  same  category  in  the	named  locale.
		     Since  the	 keyword defines the entire category, if used,
		     it	must be	the only keyword in the	category.

       The following six categories are	recognized:

       This category defines character	classification,	 case  conversion  and
       other
	      character	 attributes.  The following predefined character clas-
	      sifications are recognized:

		 Character codes classified as uppercase  letters.  Characters
		 specified
				in  the	or classifications cannot be specified
				in this	category.

		 Character codes classified as	lowercase  letters.  Same  re-
		 strictions
				applicable to the category apply to this clas-
				sification.

		 Character codes classified as numeric.	Only ten characters in
		 contiguous
				ascending  sequence  by	numerical value	can be
				specified. Alternative digits cannot be	speci-
				fied here.

		 Character codes classified as white-space. No character spec-
		 ified for
				the or categories  can	be  included  in  this
				classification.

		 Character  codes  classified  as  punctuation characters.  No
		 character
				included in the	or categories  can  be	speci-
				fied.

		 Character  codes classified as	control	characters. No charac-
		 ter included in
				the or can be included here.

		 Character codes classified as blank characters.  The  <space>
		 and
				<tab> characters are automatically included.

		 Character  codes  classified  as hexadecimal digits. Only the
		 characters
				defined	for the	class can be  specified,  fol-
				lowed  by  one or more sets of six characters,
				with each set in ascending order.

		 Character codes classified as letters.	Characters  classified
		 as
				or  cannot  be specified. Characters specified
				as and classes are automatically  included  in
				this class.

		 Character codes classified as printable characters.
				Characters  specified  for and classes and the
				<space>	character are automatically  included.
				No  character  from the	category can be	speci-
				fied.

		 Character codes classified as printable characters,
				except the <space> character.	In  all	 other
				respect	 this classification is	similar	to the
				category.

	      The following two	are special classifications, used to designate
	      valid  first-of-two  and	second-of-two Note that	these are byte
	      classifications and not character	classifications;  hence,  they
	      cannot  be used with the iswctype	interface (see wctype(3C)), in
	      the same manner as the other classifications can be used.

		 Valid first bytes of two-byte characters.

		 Valid second bytes of two-byte	characters.

	      Character	case conversion	definitions:

		 Lowercase to uppercase	character relationships.

		 Uppercase to lowercase	character relationships.

	      Miscellaneous character attribute	and classifications:

		 String	mapped into the
				ASCII		 equivalent		string
				``b!"#$%&'()*+,-./:;<=>?@[\]^_`{}~'',  where b
				is a blank (a langinfo(5) item).

		 Defines one or	more locale-specific character class names as
				strings	separated by semicolons.   Each	 named
				character  class  can  then  be	defined	subse-
				quently	in the definition. The first character
				of a character class name must be a letter and
				the class name cannot match any	of the	prede-
				fined classifications (e.g.,

		 String	operand	indicates text direction (a
				langinfo(5)  item).  String  operand "1" indi-
				cates right-to-left text direction.

		 String	operand	indicates character context  analysis.	String
		 "1"
				indicates Arabic context analysis is required.

       The    category provides	collation sequence definition for relative or-
	      dering between collating elements	(single-  and  multi-character
	      collating	 elements)  in the locale.  The	following keywords be-
	      long to this category and	should come between the	 category  tag
	      and  The	first  two keywords can	be in any order, but must come
	      before the keyword.  Any number of the first two keywords	can be
	      specified.

		 Defines a multi-character collating element,
				symbol,	 composed of the characters in string.
				String is limited to two characters.

		 Makes		symbol a collating symbol which	can be used to
				define	a  place  in  the  collating sequence.
				Symbol does not	represent any  actual  charac-
				ter.

		 Denotes the start of the collation sequence.
				The directives have an effect on string	colla-
				tion.

				The lines following the	keyword	and before the
				keyword	contain	collating element entries, one
				per line.

				Operands can optionally	appear after the  key-
				word  to  defined  rules for string comparison
				using a	multiple-weight	scheme (if no operands
				are  specified,	 a single operand is assumed).
				The possible operands are:

			   Specifies that comparison operations	 proceed  from
			   start of string towards
					  the end of it.

			   Specifies  that  comparison operations proceed from
			   end of string towards
					  the beginning	of it.

		 Marks the end of the list of collating	element	entries.

       The    category defines the rules and symbols used to  format  monetary
	      numeric information. The following keywords belong to this cate-
	      gory and should come between the category	tag and

		 The operand is	a four-character string	used to	designate  the
		 international
				currency symbol.

		 The operand is	a string used as the local currency symbol.

		 The  operand  is  a  string containing	the symbol used	as the
		 decimal
				delimiter (radix character).

		 The operand is	a string containing the	symbol used as a sepa-
		 rator for
				groups of digits to the	left of	decimal	delim-
				iter.

		 The operand is	a semicolon-separated list of integers.
				The initial integer defines the	 size  of  the
				group immediately preceding the	decimal	delim-
				iter, and the following	 integers  define  the
				preceding  groups.  If the last	integer	is not
				-1, then the size of the  previous  group  (if
				any) will be repeatedly	used for the remainder
				of the digits.	If the	last  integer  is  -1,
				then no	further	grouping will be performed.

		 The  operand  is a srting to indicate a non-negative monetary
		 quantity.

		 The operand is	a srting to indicate a negative	monetary quan-
		 tity.

		 The  operand  is  an integer representing the number of frac-
		 tional	digits
				used in	formatted monetary values using

		 The operand is	an integer representing	the  number  of	 frac-
		 tional	digits
				used in	formatted monetary values using

		 The operand is	an integer which if set	to 1 indicates the
				or precedes a monetary quantity, and if	set to
				0 the symbol succeeds the value.

		 The operand is	an integer which if set	to 1 indicates a space
				separates the or from the value, and otherwise
				if set to 0.

		 The operand is	an integer which if set	to 1 indicates the
				or  precedes a negative	monetary quantity, and
				if set to 0 the	symbol succeeds	 the  negative
				value.

		 The operand is	an integer which if set	to 1 indicates a space
				separates the or from negative monetary	value,
				and otherwise if set to	0.

		 The operand is	an integer which setting indicates  the	 posi-
		 tioning of the
				for  a	non-negative  monetary	quantity.  The
				possible values	are:

				     Parenthesis surround the quantity and the
					    or

				     The sign string precedes the quantity and
				     the
					    or

				     The sign string succeeds the quantity and
				     the
					    or

				     The sign string precedes the
					    or

				     The sign string succeeds the
					    or

		 The operand is	an integer which setting parallels that	of
				but for	negative monetary quantities.

       The    category defines rules and symbols used to  format  non-monetary
	      numeric information.  The	following keywords belong to this cat-
	      egory and	should come between the	category tag and

		 The operand is	a string containing the	 symbol	 used  as  the
		 decimal
				delimiter  (radix  character) in numeric, non-
				monetary formatted quantities.	 This  keyword
				cannot	be  omitted  and  cannot be set	to the
				empty string.

		 The operand is	a string containing the	symbol used as a sepa-
		 rator
				for  groups of digits to the left of the deci-
				mal delimiter.

		 The operand is	a semicolon-separated list of integers.
				The initial integer defines the	 size  of  the
				group immediately preceding the	decimal	delim-
				iter, and the following	 integers  define  the
				preceding  groups.  If the last	integer	is not
				-1, then the size of the  previous  group  (if
				any) will be repeatedly	used for the remainder
				of the digits. If the last integer is -1, then
				no further grouping will be performed.

		 String	mapped into the
				ASCII equivalent string	``0123456789b+-.,eE'',
				where b	is a blank (a langinfo(5) item).   The
				keyword	 is  a HP extension to the POSIX stan-
				dards and it has a different meaning than  the
				defined	in POSIX standards.

       The    category	defines	 the rules for generating locale-specific for-
	      matted date strings.  The	following mandatory keywords belong to
	      this category and	should come between the	category tag and

		 Seven semicolon-separated strings
				giving	abbreviated  names for the days	of the
				week beginning with Sunday.

		 Seven semicolon-separated strings
				giving full names for the days of the week be-
				ginning	with Sunday.

		 Twelve	 semicolon-separated  strings giving abbreviated names
		 for the months,
				beginning with January.

		 Twelve	semicolon-separated strings giving full	names for  the
		 months,
				beginning with January.

		 The  operand  is  a  string defining the appropriate date and
		 time
				representation.

		 The operand is	a string defining the appropriate date
				representation.

		 The operand is	a string defining the appropriate time
				representation.

		 The operand is	two semicolon-separated	strings	giving
				the representations for	and

		 The operand is	a string defining the appropriate time	repre-
		 sentation
				in the 12-hour clock format with

		 The  operand  is a semi-colon-separated list of strings. Each
		 string
				defines	the name and date of an	era or emperor
				for  a	locale.	 Each string should conform to
				the following format:

				direction:offset:start_date:end_date:name:format

				where:

				     direction	 Either	a or  character.   The
						 character  indicates the time
						 axis should be	such that  the
						 years	count  in the positive
						 direction  when  moving  from
						 the starting date towards the
						 ending	date.	The  character
						 indicates   the   time	  axis
						 should	be such	that the years
						 count	in the negative	direc-
						 tion  when  moving  from  the
						 starting   date  towards  the
						 ending	date.

				     offset	 A number in the  range	 indi-
						 cating	  the  number  of  the
						 first year of the era.

				     start_date	 A  date  in  the  form	 where
						 yyyy,	mm,  and  dd  are  the
						 year, month and day  numbers,
						 respectively, of the start of
						 the era.  Years prior to  the
						 year  0 A.D.  are represented
						 as negative numbers.  For ex-
						 ample,	an era beginning March
						 5th  in  the  year  100  B.C.
						 would be represented as Years
						 in the	range are supported.

				     end_date	 The ending date of the	era in
						 the	same   form   as   the
						 start_date above  or  one  of
						 the  two  special values or A
						 value of indicates the	ending
						 date  of  the	era extends to
						 the beginning of  time	 while
						 indicates  it	extends	to the
						 end of	time.  The ending date
						 can be	chronologically	either
						 before	or after the  starting
						 date of an era.  For example,
						 the   expressions   for   the
						 Christian eras	A.D.  and B.C.
						 would be:

				     name	 A  string  representing   the
						 name of the era which is sub-
						 stituted for the directive of
						 and  (see  date(1)  and strf-
						 time(3C)).

				     format	 A string for  formatting  the
						 directive   of	  date(1)  and
						 strftime(3C).	This string is
						 usually a function of the and
						 directives.  If format	is not
						 specified,  the string	speci-
						 fied for the category keyword
						 (see  below) is used as a de-
						 fault.

		 The operand is	a string defining the format of	 date  in  era
		 notation.

		 The  operand  is  a string defining the format	of time	in era
		 notation.

		 The operand is	a string defining the format of	date and
				time in	era notation.

		 The operand is	a semi-colon-separated list  of	 strings.  The
		 first
				string is the alternative symbol corresponding
				to zero, the second string is the  alternative
				symbol	corresponding to one, and so on.  Note
				that if	the HP-UX-proprietary keyword has been
				specified  in  the  same locale, the first ten
				symbols	should be identical for	these two key-
				words.

	      In  addition  to the above, the following	HP-UX-proprietary key-
	      words are	recognized (these are provided for  backward  compati-
	      bility and their use is otherwise	not recommended):

       The    category defines the format and values for affirmative and nega-
	      tive responses.  The following keywords belong to	this  category
	      and should come between the category tag and

		 The string operand is
				an  Extended  Regular  Expression matching ac-
				ceptable  affirmative  responses   to	yes/no
				queries.

		 The string operand is
				an  Extended  Regular  Expression matching ac-
				ceptable negative responses to yes/no queries.

		 The string operand identifies the  affirmative	 response  for
		 yes/no	questions.
				This  keyword  is  now	obsolete and should be
				used instead.

		 The string  operand  identifies  the  negative	 response  for
		 yes/no	questions
				This  keyword  is  now	obsolete and should be
				used instead.

   Keyword Operands
       Keyword operands	 consist  of  character-code  constants	 and  symbols,
       strings,	and metacharacters.  The types of legal	expressions are: and

	      operands	consist	of single character-code constants or symbolic
	      names
			separated by semicolons,  or  a	 character-code	 range
			consisting  of a constant or symbolic name followed by
			an ellipsis followed by	another	constant  or  symbolic
			name.  The constant preceding the ellipsis must	have a
			smaller	code value than	the constant following the el-
			lipsis.	 A range represents a set of consecutive char-
			acter codes.  If the list  is  longer  than  a	single
			line,  the escape character must be used at the	end of
			each line as a continuation character.	It is an error
			to use any symbolic name that is not defined in	an ac-
			companying charmap file	(see charmap(4)).

	      operands	consist	of strings separated by	semicolons.  If	longer
			than  one  line, the escape character must be used for
			continuation.

	      operands consist of a sequence of	zero or	more characters
			surrounded by double quotes (").  Within a string, the
			double-quote  character	 must be preceded by an	escape
			character.  The	following escape sequences also	can be
			used:

			newline

			horizontal tab

			backspace

			carriage return

			form feed

			backslash

			single quote

			bit pattern

				The  escape  consists  of the escape character
				followed by 1, 2, or 3 octal digits specifying
				the  value of the desired character (for other
				possible bit pattern  specification,  see  be-
				low).	Also,  an  escape character (\)	and an
				immediately-following newline are ignored.

			Although the backslash (\) has been used for illustra-
			tion,  another	escape character can be	substituted by
			the keyword.

	      Constants	represent character codes in the operands.
			They can be used in the	following forms:

			decimal	constants      An escape character followed by
					       a followed by up	to three deci-
					       mal digits.

			octal constants	       An escape character followed by
					       up to three octal digits.

			hexadecimal constants  An escape character followed by
					       a followed by  two  hexadecimal
					       digits.

			character constants    A  single  character  (e.g., A)
					       having the numerical  value  of
					       the  character in the machine's
					       character set.

			symbolic names	       A string	enclosed  between  and
					       is   a  symbolic	 name.	 input
					       files  are  recommended	to  be
					       written	entirely  in  symbolic
					       names, utilizing	a user defined
					       or    system-supplied   charmap
					       file.  This aids	portability of
					       input  files  between different
					       encoded	character  sets	  (see
					       charmap(4)).

					       Symbolic	 names	can be defined
					       within a	locale definition file
					       by the and keywords.  These are
					       not character constants.	 It is
					       an  error if such an internally
					       defined symbolic	name  collides
					       with  one  defined in a charmap
					       file.

	      operands	consists of one	or more	decimal	 digits	 separated  by
			semicolons.

	      operands follow keywords
			and  and  must consist of two character-code constants
			enclosed by left and right parentheses	and  separated
			by  a  comma.	Each  such character pair is separated
			from the next by a semicolon.  For the first  constant
			represents  an	uppercase character and	the second the
			corresponding lowercase	character.  For	the first con-
			stant represents an lowercase character	and the	second
			the corresponding uppercase character.

	      The	keyword	is followed by collating element entries,  one
			per  line,  in	ascending order	by collating position.
			The collating element entries have the form:

			     collation_element[weight[;weight]]

			collation_element can be a character, a	collating sym-
			bol  enclosed in angle brackets	representing a charac-
			ter or collating element, the special symbol or	an el-
			lipsis

			A  character stands for	itself;	a collating symbol can
			be a symbolic name for a character that	is interpreted
			by  the	charmap	file, a	multi-character	collating ele-
			ment defined by	a keyword, or a	collating  symbol  de-
			fined by the

			The special symbol specifies the collating position of
			any characters not explicitly defined by collating el-
			ement  entries.	 For example, if some group of charac-
			ters is	to be omitted from the collation sequence  and
			just collate after all defined characters, a collating
			symbol might be	defined	before the keyword:

			Then somewhere in the list of  collating  element  en-
			tries:

			Notice	that  there  is	 no second weight.  This means
			that on	a second pass all characters collate by	 their
			encoded	value.

			An  ellipsis  is  interpreted  as a list of characters
			with an	encoded	value higher than that of the  charac-
			ter  on	 the preceding line and	lower than that	on the
			following line.	 Because it is tied to	encoded	 value
			of  characters,	 the ellipsis is inherently non-porta-
			ble.  If it is used, a warning is issued and no	output
			generated unless the option was	given.

			The  weight operands provide information about how the
			collating element is to	be collated on first and  sub-
			sequent	passes.	 Weight	can be a two-character string,
			the special symbol or a	collating element  of  any  of
			the  forms  specified  for collating_element except If
			there are  no  weights,	 the  character	 is  collating
			strictly  by  its  position  in	the list.  If there is
			only one weight	given, the character sorts by its rel-
			ative  position	 in  the  list on the second collation
			pass.

			An equivalence class is	defined	by a series of collat-
			ing  element  entries all having the same character or
			symbol in the first weight position.  For example,  in
			many  locales  all  forms of the character 'A' collate
			equal on the first pass.  This is represented  in  the
			collating element entries as:

			Two-to-one collating elements are specified by collat-
			ing-elements defined before the	keyword.  For example,
			the  two-to-one	collating element in Spanish, would be
			defined	before the keyword as

			It would then be used in a collating element entry as

			A one-to-two collating element is defined by having  a
			two-character  string  in one of the weight positions.
			For example, if	the character collates	equal  to  the
			pair "AE", the collating element entry would be:

			A  don't-care character	is defined by the special sym-
			bol For	example, the dash character, may  be  a	 don't
			care  on the first collation pass.  The	collating ele-
			ment entry is:

			Symbols	defined	by the keyword can be used to indicate
			that  a	 given character collates higher or lower than
			some position in the sequence.	 For  example  if  all
			characters with	an encoded value less than that	of are
			to collate lower than  all  other  characters  on  the
			first  pass, and in relative order on the second pass,
			define a collating symbol before the keyword:

			The first two collating	element	entries	are then:

			This also illustrates the use of the ellipsis to indi-
			cate  a	 range.	  The first ellipsis is	interpreted as
			"all characters	in the encoded character  set  with  a
			value  lower than '0'";	the second ellipsis means that
			all characters in the range defined by the first  col-
			late in	relative order.

	      operands conform to
			the Extended Regular Expressions specifications	as de-
			scribed	in regexp(5).

   Metacharacters
       Metacharacters are characters having a special meaning to localedef  in
       operands.   To escape the special meaning of these characters, surround
       them with single	quotes or precede them by an  escape  character.   lo-
       caledef meta-characters include:

	      Indicates	the beginning of a symbolic name.

	      Indicates	the end	of a symbolic name.

	      Indicates	the beginning of a character shift pair	following the
		      and keywords.

	      Indicates	the end	of a character shift pair.

	      Used to separate the characters of a character shift pair.

	      Used to quote strings.

	      Used as a	separator in list operands.

	      escape character
		      Used to escape special meaning from other	metacharacters
		      and itself.  It is backslash (\) by default, but can  be
		      redefined	by the keyword.

   Comments
       Comments	 are  lines  beginning	with a comment character.  The comment
       character is pound sign (#) by default, but can	be  redefined  by  the
       keyword.	 Comments and blank lines are ignored.

   Separators
       Separator characters include blanks and tabs.  Any number of separators
       can be used to delimit  the  keywords,  metacharacters,	constants  and
       strings that comprise a localedef script	except that all	characters be-
       tween and are considered	to be part of the symbolic name	even they  are
       <blank>s.

EXAMPLE
       Please  see  the	 files under for examples of locale description	files.
       These files were	used to	create the various locales which are delivered
       with HP-UX.

								  localedef(4)

NAME | DESCRIPTION | EXAMPLE

Want to link to this manual page? Use this URL:
<https://www.freebsd.org/cgi/man.cgi?query=localedef&sektion=4&manpath=HP-UX+11.22>

home | help