Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
GENDICT(1)			ICU 67.1 Manual			    GENDICT(1)

NAME
       gendict - Compiles word list into ICU string trie dictionary

SYNOPSIS
       gendict [ --uchars | --bytes --transform	transform ] [ -h, -?, --help ]
       [ -V, --version ] [ -c, --copyright ] [ -v, --verbose ] [  -i,  --icud-
       atadir directory	]  input-file  output-file

DESCRIPTION
       gendict	reads  the word	list from dictionary-file and creates a	string
       trie dictionary file. Normally this data	file has the .dict extension.

       Words begin at the beginning of a line and are terminated by the	 first
       whitespace.  Lines that begin with whitespace are ignored.

OPTIONS
       -h, -?, --help
	      Print help about usage and exit.

       -V, --version
	      Print the	version	of gendict and exit.

       -c, --copyright
	      Embeds the standard ICU copyright	into the output-file.

       -v, --verbose
	      Display extra informative	messages during	execution.

       -i, --icudatadir	directory
	      Look  for	 any necessary ICU data	files in directory.  For exam-
	      ple, the file pnames.icu must be located when ICU's data is  not
	      built  as	 a  shared library.  The default ICU data directory is
	      specified	by the environment variable ICU_DATA.  Most configura-
	      tions of ICU do not require this argument.

       --uchars
	      Set  the	output	trie  type  to	UChar. Mutually	exclusive with
	      --bytes.

       --bytes
	      Set the output trie  type	 to  Bytes.  Mutually  exclusive  with
	      --uchars.

       --transform
	      Set  the	transform type.	Should only be specified with --bytes.
	      Currently	supported transforms are:  offset-<hex-number>,	 which
	      specifies	 an  offset to subtract	from all input characters.  It
	      should be	noted that the offset transform	also  maps  U+200D  to
	      0xFF and U+200C to 0xFE, in order	to offer compatibility to lan-
	      guages that require these	characters.  A transform must be spec-
	      ified  for a bytes trie, and when	applied	to the non-value char-
	      acters in	the input-file must produce output  between  0x00  and
	      0xFF.

	input-file
	      The source file to read.

	output-file
	      The file to write	the output dictionary to.

CAVEATS
       The  input-file is assumed to be	encoded	in UTF-8.  The integers	in the
       input-file that are used	as values must be made	up  of	ASCII  digits.
       They  may be specified either in	hex, by	using a	0x prefix, or in deci-
       mal.  Either --bytes or --uchars	must be	specified.

ENVIRONMENT
       ICU_DATA	 Specifies the directory  containing  ICU  data.  Defaults  to
		 ${prefix}/share/icu/67.1/.   Some  tools in ICU depend	on the
		 presence of the trailing slash. It is thus important to  make
		 sure that it is present if ICU_DATA is	set.

AUTHORS
       Maxime Serrano

VERSION
       1.0

COPYRIGHT
       Copyright (C) 2012 International	Business Machines Corporation and oth-
       ers

SEE ALSO
       http://www.icu-project.org/userguide/boundaryAnalysis.html

ICU MANPAGE			  1 June 2012			    GENDICT(1)

NAME | SYNOPSIS | DESCRIPTION | OPTIONS | CAVEATS | ENVIRONMENT | AUTHORS | VERSION | COPYRIGHT | SEE ALSO

Want to link to this manual page? Use this URL:
<https://www.freebsd.org/cgi/man.cgi?query=gendict&manpath=FreeBSD+12.1-RELEASE+and+Ports>

home | help