Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages


home | help
HXINDEX(1)			HTML-XML-utils			    HXINDEX(1)

       hxindex - insert	an index into an HTML document

       hxindex [-t] [-x] [-n|-N] [-f] [-r] [-c class[,class...]] [-b base] [-i
       indexdb]	[-s template] [-u phrase] [-O element[,element...]]  [-X  ele-
       ment[,element...]] [--] [file-or-URL]

       The hxindex looks for terms to be indexed in a document,	collects them,
       turns them into target anchors and creates a sorted index  as  an  HTML
       list,  which is inserted	at the place of	a placeholder in the document.
       The resulting document is written to standard output.

       The index is inserted at	the place of a comment of the form


       or between two comments of the form


       In the latter case, all existing	content	between	the  two  comments  is
       removed first.

       Index  terms are	either elements	of type	_dfn_ or elements with a class
       attribute of "index".  (For  backward  compatibility,  also  class  at-
       tributes	 "index-inst"  and "index-def" are recognized.)	_dfn_ elements
       (and class "index-def") are considered  more  important	than  elements
       with class "index" and will appear in bold in the generated index.

       The option -c adds additional classes, that are aliases for "index".

       By  default,  the  contents of the element are taken as the index term.
       Here are	two examples of	occurrences of the index term "shoe":

	   A <dfn>shoe</dfn> is	a piece	of clothing that...
	   completed by	a leather <span	class="index">shoe</span>...

       If the term to be indexed is not	equal to the contents of the  element,
       the title attribute can be used to give the correct term:

	   ... <dfn title="shoe">Shoes</dfn> are pieces	of clothing that...
	   ... with two	leather	<span class="index" title="shoe">shoes</span>...

       The  title attribute must also be used when the index term is a subterm
       of another. Subterms appear indented in the  index,  under  their  head
       term.  To  define a subterm, use	a title	attribute with two exclamation
       marks ("!!") between the	term and the subterm, like this:

	   <dfn	title="shoe!!leather">...</dfn>
	   <dfn	title="shoe!!invention of">...</dfn>
	   <em class="index" title="shoe!!protective!!steel nosed">...</em>

       As the last example above shows,	there can be multiple levels  of  sub-

       The  title  attribute also allows multiple index	terms to be associated
       with a single occurrence. The multiple terms are	separated with a  ver-
       tical bar ("|").	Compare	the following examples with the	ones above:

	   <dfn	title="shoe|boot">...</dfn>
	   <dfn	title="shoe!!invention of|inventions!!shoe">...</dfn>

       These  two elements both	insert two terms into the index. Note that the
       second example above combines subterms and multiple terms.

       It is possible to run index on a	file that already has  an  index.  The
       old  target  anchors and	the old	index will be removed before being re-

       The following options are supported:

       -t	 By default, hxindex adds an ID	attribute to the element  that
		 contains the occurrence of a term and also inserts an _a_ el-
		 ement inside it with a	name attribute equal to	the  ID.  This
		 is  to	 allow old browsers that ignore	ID attributes, such as
		 Netscape 4, to	find the target	as well. The  -t  option  sup-
		 presses the _a_ element.

       -x	 This  option  turns on	XML syntax conventions:	empty elements
		 will end in /_	instead	of _ as	in HTML.  -x implies -t.

       -i indexdb
		 hxindex can read an initial index from	a file and  write  the
		 merged	 collection of index terms back	to that	file. This al-
		 lows an index to span several documents.  The	-i  option  is
		 used to give the name of the file that	contains the index.

       -b base	 This option is	useful in combination with -i to give the base
		 URL reference of the document.	By default, hxindex will store
		 links to occurrences in the indexdb file in the form #anchor,
		 but when -b is	given, the links will  look  like  base#anchor

		 When used in combination with -n, the title attributes	of the
		 links will contain the	title of the  document	that  contains
		 the  term. The	title is inserted before the template (see op-
		 tion -s) and separated	from it	with  a	 comma	and  a	space.
		 E.g., if hxindex is called with

		     hxindex -i	termdb -n -base	myfile.html myfile.html

		 and the termdb	already	contains an entry for "foo" in in sec-
		 tion "3.1" of a document called "file2.html" with title  "The
		 foos",	then the generated index will contain an entry such as

		     foo, <a href="file2.html#foo"
		       title="The foos,	section	3.1">3.1</a>

       -c class,class,...
		 Normal	index terms are	recognized because they	have  a	 class
		 of  "index".  The  -c option adds additional, comma-separated
		 class names that will	be  considered	aliases	 for  "index".
		 E.g.,	-c  instance  will  make  sure	that  <span class="in-
		 stance">term</span> is	recognized as a	term for the index.

       -n	 By default, the index consists	of links with "#" as  the  an-
		 chor  text.  Option -n	causes the link	text to	consist	of the
		 section numbers of the	sections in  which  the	 terms	occur,
		 falling  back to "without number" (see	option -u below) if no
		 section number	could be found.	Section	numbers	are  found  by
		 looking  for  the nearest preceding start tag with a class of
		 "secno" or "no-num". In the case of "secno", the contents  of
		 that  element are taken as the	section	number.	In the case of
		 "no-num" the section is assumed to have no number and hxindex
		 will  print  "without number" instead.	These classes are also
		 used by hxnum(1), so it is useful to run hxindex after	hxnum,

		     hxnum myfile.html | hxindex -n >mynewfile.html

       -N	 With  this  option, the anchor	text of	the links in the index
		 is the	full title of the section in which  the	 term  occurs.
		 The title of the section is the nearest preceding H1, H2, H3,
		 H4, H5	or H6 element, or the document's title if there	is  no
		 preceding  H*	element.  This	option cannot be used together
		 with -n.  If both are used, the last one specified wins.

       -s template
		 When option -n	is used, the link will have a title  attribute
		 and  the template determines what it contains.	The default is
		 "section %s", where the %s is a placeholder for  the  section
		 number.  In  other words, the index will contain entries like

		     term, <a href="#term" title="section 7.8">7.8</a>

		 Some examples:

		     hxindex -n	-s 'chapter %s'
		     hxindex -n	-s 'part %s'
		     hxindex -n	-s 'hoofdstuk %s' -u 'zonder nummer'

		 This option is	only useful in combination with	-n

       -u phrase When option -n	is used	to display section numbers, references
		 for  which no section number can be found are shown as	phrase
		 instead. The default is "??".

		 This option is	only useful in combination with	-n

       -f	 Remove	title attributes that were used	for the	index as  well
		 as  the comments that delimit the inserted index. This	avoids
		 that browsers display these  attributes.  Note	 that  hxindex
		 cannot	be run again on	its own	output if this option is used.
		 (Mnemonic: "freeze" or	"final".)

       -r	 Do not	ignore trailing	punctuation when sorting index	terms.
		 E.g., if two terms are	written	as

		     <dfn>foo,</dfn>...	<span class=index>foo</span>

		 hxindex  will normally	ignore the comma and treat them	as the
		 same term, but	with -r, they are treated as  different.  This
		 affects  trailing commas (,), semicolons (;), colons (:), ex-
		 clamations mark (!), question marks (?)  and full stops  (.).
		 A  final  full	stop is	never ignored if there are two or more
		 in the	term, to protect abbreviations ("B.C.")	 and  ellipsis
		 ("more...").  This  does  not	affect	how  the index term is
		 printed (it is	always printed as it  appears  in  the	text),
		 only how it is	compared to similar terms. (Mnemonic: "raw".)

       -O element,element,...
		 If  -O	is present, only elements with the given names will be
		 indexed. E.g.,

		     hxindex -O	span,i,em

		 means that hxindex will  only	look  for  class="index"  (and
		 other	classes,  according to -c) on the elements span, i and
		 em.  The argument of -O must be a comma-separated list	of el-
		 ement names.  Note that this does not affect the element dfn.
		 It will always	be indexed as a	defining instance.

       -X element,element,...
		 The option -X excludes	the given elements from	being indexed.

		     hxindex -X	ul,ol

		 makes	sure  that ul and ol elements are not indexed, even if
		 they have a class="index" attribute. This  does  not  exclude
		 their children	from being indexed. E.g.,

		     <ul class=index>
		      <li class=index>foo
		      <li class=index>bar

		 will  add foo and bar to the index, but not the whole content
		 of the	ul element (foo	bar baz).  If both -O and -X are given
		 and  an  element occurs in both options, it will be excluded.

		     hxindex -X	p,h1,ul	-O em,span,h1,h2

		 will cause hxindex to only look for class attributes  on  em,
		 span and h2, because h1 is excluded.

       The following operand is	supported:

		 The name of an	HTML or	XML file or the	URL of one. If absent,
		 or if the file	is "-",	standard input is read instead.

       The following exit values are returned:

       0	 Successful completion.

       >0	 An error occurred in parsing the HTML file.

       The input is assumed to be in UTF-8, but	the current locale is used  to
       determine  the sorting order of the index terms.	I.e., hxindex looks at
       the LANG, LC_ALL	 and/or	 LC_COLLATE  environment  variables.  See  lo-

       To  use a proxy to retrieve remote files, set the environment variables
       http_proxy or ftp_proxy.	 E.g., http_proxy="http://localhost:8080/"

       Assumes UTF-8 as	input. Doesn't expand character	entities  (apart  from
       the  standard  ones: "&amp;", "&lt;", "&gt" and "&quot"). Instead, pipe
       the input through hxunent(1) and, if needed, asc2xml(1) to  convert  it
       to UTF-8.

       Remote  files  (specified  with a URL) are currently only supported for
       HTTP. Password-protected	files or files that depend on  HTTP  "cookies"
       are  not	 handled. (You can use tools such as curl(1) or	wget(1)	to re-
       trieve such files.)

       The accessibility of an index, even when	generated with option  -n,  is

       asc2xml(1), hxnormalize(1), hxnum(1), hxprune(1), hxtoc(1), hxunent(1),
       xml2asc(1), locale(1), UTF-8 (RFC 2279)

7.x				  10 Jul 2011			    HXINDEX(1)


Want to link to this manual page? Use this URL:

home | help