Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages


home | help
DJVUXML(1)		      DjVuLibre	XML Tools		    DJVUXML(1)

       djvutoxml, djvuxmlparser	- DjVuLibre XML	Tools.

       djvutoxml [options] inputdjvufile [outputxmlfile]
       djvuxmlparser [ -o djvufile ] inputxmlfile

       The  DjVuLibre  XML  Tools provide for editing the metadata, hyperlinks
       and hidden text associated with	DjVu  files.   Unlike  djvused(1)  the
       DjVuLibre  XML  Tools rely on the XML technology	and can	take advantage
       of XML editors and verifiers.

       Program djvutoxml creates a XML file outputxmlfile containing a	refer-
       ence  to	 the  original DjVu document inputdjvufile as well as tags de-
       scribing	the metadata, hyperlinks, and hidden text associated with  the
       DjVu file.

       The following options are supported:

       --page pagenum
	      Select  a	 page  in a multi-page document.  Without this option,
	      djvutoxml	outputs	the XML	corresponding to all pages of the doc-

	      Specifies	 the  HIDDENTEXT  element  for each page should	be in-
	      cluded in	the output.  If	specified without the --with-anno flag
	      then the --without-anno is implied.  If none of the --with-text,
	      --without-text, --with-anno, or --without-anno, flags are	speci-
	      fied, then the --with-text and --with-anno flags are implied.

	      Specifies	 not  to  output the HIDDENTEXT	element	for each page.
	      If specified without the --without-anno flag  then  the  --with-
	      anno flag	is implied.

	      Specifies	 the area MAP element for each page should be included
	      in the output.  If specified without the --with-text  flag  then
	      the --without-text flag is implied.

	      Specifies	 the  area MAP element for each	page should not	be in-
	      cluded in	the output.  If	specified without  the	--without-text
	      flag then	the --with-text	flag is	implied.

       Files  produced	by  djvutoxml can then be modified using either	a text
       editor or a XML editor.	Program	djvuxmlparser parses the XML file  in-
       putxmlfile  in  order  to modify	the metadata of	the corresponding DjVu

       -o djvufile
	      In principle the target DjVu file	is the file referenced by  the
	      OBJECT  element of the XML file.	This option provides the means
	      to override the filename specified in the	OBJECT element.

       The document type definition file (DTD)


       defines the input and output of the DjVu	XML tools.

       The DjVuXML-s DTD is a simplification of	the HTML DTD:

       with a few new attributes added specific	to DjVu.  Each of  the	speci-
       fied pages of a DjVu document are represented as	OBJECT elements	within
       the BODY	element	of the XML file.  Each OBJECT element may contain mul-
       tiple  PARAM elements to	specify	attributes like	page name, resolution,
       and gamma factor.  Each OBJECT element may also contain one HIDDENTTEXT
       element	to  specify the	hidden text (usually generated with an OCR en-
       gine) within the	DjVu page.  In addition	each OBJECT element may	refer-
       ence a single area MAP element which contains multiple AREA elements to
       represent all the hyperlink and highlight areas within the  DjVu	 docu-

   PARAM Elements
       Legal  PARAM  elements  of a DjVu OBJECT	include	but are	not limited to
       PAGE for	specifying the page-name, GAMMA	for specifying the gamma  cor-
       rection	factor (normally 2.2), and DPI for specifying the page resolu-

   HIDDENTEXT Elements
       The HIDDENTEXT elements consists	of nested elements of PAGECOLUMNS, RE-
       GION, PARAGRAPH,	LINE, and WORD.	 The most deeply nested	element	speci-
       fied, should specify the	bounding coordinates of	the  element  in  top-
       down  orientation.   The	 body of the most deeply nested	element	should
       contain the text.  Most DjVu documents use either LINE or WORD  as  the
       lowest level element, but any element is	legal as the lowest level ele-
       ment.  A	white space is always added between WORD elements and  a  line
       feed  is	 always	 added between LINE elements.  Since languages such as
       Japanese	do not use spaces between words, it is quite common for	 Asian
       OCR engines to use WORD as characters instead.

   MAP Elements
       The  body of the	MAP elements consist of	AREA elements.	In addition to
       the attributes listed in,

       the attributes bordertype, bordercolor, border, and highlight have been
       added to	specify	border type, border color, border width, and highlight
       colors  respectively.   Legal  values  for each of these	attributes are
       listed in the DjVuXML-s DTD.  In	addition,  the	shape  oval  has  been
       added to	the legal list of shapes.  An oval uses	a rectangular bounding

       Perhaps it would	have been better to use	CC2 style sheets with standard
       HTML elements instead of	defining the HIDDENTEXT	element.

       The  DjVu  XML  tools  and  DTD	were  written by Bill C. Riemers <doc-> and Fred Crary.

       djvu(1),	djvused(1), and	utf8(7).

DjVuLibre XML Tools		  11/15/2002			    DJVUXML(1)


Want to link to this manual page? Use this URL:

home | help