Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages


home | help
dcm2xml(1)			  OFFIS	DCMTK			    dcm2xml(1)

       dcm2xml - Convert DICOM file and	data set to XML

       dcm2xml [options] dcmfile-in [xmlfile-out]

       The  dcm2xml utility converts the contents of a DICOM file (file	format
       or raw data set)	to XML (Extensible Markup  Language).  There  are  two
       output  formats.	 The  first  one  is  specific	to  DCMTK with its DTD
       (Document Type Definition)  described  in  the  file  dcm2xml.dtd.  The
       second  one  refers  to the 'Native DICOM Model'	which is specified for
       the DICOM Application Hosting service found in DICOM part 19.

       If dcm2xml reads	a raw data set (DICOM data without a file format meta-
       header)	it  will attempt to guess the transfer syntax by examining the
       first few bytes of the file. It is not  always  possible	 to  correctly
       guess  the  transfer syntax and it is better to convert a data set to a
       file format whenever possible (using the	dcmconv	utility). It  is  also
       possible	 to  use the -f	and -t[ieb] options to force dcm2xml to	read a
       data set	with a particular transfer syntax.

       dcmfile-in   DICOM input	filename to be converted

       xmlfile-out  XML	output filename	(default: stdout)

   general options
	 -h    --help
		 print this help text and exit

		 print version information and exit

		 print expanded	command	line arguments

	 -q    --quiet
		 quiet mode, print no warnings and errors

	 -v    --verbose
		 verbose mode, print processing	details

	 -d    --debug
		 debug mode, print debug information

	 -ll   --log-level  [l]evel: string constant
		 (fatal, error,	warn, info, debug, trace)
		 use level l for the logger

	 -lc   --log-config  [f]ilename: string
		 use config file f for the logger

   input options
       input file format:

	 +f    --read-file
		 read file format or data set (default)

	 +fo   --read-file-only
		 read file format only

	 -f    --read-dataset
		 read data set without file meta information

       input transfer syntax:

	 -t=   --read-xfer-auto
		 use TS	recognition (default)

	 -td   --read-xfer-detect
		 ignore	TS specified in	the file meta header

	 -te   --read-xfer-little
		 read with explicit VR little endian TS

	 -tb   --read-xfer-big
		 read with explicit VR big endian TS

	 -ti   --read-xfer-implicit
		 read with implicit VR little endian TS

       long tag	values:

	 +M    --load-all
		 load very long	tag values (e.g. pixel data)

	 -M    --load-short
		 do not	load very long values (default)

	 +R    --max-read-length  [k]bytes: integer (4..4194302, default: 4)
		 set threshold for long	values to k kbytes

   processing options
       specific	character set:

	 +Cr   --charset-require
		 require declaration of	extended charset (default)

	 +Ca   --charset-assume	 [c]harset: string
		 assume	charset	c if no	extended charset declared

	 +Cc   --charset-check-all
		 check all data	elements with string values
		 (default: only	PN, LO,	LT, SH,	ST, UC and UT)

		 # this	option is only used for	the mapping to an appropriate
		 # XML character encoding, but not for the conversion to UTF-8

	 +U8   --convert-to-utf8
		 convert all element values that are affected
		 by Specific Character Set (0008,0005) to UTF-8

		 # requires support from an underlying character encoding library
		 # (see	output of --version on which one is available)

   output options
       general XML format:

	 -dtk  --dcmtk-format
		 output	in DCMTK-specific format (default)

	 -nat  --native-format
		 output	in Native DICOM	Model format (part 19)

	 +Xn   --use-xml-namespace
		 add XML namespace declaration to root element

       DCMTK-specific format (not with --native-format):

	 +Xd   --add-dtd-reference
		 add reference to document type	definition (DTD)

	 +Xe   --embed-dtd-content
		 embed document	type definition	into XML document

	 +Xf   --use-dtd-file  [f]ilename: string
		 use specified DTD file	(only with +Xe)
		 (default: /usr/local/share/dcmtk/dcm2xml.dtd)

	 +Wn   --write-element-name
		 write name of the DICOM data elements (default)

	 -Wn   --no-element-name
		 do not	write name of the DICOM	data elements

	 +Wb   --write-binary-data
		 write binary data of OB and OW	elements
		 (default: off,	be careful with	--load-all)

       encoding	of binary data:

	 +Eh   --encode-hex
		 encode	binary data as hex numbers
		 (default for DCMTK-specific format)

	 +Eu   --encode-uuid
		 encode	binary data as a UUID reference
		 (default for Native DICOM Model)

	 +Eb   --encode-base64
		 encode	binary data as Base64 (RFC 2045, MIME)

DCMTK Format
       The basic structure of the DCMTK-specific XML  output  created  from  a
       DICOM file looks	like the following:

       <?xml version="1.0" encoding="ISO-8859-1"?>
       <!DOCTYPE file-format SYSTEM "dcm2xml.dtd">
       <file-format xmlns="">
	 <meta-header xfer="1.2.840.10008.1.2.1" name="Little Endian Explicit">
	   <element tag="0002,0000" vr="UL" vm="1" len="4"
	   <element tag="0002,0013" vr="SH" vm="1" len="16"
	 <data-set xfer="1.2.840.10008.1.2" name="Little Endian	Implicit">
	   <element tag="0008,0005" vr="CS" vm="1" len="10"
	     ISO_IR 100
	   <sequence tag="0028,3010" vr="SQ" card="2" name="VOILUTSequence">
	     <item card="3">
	       <element	tag="0028,3002"	vr="xs"	vm="3" len="6"
		 256 8
	   <element tag="7fe0,0010" vr="OW" vm="1" len="262144"
		    name="PixelData" loaded="no" binary="hidden">

       The  'file-format'  and	'meta-header'  tags  are absent	for DICOM data

   XML Encoding
       Attributes with very large value	 fields	 (e.g.	pixel  data)  are  not
       loaded  by  default. They can be	identified by the additional attribute
       'loaded'	with a value of	'no' (see example  above).  The	 command  line
       option  --load-all  forces  to load all value fields including the very
       long ones.

       Furthermore, binary data	of OB and OW attributes	are not	written	to the
       XML  output  file  by  default. These elements can be identified	by the
       additional attribute 'binary' with a  value  of	'hidden'  (default  is
       'no').  The  command line option	--write-binary-data causes also	binary
       value fields to be printed (attribute value is 'yes' or 'base64'). But,
       be  careful  when using this option together with --load-all because of
       the large amounts of pixel data that might be printed  to  the  output.
       Please note that	in this	context	element	values with a VR of OD,	OF, OL
       and OV are not regarded as 'binary data'.

       Multiple	values (i.e. where the DICOM  value  multiplicity  is  greater
       than  1)	 are  separated	 by a backslash	'\' (except for	Base64 encoded
       data). The 'len'	attribute  indicates  the  number  of  bytes  for  the
       particular  value  field	as stored in the DICOM data set, i.e. it might
       deviate from  the  XML  encoded	value  length  e.g.  because  of  non-
       significant padding that	has been removed. If this attribute is missing
       in 'sequence' or	'item' start tags, the corresponding DICOM element has
       been stored with	undefined length.

Native DICOM Model Format
       The  description	 of  the Native	DICOM Model format can be found	in the
       DICOM standard, part 19 ('Application Hosting').

   Bulk	Data
       Binary data, i.e. DICOM element values with Value Representations  (VR)
       of OB or	OW, as well as OD, OF, OL, OV and UN values are	by default not
       written to the XML output because of  their  size.  Instead,  for  each
       element,	 a new Universally Unique Identifier (UUID) is being generated
       and written as an attribute of a	<BulkData> XML element.	So far,	 there
       is  no  possibility to write an additional file to hold the binary data
       for each	of the binary  data  chunks.  This  is	not  required  by  the
       standard,  however,  it might be	useful for implementing	an Application
       Hosting interface;  thus	 this  feature	may  be	 available  in	future
       versions	of dcm2xml.

       In  addition,  Supplement  163  (Store Over the Web by Representational
       State Transfer Services)	introduces a new  <InlineBinary>  XML  element
       that  allows for	encoding binary	data as	Base64.	Currently, the command
       line option --encode-base64 enables this	 encoding  for	the  following
       VRs: OB,	OD, OF,	OL, OV,	OW and UN.

   Known Issues
       In  addition  to	 what  is written in the above section on 'Bulk	Data',
       there are further known issues with the current implementation  of  the
       Native  DICOM Model format. For example,	large element values with a VR
       other than OB, OD, OF, OL, OV, OW or UN are currently never written  as
       bulk  data,  although  it  might	 be  useful,  e.g.  for	very long text
       elements	(especially UT)	or very	long numeric fields (of	various	VRs).

   Character Encoding
       The XML encoding	is determined automatically from the  DICOM  attribute
       (0008,0005) 'Specific Character Set' using the following	mapping:

       ASCII	     (ISO_IR 6)	   =>  "UTF-8"
       UTF-8	     "ISO_IR 192"  =>  "UTF-8"
       ISO Latin 1   "ISO_IR 100"  =>  "ISO-8859-1"
       ISO Latin 2   "ISO_IR 101"  =>  "ISO-8859-2"
       ISO Latin 3   "ISO_IR 109"  =>  "ISO-8859-3"
       ISO Latin 4   "ISO_IR 110"  =>  "ISO-8859-4"
       ISO Latin 5   "ISO_IR 148"  =>  "ISO-8859-9"
       Cyrillic	     "ISO_IR 144"  =>  "ISO-8859-5"
       Arabic	     "ISO_IR 127"  =>  "ISO-8859-6"
       Greek	     "ISO_IR 126"  =>  "ISO-8859-7"
       Hebrew	     "ISO_IR 138"  =>  "ISO-8859-8"

       If  this	DICOM attribute	is missing in the input	file, although needed,
       option --charset-assume can be used to specify an appropriate character
       set  manually  (using  one  of the DICOM	defined	terms).	For reasons of
       backward	 compatibility	with  previous	versions  of  this  tool,  the
       following  terms	 are  also  supported  and mapped automatically	to the
       associated DICOM	defined	terms:	latin-1,  latin-2,  latin-3,  latin-4,
       latin-5,	cyrillic, arabic, greek, hebrew.

       Multiple	 character  sets  using	 code  extension  techniques  are  not
       supported. If needed, option --convert-to-utf8 can be used  to  convert
       the DICOM file or data set to UTF-8 encoding prior to the conversion to
       XML format. This	is also	useful for DICOMDIR files where	each directory
       record can have a different character set.

       If no mapping is	defined	and option --convert-to-utf8 is	not used, non-
       ASCII characters	and those below	#32 are	stored as '&#nnn;' where 'nnn'
       refers  to  the	numeric	 character  code.  This	 might lead to invalid
       character entity	references (such as '&#27;' for	ESC)  and  will	 cause
       most XML	parsers	to reject the document.

       The  level  of  logging	output	of  the	various	command	line tools and
       underlying libraries can	be specified by	the  user.  By	default,  only
       errors  and  warnings  are  written to the standard error stream. Using
       option --verbose	also informational messages  like  processing  details
       are  reported.  Option  --debug	can be used to get more	details	on the
       internal	activity, e.g. for debugging purposes.	Other  logging	levels
       can  be	selected  using	option --log-level. In --quiet mode only fatal
       errors are reported. In such very severe	error events, the  application
       will  usually  terminate.  For  more  details  on the different logging
       levels, see documentation of module 'oflog'.

       In case the logging output should be written to file  (optionally  with
       logfile	rotation),  to syslog (Unix) or	the event log (Windows)	option
       --log-config can	be used.  This	configuration  file  also  allows  for
       directing  only	certain	messages to a particular output	stream and for
       filtering certain messages based	on the	module	or  application	 where
       they  are  generated.  An  example  configuration  file	is provided in

       All command line	tools  use  the	 following  notation  for  parameters:
       square  brackets	 enclose  optional  values  (0-1), three trailing dots
       indicate	that multiple values are allowed (1-n),	a combination of  both
       means 0 to n values.

       Command line options are	distinguished from parameters by a leading '+'
       or '-' sign, respectively. Usually, order and position of command  line
       options	are  arbitrary	(i.e.  they  can appear	anywhere). However, if
       options are mutually exclusive the rightmost appearance is  used.  This
       behavior	 conforms  to  the  standard  evaluation  rules	of common Unix

       In addition, one	or more	command	files can be specified	using  an  '@'
       sign  as	 a  prefix to the filename (e.g. @command.txt).	Such a command
       argument	is replaced by the content  of	the  corresponding  text  file
       (multiple  whitespaces  are  treated  as	a single separator unless they
       appear between two quotation marks) prior to  any  further  evaluation.
       Please  note  that  a command file cannot contain another command file.
       This simple but effective  approach  allows  one	 to  summarize	common
       combinations  of	 options/parameters  and  avoids longish and confusing
       command lines (an example is provided in	file _datadir_/dumppat.txt).

       The dcm2xml utility  will  attempt  to  load  DICOM  data  dictionaries
       specified  in the DCMDICTPATH environment variable. By default, i.e. if
       the  DCMDICTPATH	 environment   variable	  is   not   set,   the	  file
       _datadir_/dicom.dic  will be loaded unless the dictionary is built into
       the application (default	for Windows).

       The  default  behavior  should  be  preferred   and   the   DCMDICTPATH
       environment  variable  only used	when alternative data dictionaries are
       required. The DCMDICTPATH environment variable has the same  format  as
       the  Unix  shell	PATH variable in that a	colon (':') separates entries.
       On Windows systems, a semicolon (';') is	used as	a separator. The  data
       dictionary  code	 will  attempt	to  load  each	file  specified	in the
       DCMDICTPATH environment variable. It is an error	if no data  dictionary
       can be loaded.

       _datadir_/dcm2xml.dtd - Document	Type Definition	(DTD) file

       xml2dcm(1), dcmconv(1)

       Copyright (C) 2002-2021 e.V., Escherweg 2, 26121	Oldenburg, Germany.

Version	3.6.6			Thu Jan	14 2021			    dcm2xml(1)


Want to link to this manual page? Use this URL:

home | help