Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages


home | help
bgzip(1)		     Bioinformatics tools		      bgzip(1)

       bgzip - Block compression/decompression utility

       bgzip   [-cdfhir]  [-b  virtualOffset]  [-I  index_name]	 [-l  compres-
       sion_level] [-s size] [-@ threads] [file]

       Bgzip compresses	files in a similar manner  to,	and  compatible	 with,
       gzip(1).	 The file is compressed	into a series of small (less than 64K)
       'BGZF' blocks.  This allows indexes to be built against the  compressed
       file and	used to	retrieve portions of the data without having to	decom-
       press the entire	file.

       If no files are specified on the	command	line, bgzip will compress  (or
       decompress if the -d option is used) standard input to standard output.
       If a file is specified, it will be  compressed  (or  decompressed  with
       -d).   If the -c	option is used,	the result will	be written to standard
       output, otherwise when compressing bgzip	will write to a	new file  with
       a  .gz  suffix  and  remove the original.  When decompressing the input
       file must have a	.gz suffix, which will be removed to make  the	output
       name.   Again  after decompression completes the	input file will	be re-

       -b, --offset INT
		 Decompress to standard	 output	 from  virtual	file  position
		 (0-based uncompressed offset).	 Implies -c and	-d.

       -c, --stdout
		 Write to standard output, keep	original files unchanged.

       -d, --decompress

       -f, --force
		 Overwrite  files  without  asking,  or	 decompress files that
		 don't have a known compression	filename extension (e.g., .gz)
		 without asking.  Use --force twice to do both without asking.

       -h, --help
		 Displays a help message.

       -i, --index
		 Create	 a BGZF	index while compressing.  Unless the -I	option
		 is used, this will have the name of the compressed file  with
		 .gzi appended to it.

       -I, --index-name	FILE
		 Index file name.

       -l, --compress-level INT
		 Compression  level  to	use when compressing.  From 0 to 9, or
		 -1 for	the default level set by the compression library. [-1]

       -r, --reindex
		 Rebuild the index on an existing compressed file.

       -g, --rebgzip
		 Try to	use an existing	index to create	a compressed file with
		 matching block	offsets.  Note that this assumes that the same
		 compression library and level are in use as when  making  the
		 original  file.  Don't	use it unless you know what you're do-

       -s, --size INT
		 Decompress INT	bytes (uncompressed size) to standard  output.
		 Implies -c.

       -@, --threads INT
		 Number	of threads to use [1].

       The  BGZF format	written	by bgzip is described in the SAM format	speci-
       fication	available from

       It makes	use of a gzip feature which allows compressed files to be con-
       catenated.   The	 input data is divided into blocks which are no	larger
       than 64 kilobytes both before and after compression (including compres-
       sion  headers).	 Each  block is	compressed into	a gzip file.  The gzip
       header includes an extra	sub-field with identifier 'BC' and the	length
       of the compressed block,	including all headers.

       The  index  format is a binary file listing pairs of compressed and un-
       compressed offsets in a BGZF file.  Each	compressed  offset  points  to
       the  start of a BGZF block.  The	uncompressed offset is the correspond-
       ing location in the uncompressed	data stream.

       All values are stored as	little-endian 64-bit unsigned integers.

       The file	contents are:

	   uint64_t number_entries

       followed	by number_entries pairs	of:

	   uint64_t compressed_offset
	   uint64_t uncompressed_offset

	   # Compress stdin to stdout
	   bgzip < /usr/share/dict/words > /tmp/words.gz

	   # Make a .gzi index
	   bgzip -r /tmp/words.gz

	   # Extract part of the data using the	index
	   bgzip -b 367635 -s 4	/tmp/words.gz

	   # Uncompress	the whole file,	removing the compressed	copy
	   bgzip -d /tmp/words.gz

       The BGZF	library	was originally implemented by Bob Handsaker and	 modi-
       fied by Heng Li for remote file access and in-memory caching.

       gzip(1),	tabix(1)

htslib-1.11		       22 September 2020		      bgzip(1)


Want to link to this manual page? Use this URL:

home | help