Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages


home | help
samtools-view(1)	     Bioinformatics tools	      samtools-view(1)

       samtools	view - views and converts SAM/BAM/CRAM files

       view samtools view [options] in.sam|in.bam|in.cram [region...]

       With  no	 options  or  regions  specified, prints all alignments	in the
       specified input alignment file (in SAM, BAM, or CRAM format)  to	 stan-
       dard output in SAM format (with no header).

       You may specify one or more space-separated region specifications after
       the input filename to restrict output to	only  those  alignments	 which
       overlap	the specified region(s). Use of	region specifications requires
       a coordinate-sorted and indexed input file (in BAM or CRAM format).

       The -b, -C, -1, -u, -h, -H, and -c options  change  the	output	format
       from  the  default of headerless	SAM, and the -o	and -U options set the
       output file name(s).

       The -t and -T options provide additional	reference data.	One  of	 these
       two  options  is	 required when SAM input does not contain @SQ headers,
       and the -T option is required whenever writing CRAM output.

       The -L, -M, -r, -R, -d, -D, -s, -q, -l, -m, -f, -F, and -G options fil-
       ter  the	 alignments  that will be included in the output to only those
       alignments that match certain criteria.

       The -x and -B options modify the	data which is contained	in each	align-

       The  -X	option	can  be	used to	allow user to specify customized index
       file location(s)	if the data folder does	not contain  any  index	 file.
       See EXAMPLES section for	sample of usage.

       Finally,	the -@ option can be used to allocate additional threads to be
       used for	compression, and the -?	 option	requests a long	help message.

	      Regions can be specified as: RNAME[:STARTPOS[-ENDPOS]]  and  all
	      position coordinates are 1-based.

	      Important	note: when multiple regions are	given, some alignments
	      may be output multiple times if they overlap more	 than  one  of
	      the specified regions.

	      Examples of region specifications:

	      chr1	Output all alignments mapped to	the reference sequence
			named `chr1' (i.e. @SQ SN:chr1).

			The  region  on	 chr2  beginning  at   base   position
			1,000,000 and ending at	the end	of the chromosome.

			The  1001bp  region on chr3 beginning at base position
			1,000 and ending at  base  position  2,000  (including
			both end positions).

	      '*'	Output	the  unmapped  reads  at  the end of the file.
			(This does not include any unmapped reads placed on  a
			reference sequence alongside their mapped mates.)

	      .		Output	all  alignments.   (Mostly  unnecessary	as not
			specifying a region at all has the same	effect.)

       -b	 Output	in the BAM format.

       -C	 Output	in the CRAM format (requires -T).

       -1	 Enable	fast BAM compression (implies -b).

       -u	 Output	uncompressed BAM. This option saves time spent on com-
		 pression/decompression	 and is	thus preferred when the	output
		 is piped to another samtools command.

       -h	 Include the header in the output.

       -H	 Output	the header only.

       -c	 Instead of printing the alignments, only count	them and print
		 the total number. All filter options, such as -f, -F, and -q,
		 are taken into	account.

       -?	 Output	long help and exit immediately.

       -o FILE	 Output	to FILE	[stdout].

       -U FILE	 Write alignments that are not selected	by the various	filter
		 options  to  FILE.   When this	option is used,	all alignments
		 (or all alignments intersecting the  regions  specified)  are
		 written  to  either  the  output file or this file, but never

       -t FILE	 A tab-delimited FILE.	Each line must contain	the  reference
		 name  in  the first column and	the length of the reference in
		 the second column, with one line for each distinct reference.
		 Any  additional  fields beyond	the second column are ignored.
		 This file also	defines	the order of the  reference  sequences
		 in  sorting.  If  you run: `samtools faidx <ref.fa>', the re-
		 sulting index file _ref.fa_.fai can be	used as	this FILE.

       -T FILE	 A FASTA format	reference FILE,	optionally compressed by bgzip
		 and  ideally  indexed	by samtools faidx.  If an index	is not
		 present one will be generated for you,	if the reference  file
		 is local.

		 If  the  reference file is not	local, but is accessed instead
		 via an	https://, s3://	or other URL, the index	file will need
		 to  be	supplied by the	server alongside the reference.	 It is
		 possible to have the reference	and index files	 in  different
		 locations  by	supplying both to this option separated	by the
		 string	"##idx##", for example:


		 However, note that only the location of the reference will be
		 stored	 in the	output file header.  If	this method is used to
		 make CRAM files, the cram reader may not be able to find  the
		 index,	 and  may not be able to decode	the file unless	it can
		 get the references it needs using a different method.

       -L FILE	 Only output alignments	overlapping the	input BED FILE [null].

       -M	 Use the multi-region iterator on the union of a BED file  and
		 command-line  region  arguments.   This avoids	re-reading the
		 same regions of files so can sometimes	be much	faster.	  Note
		 this  also  removes  duplicate	sequences.  Without this a se-
		 quence	that overlaps multiple regions specified on  the  com-
		 mand  line  will  be reported multiple	times.	The usage of a
		 BED file is optional and its path has to be  preceded	by  -L

       -r STR	 Output	 alignments  in	 read  group  STR  [null].   Note that
		 records with no RG tag	will also be output  when  using  this
		 option.  This behaviour may change in a future	release.

       -R FILE	 Output	alignments in read groups listed in FILE [null].  Note
		 that records with no RG tag will also be  output  when	 using
		 this option.  This behaviour may change in a future release.

       -d STR:STR
		 Only  output alignments with tag STR and associated value STR

       -D STR:FILE
		 Only output alignments	with tag  STR  and  associated	values
		 listed	in FILE	[null].

       -q INT	 Skip alignments with MAPQ smaller than	INT [0].

       -l STR	 Only output alignments	in library STR [null].

       -m INT	 Only  output  alignments with number of CIGAR bases consuming
		 query sequence	>= INT [0]

       -f INT	 Only output alignments	with all bits set in  INT  present  in
		 the  FLAG  field.   INT  can be specified in hex by beginning
		 with `0x' (i.e. /^0x[0-9A-F]+/) or in octal by	beginning with
		 `0' (i.e. /^0[0-7]+/) [0].

       -F INT	 Do  not output	alignments with	any bits set in	INT present in
		 the FLAG field.  INT can be specified	in  hex	 by  beginning
		 with `0x' (i.e. /^0x[0-9A-F]+/) or in octal by	beginning with
		 `0' (i.e. /^0[0-7]+/) [0].

       -G INT	 Do not	output alignments with all bits	set in INT present  in
		 the  FLAG  field.   This is the opposite of -f	such that -f12
		 -G12 is the same as no	filtering at all.  INT can  be	speci-
		 fied  in  hex by beginning with `0x' (i.e. /^0x[0-9A-F]+/) or
		 in octal by beginning with `0'	(i.e. /^0[0-7]+/) [0].

       -x STR	 Read tag to exclude from output (repeatable) [null]

       -B	 Collapse the backward CIGAR operation.

       -s FLOAT	 Output	only a proportion of the input alignments.  This  sub-
		 sampling acts in the same way on all of the alignment records
		 in the	same template or read pair, so it never	keeps  a  read
		 but not its mate.

		 The  integer  and  fractional parts of	the -s INT.FRAC	option
		 are used separately: the part after the  decimal  point  sets
		 the fraction of templates/pairs to be kept, while the integer
		 part is used as a seed	that influences	which subset of	 reads
		 is kept.

		 When subsampling data that has	previously been	subsampled, be
		 sure to use a different seed value  from  those  used	previ-
		 ously;	otherwise more reads will be retained than expected.

       -@ INT	 Number	 of BAM	compression threads to use in addition to main
		 thread	[0].

       -S	 Ignored for compatibility with	 previous  samtools  versions.
		 Previously  this option was required if input was in SAM for-
		 mat, but now the correct format is automatically detected  by
		 examining the first few characters of input.

       -X	 Include customized index file as a part of arguments. See EX-
		 AMPLES	section	for sample of usage.

       --no-PG	 Do not	add a @PG line to the header of	the output file.

       o Import	SAM to BAM when	@SQ lines are present in the header:

	   samtools view -bS aln.sam > aln.bam

	 If @SQ	lines are absent:

	   samtools faidx ref.fa
	   samtools view -bt ref.fa.fai	aln.sam	> aln.bam

	 where ref.fa.fai is generated automatically by	the faidx command.

       o Convert a BAM file to a CRAM file using a local reference sequence.

	   samtools view -C -T ref.fa aln.bam >	aln.cram

       o Convert a BAM file to a CRAM with NM  and  MD	tags  stored  verbatim
	 rather	 than calculating on the fly during CRAM decode, so that mixed
	 data sets with	MD/NM only on some records,  or	 NM  calculated	 using
	 different  definitions	 of  mismatch,	can be decoded without change.
	 The second command demonstrates how to	decode such a file.   The  re-
	 quest to not decode MD	here is	turning	off auto-generation of both MD
	 and NM; it will still emit the	MD/NM tags on records that  had	 these
	 stored	verbatim.

	   samtools view -C --output-fmt-option	store_md=1 --output-fmt-option store_nm=1 -o aln.cram aln.bam
	   samtools view --input-fmt-option decode_md=0	-o aln.cram

       o An alternative	way of achieving the above is listing multiple options
	 after the --output-fmt	or -O option.  The commands below are  equiva-
	 lent to the two above.

	   samtools view -O cram,store_md=1,store_nm=1 -o aln.cram aln.bam
	   samtools view --input-fmt cram,decode_md=0 -o aln.cram

       o Include customized index file as a part of arguments.

	   samtools view [options] -X /data_folder/data.bam /index_folder/data.bai chrM:1-10

       o Output	 alignments  in	 read  group grp2 (records with	no RG tag will
	 also be in the	output).

	   samtools view -r grp2 -o /data_folder/data.rg2.bam /data_folder/data.bam

       o Only keep reads with tag BC and were the barcode matches the barcodes
	 listed	in the barcode file.

	   samtools view -D BC:barcodes.txt -o /data_folder/data.barcodes.bam /data_folder/data.bam

       o Only  keep  reads  with tag RG	and read group grp2.  This does	almost
	 the same than -r grp2 but will	not keep records without the RG	tag.

	   samtools view -d RG:grp2 -o /data_folder/data.rg2_only.bam /data_folder/data.bam

       Written by Heng Li from the Sanger Institute.

       samtools(1), samtools-tview(1), sam(5)

       Samtools	website: <>

samtools-1.11		       22 September 2020	      samtools-view(1)


Want to link to this manual page? Use this URL:

home | help