Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages


home | help
samtools-view(1)	     Bioinformatics tools	      samtools-view(1)

       samtools	view - views and converts SAM/BAM/CRAM files

       view samtools view [options] in.sam|in.bam|in.cram [region...]

       With  no	 options  or  regions  specified, prints all alignments	in the
       specified input alignment file (in SAM, BAM, or CRAM format)  to	 stan-
       dard output in SAM format (with no header).

       You may specify one or more space-separated region specifications after
       the input filename to restrict output to	only  those  alignments	 which
       overlap	the specified region(s). Use of	region specifications requires
       a coordinate-sorted and indexed input file (in BAM or CRAM format).

       The -b, -C, -1, -u, -h, -H, and -c options  change  the	output	format
       from  the  default of headerless	SAM, and the -o	and -U options set the
       output file name(s).

       The -t and -T options provide additional	reference data.	One  of	 these
       two  options  is	 required when SAM input does not contain @SQ headers,
       and the -T option is required whenever writing CRAM output.

       The -L, -M, -N, -r, -R, -d, -D, -s, -q, -l, -m, -f, -F, and -G  options
       filter the alignments that will be included in the output to only those
       alignments that match certain criteria.

       The -x and -B options modify the	data which is contained	in each	align-

       The  -X	option	can  be	used to	allow user to specify customized index
       file location(s)	if the data folder does	not contain  any  index	 file.
       See EXAMPLES section for	sample of usage.

       Finally,	the -@ option can be used to allocate additional threads to be
       used for	compression, and the -?	 option	requests a long	help message.

	      Regions can be specified as: RNAME[:STARTPOS[-ENDPOS]]  and  all
	      position coordinates are 1-based.

	      Important	note: when multiple regions are	given, some alignments
	      may be output multiple times if they overlap more	 than  one  of
	      the specified regions.

	      Examples of region specifications:

	      chr1	Output all alignments mapped to	the reference sequence
			named `chr1' (i.e. @SQ SN:chr1).

			The  region  on	 chr2  beginning  at   base   position
			1,000,000 and ending at	the end	of the chromosome.

			The  1001bp  region on chr3 beginning at base position
			1,000 and ending at  base  position  2,000  (including
			both end positions).

	      '*'	Output	the  unmapped  reads  at  the end of the file.
			(This does not include any unmapped reads placed on  a
			reference sequence alongside their mapped mates.)

	      .		Output	all  alignments.   (Mostly  unnecessary	as not
			specifying a region at all has the same	effect.)

       -b	 Output	in the BAM format.

       -C	 Output	in the CRAM format (requires -T).

       -1	 Enable	fast BAM compression (implies -b).

       -u	 Output	uncompressed BAM. This option saves time spent on com-
		 pression/decompression	 and is	thus preferred when the	output
		 is piped to another samtools command.

       -h	 Include the header in the output.

       -H	 Output	the header only.

       -c	 Instead of printing the alignments, only count	them and print
		 the total number. All filter options, such as -f, -F, and -q,
		 are taken into	account.

       -?	 Output	long help and exit immediately.

       -o FILE	 Output	to FILE	[stdout].

       -U FILE	 Write alignments that are not selected	by the various	filter
		 options  to  FILE.   When this	option is used,	all alignments
		 (or all alignments intersecting the  regions  specified)  are
		 written  to  either  the  output file or this file, but never

       -t FILE	 A tab-delimited FILE.	Each line must contain	the  reference
		 name  in  the first column and	the length of the reference in
		 the second column, with one line for each distinct reference.
		 Any  additional  fields beyond	the second column are ignored.
		 This file also	defines	the order of the  reference  sequences
		 in  sorting.  If  you run: `samtools faidx <ref.fa>', the re-
		 sulting index file _ref.fa_.fai can be	used as	this FILE.

       -T FILE	 A FASTA format	reference FILE,	optionally compressed by bgzip
		 and  ideally  indexed	by samtools faidx.  If an index	is not
		 present one will be generated for you,	if the reference  file
		 is local.

		 If  the  reference file is not	local, but is accessed instead
		 via an	https://, s3://	or other URL, the index	file will need
		 to  be	supplied by the	server alongside the reference.	 It is
		 possible to have the reference	and index files	 in  different
		 locations  by	supplying both to this option separated	by the
		 string	"##idx##", for example:


		 However, note that only the location of the reference will be
		 stored	 in the	output file header.  If	this method is used to
		 make CRAM files, the cram reader may not be able to find  the
		 index,	 and  may not be able to decode	the file unless	it can
		 get the references it needs using a different method.

       -L FILE	 Only output alignments	overlapping the	input BED FILE [null].

       -M	 Use the multi-region iterator on the union of a BED file  and
		 command-line  region  arguments.   This avoids	re-reading the
		 same regions of files so can sometimes	be much	faster.	  Note
		 this  also  removes  duplicate	sequences.  Without this a se-
		 quence	that overlaps multiple regions specified on  the  com-
		 mand  line  will  be reported multiple	times.	The usage of a
		 BED file is optional and its path has to be  preceded	by  -L

       -N FILE	 Output	only alignments	with read names	listed in FILE.

       -r STR	 Output	 alignments  in	 read  group  STR  [null].   Note that
		 records with no RG tag	will also be output  when  using  this
		 option.  This behaviour may change in a future	release.

       -R FILE	 Output	alignments in read groups listed in FILE [null].  Note
		 that records with no RG tag will also be  output  when	 using
		 this option.  This behaviour may change in a future release.

       -d STR1[:STR2]
		 Only  output  alignments  with	 tag STR1 and associated value
		 STR2, which can be a string or	an integer [null].  The	 value
		 can be	omitted, in which case only the	tag is considered.

       -D STR:FILE
		 Only  output  alignments  with	 tag STR and associated	values
		 listed	in FILE	[null].

       -q INT	 Skip alignments with MAPQ smaller than	INT [0].

       -l STR	 Only output alignments	in library STR [null].

       -m INT	 Only output alignments	with number of CIGAR  bases  consuming
		 query sequence	>= INT [0]

       -e STR	 Only include alignments that match the	filter expression STR.
		 The syntax for	these expressions are in the main  samtools(1)
		 man page under	the FILTER EXPRESSIONS heading.

       -f INT	 Only  output  alignments  with	all bits set in	INT present in
		 the FLAG field.  INT can be specified	in  hex	 by  beginning
		 with `0x' (i.e. /^0x[0-9A-F]+/) or in octal by	beginning with
		 `0' (i.e. /^0[0-7]+/) [0].

       -F INT	 Do not	output alignments with any bits	set in INT present  in
		 the  FLAG  field.   INT  can be specified in hex by beginning
		 with `0x' (i.e. /^0x[0-9A-F]+/) or in octal by	beginning with
		 `0' (i.e. /^0[0-7]+/) [0].

       -G INT	 Do  not output	alignments with	all bits set in	INT present in
		 the FLAG field.  This is the opposite of -f  such  that  -f12
		 -G12  is  the same as no filtering at all.  INT can be	speci-
		 fied in hex by	beginning with `0x' (i.e.  /^0x[0-9A-F]+/)  or
		 in octal by beginning with `0'	(i.e. /^0[0-7]+/) [0].

       -x STR	 Read tag to exclude from output (repeatable) [null]

       -B	 Collapse the backward CIGAR operation.

       -s FLOAT	 Output	 only a	proportion of the input	alignments.  This sub-
		 sampling acts in the same way on all of the alignment records
		 in  the  same template	or read	pair, so it never keeps	a read
		 but not its mate.

		 The integer and fractional parts of the  -s  INT.FRAC	option
		 are  used  separately:	 the part after	the decimal point sets
		 the fraction of templates/pairs to be kept, while the integer
		 part  is used as a seed that influences which subset of reads
		 is kept.

		 When subsampling data that has	previously been	subsampled, be
		 sure  to  use	a  different seed value	from those used	previ-
		 ously;	otherwise more reads will be retained than expected.

       -@ INT	 Number	of BAM compression threads to use in addition to  main
		 thread	[0].

       -S	 Ignored  for  compatibility  with previous samtools versions.
		 Previously this option	was required if	input was in SAM  for-
		 mat,  but now the correct format is automatically detected by
		 examining the first few characters of input.

       -X	 Include customized index file as a part of arguments. See EX-
		 AMPLES	section	for sample of usage.

       --no-PG	 Do not	add a @PG line to the header of	the output file.

       o Import	SAM to BAM when	@SQ lines are present in the header:

	   samtools view -bS aln.sam > aln.bam

	 If @SQ	lines are absent:

	   samtools faidx ref.fa
	   samtools view -bt ref.fa.fai	aln.sam	> aln.bam

	 where ref.fa.fai is generated automatically by	the faidx command.

       o Convert a BAM file to a CRAM file using a local reference sequence.

	   samtools view -C -T ref.fa aln.bam >	aln.cram

       o Convert  a  BAM  file	to  a CRAM with	NM and MD tags stored verbatim
	 rather	than calculating on the	fly during CRAM	decode,	so that	 mixed
	 data  sets  with  MD/NM  only on some records,	or NM calculated using
	 different definitions of mismatch, can	 be  decoded  without  change.
	 The  second  command demonstrates how to decode such a	file.  The re-
	 quest to not decode MD	here is	turning	off auto-generation of both MD
	 and  NM;  it will still emit the MD/NM	tags on	records	that had these
	 stored	verbatim.

	   samtools view -C --output-fmt-option	store_md=1 --output-fmt-option store_nm=1 -o aln.cram aln.bam
	   samtools view --input-fmt-option decode_md=0	-o aln.cram

       o An alternative	way of achieving the above is listing multiple options
	 after	the --output-fmt or -O option.	The commands below are equiva-
	 lent to the two above.

	   samtools view -O cram,store_md=1,store_nm=1 -o aln.cram aln.bam
	   samtools view --input-fmt cram,decode_md=0 -o aln.cram

       o Include customized index file as a part of arguments.

	   samtools view [options] -X /data_folder/data.bam /index_folder/data.bai chrM:1-10

       o Output	alignments in read group grp2 (records with  no	 RG  tag  will
	 also be in the	output).

	   samtools view -r grp2 -o /data_folder/data.rg2.bam /data_folder/data.bam

       o Only keep reads with tag BC and were the barcode matches the barcodes
	 listed	in the barcode file.

	   samtools view -D BC:barcodes.txt -o /data_folder/data.barcodes.bam /data_folder/data.bam

       o Only keep reads with tag RG and read group grp2.   This  does	almost
	 the same than -r grp2 but will	not keep records without the RG	tag.

	   samtools view -d RG:grp2 -o /data_folder/data.rg2_only.bam /data_folder/data.bam

       Written by Heng Li from the Sanger Institute.

       samtools(1), samtools-tview(1), sam(5)

       Samtools	website: <>

samtools-1.12			 17 March 2021		      samtools-view(1)


Want to link to this manual page? Use this URL:

home | help