Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
samtools-sort(1)	     Bioinformatics tools	      samtools-sort(1)

NAME
       samtools	sort - sorts SAM/BAM/CRAM files

SYNOPSIS
       samtools	sort [-l level]	[-u] [-m maxMem] [-o out.bam] [-O format] [-M]
       [-K   kmerLen]	[-n]   [-t   tag]   [-T	  tmpprefix]   [-@    threads]
       [in.sam|in.bam|in.cram]

DESCRIPTION
       Sort  alignments	 by  leftmost  coordinates, or by read name when -n is
       used.  An appropriate @HD-SO sort order header tag will be added	or  an
       existing	one updated if necessary.

       The  sorted  output is written to standard output by default, or	to the
       specified file (out.bam)	when -o	is used.  This command will also  cre-
       ate  temporary  files tmpprefix.%d.bam as needed	when the entire	align-
       ment data cannot	fit into memory	(as controlled via the -m option).

       Consider	using samtools collate instead if you need name	collated  data
       without a full lexicographical sort.

       Note  that if the sorted	output file is to be indexed with samtools in-
       dex, the	default	coordinate sort	must be	used.  Thus the	-n and -t  op-
       tions are incompatible with samtools index.

OPTIONS
       -K INT	  Sets the kmer	size to	be used	in the -M option. [20]

       -l INT	  Set the desired compression level for	the final output file,
		  ranging from 0 (uncompressed)	or 1 (fastest but minimal com-
		  pression) to 9 (best compression but slowest to write), sim-
		  ilarly to gzip(1)'s compression level	setting.

		  If -l	is not used, the default compression level will	apply.

       -u	  Set the compression level to	0,  for	 uncompressed  output.
		  This is a synonym for	-l 0.

       -m INT	  Approximately	the maximum required memory per	thread,	speci-
		  fied either in bytes or with a K, M, or G suffix.  [768 MiB]

		  To prevent sort from creating	a  huge	 number	 of  temporary
		  files, it enforces a minimum value of	1M for this setting.

       -M	  Sort	unmapped  reads	(those in chromosome "*") by their se-
		  quence minimiser (Schleimer et al., 2003;  Roberts  et  al.,
		  2004),  also reverse complementing as	appropriate.  This has
		  the effect of	collating some similar data together,  improv-
		  ing  the compressibility of the unmapped sequence.  The min-
		  imiser kmer size is adjusted using the -K option.  Note data
		  compressed in	this manner may	need to	be name	collated prior
		  to conversion	back to	fastq.

		  Mapped sequences are sorted by chromosome and	position.

       -n	  Sort by read names (i.e., the	QNAME field)  rather  than  by
		  chromosomal coordinates.

       -t TAG	  Sort	first  by  the value in	the alignment tag TAG, then by
		  position or name (if also using -n).

       -o FILE	  Write	the final sorted output	to FILE, rather	than to	 stan-
		  dard output.

       -O FORMAT  Write	the final output as sam, bam, or cram.

		  By  default,	samtools tries to select a format based	on the
		  -o filename extension; if output is to standard output or no
		  format can be	deduced, bam is	selected.

       -T PREFIX  Write	 temporary  files to PREFIX.nnnn.bam, or if the	speci-
		  fied	PREFIX	is  an	existing  directory,  to   PREFIX/sam-
		  tools.mmm.mmm.tmp.nnnn.bam,  where mmm is unique to this in-
		  vocation of the sort command.

		  By default, any temporary files are  written	alongside  the
		  output  file,	 as  out.bam.tmp.nnnn.bam,  or if output is to
		  standard  output,  in	 the   current	 directory   as	  sam-
		  tools.mmm.mmm.tmp.nnnn.bam.

       -@ INT	  Set  number of sorting and compression threads.  By default,
		  operation is single-threaded.

       --no-PG	  Do not add a @PG line	to the header of the output file.

       Ordering	Rules

       The following rules are used for	ordering records.

       If option -t is in use, records are first sorted	by the	value  of  the
       given  alignment	 tag, and then by position or name (if using -n).  For
       example,	"-t RG"	will make read group the primary sort key.  The	 rules
       for ordering by tag are:

       o   Records that	do not have the	tag are	sorted before ones that	do.

       o   If the types	of the tags are	different, they	will be	sorted so that
	   single character tags (type A) come before  array  tags  (type  B),
	   then	 string	 tags  (types H	and Z),	then numeric tags (types f and
	   i).

       o   Numeric tags	(types f and i)	are compared by	value.	Note that com-
	   parisons of floating-point values are subject to issues of rounding
	   and precision.

       o   String tags (types H	and Z) are compared based on the  binary  con-
	   tents of the	tag using the C	strcmp(3) function.

       o   Character tags (type	A) are compared	by binary character value.

       o   No attempt is made to compare tags of other types --	notably	type B
	   array values	will not be compared.

       When the	-n option is present, records are sorted by name.   Names  are
       compared	so as to give a	"natural" ordering -- i.e. sections consisting
       of digits are compared numerically while	all other  sections  are  com-
       pared  based on their binary representation.  This means	"a1" will come
       before "b1" and "a9" will come before "a10".   Records  with  the  same
       name  will  be  ordered	according to the values	of the READ1 and READ2
       flags (see flags).

       When the	-n option is not present, reads	are sorted by  reference  (ac-
       cording	to  the	 order of the @SQ header records), then	by position in
       the reference, and then by the REVERSE flag.

       Note

       Historically samtools sort also accepted	a less flexible	way of	speci-
       fying the final and temporary output filenames:

	      samtools sort [-f] [-o] in.bam out.prefix

       This  has  now  been removed.  The previous out.prefix argument (and -f
       option, if any) should be changed to an appropriate combination	of  -T
       PREFIX  and -o FILE.  The previous -o option should be removed, as out-
       put defaults to standard	output.

AUTHOR
       Written by Heng Li from the Sanger Institute with  numerous  subsequent
       modifications.

SEE ALSO
       samtools(1), samtools-collate(1), samtools-merge(1)

       Samtools	website: <http://www.htslib.org/>

samtools-1.12			 17 March 2021		      samtools-sort(1)

NAME | SYNOPSIS | DESCRIPTION | OPTIONS | AUTHOR | SEE ALSO

Want to link to this manual page? Use this URL:
<https://www.freebsd.org/cgi/man.cgi?query=samtools-sort&sektion=1&manpath=FreeBSD+13.0-RELEASE+and+Ports>

home | help