Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages


home | help
LALIGN/PLALIGN(1)	    General Commands Manual	     LALIGN/PLALIGN(1)

       lalign  - compare two protein or	DNA sequences for local	similarity and
       show the	local sequence alignments

       plalign,flalign - compare two sequences for local similarity  and  plot
       the local sequence alignments

       lalign [-EKfgiImnNOQqrRswxZ] sequence-file-1 sequence-file-2
       plalign [-EKfgiImnNQqrRsvwxZ] sequence-file-1 sequence-file-2

       lalign and plalign programs compare two sequences looking for local se-
       quence similarities.  lalign/plalign use	code developed by X. Huang and
       W.  Miller  (Adv. Appl. Math. (1991) 12:337-357)	for the	"sim" program.
       (Version	2.1 uses sim2 code.)  While  ssearch  reports  only  the  best
       alignment  between  the query sequence and the library sequence,	lalign
       and plalign will	report all the alignments with	pair-wisse  probabili-
       ties  <	0.05  (default,	 modified with -E #) between the two sequences
       lalign shows the	actual local alignments	between	the two	sequences  and
       their  scores,  while  plalign  produces	 a plot	of the alignments that
       looks similar to	a `dot-matrix'	homology  plot.	  On  Unixtm  systems,
       plalign	generates  postscript  output.	flalign	generates graphic com-
       mands for the GCG "figure" program.

       Probability estimates for the lalign/plalign/flalign programs are based
       on  the	parameters provided by Altschul	and Gish (1996)	Meth. Enzymol.
       266:460-480.  These parameters are available  for  BLOSUM50,  BLOSUM62,
       and  PAM250  scoring matrices with specific gap penalties, and also for
       DNA comparison with a gap penalty of -16,  -4.	Probability  estimates
       are not available for other scoring matrices and	gap penalties.

       The  E(10,000)  values  reported	 with the alignments are the pairwise-
       alignment probabilities multiplied by 10,000. These estimates  approxi-
       mate  the  significance from a search of	a 10,000 entry database.  They
       differ from the -E 0.05 initial theshold	by the same factor of  10,000.
       This  is	an unfortunate inconsistency, but I believe that it is helpful
       to provide the perspective of a database	search.

       The lalign/plalign/fasta	programs use a standard	text  format  sequence
       file.   Lines beginning with '>'	or ';' are considered comments and ig-
       nored; sequences	can be upper or	lower case, blanks,tabs	 and  unrecog-
       nizable characters are ignored.	lalign/plalign expect sequences	to use
       the single letter amino acid codes, see protcodes(1) .

       lalign and the other programs can be directed to	change the scoring ma-
       trix,  search parameters, output	format,	and default search directories
       by entering options on the command line (preceeded by a	`-').  All  of
       the  options  should  preceed the file name and ktup arguments).	Alter-
       nately, these options can be changed by setting environment  variables.
       The options and environment variables are:

       -E #   Pairwise-probability limit (default -E 0.05).

       -K #   maximum number of	alignments to be shown (default	-K 50).

       -f #   Penalty for the first residue a gap (-14 by default).

       -g #   Penalty for each additional residue in a gap (-4 by default).

       -i     Compare the reverse complement (DNA only).

       -I     Show alignment between identical sequences.  Normally, the iden-
	      tity alignment is	not shown.

       -m #   (MARKX) =1,2,3. Alternate	display	of matches and	mismatches  in
	      alignments.  MARKX=1  uses ":","."," ", for identities, conseva-
	      tive replacements, and  non-conservative	replacements,  respec-
	      tively.  MARKX=2	uses  "	","x", and "X".	 MARKX=3 does not show
	      the second sequence, but uses the	second alignment line to  dis-
	      play  matches  with  a "."  for identity,	or with	the mismatched
	      residue for mismatches.  MARKX=3 is useful  for  aligning	 large
	      numbers of similar sequences.

       -n     pre-specify DNA sequence,	rather than infer from	sequence.

       -N #   limit first and second sequences to '#' residues.

       -s str (SMATRIX)	 the  filename	of an alternative scoring matrix file.
	      For protein sequences, BLOSUM50 is used by default;  PAM250  can
	      be  used with the	command	line option -s P250, BLOSUM62 with "-s

       -v str (LINEVAL)	(plalign only) plalign can use up to 4 different  line
	      styles  to  denote  the  scores of local alignments.  The	scores
	      that correspond to these line styles can be specified  with  the
	      environment  variable  LINVAL, or	with the -v option.  In	either
	      case, a string with three	numbers	separated by spaces should  be
	      given.   This  string  must  be  surrounded  by double quotation
	      marks.  For example, LINEVAL="200	100 50"	tells plalign  to  use
	      solid  lines  for	local alignments with scores greater than 200,
	      long dashed lines	for scores between 100 and 200,	 short	dashed
	      lines for	scores between 50 and 100, and dotted lines for	scores
	      less than	50.
		   plalign -v "200 100 50"
	      Normally,	the values are 200, 100, and 50	for  protein  sequence
	      comparisons and 400, 200,	and 100	for DNA	sequence comparisons.

       -w #   (LINLEN)	output line length for sequence	alignments.  (normally
	      60, can be set up	to 200).

       (1)    lalign mchu.aa mchu.aa

       Compare the amino acid sequence in the file mchu.aa with	itself and re-
       port  the  ten  best  local alignments.	Sequence files should have the

	    >MCHU - Calmodulin - Human ...

       (2)    plalign -K 100 -E	0.01 qrhuld.aa egmsmg.aa

       Display up to 100 local alignments of the LDL receptor (qrhuld.aa) with
       epidermal  growth factor	precursor (egmsmg.aa) with pairwise probabili-
       ties better than	0.01.  Plot the	results	on the screen.

       (3)    lalign

       Run the lalign program in interactive mode.  The	 program  will	prompt
       for  the	 name  of  two	sequence files and the number of alignments to

       ssearch(1), prss(1), fasta(1), protcodes(5), dnacodes(5)

       Bill Pearson

				     local		     LALIGN/PLALIGN(1)


Want to link to this manual page? Use this URL:

home | help