Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages


home | help
awk(1)				 User Commands				awk(1)

       awk - pattern scanning and processing language

       /usr/bin/awk  [-f progfile]  [-F	c]  [  '  prog	'] [parameters]	[file-

       /usr/xpg4/bin/awk [-F ERE] [-v assignment...] 'program'	-f progfile...

       The  /usr/xpg4/bin/awk utility is described on the nawk(1) manual page.

       The /usr/bin/awk	utility	scans each input filename for lines that match
       any  of	a  set	of patterns specified in prog. The prog	string must be
       enclosed	in single quotes ( ') to protect it from the shell.  For  each
       pattern in prog there may be an associated action performed when	a line
       of a filename matches the pattern. The set of pattern-action statements
       may  appear  literally as prog or in a file specified with the -f prog-
       file option. Input files	are read in order; if there are	no files,  the
       standard	input is read. The file	name '-' means the standard input.

       The following options are supported:

       -f progfile
	     awk uses the set of patterns it reads from	progfile.

       -Fc   Uses  the character c as the field	separator (FS) character.  See
	     the discussion of FS below.

   Input Lines
       Each input line is matched against the pattern portion  of  every  pat-
       tern-action  statement;	the  associated	 action	 is performed for each
       matched pattern.	Any filename of	the form var=value is  treated	as  an
       assignment,  not	 a filename, and is executed at	the time it would have
       been opened if it were a	filename. Variables assigned  in  this	manner
       are  not	 available  inside a BEGIN rule, and are assigned after	previ-
       ously specified files have been read.

       An input	line is	normally made up of fields separated by	white  spaces.
       (This  default  can be changed by using the FS built-in variable	or the
       -Fc option.) The	default	is to ignore leading blanks  and  to  separate
       fields  by  blanks  and/or tab characters. However, if FS is assigned a
       value that does not include any	of  the	 white	spaces,	 then  leading
       blanks  are  not	ignored. The fields are	denoted	$1, $2,	...; $0	refers
       to the entire line.

   Pattern-action Statements
       A pattern-action	statement has the form:

       pattern { action	}

       Either pattern or action	may be omitted.	If there  is  no  action,  the
       matching	 line  is  printed. If there is	no pattern, the	action is per-
       formed on every input line. Pattern-action statements are separated  by
       newlines	or semicolons.

       Patterns	 are arbitrary Boolean combinations ( !, ||, &&, and parenthe-
       ses) of relational expressions and regular  expressions.	 A  relational
       expression is one of the	following:

       expression relop	expression
       expression matchop regular_expression

       where  a	 relop	is  any	 of  the  six relational operators in C, and a
       matchop is either ~ (contains) or !~ (does not contain).	An  expression
       is  an  arithmetic  expression,	a  relational  expression, the special

       var in array

       or a Boolean combination	of these.

       Regular expressions are as in egrep(1). In patterns they	must  be  sur-
       rounded	by slashes. Isolated regular expressions in a pattern apply to
       the entire line.	Regular	 expressions  may  also	 occur	in  relational
       expressions.  A	pattern	 may  consist  of  two patterns	separated by a
       comma; in this case, the	action is performed for	all lines between  the
       occurrence  of  the  first pattern to the occurrence of the second pat-

       The special patterns BEGIN and END  may	be  used  to  capture  control
       before the first	input line has been read and after the last input line
       has been	read respectively. These keywords  do  not  combine  with  any
       other patterns.

   Built-in Variables
       Built-in	variables include:

	     name of the current input file

       FS    input field separator regular expression (default blank and tab)

       NF    number of fields in the current record

       NR    ordinal number of the current record

       OFMT  output format for numbers (default	%.6g)

       OFS   output field separator (default blank)

       ORS   output record separator (default new-line)

       RS    input record separator (default new-line)

       An  action  is  a sequence of statements. A statement may be one	of the

       if ( expression ) statement [ else statement ]
       while ( expression ) statement
       do statement while ( expression )
       for ( expression	; expression ; expression ) statement
       for ( var in array ) statement
       { [ statement ] ... }
       expression      # commonly variable = expression
       print [ expression-list ] [ >expression ]
       printf format [ ,expression-list	] [ >expression	]
       next	       # skip remaining	patterns on this input line
       exit [expr]     # skip the rest of the input; exit status is expr

       Statements are terminated by semicolons,	newlines, or right braces.  An
       empty expression-list stands for	the whole input	line. Expressions take
       on string or numeric values as appropriate, and	are  built  using  the
       operators  +,  -,  *, /,	%, ^ and concatenation (indicated by a blank).
       The operators ++, --, +=, -=, *=, /=, %=, ^=, >,	>=, <, <=, ==, !=, and
       ?:  are	also available in expressions. Variables may be	scalars, array
       elements	(denoted x[i]),	or fields. Variables are  initialized  to  the
       null  string or zero. Array subscripts may be any string, not necessar-
       ily numeric; this allows	for a form of associative memory. String  con-
       stants are quoted (""), with the	usual C	escapes	recognized within.

       The  print statement prints its arguments on the	standard output, or on
       a file if >expression is	present, or on a pipe if  '|cmd'  is  present.
       The  output resulted from the print statement is	terminated by the out-
       put record separator with each argument separated by the	current	output
       field  separator.  The  printf  statement  formats  its expression list
       according to the	format (see printf(3C)).

   Built-in Functions
       The arithmetic functions	are as follows:

	     Return cosine of x, where x is in radians.	(In  /usr/xpg4/bin/awk
	     only. See nawk(1).)

	     Return  sine  of  x, where	x is in	radians. (In /usr/xpg4/bin/awk
	     only. See nawk(1).)

	     Return the	exponential function of	x.

	     Return the	natural	logarithm of x.

	     Return the	square root of x.

	     Truncate its argument to an integer. It will be truncated	toward
	     0 when x >	0.

       The string functions are	as follows:

       index(s,	t)
	     Return the	position in string s where string t first occurs, or 0
	     if	it does	not occur at all.

	     truncates s to an integer value. If s is  not  specified,	$0  is

	     Return  the  length  of its argument taken	as a string, or	of the
	     whole line	if there is no argument.

       split(s,	a, fs)
	     Split the string s	into array elements a[1], a[2],	... a[n],  and
	     returns  n. The separation	is done	with the regular expression fs
	     or	with the field separator FS if fs is not given.

       sprintf(fmt, expr, expr,...)
	     Format the	expressions according to the printf(3C)	 format	 given
	     by	fmt and	returns	the resulting string.

       substr(s, m, n)
	     returns the n-character substring of s that begins	at position m.

       The input/output	function is as follows:

	     Set $0 to the next	input record from the current input file. get-
	     line  returns  1  for successful input, 0 for end of file,	and -1
	     for an error.

   Large File Behavior
       See largefile(5)	for the	 description  of  the  behavior	 of  awk  when
       encountering files greater than or equal	to 2 Gbyte ( 2**31 bytes).

       Example 1: Print	lines longer than 72 characters:

       length >	72

       Example 2: Print	first two fields in opposite order:

       { print $2, $1 }

       Example 3: Same,	with input fields separated by comma and/or blanks and

       BEGIN { FS = ",[	\t]*|[ \t]+" }
	     { print $2, $1 }

       Example 4: Add up first column, print sum and average:

	    { s	+= $1 }
       END  { print "sum is", s, " average is",	s/NR }

       Example 5: Printing fields in reverse order

       { for (i	= NF; i	> 0; --i) print	$i }

       Example 6: Print	all lines between start/stop pairs:

       /start/,	/stop/

       Example 7: Print	all lines whose	first field is different from the pre-
       vious one:

       $1 != prev { print; prev	= $1 }

       Example 8: Print	a file,	filling	in page	numbers	starting at 5:

	 /Page/	 { $2 =	n++; }
		    { print }

       Example 9: Print	a file and number its pages starting at	5:

       Assuming	 this  program	is in a	file named prog, the following command
       line prints the file input numbering its	pages starting at 5:

       awk f prog n=5 input

       See environ(5) for descriptions of the following	environment  variables
       that affect the execution of awk: LC_CTYPE and LC_MESSAGES.

	     Determine	the  radix  character  used  when interpreting numeric
	     input, performing conversions between numeric and	string	values
	     and  formatting  numeric output. Regardless of locale, the	period
	     character (the decimal-point character of the  POSIX  locale)  is
	     the decimal-point character recognized in processing awk programs
	     (including	assignments in command-line arguments).

       See attributes(5) for descriptions of the following attributes:

       |      ATTRIBUTE	TYPE	     |	    ATTRIBUTE VALUE	   |
       |Availability		     |SUNWesu			   |
       |CSI			     |Enabled			   |

       |      ATTRIBUTE	TYPE	     |	    ATTRIBUTE VALUE	   |
       |Availability		     |SUNWxcu4			   |
       |CSI			     |Enabled			   |

       egrep(1), grep(1), nawk(1), sed(1),  printf(3C),	 attributes(5),	 envi-
       ron(5), largefile(5), XPG4(5)

       Input white space is not	preserved on output if fields are involved.

       There are no explicit conversions between numbers and strings. To force
       an expression to	be treated as a	number add 0 to	it; to force it	to  be
       treated as a string concatenate the null	string ("") to it.

SunOS 5.9			  7 Jul	2000				awk(1)


Want to link to this manual page? Use this URL:

home | help