Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages


home | help
AWK(1)			  BSD General Commands Manual			AWK(1)

     awk -- pattern-directed scanning and processing language

     awk [-F fs] [-v var=value]	[-safe]	[-d[N]]	[prog |	-f filename] file ...
     awk -version

     awk is the	Bell Labs' implementation of the AWK programming language as
     described in the The AWK Programming Language by A. V. Aho, B. W.
     Kernighan,	and P. J. Weinberger.

     awk scans each input file for lines that match any	of a set of patterns
     specified literally in prog or in one or more files specified as -f
     filename.	With each pattern there	can be an associated action that will
     be	performed when a line of a file	matches	the pattern.  Each line	is
     matched against the pattern portion of every pattern-action statement;
     the associated action is performed	for each matched pattern.  The file
     name - means the standard input.  Any file	of the form var=value is
     treated as	an assignment, not a filename, and is executed at the time it
     would have	been opened if it were a filename.

     The options are as	follows:

     -d[N]   Set debug level to	specified number N.  If	the number is omitted,
	     debug level is set	to 1.

     -f	filename
	     Read the AWK program source from specified	file filename, instead
	     of	the first command line argument.  Multiple -f options may be

     -F	fs   Set the input field separator FS to the regular expression	fs.

     -mr NNN, -mf NNN
	     Obsolete, no longer needed	options.  Set limit on maximum record
	     or	fields number.

     -safe   Potentially unsafe	functions such as system() make	the program
	     abort (with a warning message).

     -v	var=value
	     Assign the	value value to the variable var	before prog is exe-
	     cuted.  Any number	of -v options may be present.

	     Print awk version on standard output and exit.

     An	input line is normally made up of fields separated by white space, or
     by	regular	expression FS.	The fields are denoted $1, $2, ..., while $0
     refers to the entire line.	 If FS is null,	the input line is split	into
     one field per character.

     A pattern-action statement	has the	form

	   pattern { action }

     A missing { action	} means	print the line;	a missing pattern always
     matches.  Pattern-action statements are separated by newlines or semi-

     An	action is a sequence of	statements.  Statements	are terminated by
     semicolons, newlines or right braces.  An empty expression-list stands
     for $0.  String constants are quoted " ", with the	usual C	escapes	recog-
     nized within.  Expressions	take on	string or numeric values as appropri-
     ate, and are built	using the Operators (see next subsection).  Variables
     may be scalars, array elements (denoted x[i]) or fields.  Variables are
     initialized to the	null string.  Array subscripts may be any string, not
     necessarily numeric; this allows for a form of associative	memory.	 Mul-
     tiple subscripts such as [i,j,k] are permitted; the constituents are con-
     catenated,	separated by the value of SUBSEP.

     awk operators, in order of	decreasing precedence, are:

     (...)  Grouping
     $	    Field reference
     ++	--  Increment and decrement, can be used either	as postfix or prefix.
     ^	    Exponentiation (the	** form	is also	supported, and **= for the as-
	    signment operator).
     + - !  Unary plus,	unary minus and	logical	negation.
     * / %  Multiplication, division and modulus.
     + -    Addition and subtraction.
     space  String concatenation.
     < >
     <=	>=
     !=	==  Regular relational operators
     ~ !~   Regular expression match and not match
     in	    Array membership
     &&	    Logical AND
     ||	    Logical OR
     ?:	    C conditional expression.  This is used as expr1 ? expr2 : expr3 .
	    If expr1 is	true, the result value is expr2, otherwise it is
	    expr3.  Only one of	expr2 and expr3	is evaluated.
     = += -=
     *=	/= %= ^=
	    Assignment and Operator-Assignment

   Control Statements
     The control statements are	as follows:

	   if (	expression ) statement [else statement]
	   while ( expression )	statement
	   for ( expression ; expression ; expression )	statement
	   for ( var in	array )	statement
	   do statement	while (	expression )
	   delete array	[expression]
	   delete array
	   exit	[expression] expression
	   return [expression]
	   { [statement	...] }

   I/O Statements
     The input/output statements are as	follows:

	     Closes the	file or	pipe expr.  Returns zero on success; otherwise

	     Flushes any buffered output for the file or pipe expr.  Returns
	     zero on success; otherwise	nonzero.

     getline [var]
	     Set var (or $0 if var is not specified) to	the next input record
	     from the current input file.  getline returns 1 for a successful
	     input, 0 for end of file, and -1 for an error.

     getline [var] < file
	     Set var (or $0 if var is not specified) to	the next input record
	     from the specified	file file.

     expr | getline
	     Pipes the output of expr into getline; each call of getline re-
	     turns the next line of output from	expr.

     next    Skip remaining patterns on	this input line.

	     Skip rest of this file, open next,	start at top.

     print [expr-list] [> file]
	     The print statement prints	its arguments on the standard output
	     (or to a file if >	file or	to a pipe if | expr is present), sepa-
	     rated by the current output field separator OFS, and terminated
	     by	the output record separator ORS.  Both file and	expr may be
	     literal names or parenthesized expressions; identical string val-
	     ues in different statements denote	the same open file.

     printf format [, expr-list] [> file]
	     Format and	print its expression list according to format.	See
	     printf(3) for list	of supported formats and their meaning.

   Mathematical	and Numeric Functions
     AWK has the following mathematical	and numerical functions	built-in:

     atan2(x, y)
	     Returns the arctangent of x / y in	radians.  See also atan2(3).

	     Computes the cosine of expr, measured in radians.	See also

	     Computes the exponential value of the given argument expr.	 See
	     also exp(3).

	     Truncates expr to integer.

	     Computes the value	of the natural logarithm of argument expr.
	     See also log(3).

     rand()  Returns random number between 0 and 1.

	     Computes the sine of expr,	measured in radians.  See also sin(3).

	     Computes the non-negative square root of expr.  See also sqrt(3).

	     Sets seed for random number generator ( rand()) and returns the
	     previous seed.

   String Functions
     AWK has the following string functions built-in:

     gensub(r, s, h, [t])
	     Search the	target string t	for matches of the regular expression
	     r.	 If h is a string beginning with g or G, then replace all
	     matches of	r with s.  Otherwise, h	is a number indicating which
	     match of r	to replace.  If	no t is	supplied, $0 is	used instead.
	     Unlike sub() and gsub(), the modified string is returned as the
	     result of the function, and the original target is	not changed.
	     Note that the \n sequences	within replacement string s supported
	     by	GNU awk	are not	supported at this moment.

     gsub(r, t,	[s])
	     same as sub() except that all occurrences of the regular expres-
	     sion are replaced;	sub() and gsub() return	the number of replace-

     index(s, t)
	     the position in s where the string	t occurs, or 0 if it does not.

	     the length	of its argument	taken as a string, or of $0 if no ar-

     match(s, r)
	     the position in s where the regular expression r occurs, or 0 if
	     it	does not.  The variables RSTART	and RLENGTH are	set to the po-
	     sition and	length of the matched string.

     split(s, a, [fs])
	     splits the	string s into array elements a[1], a[2], ..., a[n],
	     and returns n.  The separation is done with the regular expres-
	     sion fs or	with the field separator FS if fs is not given.	 An
	     empty string as field separator splits the	string into one	array
	     element per character.

     sprintf(fmt, expr,	...)
	     Returns the string	resulting from formatting expr according to
	     the printf(3) format fmt.

     sub(r, t, [s])
	     substitutes t for the first occurrence of the regular expression
	     r in the string s.	 If s is not given, $0 is used.

     substr(s, m, [n])
	     Returns the at most n-character substring of s starting at	posi-
	     tion m, counted from 1.  If n is omitted, the rest	of s is	re-

	     returns a copy of str with	all upper-case characters translated
	     to	their corresponding lower-case equivalents.

	     returns a copy of str with	all lower-case characters translated
	     to	their corresponding upper-case equivalents.

   Time	Functions
     This awk provides the following two functions for obtaining time stamps
     and formatting them:

	     Returns the value of time in seconds since	the start of Unix
	     Epoch (Midnight, January 1, 1970, Coordinated Universal Time).
	     See also time(3).

     strftime([format [, timestamp]])
	     Formats the time timestamp	according to the string	format.
	     timestamp should be in same form as value returned	by systime().
	     If	timestamp is missing, current time is used.  If	format is
	     missing, a	default	format equivalent to the output	of date(1)
	     would be used.  See the specification of ANSI C strftime(3) for
	     the format	conversions which are supported.

   Other built-in functions
	     executes cmd and returns its exit status

     Patterns are arbitrary Boolean combinations (with ! || &&)	of regular ex-
     pressions and relational expressions.  Regular expressions	are as in
     egrep(1).	Isolated regular expressions in	a pattern apply	to the entire
     line.  Regular expressions	may also occur in relational expressions, us-
     ing the operators ~ and !~.  / re / is a constant regular expression; any
     string (constant or variable) may be used as a regular expression,	except
     in	the position of	an isolated regular expression in a pattern.

     A pattern may consist of two patterns separated by	a comma; in this case,
     the action	is performed for all lines from	an occurrence of the first
     pattern though an occurrence of the second.

     A relational expression is	one of the following:
	   expression matchop regular-expression
	   expression relop expression
	   expression in array-name
	   (expr, expr,... ) in	array-name

     where a relop is any of the six relational	operators in C,	and a matchop
     is	either ~ (matches) or !~ (does not match).  A conditional is an	arith-
     metic expression, a relational expression,	or a Boolean combination of

     The special patterns BEGIN	and END	may be used to capture control before
     the first input line is read and after the	last.  BEGIN and END do	not
     combine with other	patterns.

   Built-in Variables
     Variable names with special meanings:

     ARGC	argument count,	assignable

     ARGV	argument array,	assignable; non-null members are taken as

     CONVFMT	conversion format used when converting numbers (default

     ENVIRON	array of environment variables;	subscripts are names.

     FILENAME	the name of the	current	input file

     FNR	ordinal	number of the current record in	the current file

     FS		regular	expression used	to separate fields; also settable by
		option -F fs.

     NF		number of fields in the	current	record

     NR		ordinal	number of the current record

     OFMT	output format for numbers (default "%.6g" )

     OFS	output field separator (default	blank)

     ORS	output record separator	(default newline)

     RS		input record separator (default	newline)

     RSTART	Position of the	first character	matched	by match(); 0 if not

     RLENGTH	Length of the string matched by	match(); -1 if no match.

     SUBSEP	separates multiple subscripts (default 034)

     Functions may be defined (at the position of a pattern-action statement)

	   function foo(a, b, c) { ...;	return x }

     Parameters	are passed by value if scalar and by reference if array	name;
     functions may be called recursively.  Parameters are local	to the func-
     tion; all other variables are global.  Thus local variables may be	cre-
     ated by providing excess parameters in the	function definition.

     length($0)	> 72
	     Print lines longer	than 72	characters.

     { print $2, $1 }
	     Print first two fields in opposite	order.

     BEGIN { FS	= ",[ \t]*|[ \t]+" }
	   { print $2, $1 }
	     Same, with	input fields separated by comma	and/or blanks and

	 { s +=	$1 }
     END { print "sum is", s, "	average	is ", s/NR }
	     Add up first column, print	sum and	average.

     /start/, /stop/
	     Print all lines between start/stop	pairs.

     BEGIN { # Simulate	echo(1)
	  for (i = 1; i	< ARGC;	i++) printf "%s	", ARGV[i]
	  printf "\n"
	  exit }

     egrep(1), lex(1), sed(1), atan2(3), cos(3), exp(3), log(3), sin(3),
     sqrt(3), strftime(3), time(3)

     A.	V. Aho,	B. W. Kernighan, P. J. Weinberger, The AWK Programming
     Language, Addison-Wesley, 1988.  ISBN 0-201-07981-X

     AWK Language Programming, Edition 1.0, published by the Free Software
     Foundation, 1995

     nawk has been the default system awk since	NetBSD 2.0, replacing the pre-
     viously used GNU awk.

     There are no explicit conversions between numbers and strings.  To	force
     an	expression to be treated as a number add 0 to it; to force it to be
     treated as	a string concatenate ""	to it.

     The scope rules for variables in functions	are a botch; the syntax	is

BSD				 May 25, 2008				   BSD


Want to link to this manual page? Use this URL:

home | help