proxy70

AWK syntax:
 awk [-Fs] "program" [file1 file2...]   # commands come from DOS cmdline
 awk 'program{print "foo"}' file1       # single quotes around double quotes
  # NB: Don't use single quotes alone if the embedded info will contain the
  # vertical bar or redirection arrows! Either use double quotes, or (if
  # using 4DOS) use backticks around the single quotes:  `'NF>1'`

  # NB: since awk will accept single quotes around arguments from the
  # DOS command line, this means that DOS filenames which contain a
  # single quote cannot be found by awk, even though they are legal names
  # under MS-DOS. To get awk to find a file named foo'bar, the name must
  # be entered as foo"'"bar.

 awk [-Fs] -f pgmfile [file1 file2...]   # commands come from DOS file

 If file1 is omitted, input comes from stdin (console).
 Option -Fz sets the field separator FS to letter "z".

AWK notes:
 "pattern {action}"
    if {action} is omitted, {print $0} is assumed
    if "pattern" is omitted, each line is selected for {action}.

 Fields are separated by 1 or more spaces or tabs: "field1   field2"
 If the commands come from a file, the quotes below can be omitted.

 Basic AWK commands:
 -------------------
 "NR == 5" file             show rec. no. (line) 5.  NB: "==" is equals.
 {FOO = 5}                  single = assigns "5" to the variable FOO
 "$2 == 0 {print $1}"       if 2d field is 0, print 1st field
 "$3 < 10"                  if 3d field < 10, numeric comparison; print line
 '$3 < "10" '               use single quotes for string comparison!, or
 -f pgmfile [$3 < "10"]     use "-f pgmfile"  for string comparison
 "$3 ~ /regexp/"            if /regexp/ matches 3d field, print the line
 '$3 ~ "regexp" '           regexp can appear in double-quoted string*
                 # * If double-quoted, 2 backslashes for every 1 in regexps
                 # * Double-quoted strings require the match (~) character.
 "NF > 4"                   print all lines with 5 or more fields
 "$NF > 4"                  print lines where the last field is 5 or more
 "{print NF}"               tell us how many fields (words) are on each line
 "{print $NF}"              print last field of each line

 "/regexp/"                 Only print lines containing "regexp"
 "/text|file/"              Lines containing "text" or "file" (CASE SENSITIVE!)

 "/foo/{print "za", NR}"    FAILS on DOS/4DOS command line!!
 '/foo/{print "za", NR}'    WORKS on DOS/4DOS command line!!
                            If lines matches "foo", print word and line no.
 `"/foo/{print \"za\",NR}"` WORKS on 4DOS cmd line: escape internal quotes with
                            slash and backticks; for historical interest only.

 "$3 ~ /B/ {print $2,$3}"   If 3d field contains "B", print 2d + 3d fields
 "$4 !~ /R/"                Print lines where 4th field does NOT contain "R"

 '$1=$1'                    Del extra white space between fields & blank lines
 '{$1=$1;print}'            Del extra white space between fields, keep blanks
 'NF'                       Del all blank lines

 AND(&&), OR(||), NOT(!)
 -----------------------
 "$2 >= 4 || $3 <= 20"      lines where 2d field >= 4 .OR. 3d field <= 20
 "NR > 5 && /with/"         lines containing "with" for lines 6 or beyond
 "/x/ && NF > 2"            lines containing "x" with more than 2 fields

 "$3/$2 != 5"               not equal to "value" or "string"
 "$3 !~ /regexp/"           regexp does not match in 3d field
 "!($3 == 2 && $1 ~ /foo/)" print lines that do NOT match condition

 "{print NF, $1, $NF}"      print no. of fields, 1st field, last field
 "{print NR, $0}"           prefix a line number to each line
 '{print NR ": " $0}'       prefix a line number, colon, space to each line

 "NR == 10, NR == 20"       print records (lines) 10 - 20, inclusive
 "/start/, /stop/"          print lines between "start" and "stop"

 "length($0) > 72"          print all lines longer than 72 chars
 "{print $2, $1}"           invert first 2 fields, delete all others
 "{print substr($0,index($0,$3))}"  print field #3 to end of the line


 END{...} usage
 ---------------     END reads all input first.

1) END { print NR }               # same output as "wc -l"

2) {s = s + $1 }                  # print sum, ave. of all figures in col. 1
  END {print "sum is", s, "average is", s/NR}

3) {names=names $1 " " }          # converts all fields in col 1 to
  END { print names }             # concatenated fields in 1 line, e.g.
        +---Beth   4.00 0         #
  input |   Mary   3.75 0         # infile is converted to:
   file |   Kathy  4.00  10       #   "Beth Mary Kathy Mark" on output
        +---Mark   5.00  30       #

4)  { field = $NF }               # print the last field of the last line
  END { print field }


PRINT, PRINTF:   print expressions, print formatted
print expr1, expr2, ..., exprn    # parens() needed if the expression contains
print(expr1, expr2, ..., exprn)   # any relational operator: <, <=, ==, >, >=

print                             # an abbreviation for {print $0}
print ""                          # print only a blank line
printf(expr1,expr2,expr3,\n}      # add newline to printf statements

 FORMAT CONVERSION:
 ------------------
 BEGIN{ RS="";    FS="\n";        # takes records sep. by blank lines, fields
       ORS="\n"; OFS="," }        # sep. by newlines, and converts to records
 {$1=$1; print }                  # sep. by newlines, fields sep. by commas.

 PARAGRAPHS:
 -----------
 'BEGIN{RS="";ORS="\n\n"};/foo/'  # print paragraph if 'foo' is there.
 'BEGIN{RS="";ORS="\n\n"};/foo/&&/bar/'  # need both
                         ;/foo|bar/'     # need either

 PASSING VARIABLES:
 ------------------
 gawk -v var="/regexp/" 'var{print "Here it is"}'   # var is a regexp
 gawk -v var="regexp" '$0~var{print "Here it is"}'  # var is a quoted string
 gawk -v num=50 '$5 == num'                         # var is a numeric value

Built-in variables:
 ARGC       number of command-line arguments
 ARGV       array of command-line arguments (ARGV[0...ARVC-1])
 FILENAME   name of current input file
 FNR        input record number in current file
 FS         input field separator (default blank)
 NF         number of fields in current input record
 NR         input record number since beginning
 OFMT       output format for numbers (default "%.6g")
 OFS        output field separator (default blank)
 ORS        output record separator (default newline)
 RLENGTH    length of string matched by regular expression in match
 RS         input record separator (default newline)
 RSTART     beginning position of string matched by match
 SUBSEP     separator for array subscripts of form [i,j,...] (default ^\)

Escape sequences:
 \b       backspace (^H)
 \f       formfeed (^L)
 \n       newline (DOS, CR/LF; Unix, LF)
 \r       carriage return
 \t       tab (^I)
 \ddd     octal value `ddd', where `ddd' is 1-3 digits, from 0 to 7
 \c       any other character is a literal, eg, \" for " and \\ for \

Awk string functions:
 `r' is a regexp, `s' and `t' are strings, `i' and `n' are integers
 `&' in replacement string in SUB or GSUB is replaced by the matched string

 gsub(r,s,t)    globally replace regex r with string s, applied to data t;
                return no. of substitutions; if t is omitted, $0 is used.
 gensub(r,s,h,t) replace regex r with string s, on match number h, applied
                to data t; if h is 'g', do globally; if t is omitted, $0 is
                used. Return the converted pattern, not the no. of changes.
 index(s,t)     return the index of t in s, or 0 if s does not contain t
 length(s)      return the length of s
 match(s,r)     return index of where s matches r, or 0 if there is no
                  match; set RSTART and RLENGTH
 split(s,a,fs)  split s into array a on fs, return no. of fields; if fs is
                  omitted, FS is used in its place
 sprintf(fmt,expr-list)     return expr-list formatted according to fmt
 sub(r,s,t)     like gsub but only the first matched substring is replaced
 substr(s,i,n)  return the n-character substring of s starting at i; if n
                  is omitted, return the suffix of s starting at i

Arithmetic functions:
 atan2(y,x)     arctangent of y/x in radians in the range of -ã to ã
 cos(x)         cosine (angle in radians)
 exp(n)         exponential eü (n need not be an integer)
 int(x)         truncate to integer
 log(x)         natural logarithm
 rand()         pseudo-random number r, 0 ó r ó 1
 sin(x)         sine (angle in radians)
 sqrt(x)        square root
 srand(x)       set new seed for random number generator; uses time of day
                  if no x given

[end-of-file]