AWK with Examples 8

Example 30: Using Regular Expressions

Regular expressions make use of // for matching.
We have seen this earlier.

Following 3 are same:

awk '/foo/ { print $0}' test1
awk '$0 ~ /foo/ {print $0}' test1
awk ' if( $0 ~ /foo/) {print $0}'  test1


With in // we can write any regular expressions.
For matching regular expression we use the operator ~ .
For not matching we use !~ (non matching).


Details of regular expressions:

^ --> matches the beginning of the string or beginning of line.

awk '$1 ~ /^s/ {print $1} ' test1
bos90631:sm017r awk '$1 ~ /^s/ {print $1} ' test1
sukul
shushant

____________________________________________

$ --> matches the end of string or end of line.

awk '$1 ~ /u$/ {print $1}' test1
bhanu
himanshu

____________________________________________

[...] --> character set. matches any characters in the brackets.
we can use - to provide ranges . ex: [0-9]
To include the characters \,],-,^ in the character set , we should put a '\' infront of it.


awk '$0 ~ /n[en]/ { print $0} ' test1

bos90631:sm017r awk '$0 ~ /n[en]/ { print $0} ' test1
uma 8149122222 chennai 100/800/300
shushant 7798977047 nepal 200/9000/100


Searches for n followed by either n or e.
____________________________________________

[^...] --> complemented character set.
This matches any characacter except those in square brackets:
awk '$0 ~ /n[^en]/ {print $0}' test1

bos90631:sm017r awk '$0 ~ /n[^en]/ {print $0}' test1
uma 8149122222 chennai 100/800/300
bhanu 8097123451 Jhansi 200/1000/500
shushant 7798977047 nepal 200/9000/100
himanshu 9090909090 bokharo 100/800/300

____________________________________________

| --> Alternation operator and used to specify alternatives.

awk '$0 ~ /uma|sukul/ { print $0}' test1
sukul 8149158828 mumbai 100/900/200
uma 8149122222 chennai 100/800/300

____________________________________________

* --> used to match zero or any number of preceeding regular expression.

awk ' $0 ~ /900*/ {print $0}' test1
sukul 8149158828 mumbai 100/900/200
shushant 7798977047 nepal 200/9000/100
himanshu 9090909090 bokharo 100/800/300

____________________________________________
+ -> used to mtch 1 or more number of preceeding regular expression


os90631:sm017r awk '$0 ~ /90+/ {print $0}' test1
sukul 8149158828 mumbai 100/900/200
shushant 7798977047 nepal 200/9000/100
himanshu 9090909090 bokharo 100/800/300

____________________________________________
? --> used to match 0 or 1(but not more than 1) number of preceeding expression.
awk ' $0 ~ /900?/ {print $0}' test1
____________________________________________
\ --> Used to escape special meaning of special characaters like $
To search for  $ we use \$


Example 31: Using boolean operators

&&- AND
||-OR
!- NOT


awk '{if($1 ~ /u/ && $2=="8149158828") print $1 " IS THE LUCKY PERSON"}' test1
sukul IS THE LUCKY PERSON


awk '{if($1 ~ /hu/ || $2 ~ /81491/) print $1 " IS THE LUCKY PERSON"}' test1
bos90631:sm017r awk '{if($1 ~ /hu/ || $2 ~ /81491/) print $1 " IS THE LUCKY PERSON"}' test1
sukul IS THE LUCKY PERSON
uma IS THE LUCKY PERSON
shushant IS THE LUCKY PERSON
himanshu IS THE LUCKY PERSON



Example 32: Using BEGIN and END pattern.

BEGIN pattern is executed before 1st record is read.
END pattern is executed at the end of file.


awk 'BEGIN {print "DATAFILE       HEADER"}
     {n++;print $0}
  END{ print "DATAFILE   TRAILER  COUNT:" n}' test1
 
DATAFILE       HEADER
sukul 8149158828 mumbai 100/900/200
uma 8149122222 chennai 100/800/300
bhanu 8097123451 Jhansi 200/1000/500
shushant 7798977047 nepal 200/9000/100
himanshu 9090909090 bokharo 100/800/300
DATAFILE   TRAILER  COUNT:5


Note that we can have multiple BEGIN and END patterns and
they get executed in order they are written 
 
Example 33: Setting variable values on command line


-v option can be used to set varibles on the command line
The variables are set even before BEGIN pattern is executed.
-v option is wrritten preceeding the filename arguments as well
the program text.

(not working on my installation)

Variables can be set on the command line before the filenames.
A same variable can be specified with different before every file name.
awk '{ print $n}' n=2 test1 n=3 test2

bos90631:sm017r awk '{ print $n}' n=2 test1 n=3 test2
8149158828
8149122222
8097123451
7798977047
9090909090
mumbai
chennai
Jhansi
nepal
bokharo

No comments:

Post a Comment