AWK with Examples 10

Example 41: Arrays

Arrays in awk are associative.

Each of the awk elements are identified by their indices.

Awk arrays are different from arrays in other languages:
1) no need to specify the size of the arrays before using them
2) any number or string can be an index.


array1["CAT"]="meoww"
array2["DOG"]="barks"
Above array is valid even when we dont have numeric indices.


Also we can add elememts at any position.
a[1]="Sukul"
a[2]="uma"
a[20]="shushant"
Note that we can add element at 20th position irrespective whether we have added elements 3,4,5...



Notice the below 2 for loops and understand why for(i in array) is used with awk arrays.

awk '{ a[1]="sukul";a[2]="uma";a[5]="bhanu";
      for (i=1;i<=5;i++)
   { print a[i]
   }
   }' testx
  

sukul
uma


bhanu

Note that since we had not assigned values to a[3] and a[4] above for loop printed blanks for them.
Ideally we should not printed  anything because they dont exist.
Thus the above for loop is not inteligent enough to understand whether
the element exists or not.


Instead below for loop makes more sense

awk '{ a[1]="sukul";a[2]="uma";a[5]="bhanu";
      for (i in a )
   { print a[i]
   }
   }' testx

uma
bhanu
sukul


Note that this for loop understand existence or non-existence of an
array element and prints them accordingly.
This is the reason why we use for( i in array) syntax when working with arrays in awk.



Example 42: numeric built in functions

awk ' {
     print int(17.23)   #gives integer part
  print sqrt(900)    #gives square root
  print exp(2)       # exponential
  print log(10)      # natural log
  print sin(30)      # sine. (x in radians)
  print cos(30)      # cosine. (x in radians)
  } ' testx
 
17
30
7.38906
2.30259
30
30


Example 43: String built in function- index

index(string1,string2) : searches string1 for 1st occurenence of string2 and returns the position of beginning of string2.
If not found it returns zero


Below shows the position of 1st "u" in the data file

awk '{ print index($0,"u")}'  test1
2
1
5
3
8


Example 44: String built in function- length

Returns the length of the string input

#prints the lengths of names
awk '{ print length($1)}'  test1
5
3
5
8
8


Example 45: String built in function- match

match(string,regexp): searches for regexp in the string
and returns the position where the substring begins and
if no match found returns 0.


It also sets two built in variables
1) RSTART: sets the value of index where the substring begins
2) RLENGTH: length of the characters of matached string


note: did not work on my installation.

Example 46: String built in function- split
split(string,arrayname,separator)
awk splits the string 'string' into array 'arrayname' based on the separator we provide.
Split returns the number of array elements th split created.

If we skip separator, FS value is used.

awk '{ numberofelements=split($0,array1,"u")
       print "Record no:" NR
    print "Number of array elements created:" numberofelements
    print array1[1],"|",array1[2],"|",array1[3]}' test1
   

Record no:1
Number of array elements created:4
s | k | l 8149158828 m
Record no:2
Number of array elements created:2
 | ma 8149122222 chennai 100/800/300 |
Record no:3
Number of array elements created:2
bhan |  8097123451 Jhansi 200/1000/500 |
Record no:4
Number of array elements created:2
sh | shant 7798977047 nepal 200/9000/100 |
Record no:5
Number of array elements created:2
himansh |  9090909090 bokharo 100/800/300 |


Example 47: String built in function- sub
Sub stands for substitute.
sub(regexp,replacement,target)

sub replaces the 1st occurence of regexp with the replacement text
in the target.


It returns 0 or 1 depending upon number of strings replaced.

awk '{str = "water, water, everywhere"
sub(/at/, "ith", str);
print str}' test1

wither, water, everywhere
wither, water, everywhere
wither, water, everywhere
wither, water, everywhere
wither, water, everywhere


pht022e2:/home/nemo_dev/sm017r> awk '{ sub(/uma/,"shri",$0);print $0}' test1
sukul 8149158828 mumbai 100/900/200
shri 8149122222 chennai 100/800/300
bhanu 8097123451 Jhansi 200/1000/500
shushant 7798977047 nepal 200/9000/100
himanshu 9090909090 bokharo 100/800/300


Note that the 1st occurenece of "uma" is replaced by "shri".

Note another variance of this using &.
This keeps the original string intact and just appends the new data.

awk '{ noofrep=sub(/uma/,"& shri",$0);print "Replace cnt:" noofrep, "|", $0}' test1

pht022e2:/home/nemo_dev/sm017r> awk '{ noofrep=sub(/uma/,"& shri",$0);print "Replace cnt:" noofrep, "|", $0}' test1
Replace cnt:0 | sukul 8149158828 mumbai 100/900/200
Replace cnt:1 | uma shri 8149122222 chennai 100/800/300
Replace cnt:0 | bhanu 8097123451 Jhansi 200/1000/500
Replace cnt:0 | shushant 7798977047 nepal 200/9000/100
Replace cnt:0 | himanshu 9090909090 bokharo 100/800/300


Example 48: String built in function- global sub

Same as sub but it replaces all the occurences in the input record.
awk '{ noofrep=gsub(/u/,"A",$0);print "Replace cnt:" noofrep, "|", $0}' test1

pht022e2:/home/nemo_dev/sm017r> awk '{ noofrep=gsub(/u/,"A",$0);print "Replace cnt:" noofrep, "|", $0}' test1
Replace cnt:3 | sAkAl 8149158828 mAmbai 100/900/200
Replace cnt:1 | Ama 8149122222 chennai 100/800/300
Replace cnt:1 | bhanA 8097123451 Jhansi 200/1000/500
Replace cnt:1 | shAshant 7798977047 nepal 200/9000/100
Replace cnt:1 | himanshA 9090909090 bokharo 100/800/300


Example 49: String built in function- substr
Substring is used to extract a part of the string.
substr(string,start,length)


pht022e2:/home/nemo_dev/sm017r> awk '{ s1=substr($0,5,10);print s1}' test1
l 81491588
8149122222
u 80971234
hant 77989
nshu 90909


Example 50: String built in function-toupper, tolower
Used to convert case from upper to lower OR lower to upper case.

pht022e2:/home/nemo_dev/sm017r> awk '{ record=toupper($0);print record}' test1
SUKUL 8149158828 MUMBAI 100/900/200
UMA 8149122222 CHENNAI 100/800/300
BHANU 8097123451 JHANSI 200/1000/500
SHUSHANT 7798977047 NEPAL 200/9000/100
HIMANSHU 9090909090 BOKHARO 100/800/300


Example 51: system builtin function- system

Used to execute any system command from awk itself.
The system command is run and control comes back to awk.

pht022e2:/home/nemo_dev/sm017r> awk '{ record=toupper($0);print record}
    END { system("ls -lrt test*")}' test1>
SUKUL 8149158828 MUMBAI 100/900/200
UMA 8149122222 CHENNAI 100/800/300
BHANU 8097123451 JHANSI 200/1000/500
SHUSHANT 7798977047 NEPAL 200/9000/100
HIMANSHU 9090909090 BOKHARO 100/800/300
-rw-r-----   1 sm017r     nemo_dev       187 Aug  8 05:31 test1


note the last line of the output. It contains the result of ls -lrt test* that was run
from within awk.


Example 52 : understanding ARGV and ARGC.
The command line arguments that we pass to awk program are stored in an array called ARGV.
ARGC: This contains the number of command line arguments.

The ARGV is indexed from 0 to ARGC-1

awk '{print ARGC;
     print ARGV[0]
  print ARGV[1]}' test1
this prints all the 3 for each line in the input file.
Note that ARGV[1] is the name of the input file  .

awk
test1
2
awk
test1
2
awk
test1
2
awk
test1
2
awk
test1


Example 52: Built variables ENVIRON and FILENAME

awk also has a array ENVIRON which contains the values of the environment variables.

The index for this array is the name of the variable.


FILENAME variable gives the name of the input file.
If the data is read from standard input the value is set to "-".

awk '{print ENVIRON["HOME"], ENVIRON["SHELL"], FILENAME }' test1
/home/nemo_dev/sm017r /usr/bin/ksh test1
/home/nemo_dev/sm017r /usr/bin/ksh test1
/home/nemo_dev/sm017r /usr/bin/ksh test1
/home/nemo_dev/sm017r /usr/bin/ksh test1
/home/nemo_dev/sm017r /usr/bin/ksh test1


we can see that ENVIRON["HOME"] prints the value of the HOME
environment variable and same also applies to ENVIRON["SHELL"].

No comments:

Post a Comment