AWK with Examples 5

Example 18: Built in Variable FS(Field Separator)

Field separator is a single character or a regex which is used to determine
how  awk splits the records into fields.


Field separator is represented by FS.

We can change the value of FS in BEGIN pattern so that is affects all the records.

Default value of S is " "(single space)

Note that Two consecutive spaces(or tabs) does not create an empty field.
But if FS=";" then two consecutive ; will create an empty field.


Here in below example we assume field separtor is /

bos90631:sm017r awk 'BEGIN{ FS="/"}{ print $1}' test1
sukul 8149158828 mumbai 100
uma 8149122222 chennai 100
bhanu 8097123451 Jhansi 200
shushant 7798977047 nepal 200
himanshu 9090909090 bokharo 100


We can set FS="[ ]" if we want to force single space and delimiter.
This will cause two consecutive spaces to to counted as an empty field.


We can set the field separator on the command line using -F option

awk -F/ '{ print $2 }' test1
900
800
1000
9000
800



Example 19: getline command

awk reads input file one record at a time implicitly.
We can also read the record explicitly by making use of getline command.(with no arguments)

The command getline returns a numeric indicating if it was successful or not:
1) 1 if record is found
2) 0 if end of file is encountered.
3) -1 error if file cannot be opened.

When we execute getline without arguments the next record is read in to $0.
The original record that was already in $0 will be overridden.
So we should use getline oney after we are done working with
current record because it gets flushed the moment we read next record using getline.

The value of NF,NR and FNR and $0 are also set as per new record.

Below is an example of file having carriage return and code using getline to fix it.

[test file]
test3:
sukul 8149158828
mumbai 100/900/200
uma 8149122222 chennai
100/800/300
bhanu
8097123451 Jhansi 200/1000/500
shushant 7798977047 nepal 200/9000/100
himanshu 9090909090 bokharo 100/800/300


fixing carriage return:

awk '{record=$0;noofflds=split(record,a)
while (noofflds!=4 && (getline > 0))
{
record=record $0
noofflds=split(record,a)
}
print record
}' test3


bos90631:sm017r awk '{record=$0;noofflds=split(record,a)
while (noofflds!=4 && (getline > 0))
{
record=record $0
noofflds=split(record,a)
}
print record
>while (noofflds!=4 && (getline > 0))
>{' test3
>record=record $0
>noofflds=split(record,a)
>}' test3
>print record
>}' test3
sukul 8149158828 mumbai 100/900/200
uma 8149122222 chennai 100/800/300
bhanu 8097123451 Jhansi 200/1000/500
shushant 7798977047 nepal 200/9000/100
himanshu 9090909090 bokharo 100/800/300


#split is used to split data into array. The number returned by
split is the numeber of array element it creates.
Example 20: getline with variable
getline var

This is used to read the next record explicitly into a variable.

This does not change the value of $0.

This changes the value of variables NR and FNR, but not NF and $0
because the record is not split into fields.

pht022e2:/home/nemo_dev/sm017r> awk '{print $0;(rc=getline var);print var,rc}' test1
sukul 8149158828 mumbai 100/900/200
uma 8149122222 chennai 100/800/300 1
bhanu 8097123451 Jhansi 200/1000/500
shushant 7798977047 nepal 200/9000/100 1
himanshu 9090909090 bokharo 100/800/300
 0




 Example 20: getline with a file
 getline < "file1"                      

This is used to take input from any other file which is not a standard input to awk.
We may use this when we want file to be used as lookup.

Filename should be specified in double quotes( " ")

Since the main stream input is not used this does not change the value ofNR or FNR.
But record is split into fields in normal manner , So the value of  $0 and NF is changed.

 [test file]
 vi lookup
 sukul
 himanshu
 uma
 shushant

 awk '{ name=$1; flag=0;
 while ((getline < "lookup") > 0)
 {
 if (name==$1)
 { flag=1 }
 };
 if (flag==0)
 { print name " is the black sheep"};
 close("lookup")
 }' test1

pht022e2:/home/nemo_dev/sm017r>  awk '{ name=$1; flag=0;
 while ((getline < "lookup") > 0)
 {
 if (name==$1)
 { flag=1 }
 };
 if (flag==0)
 { print name " is the black sheep"};
 close("lookup")
 }' test1> > > > > > > > >
bhanu is the black sheep


Note that we need to close the lookup file before we start reading it
from the top.
Also note that the normal getline<"filename" will override the value of $0
that came from the main input. So make sure you save the value to anither variable
before we run the getline command.

No comments:

Post a Comment