Unix Shell Scripting For DBA’s (PART – 08)

Continuing from previous article.

Unix Shell Scripting For DBA’s (PART – 07)

3.AWK programming

This filter is among the most powerful filter which in itself a scripting language this filter was invented by three authors




The main advantage of this awk filter is that it searches field by field also which is never offered by any other filters using awk filter we can create wonders and most of the advance shell script are written using this awk filter while using this filter we should be careful of the syntax’s which may lead to some other junk output we have some thing called relational operators,logical operators,and variables.

What is gawk?

Gawk is the GNU version of the commonly available UNIX awk program, another popular stream editor.

Since the awk program is often just a link to gawk, we will refer to it as awk.

The basic function of awk is to search files for lines or other text units containing one or more patterns.

When a line matches one of the patterns, special actions are performed on that line.

Programs in awk are different from programs in most other languages, because awk programs are “data-driven”: you describe the data you want to work with and then what to do when you find it. Most other languages are “procedural.” You have to describe, in great detail, every step the program is to take.

When working with procedural languages, it is usually much harder to clearly describe the data your program will process. For this reason, awk programs are often refreshingly easy to read and write.

Gawk commands

When you run awk, you specify an awk program that tells awk what to do. The program consists of a series of rules. (It may also contain function definitions, loops, conditions and other programming constructs, advanced features that we will ignore for now.) Each rule specifies one pattern to search for and one action to perform upon finding the pattern.

There are several ways to run awk. If the program is short, it is easiest to run it on the command line:

If multiple changes have to be made, possibly regularly and on multiple files, it is easier to put the awk commands in a script. This is read like this:


We have three modes. They are








== —- EQUAL TO



&& —AND

|| —OR



NR—-no.of records

NF—-no.of fields

$ —-specify the field

now lets start with this wonderful filter with its syntax’s

To print total records

Create a file

$awk -F “:” ‘{print}’ <filename>


$awk -F “:” ‘{print$1}’ <filename> (prints the first file)


$awk -F “:” ‘NR==1 {print}’ <filename>


$awk -F “:” ‘NR==1&&NR!=7 {print}’ <filename>

Now will print third line from the file

$awk -F “:” ‘NR==1||NR==7 {print}’ <filename>

4.AWK print program

Printing selected fields

The print command in awk outputs selected data from the input file.

When awk reads a line of a file, it divides the line in fields based on the specified input field separator, FS, which is an awk variable. This variable is predefined to be one or more spaces or tabs.

The variables $1, $2, $3, …, $N hold the values of the first, second, third until the last field of an input line. The variable $0 (zero) holds the value of the entire line. This is depicted in the image below, where we see six colums in the output of the df command:

In above example what column we want to display then after $ we need to give column number.

Use awk programming to print specified columns.

In the above output i have printed 5th and 9th column but output is not structured.

To set structured format we have formatting.


Formatting fields

Without formatting, using only the output separator, the output looks rather poor. Inserting a couple of tabs and a string to indicate what output this is will make it look a lot better:

ls -ldh * shows 9 column output

Print the fields with formatted output.

To format the output from df command.

Now will sort command to to set reverse order.

We will use head command to display only top 3 filled mount mount points

To set structured format for df  command with required columns

Formatting characters for gawk

Sequence              Meaning

\a                           Bell character

\n                          Newline character

\t                          Tab

The print command and regular expressions

A regular expression can be used as a pattern by enclosing it in slashes. The regular expression is then tested against the entire text of each record. The syntax is as follows:

The following example displays only local disk device information, networked file systems are not shown:

Below another example where we search the /etc directory for files ending in “.conf” and starting with either “a” or “x”, using extended regular expressions:

Special patterns

In order to precede output with comments, use the BEGIN statement.

In the above example we can add text at the  beginning of the output

The END statement can be added for inserting text after the entire input is processed.

Gawk scripts

As commands tend to get a little longer, you might want to put them in a script, so they are reusable. An awk script contains awk statements defining patterns and actions.

As an illustration, we will build a report that displays our most loaded partitions.

Write AWK script. :

This script will help on finding 40-90 % disk full .

Use this script now for df command.

5. AWK variables

As awk is processing the input file, it uses several variables. Some are editable, some are read-only.

The input field separator

The field separator, which is either a single character or a regular expression, controls the way awk splits up an input record into fields. The input record is scanned for character sequences that match the separator definition; the fields themselves are the text between the matches.

The field separator is represented by the built-in variable FS. Note that this is something different from the IFS variable used by POSIX-compliant shells.

The value of the field separator variable can be changed in the awk program with the assignment operator =. Often the right time to do this is at the beginning of execution before any input has been processed, so that the very first record is read with the proper separator. To do this, use the special BEGIN pattern.

In the example below, we build a command that displays all the users on your system with a description:

The default input field separator is one or more white spaces or tabs.

The output field separator

Fields are normally separated by spaces in the output. This becomes apparent when you use the correct syntax for the print command, where arguments are separated by commas:

Let us create a file with content as below.

Let us print output without field separator.

Let us print output with field separator ,

The output record separator

The output from an entire print statement is called an output record. Each print command results in one output record, and then outputs a string called the output record separator, ORS. The default value for this variable is “\n”, a newline character. Thus, each print statement generates a separate line.

To change the way output fields and records are separated, assign new values to OFS and ORS:

[oracle@dba15 ~]$ awk ‘BEGIN {OFS=”;” ; ORS=”\n–>\n” }{ print $1,$2}’ test

If the value of ORS does not contain a newline, the program’s output is run together on a single line.

The number of records

The built-in NR holds the number of records that are processed. It is incremented after reading a new input line. You can use it at the end to count the total number of records, or in each output record.

Write an awk script as below.

Let us run this on test file.

User defined variables

Apart from the built-in variables, you can define your own. When awk encounters a reference to a variable which does not exist (which is not predefined), the variable is created and initialized to a null string. For all subsequent references, the value of the variable is whatever value was assigned last. Variables can be a string or a numeric value. Content of input fields can also be assigned to variables.

Values can be assigned directly using the = operator, or you can use the current value of the variable in combination with other operators.

Write an awk script now.

Run the awk script now on this file.


Note: Please test scripts in Non Prod before trying in Production.
