Unix Shell Scripting For DBA’s (PART – 06)

Unix Shell Scripting For DBA’s (PART – 06)

reference of this article from : www.orskl.com

refer previous article



Regular expressions

What are regular expressions?

A regular expression is a pattern that describes a set of strings. Regular expressions are constructed analogously to arithmetic expressions by using various operators to combine smaller expressions.

The fundamental building blocks are the regular expressions that match a single character. Most characters, including all letters and digits, are regular expressions that match themselves. Any meta character with special meaning may be quoted by preceding it with a backslash.

A regular expression may be followed by one of several repetition operators (meta characters):

Operator          Effect

  • .                      Matches any single character.
  • ?                     The preceding item is optional and will be matched, at most, once.
  • *                     The preceding item will be matched zero or more times.
  • +                     The preceding item will be matched one or more times.
  • {N}                  The preceding item is matched exactly N times.
  • {N,}                 The preceding item is matched N or more times.
  • {N,M}              The preceding item is matched at least N times, but not more than M times
  • –                     Represents the range if it’s not first or last in a list or the ending point of a range in a list.
  • ^                     Matches the empty string at the beginning of a line also represents the characters not in the range of a list.
  • $                    Matches the empty string at the end of a line.
  • \b                   Matches the empty string at the edge of a word.
  • \B                   Matches the empty string provided it’s not at the edge of a word.
  • \<                   Match the empty string at the beginning of word.
  • \>                   Match the empty string at the end of word.

Two regular expressions may be concatenated; the resulting regular expression matches any string formed by concatenating two sub strings that respectively match the concatenated sub expressions.

Two regular expressions may be joined by the infix operator “|”; the resulting regular expression matches any string matching either sub expression.

Repetition takes precedence over concatenation, which in turn takes precedence over alternation. A whole sub expression may be enclosed in parentheses to override these precedence rules.

In the above example i have used – expression .

The above regular expressions will be discussed along with grep command.


What is grep?

Grep searches the input files for lines containing a match to a given pattern list. When it finds a match in a line, it copies the line to standard output (by default), or whatever other sort of output you have requested with options.

Though grep expects to do the matching on text, it has no limits on input line length other than available memory, and it can match arbitrary characters within a line. If the final byte of an input file is not a newline, grep silently supplies one. Since newline is also a separator for the list of patterns, there is no way to match newline characters in a text.

Syntax :
grep [OPTIONS] [-e PATTERN | -f FILE] [FILE…]

Grep and regular expressions

Note     If you are not on Linux

We use GNU grep in these examples, which supports extended regular expressions. GNU grep is the default on Linux systems. If you are working on proprietary systems, check with the -V option which version you are using. GNU grep can be downloaded from http://gnu.org/directory/.

Examples :

The below grep command will found the expression called root inside /etc/passwd file.

If we wanted to know at what line root appeared then  use -n option

Inverse search : Everything from a file other than expression.

If we want to know in which line root pattern is not there.

Grep expression from group of files in a directory.

Exclude an expression in grep from group of files.

In above example every line have bin pattern so that it send empty output.

If we send ksh to exclude.

Grep from a file with lines starting with regular expression we need to use ^ .

Grep from a file with lines ending with regular expression we need to use $.

Grep regular expression with blank spaces or special symbols on either side.

Character classes

A bracket expression is a list of characters enclosed by “[” and “]”. It matches any single character in that list; if the first character of the list is the caret, “^”, then it matches any character NOT in the list. For example, the regular expression “[0123456789]” matches any single digit.

Grep individual characters in a regular expression.

Within a bracket expression, a range expression consists of two characters separated by a hyphen. It matches any single character that sorts between the two characters, inclusive, using the locale’s collating sequence and character set. For example, in the default C locale, “[a-d]” is equivalent to “[abcd]”. Many locales sort characters in dictionary order, and in these locales “[a-d]” is typically not equivalent to “[abcd]”; it might be equivalent to “[aBbCcDd]”, for example. To obtain the traditional

interpretation of bracket expressions, you can use the C locale by setting the LC_ALL environment variable to the value “C”.

Finally, certain named classes of characters are predefined within bracket expressions. See the grep man or info pages for more information about these predefined expressions.


Use the “.” for a single character match. If you want to get a list of all five-character English dictionary words starting with “c” and ending in “h” (handy for solving crosswords): cathy ~> grep ‘\<c…h\>’ /usr/share/dict/words

If you want to display lines containing the literal dot character, use the -F option to grep.

For matching multiple characters, use the asterisk. This example selects all words starting with “c” and ending in “h” from the system’s dictionary: cathy ~> grep ‘\<c.*h\>’ /usr/share/dict/words

Grep for regular expression starting and ending with a character and fixed number of characters in between.


Character ranges

Apart from grep and regular expressions, there’s a good deal of pattern matching that you can do directly in the shell, without having to use an external program.

As you already know, the asterisk (*) and the question mark (?) match any string or any single character, respectively. Quote these special characters to match them literally.

Filter files with range of characters in the location.

Character classes

Character classes can be specified within the square braces, using the syntax [:CLASS:], where CLASS is defined in the POSIX standard and has one of the values “alnum”, “alpha”, “ascii”, “blank”, “cntrl”, “digit”, “graph”, “lower”, “print”, “punct”, “space”, “upper”, “word” or “xdigit”.

Filer files with digits in the name or upper case character in the name.

In below example It found the fine name  has digits at the end.

For upper case


Thank you….


Note: Please test scripts in Non Prod before trying in Production.
1 Star2 Stars3 Stars4 Stars5 Stars (2 votes, average: 5.00 out of 5)

Add Comment