Nifty Computer Tricks: Grep

Tuesday, October 14, 2008

I like either for grep'ing

There are times when you want to grep for multiple different lines from a file. For example, the text file below could be an example of a log file and I'd like to see all the CmdStat entries I had and what Value line followed each of them.

log.txt:
CmdStat=InFlow
Info=More Data and Values
CmdStat=OutStat
Value=-3
CmdStat=Done
Value=42
Notes=end

Well, the first guess of running "grep CmdStat log.txt" followed by "grep Value log.txt" will certainly generate all the right lines, but it will not tell me the where the Value lines are compare to the CmdStat lines. There could be CmdStat lines without Value lines and vice versa.

The correct solution is to use the very useful -e (either) option. Doing a "grep -eCmdStat= -eValue= log.txt" will yield:
CmdStat=InFlow
CmdStat=OutStat
Value=-3
CmdStat=Done
Value=42

which shows me all the CmdStat and Value entries. Note that I used Value= instead of Value so we do not accidentally get lines with the word Value. Always a good idea to grep for a word as unique as possible.

You may of course also combine -e with other useful options, such as -i (case insensitive) and -v (except). So if I want to weed out all the Info and Notes lines from the file, I could do a "grep -v -eNotes= -eInfo= log.txt"

Nifty.

Saturday, August 2, 2008

Grep'ing for a column value

A few days ago I needed to get those lines from a file that only contained a certain value in a certain field. To simplify the example, let's say that I have a file containing id, first name, last name, state, and phone number, and that I am only interested in those entries containing a state of California (e.g. ca or CA).

Here is the sample file named in.txt:
line1 joe smith ca 4085551212
line2 joe carlson az 3334445555
line3 carl smith ny 2049998888
line4 joe smith or 5035551234
line5 mike erwin ca 4159876543
line6 mike erwin CA 4159876543

I can not simply do a "grep -i ca in.txt" as I would get all lines containing "ca" even if "ca" occurred in the name fields (e.g. like in carl or carlson). In the above case I would get all lines, except line 4, which is of course incorrect.

If I did "cut -f4 in.txt | grep -i ca" I would get the correct number of results (i.e. 3), but I would only get the 3 "ca" values and not the whole line as I cut everything else away.

So one way to solve the problem (there may be other more clever ones I'm not aware of), is to use the "awk" command instead of the "grep" command.
cat in.txt | awk '{if ($4=="ca") print $0;}'

I think the above is self explanatory even for non-awk folks, the only trick is that $0 shows the entire line. If you only wanted to get certain columns in the output, you could decide to only print $2,$3,$5 for example to get first name, last name and phone number.

If you want to make the solution case insensitive, simply use the awk command "tolower()" in your if statement:
cat in.txt | awk '{if (tolower($4)=="ca") print $0;}'

Nifty.

Nifty Computer Tricks

Tuesday, October 14, 2008

I like either for grep'ing

Saturday, August 2, 2008

Grep'ing for a column value

Blog Archive

Page Views

About Me