There are times when you want to grep for multiple different lines from a file. For example, the text file below could be an example of a log file and I'd like to see all the CmdStat entries I had and what Value line followed each of them.
log.txt:
CmdStat=InFlow
Info=More Data and Values
CmdStat=OutStat
Value=-3
CmdStat=Done
Value=42
Notes=end
Well, the first guess of running "grep CmdStat log.txt" followed by "grep Value log.txt" will certainly generate all the right lines, but it will not tell me the where the Value lines are compare to the CmdStat lines. There could be CmdStat lines without Value lines and vice versa.
The correct solution is to use the very useful -e (either) option. Doing a "grep -eCmdStat= -eValue= log.txt" will yield:
CmdStat=InFlow
CmdStat=OutStat
Value=-3
CmdStat=Done
Value=42
which shows me all the CmdStat and Value entries. Note that I used Value= instead of Value so we do not accidentally get lines with the word Value. Always a good idea to grep for a word as unique as possible.
You may of course also combine -e with other useful options, such as -i (case insensitive) and -v (except). So if I want to weed out all the Info and Notes lines from the file, I could do a "grep -v -eNotes= -eInfo= log.txt"
Nifty.
Showing posts with label Grep. Show all posts
Showing posts with label Grep. Show all posts
Tuesday, October 14, 2008
Saturday, August 2, 2008
Grep'ing for a column value
A few days ago I needed to get those lines from a file that only contained a certain value in a certain field. To simplify the example, let's say that I have a file containing id, first name, last name, state, and phone number, and that I am only interested in those entries containing a state of California (e.g. ca or CA).
Here is the sample file named in.txt:
line1 joe smith ca 4085551212
line2 joe carlson az 3334445555
line3 carl smith ny 2049998888
line4 joe smith or 5035551234
line5 mike erwin ca 4159876543
line6 mike erwin CA 4159876543
I can not simply do a "grep -i ca in.txt" as I would get all lines containing "ca" even if "ca" occurred in the name fields (e.g. like in carl or carlson). In the above case I would get all lines, except line 4, which is of course incorrect.
If I did "cut -f4 in.txt | grep -i ca" I would get the correct number of results (i.e. 3), but I would only get the 3 "ca" values and not the whole line as I cut everything else away.
So one way to solve the problem (there may be other more clever ones I'm not aware of), is to use the "awk" command instead of the "grep" command.
cat in.txt | awk '{if ($4=="ca") print $0;}'
I think the above is self explanatory even for non-awk folks, the only trick is that $0 shows the entire line. If you only wanted to get certain columns in the output, you could decide to only print $2,$3,$5 for example to get first name, last name and phone number.
If you want to make the solution case insensitive, simply use the awk command "tolower()" in your if statement:
cat in.txt | awk '{if (tolower($4)=="ca") print $0;}'
Nifty.
Here is the sample file named in.txt:
line1 joe smith ca 4085551212
line2 joe carlson az 3334445555
line3 carl smith ny 2049998888
line4 joe smith or 5035551234
line5 mike erwin ca 4159876543
line6 mike erwin CA 4159876543
I can not simply do a "grep -i ca in.txt" as I would get all lines containing "ca" even if "ca" occurred in the name fields (e.g. like in carl or carlson). In the above case I would get all lines, except line 4, which is of course incorrect.
If I did "cut -f4 in.txt | grep -i ca" I would get the correct number of results (i.e. 3), but I would only get the 3 "ca" values and not the whole line as I cut everything else away.
So one way to solve the problem (there may be other more clever ones I'm not aware of), is to use the "awk" command instead of the "grep" command.
cat in.txt | awk '{if ($4=="ca") print $0;}'
I think the above is self explanatory even for non-awk folks, the only trick is that $0 shows the entire line. If you only wanted to get certain columns in the output, you could decide to only print $2,$3,$5 for example to get first name, last name and phone number.
If you want to make the solution case insensitive, simply use the awk command "tolower()" in your if statement:
cat in.txt | awk '{if (tolower($4)=="ca") print $0;}'
Nifty.
Subscribe to:
Posts (Atom)