Thursday, November 3, 2011

Searching in Excel and Word

Searching and replacing text and special characters in Microsoft Excel and Word can get, uhmm, a tad complicated. Even for people very familiar with standard regular expressions (RegExp) since Microsoft unfortunately decided to build their own standard rather than using the normal regular expressions. And to complicate things, Microsoft Word and Excel don't use the same wildcards.

So here is a quick recap on how to search and replace in Microsoft Excel and Word. Please ignore all the double quotes as they are only to make the terms stand out.

For both Word and Excel:
  • Use "?" (without quotes) to find a single character (a letter, a digit, etc). So to find the words "sat" or "sit", you can search for "s?t". Note that it will not find "shoot" as "?" can only be a single character. The "?" is the same as "." (period) in regular expressions.
  • Use "*" (without quotes) to find a any number of characters (letters, digits, spaces, tabs, newlines, etc). So to find the words "shot" or "shoot", you can search for "sh*t". Note that it will also find the string "shot" in the word "gunshot", and it will even find the phrase "show up at" since it starts with "sh" and ends with "t". The is the same as ".*" in regular expressions.

The following only works in Word (not Excel)
  • Use "<" and ">" (without quotes) to mark the beginning and ending of a word. So if we only wanted to find the word "show" and not "gunshow" then we can use "<show" to look for a starting word boundary, but of course it would also fine "showing". To only find the word "show" simply use "<show>"  Note that "<sh*t>" may find phrases too as * includes spaces, so it may find "shot" or even "show up at".
  • Use "^t" (without quotes) to search for the tab character.  The is the same as "\t" in regular expressions.
  • Use "^p" (without quotes) to search for a paragraph mark (i.e. newline).  The is the same as "\n" in regular expressions.
  • Use "@" (without quotes) to search for one or more occurrences of the previous character.  So "f@" will find both "of" and "off". The is the same as "+" in regular expressions. 

To search for the wildcard characters themselves (e.g. wanting to find a *), please refer to earlier post.

So thanks to Microsoft for another straight forward implementation :-)

(Somewhat) Nifty


1 comment:

Warnerhill said...

Very Helpful and informative for me. Find and Replace in Excel tool with additional features.