Using \ t in regex doesn't work with all tabs

Some lines of the file do not seem to match \ t in the regular expression. Does anyone have an idea why?

Take an example file that you can download from http://download.geonames.org/export/dump/countryInfo.txt .

$ wget http://download.geonames.org/export/dump/countryInfo.txt --2011-02-03 16:24:08-- http://download.geonames.org/export/dump/countryInfo.txt Resolving download.geonames.org... 178.63.52.141 Connecting to download.geonames.org|178.63.52.141|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 31204 (30K) [text/plain] Saving to: `countryInfo.txt' 100%[===================================================================================================================================================================================================>] 31,204 75.0K/s in 0.4s 2011-02-03 16:24:10 (75.0 KB/s) - `countryInfo.txt' saved [31204/31204] $ cat countryInfo.txt | grep -E 'AD.AND' AD AND 200 AN Andorra Andorra la Vella 468 84000 EU .ad EUR Euro 376 AD### ^(?:AD)*(\d{3})$ ca 3041565 ES,FR sdalouche@samxps :/tmp$ cat countryInfo.txt | grep -E 'AD\tAND' (no result) output of vi :set list AD^IAND^I200^IAN^IAndorra^IAndorra la Vella^I468^I84000^IEU^I.ad^IEUR^IEuro^I376^IAD###^I^(?:AD)*(\d{3})$^Ica^I3041565^IES,FR^I$ 
+7
source share
4 answers

Try using the -P option instead of -E :

 cat countryInfo.txt | grep -P 'AD\tAND' 

This will use Perl-style regular expressions that will catch \t .

 $ echo -e '-\t-' | grep -E '\t' (no result) $ echo -e '-\t-' | grep -P '\t' - - 
+10
source

If I read the documentation for grep, I don’t see a mention that \t represents a tab. Remember that not all regex mechanisms are the same.

0
source

Tabs are not part of POSIX regular expressions (standard for grep). But you can create an alphabetic tab character:

 echo -ne "\\t" 

So, grepping for a tab works like this:

 grep "AD$(echo -ne "\\t")AND" 

or

 t=$(echo -ne "\\t") grep "AD${t}AND" 
0
source

You can just use the literal tab. In the terminal, press CTRL + V, and then press the Tab key. This will create a space in the pointer at the cursor point, which can be used in your regular expression.

 ls | grep -E "[0-9]<CTRL+V><TAB>]" 

This will search for any number from 0 to 9 with a tab character immediately after it.

0
source

All Articles