How to remove a word prefix using grep?

How to remove the beginning of a word using grep? Example: I have a file that contains:

www.abc.com 

I only need a part

 abc.com 

Sorry for the main question. But they do not have experience with Linux.

+8
source share
7 answers

You do not edit lines with grep in a Unix shell, grep usually used to find or remove some lines from text. Instead, you prefer to use sed :

 $ echo www.example.com | sed 's/^[^\.]\+\.//' example.com 

You need to learn regular expressions in order to use it effectively .

Sed can also edit the file in place (modify the file) if you pass the -i argument, but be careful, you can easily lose data if you write the wrong sed command and use the -i flag.

Example

From your comments, guess that you have a TeX document and you want to delete the first part of all .com domain names. If this is your test.tex document:

 \documentclass{article} \begin{document} www.example.com example.com www.another.domain.com \end{document} 

then you can convert it using this sed command (redirect the output to a file or edit in-place using -i ):

 $ sed 's/\([a-z0-9-]\+\.\)\(\([a-z0-9-]\+\.\)\+com\)/\2/gi' test.tex \documentclass{article} \begin{document} example.com example.com another.domain.com \end{document} 

Note:

  • The general sequence of allowed characters followed by a period corresponds to [a-z0-9-]\+\.
  • I used groups in the regex (parts of it within \( and \) ) to specify the first and second parts of the URL, and I replaced the whole match with my second group ( \2 in the replacement pattern)
  • The domain must be at least level 3. COM domain (each \+ repetition means at least one match)
  • The search is not case sensitive ( i flag at the end)
  • In the end, it may correspond more to line ( g )
+7
source

You can do this with grep easily:

 $ echo www.google.com | grep -o '[^.]*\.com' google.com 

Instead of echo you must specify your file.

 $ grep -o '[^.]*\.com$' < file 

I used the regular expression '[^.] * Here. com '. This means: find the word in it . ( [^.]* ) followed by .com ( \.com in re). The -o key says grep should only show that part found.

+5
source

grep not used to control / change text, only to search for text / patterns in text

You should look something like sed or awk or cut if you want the command line tool to execute it. Or write a script in Python / Perl / Ruby / whatever.

+3
source

As others have noted, grep not suitable for this task, sed is a good option, or if the text is ordered, a simple cut might be easier to type:

 echo www.abc.com | cut -d. -f2- 
  • -d. tells cut use . as a separator.
  • -f2- tells cut to return field 2 to infinity.
+2
source

Although sed , awk , cut and even grep may solve the problem, I think grep is not a good choice.

  • grep is a command line utility for finding regular text datasets for strings matching a regular expression.
  • But for working with line by line, there are utilities like sed and awt .
0
source

You can do this without using other programs using the built-in parameter extension in bash:

 while read line; do echo ${line#*.}; done < file 

Where #*. tells the shell to remove the prefix that looks like 0 or more characters, followed by . .

You can view the cheatsheet with various parameter extensions for bash here:

https://devhints.io/bash

0
source

You can do this with a positive look and using the grep --only-matching flag :

 echo "www.abc.com" | grep --perl-regexp --only-matching '(?<=www\.).*' 

which can be reduced to

 echo "www.abc.com" | grep -Po '(?<=www\.).*' 

Both produce

abc.com

using grep (GNU grep) 3.3.

0
source

All Articles