How to remove a word prefix using grep?

Question

How to remove a word prefix using grep?

How to remove the beginning of a word using grep? Example: I have a file that contains:

www.abc.com

I only need a part

 abc.com

Sorry for the main question. But they do not have experience with Linux.

+8

linux regex shell sed

Jury a Jul 26 '12 at 15:56

source share

7 answers

You can do this with grep easily:

 $ echo www.google.com | grep -o '[^.]*\.com' google.com

Instead of echo you must specify your file.

 $ grep -o '[^.]*\.com$' < file

I used the regular expression '[^.] * Here. com '. This means: find the word in it . ( [^.]* ) followed by .com ( \.com in re). The -o key says grep should only show that part found.

+5

Igor Chubin Jul 26 '12 at 18:42

source share

grep not used to control / change text, only to search for text / patterns in text

You should look something like sed or awk or cut if you want the command line tool to execute it. Or write a script in Python / Perl / Ruby / whatever.

+3

Daniel DiPaolo Jul 26 '12 at 16:00

source share

As others have noted, grep not suitable for this task, sed is a good option, or if the text is ordered, a simple cut might be easier to type:

 echo www.abc.com | cut -d. -f2-

-d. tells cut use . as a separator.
-f2- tells cut to return field 2 to infinity.

+2

Thor Jul 26 '12 at 16:34

source share

Although sed , awk , cut and even grep may solve the problem, I think grep is not a good choice.

grep is a command line utility for finding regular text datasets for strings matching a regular expression.
But for working with line by line, there are utilities like sed and awt .

0

Neoh Jul 27 '12 at 2:36

source share

You can do this without using other programs using the built-in parameter extension in bash:

 while read line; do echo ${line#*.}; done < file

Where #*. tells the shell to remove the prefix that looks like 0 or more characters, followed by . .

You can view the cheatsheet with various parameter extensions for bash here:

https://devhints.io/bash

0

Fahd ahmed Jan 2 '18 at 7:35

source share

You can do this with a positive look and using the grep --only-matching flag :

 echo "www.abc.com" | grep --perl-regexp --only-matching '(?<=www\.).*'

which can be reduced to

 echo "www.abc.com" | grep -Po '(?<=www\.).*'

Both produce

abc.com

using grep (GNU grep) 3.3.

0

Matthias braun May 21 '19 at 10:27

source share

sastanin · Accepted Answer · 2012-07-26T16:01:34+0000

You do not edit lines with grep in a Unix shell, grep usually used to find or remove some lines from text. Instead, you prefer to use sed :

 $ echo www.example.com | sed 's/^[^\.]\+\.//' example.com

You need to learn regular expressions in order to use it effectively .

Sed can also edit the file in place (modify the file) if you pass the -i argument, but be careful, you can easily lose data if you write the wrong sed command and use the -i flag.

Example

From your comments, guess that you have a TeX document and you want to delete the first part of all .com domain names. If this is your test.tex document:

 \documentclass{article} \begin{document} www.example.com example.com www.another.domain.com \end{document}

then you can convert it using this sed command (redirect the output to a file or edit in-place using -i ):

 $ sed 's/\([a-z0-9-]\+\.\)\(\([a-z0-9-]\+\.\)\+com\)/\2/gi' test.tex \documentclass{article} \begin{document} example.com example.com another.domain.com \end{document}

Note:

The general sequence of allowed characters followed by a period corresponds to [a-z0-9-]\+\.
I used groups in the regex (parts of it within \( and \) ) to specify the first and second parts of the URL, and I replaced the whole match with my second group ( \2 in the replacement pattern)
The domain must be at least level 3. COM domain (each \+ repetition means at least one match)
The search is not case sensitive ( i flag at the end)
In the end, it may correspond more to line ( g )

How to remove a word prefix using grep?

Example

More articles: