How to delete text before and after a certain character?

I am trying to remove text before and after a specific character in each line of text. It would be very difficult to do this manually, since it contains 5000 lines, and I need to delete the text before this keyword in each line. Any software that could do this would be great, or any Perl script that could run on Windows. I run Perl scripts in ActivePerl, so scripts that could do this and run on ActivePerl would be useful.

thanks

+3
source share
5 answers

I would use this:

$text =~ s/ .*? (keyword) .* /$1/gx; 
+3
source

You do not need software, you can make this part of an existing script. A multi-line regular expression replaces the lines / a (b) c /, after which you can backref b in replacer with $ 1. Without knowing more about the text you're working with, it's hard to guess what the actual template will be.

+2
source

Assuming you have the following:

text1 text2 keyword text3 text4 text5 keyword text6 text7

and what do you want

 s/.*?keyword(.*?)keyword.*/keyword$1keyword/; 

otherwise you can just replace the whole line with the keyword

Sample data may help us to be clearer.

+2
source

I would say that if the text $ text contains all the text, you can do:

 $text =~ s/^.*(keyword1|keyword2).*$/$1/m; 

The m modifier makes ^ and $ see the beginning and end of the line, not the beginning and end of the line.

+1
source

Assuming you want to remove all text to the left of keyword1 and all text to the right of keyword2 :

 while (<>) { s/.*(keyword1)/$1/; s/(keyword2).*/$1/; print; } 

Put this in a perl script and run it like this:

 fix.pl original.txt > new.txt 

Or if you just want to do this in place, perhaps on multiple files at once:

 perl -i.bak -pe 's/.*(keyword1)/$1/; s/(keyword2).*/$1/;' original.txt original2.txt 

This will lead to editing, renaming the original to have a .bak extension, use an implicit while-loop with printing, and execute a search and replace pattern before each printing.

To be safe, first check it without the -i option or at least on just one file ...

0
source

All Articles