Replace the newline in the specified line with \ n

I need to write a quick (tomorrow's) filter script to replace line breaks (LF or CRLF) found in double quotes with an escaped new line \n . The content is a (broken) javascript program, so I need to enable escape sequences such as "ab\"cd" and "ab\\"cd"ef" inside the line.

I understand that sed is not suitable for work, since it works on a string, so I move on to perl, which I know nothing about :)

I wrote this regular expression: "(((\\.)|[^"\\\n])*\n?)*" And tested it with http://regex.powertoy.org . It really matches the quoted lines with line breaks, however perl -p -e 's/"(((\\.)|[^"\\\n])*(\n)?)*"/TEST/g' does not work.

So my questions are:

  • how to make perl to match line breaks?
  • how to write the replace-by part to keep the original line and replace only newline?

There is such a similar question with an awk solution, but this is not quite what I need.

NOTE. I usually don’t ask “please do it for me”, but I really don’t want to learn perl / awk tomorrow ... :)

EDIT : example data

 "abc\"def" - matches as one string "abc\\"def"xy" - match "abcd\\" and "xy" "ab cd ef" - is replaced by "ab\ncd\nef" 
+4
source share
4 answers

Here is a simple Perl solution:

 s§ \G # match from the beginning of the string or the last match ([^"]*+) # till we get to a quote "((?:[^"\\]++|\\.)*+)" # match the whole quote § $a = $1; $b = $2; $b =~ s/\r?\n/\\n/g; # replace what you want inside the quote "$a\"$b\""; §gex; 

Here is another solution if you do not want to use /e and just do it with one regex:

 use strict; $_=<<'_quote_'; hai xtest "aa xx aax" baix "xx" x "axa\"x\\" xa "x\\\\\"x" ax xbai!x _quote_ print "Original:\n", $_, "\n"; s/ ( (?: # at the beginning of the string match till inside the quotes ^(?&outside_quote) " # or continue from last match which always stops inside quotes | (?!^)\G ) (?&inside_quote) # eat things up till we find what we want ) x # the thing we want to replace ( (?&inside_quote) # eat more possibly till end of quote # if going out of quote make sure the match stops inside them # or at the end of string (?: " (?&outside_quote) (?:"|\z) )? ) (?(DEFINE) (?<outside_quote> [^"]*+ ) # just eat everything till quoting starts (?<inside_quote> (?:[^"\\x]++|\\.)*+ ) # handle escapes ) /$1Y$2/xg; print "Replaced:\n", $_, "\n"; 

Output:

 Original: hai xtest "aa xx aax" baix "xx" x "axa\"x\\" xa "x\\\\\"x" ax xbai!x Replaced: hai xtest "aa YY aaY" baix "YY" x "aYa\"Y\\" xa "Y\\\\\"Y" ax xbai!x 

To work with line breaks instead of x, just replace it in the regular expression like this:

 s/ ( (?: # at the beginning of the string match till inside the quotes ^(?&outside_quote) " # or continue from last match which always stops inside quotes | (?!^)\G ) (?&inside_quote) # eat things up till we find what we want ) \r?\n # the thing we want to replace ( (?&inside_quote) # eat more possibly till end of quote # if going out of quote make sure the match stops inside them # or at the end of string (?: " (?&outside_quote) (?:"|\z) )? ) (?(DEFINE) (?<outside_quote> [^"]*+ ) # just eat everything till quoting starts (?<inside_quote> (?:[^"\\\r\n]++|\\.)*+ ) # handle escapes ) /$1\\n$2/xg; 
+2
source

Until the OP loads any sample content for verification, try adding the "m" flag (and possibly "s") to the end of your regular expression; from perldoc perlreref (link) :

 m Multiline mode - ^ and $ match internal lines s match as a Single line - . matches \n 

For testing, you can also find adding the command line argument "-i.bak" to save a backup copy of the original file (now with the extension ".bak").

Note that if you want to capture, but not save something, you can use (?:PATTERN) , not (PATTERN) . After you grab the content, use $1 through $9 to access the saved matches from the corresponding section.

For more information see the link, as well as perldoc perlretut (tutorial) and perldoc perlre (full format documentation)

+1
source
 #!/usr/bin/perl use warnings; use strict; use Regexp::Common; $_ = '"abc\"def"' . '"abc\\\\"def"xy"' . qq("ab\ncd\nef"); print "befor: {{$_}}\n"; s{($RE{quoted})} { (my $x=$1) =~ s/\n/\\n/g; $x }ge; print "after: {{$_}}\n"; 
+1
source

Using Perl 5.14.0 (install with perlbrew ), you can do this:

 #!/usr/bin/env perl use strict; use warnings; use 5.14.0; use Regexp::Common qw/delimited/; my $data = <<'END'; "abc\"def" "abc\\"def"xy" "ab cd ef" END my $output = $data =~ s/$RE{delimited}{-delim=>'"'}{-keep}/$1=~s!\n!\\n!rg/egr; print $output; 

I need 5.14.0 for the /r flag of internal replacement. If anyone knows how to avoid this, please let me know.

+1
source

All Articles