How to convert escaped characters to special characters in Perl?

Possible duplicate:
How can I manually interpolate string escape strings in a Perl string?

I am reading a line from a specific file. The problem is that it contains escaped characters, for example:

Hello!\nI\'d like to tell you a little \"secret\"... 

I would like it to be printed without escape sequences, for example:

 Hello! I'd like to tell you a little "secret". 

I was thinking about removing single backslashes and replacing double with a single (since \ is represented as \\), but this does not help me with problems \ n, \ t and so on. Before trying to mess around with ugly, complex string replacements, I thought I'd ask - maybe Perl has a built-in mechanism for such a conversion?

+4
source share
2 answers

For a single Perl backslash character, you can do this with two eval characters as part of the wildcard. You need to enter characters that are acceptable for interpretation in the character class after \ , and then one character after eval 'd and inserted into the string.

Consider:

 #!/usr/bin/perl use warnings; use strict; print "\n\n\n\n"; while (my $data = <DATA>) { $data=~s/\\([rnt'"\\])/"qq|\\$1|"/gee; print $data; } __DATA__ Hello!\nI\'d like to tell you a little \"secret\". A backslask:\\ Tab'\t'stop line 1\rline 2 (on Unix, "line 1" will get overwritten) line 3\\nline 4 (should result in "line 3\\nline 4") line 5\r\nline 6 

Conclusion:

 Hello! I'd like to tell you a little "secret". A backslask:\ Tab' 'stop line 2 (on Unix, "line 1" will get overwritten) line 3\nline 4 (should result in "line 3\nline 4") line 5 line 6 

The line s/\\([rnt'"\\])/"qq|\\$1|"/gee does the job.

  • \\([rnt'"\\]) has valid characters for display inside curly braces.

  • The gee part performs a double evaluation on the replacement string.

  • Part "qq|\\$1|" value eval'd twice. The first eval replaces $1 with a string, and the second performs interpolation.

I can’t come up with a two-character combination that would be a security breach ...

This method is not related to the following rule:

  • Quoted strings. For example, Perl does not cancel the string "string 1 \ nline 2" due to single quotes.

  • Invokes sequences that are longer than one character, such as hex \x1b or Unicode, such as \N{U+...} , or escape sequences, such as \cD

  • Anchor escape sequences such as \ LMAKE LOWER CASE \ E or \ Umake upper case \ E

If you want a more complete evacuation replacement, you can use this regex:

 #!/usr/bin/perl use warnings; use strict; print "\n\n\n\n"; binmode STDOUT, ":utf8"; while (my $data = <DATA>) { $data=~s/\\( (?:[arnt'"\\]) | # Single char escapes (?:[ul].) | # uc or lc next char (?:x[0-9a-fA-F]{2}) | # 2 digit hex escape (?:x\{[0-9a-fA-F]+\}) | # more than 2 digit hex (?:\d{2,3}) | # octal (?:N\{U\+[0-9a-fA-F]{2,4}\}) # unicode by hex )/"qq|\\$1|"/geex; print $data; } __DATA__ Hello!\nI\'d like to tell you a little \"secret\". Here is octal: \120 Here is UNICODE: \N{U+0041} and \N{U+41} and \N{U+263D} Here is a little hex:\x50 \x5fa \x{5fa} \x{263B} lower case next char \lU \lA upper case next char \ua \uu A backslask:\\ Tab'\t'stop line 1\rline 2 (on Unix, "line 1" will get overwritten) line 3\\nline 4 (should result in "line 3\\nline 4") line 5\r\nline 6 

Which processes all Perl escape files , except:

  • Anchor type (\ Q, \ U, \ L, finished \ E)

  • Cited forms such as 'don't \n escape in single quotes' or [not \n in here]

  • unicode named characters such as \N{THAI CHARACTER SO SO}

  • Control characters like \cD (this is easy to add ...)

But that was not part of your question, as I understood it ...

+3
source

I don't like to suggest this, but the eval line will solve the problem, but the eval line causes a lot of security and maintenance issues. Where does this data come from? Are there any contracts between the data producers and you about what the row will contain?

 #!/usr/bin/perl use strict; use warnings; while (my $input = <DATA>) { #note: this only works if # is not allowed as a character in the string my $string = eval "qq#$input#" or die $@ ; print $string; } __DATA__ Hello!\nI\'d like to tell you a little \"secret\". This is bad @{[print "I have pwned you\n"]}. 

Another solution is to create a hash that defines all the escape sequences that you want to implement and do the substitution.

+3
source

All Articles