<...">

How to stop + in the first instance of a character, and not the last with regular expressions in perl?

I want to replace:

'''<font size="3"><font color="blue"> SUMMER/WINTER CONFIGURATION FILES</font></font>'''

WITH

='''<font color="blue"> SUMMER/WINTER CONFIGURATION FILES</font>'''=

Now my existing code is:

$html =~ s/\n(.+)<font size=\".+?\">(.+)<\/font>(.+)\n/\n=$1$2$3=\n/gm

However, this ends with this:

=''' SUMMER/WINTER CONFIGURATION FILES</font>'''=

Now I see what is happening, this is a correspondence <font size ="..... all the way up to the end of the <font colour blue">that is not what I want, I want it to stop in the first instance " not the last, I thought it was what placed the sign ?, but I tried. +. + ?. * and. *? with the same result every time.

Anyone have any ideas what I'm doing wrong?

+5
source share
3 answers

As Mark said, just use CPAN for this.

#!/usr/bin/env perl

use strict; use warnings;
use HTML::TreeBuilder;

my $s = q{<font size="3"><font color="blue"> SUMMER/WINTER CONFIGURATION FILES</font></font>};

my $tree = HTML::TreeBuilder->new;
$tree->parse( $s ); 
print $tree->find_by_attribute( color => 'blue' )->as_HTML;

# => <font color="blue"> SUMMER/WINTER CONFIGURATION FILES</font>

This works for your specific case:

#!/usr/bin/env perl

use strict; use warnings;

my $s = q{<font size="3"><font color="blue"> SUMMER/WINTER CONFIGURATION FILES</font></font>};

print $s =~ m{
                 < .+? >
                 (.+)?
                 </.+? >                
             }mx;

# => <font color="blue"> SUMMER/WINTER CONFIGURATION FILES</font>
+4
source

.+? , .

$html =~ s/\n(.+?)<font size=\".+?\">(.+?)<\/font>(.+?)\n/\n=$1$2$3=\n/gm
                ^                ^      ^            ^

HTML. HTML, .

+8

.+ [^"]+ ( "match anything", " , ""...

+7

All Articles