Recommended Regular Expression Validation Method?

Question

Recommended Regular Expression Validation Method?

I am new to regular expressions, I managed to write some trial versions and errors, so I tried several programs to help me write the expression, but the programs were harder to understand than the regular expressions themselves. Any recommended programs? I run most of my programs under Linux.

+6

linux regex

Roberto rosario Oct 6 '09 at 10:55

source share

11 answers

Try YAPE :: Regex :: Explain for Perl:

#!/usr/bin/perl use strict; use warnings; use YAPE::Regex::Explain; print YAPE::Regex::Explain->new( qr/^\A\w{2,5}0{2}\S \n?\z/i )->explain;

Output:

  The regular expression:

 (? i-msx: ^ \ A \ w {2,5} 0 {2} \ S \ n? \ z)

 matches as follows:

 NODE EXPLANATION
 -------------------------------------------------- --------------------
 (? i-msx: group, but do not capture (case-insensitive)
                          (with ^ and $ matching normally) (with. not
                          matching \ n) (matching whitespace and #
                          normally):
 -------------------------------------------------- --------------------
   ^ the beginning of the string
 -------------------------------------------------- --------------------
   \ A the beginning of the string
 -------------------------------------------------- --------------------
   \ w {2,5} word characters (az, AZ, 0-9, _)
                            (between 2 and 5 times (matching the most
                            amount possible))
 -------------------------------------------------- --------------------
   0 {2} '0' (2 times)
 -------------------------------------------------- --------------------
   \ S non-whitespace (all but \ n, \ r, \ t, \ f,
                            and "")
 -------------------------------------------------- --------------------
                            ''
 -------------------------------------------------- --------------------
   \ n?  '\ n' (newline) (optional (matching the
                            most amount possible))
 -------------------------------------------------- --------------------
   \ z the end of the string
 -------------------------------------------------- --------------------
 ) end of grouping
 -------------------------------------------------- --------------------

+7

Sinan Ünür Oct 7 '09 at 2:51

source share

RegexPal is a great free JavaScript regular expression editor. Since it uses the regex JavaScript engine, it does not have some more advanced regular expression functions, but it works very well for many regular expressions. The feature that I missed the most was statements with a stand.

+5

Shawn Oct 6 '09 at 23:29

source share

Most regular expression errors fall into three categories:

Subtle omissions — leaving “ ^ ” at the beginning or “ $ ” at the end, using “ * ”, where you should have used ' + ' - these are only beginner errors, but its common buggy regular expression still passes all automatic tests.
Random success . If part of the regular expression is simply completely wrong and destined to fail 99% of its actual use in the world, but, oddly enough, it manages to pass half, dozens of automated tests that you wrote.
Too much success - where one part of the regular expression matches a lot more than you thought. For example, the token [^., ]* Will also match \r and \n , which means that your regular expression can now match multiple lines of text, even if you wrapped it in ^ and $ .

In fact, there is no substitute for proper regular expression training. Read the reference guide for your regex engine and use a tool like Regex Buddy to experiment and get familiar with all the features, and especially pay attention to any special or unusual behaviors that they may exhibit. If you study the regular expression correctly, you will avoid most of the errors mentioned above, and you will know how to write only a small number of automatic tests that can guarantee all extreme cases without over testing the obvious things (does [AZ] really match each letter between A and A? I would rather write 26 unit test options to make sure!).

If you do not fully learn the regular expression, you will need to write an absurd amount of automatic tests to prove the correctness of your magic regular expression.

+4

too much php Oct 7 '09 at 3:20

source share

A great program to help write regular expressions would be Perl ; you can try regex to make sure it is very easy:

 perl -e 'print "yes!\n" if "string" =~ /regex to test/'

See this SO question for regular expressions module for more information on testing regular expressions in general.

+2

Robert P Oct 6 '09 at 10:58

source share

You can try using websites that give you tips and instant satisfaction, like this one . The combination of a simple perl script that you can easily modify is also a great polygon. Something like the following:

 #!/usr/bin/perl $mystring = "My cat likes to eat tomatoes."; $mystring =~ s/cat/dog/g; print $mystring;

+1

akf Oct 6 '09 at 23:02

source share

Also check out re pragma , which will show how regular expressions are compiled and how they are executed:

 $ perl -Mre=debugcolor -e '"huzza" =~ /^(hu)?z{1,2}za$/'

Exit:

  Compiling REx "^ (hu)? Z {1,2} za $"
     Final program:
        1: BOL (2)
        2: CURLYM [1] {0,1} (12)
        6: EXACT (10)
       10: SUCCEED (0)
       11: NOTHING (12)
       12: CURLY {1,2} (16)
       14: EXACT (0)
       16: EXACT (18)
       18: EOL (19)
       19: END (0)
     floating "zza" $ at 0..3 (checking floating) anchored (BOL) minlen 3 
     Guessing start of match in sv for REx "^ (hu)? Z {1,2} za $" against "huzza"
     Found floating substr "zza" $ at offset 2 ...
     Guessed: match at offset 0
     Matching REx "^ (hu)? Z {1,2} za $" against "huzza"
        0 |  1: BOL (2)
        0 |  2: CURLYM [1] {0,1} (12)
        0 |  6: EXACT (10)
        2 |  10: SUCCEED (0)
                                         subpattern success ...
                                       CURLYM now matched 1 times, len = 2 ...
                                       CURLYM trying tail with matches = 1 ...
        2 |  12: CURLY {1,2} (16)
                                         EXACT can match 2 times out of 2 ...
        3 |  16: EXACT (18)
        5 |  18: EOL (19)
        5 |  19: END (0)
     Match successful!
     Freeing REx: "^ (hu)? Z {1,2} za $"

+1

Inshallah Oct 7 '09 at 2:37

source share

http://regex-test.com is a really good / professional website that allows you to test many different types of regular expressions.

+1

Thys Dec 10 '09 at 13:25

source share

My personal favorite is Rubular (disclaimer: online tool).

It is good and simple, and pretty fast.

+1

Adam eberlin Sep 22 '11 at 0:17

source share

If you want to buy a tool, Komodo , ActiveState is a great editor for scripting languages and comes with a powerful regular expression helper. This is a cross platform, but not free. This helped me out of several difficult situations when I did not quite understand why things are not analyzed and does not support several types of varieties of regular expressions.

0

Robert P Oct 6 '09 at 23:06

source share

Kudos is a great free cross-platform regular expression debugger.

0

Wogan Oct 7 '09 at 4:27

source share

Robert P · Accepted Answer · 2009-10-06T23:00:43+0000

Unfortunately, if you use linux, you will not have access to one of the best: Regex Buddy .

RegexBuddy is your ideal regex companion. Easily create regular expressions that exactly match your needs. Clearly understand complex regular expressions written by others. Quickly test any regular expression on strings and example files to prevent errors in real data. No hassle debugging after going through the actual matching process. Use a regular expression with fragments of source code that are automatically adjusted to suit your particular programming language. Build and document regex libraries for future reuse. GREP (search and replace) through files and folders. Integrate RegexBuddy with your favorite search and editing tools for instant access. (from your website)

Recommended Regular Expression Validation Method?

More articles: