Can my regex be improved?

Yes, another question in Regex. Welcome; -P

This is the first time I have written my own regex for some simple string checking in C #. I think it works for me, but as a training exercise, I was wondering if it was possible to improve it and if I made any mistakes.

The lines will look something like this:

T20160307.0001

Rules:

  • Start with the letter T.
  • Date in YYYYMMDD format.
  • Full stop.
  • The last 4 characters are always numeric. Must be exactly 4.

Here is my regex ( fiddle ):

^ (? I) [T] 20 [0-9] {2} [0-1] [0-9] [0-3] [0-9]. \ D {4} $ p>

  • ^ Approve the beginning of the line.
  • (?i)[T] Make sure we have a case insensitive letter T.
  • 20 20 ( 2100 , : -P)
  • [0-9]{2} 0 99 YYYY.
  • [0-1][0-9] 0 1 , 0-9 .
  • [0-3][0-9] 0-3 , 0-9 .
  • . .
  • \d{4} 4 .
  • $ .

, , - . 20161935 (35- 19- ) . // , , , , .

, , - ELI5, , .

: , , DateTime.TryParse .. , Regex , . , , .

+4
3

Regex DateTime.TryParseExact. :

bool IsCorrectFormat(string input)
{
    //14 is a magic number for the length of the expected format
    if (input.Length == 14 && input.StartsWith("T", StringComparison.OrdinalIgnoreCase))
    {
        DateTime dt;
        if (DateTime.TryParseExact(input.Substring(1), "yyyyMMdd.ffff", CultureInfo.InvariantCulture, DateTimeStyles.None, out dt))
        {
            return true;
        }
    }

    return false;
}

, , 1 6, yyyyMMdd, 5 .

EDIT:

. ,

@"^(((0[1-9]{1}|[1-2][0-9]{1}|3[01]{1})(0[13578]{1}|1[12]{1}))" //For a 31 day month
+ @"|"
+ @"((0[1-9]{1}|[1-2][0-9]{1}|30)(0[469]{1}|11))" //For a 30 day month
+ @"|"
+ @"((0[1-9]{1}|1[0-9]{1}|2[0-8]{1})(02)))" //For a 28 day month(feb)
+ @"([0-9]{4})$"; //For the year
+3

, , :

  • \d, Unicode ( ascii)
  • [0-1] [01]
  • , ( ).
  • T ,
  • [Tt] T


^(?i)T20[0-9]{2}[01][0-9][0-3][0-9]\.[0-9]{4}$

^[Tt]20[0-9]{2}[01][0-9][0-3][0-9]\.[0-9]{4}$

: , , ? ( ) :

^(?i)T(20[0-9]{6})\.[0-9]{4}$

, , DateTime.TryParse.

+4

, Regex , .

, , , . .

, YYYYMMdd :

(?=\p{IsBasicLatin}{8}) # ensures \d matches only 0-9
(?!0000)\d{4} # year any 4-digit year, except 00
(?:0[1-9]\d|1[012]) # month 01-12
(?: 
   # day logic for leap years
   (?:
      (!00)[012]\d # Days 01-29 (we exclude 2/29 later)
      | (?<!02)30  # Day 30 valid for all months except Feb
      | (?<=0[13578]|1[02])31 # Day 31 valid for some months
   )
   # Non-Leap-year logic.  Do not allow 2/29 if the provided year
   # is not a leap year.
   (?<!
     (?:
        [13579] # years ending with odd number are not leap years
        | [02468][26]|[13579][048] # years not divisible by 4
                                     # are not leap years (02, 06, 10, ...)
        | (?:[02468][\d-[048]]|[13579][\d-[26]])00 # years divisible by
                                                 # 100 are not leap years,
                                                 # unless divisible by 400

     )0229)
)

Compile with RegexOptions.IgnorePatternWhitespace. You can use ^T~\.\d{4}$to match the full string in your example by replacing ~with the above expression.

+1
source

All Articles