For this kind of potentially complex regular expression, I try to break it into simple parts that can be individually tested, maintained and developed.
I use REL , DSL (in Scala), which allows you to reassemble and reuse regex elements. That way, you can define your regular expression as these date and unit test matches in each part.
Additionally, your unit / spec tests can double as your document for this regular expression bit, indicating what matches and what doesn't (which tends to be important when using regular expressions).
In the next version of REL (0.3), you can directly export Regex to, for example, PCRE (thus PHP) to use it independently ... So far, only JavaScript and .NET translations are implemented in the github repository. Using the latest (not yet publicly recorded) snapshot, the PCRE flavor of the English alphanumeric regular expression is as follows:
/(?:(?:(?<!\d)(?<a_d1>(?>(?:(?:[23]?1)st|(?:2?2)nd|(?:2?3)rd|(?:[12]?[4-9]|[123]0)th)\b|0[1-9]|[12][0-9]|3[01]|[1-9]|[12][0-9]|3[01]))(?: ?+(?:of )?+))(?>(?<a_m1>jan(?>uary|\.)?|feb(?>ruary|r?\.?)?|mar(?>ch|\.)?|apr(?>il|\.)?|may|jun(?>e|\.)?|jul(?>y|\.)?|aug(?>ust|\.)?|sep(?>tember|t?\.?)?|oct(?>ober|\.)?|nov(?>ember|\.)?|dec(?>ember|\.)?))|(?:\b(?>(?<a_m2>jan(?>uary|\.)?|feb(?>ruary|r?\.?)?|mar(?>ch|\.)?|apr(?>il|\.)?|may|jun(?>e|\.)?|jul(?>y|\.)?|aug(?>ust|\.)?|sep(?>tember|t?\.?)?|oct(?>ober|\.)?|nov(?>ember|\.)?|dec(?>ember|\.)?)))(?:(?:(?: ?+)(?<a_d2>(?>(?:(?:[23]?1)st|(?:2?2)nd|(?:2?3)rd|(?:[12]?[4-9]|[123]0)th)\b|0[1-9]|[12][0-9]|3[01]|[1-9]|[12][0-9]|3[01]))(?!\d))?))(?:(?:,?+)(?:(?:(?: ?)(?<a_y>(?:1[7-9]|20)\d\d|'?+\d\d))(?!\d))|(?<=\b|\.))/i
Obtained through the expression fr.splayce.rel.matchers.en.Date.ALPHA using PCREFlavor (not yet in the GitHub repository). It will only match if there is a month expressed in alphabetical form ( feb , feb. Or february ), the regular expression β¦.Date.ALL , also corresponding to numerical forms, for example 2/21/2013 , is more complicated.
In addition, this particular regular expression matches your examples, but may be slightly limited for your needs:
- It does not include weekly days.
- It will not match date ranges (
March 9th ) - It does not coincide with the first year.
2013, jan. 14th