I need to transfer a very large dataset from one system to another. One of the source columns contains a date, but in fact it is a row without restriction, while the destination system sets the date in the format yyyy-mm-dd.
Many, but not all, source dates are formatted as yyyymmdd. Therefore, to force them to the expected format, I do (in Perl):
return "$1-$2-$3" if ($val =~ /(\d{4})[-\/]*(\d{2})[-\/]*(\d{2})/);
The problem occurs when the original dates are removed from the "generic" yyyymmdd. The goal is to save as many dates as possible before giving up. Examples of source lines:
3/21/1998, March 2004, 2001, 3/4/97
I can try to match so many examples that I can find with a sequence of regular expressions like the one above.
But is there something smarter? Don't I reinvent the wheel? Is there a library somewhere somewhere? I could not find anything suitable for the search engine "forgiving date parser". (any language is fine).
source
share