Best way to extract datetime from string in python

I have a script that processes fields in email headers that represent dates and times. Here are some examples of these lines:

Fri, 10 Jun 2011 11:04:17 +0200 (CEST) Tue, 1 Jun 2011 11:04:17 +0200 Wed, 8 Jul 1992 4:23:11 -0200 Wed, 8 Jul 1992 4:23:11 -0200 EST 

Before I came across CEST / EST patches at the ends of some lines, I had everything that worked fine, just using datetime.datetime.strptime as follows:

 msg['date'] = 'Wed, 8 Jul 1992 4:23:11 -0200' mail_date = datetime.datetime.strptime(msg['date'][:-6], '%a, %d %b %Y %H:%M:%S') 

I tried putting the regular expression together to match the parts of the string date, excluding time zone information at the end, but I had problems with the regular expression (I could not match the colon).

Does regex use the best way to parse all the above examples? If so, can anyone share a regex that matches these examples? In the end, I want to have a datetime object.

+3
source share
2 answers

From python time to age 2, timezone :

 from email import utils utils.parsedate_tz('Fri, 10 Jun 2011 11:04:17 +0200 (CEST)') utils.parsedate_tz('Fri, 10 Jun 2011 11:04:17 +0200') utils.parsedate_tz('Fri, 10 Jun 2011 11:04:17') 

Output:

 (2011, 6, 10, 11, 4, 17, 0, 1, -1, 7200) (2011, 6, 10, 11, 4, 17, 0, 1, -1, 7200) (2011, 6, 10, 11, 4, 17, 0, 1, -1, None) 
+7
source

Perhaps I misunderstood your question, but a simple split enough?

 #!/usr/bin/python d = ["Fri, 10 Jun 2011 11:04:17 +0200 (CEST)", "Tue, 1 Jun 2011 11:04:17 +0200", "Wed, 8 Jul 1992 4:23:11 -0200", "Wed, 8 Jul 1992 4:23:11 -0200 EST"] for i in d: print " ".join(i.split()[0:5]) Fri, 10 Jun 2011 11:04:17 Tue, 1 Jun 2011 11:04:17 Wed, 8 Jul 1992 4:23:11 Wed, 8 Jul 1992 4:23:11 
+2
source

All Articles