Parse french date in python

Can someone tell me how I can parse a French date in Python? Sorry if the question is repeated, but I could not find it.

Here is what I tried using the dateutil parser:

 import locale from dateutil.parser import parse as parse_dt locale.setlocale(locale.LC_TIME, 'fr_FR.UTF-8') ## first I set locale ## locale.LC_TIME, 'fr_FR.UTF-8') parse_dt('3 juillet',fuzzy= True) ## don't work give the default month ## Out[29]: datetime.datetime(2014, 10, 3, 0, 0) parse_dt(u'4 Août ',fuzzy= True) ## same thing using another month 

Edit: add some context:

I parse dates; I don’t know the format of my string in advance The idea is to analyze many dates on the fly:

 parse_dt(u'Aujourd''hui ',fuzzy= True) parse_dt(u'Hier',fuzzy= True) 

Edit using another library:

Using a library of parsedatims and some regular expression to translate French words, I can get the following:

 import parsedatetime import re cal = parsedatetime.Calendar() cal.parse(re.sub('juil.*' ,'jul' ,'20 juillet')) ((2015, 7, 20, 10, 25, 47, 4, 283, 1), 1) 

Perhaps I should generalize this to all the French months?

+7
python date parsing internationalization localization
source share
3 answers

dateparser module can parse dates in question:

 #!/usr/bin/env python # -*- coding: utf-8 -*- import dateparser # $ pip install dateparser for date_string in [u"Aujourd'hui", "3 juillet", u"4 Août", u"Hier"]: print(dateparser.parse(date_string).date()) 

It translates dates into English using the simple yaml configuration and passes the date strings to dateutil.parser .

Exit

 2015-09-09 2015-07-03 2015-08-04 2015-09-08 
+6
source share
 #!/usr/bin/env python # -*- coding: utf-8 -*- import parsedatetime as pdt # $ pip install parsedatetime pyicu calendar = pdt.Calendar(pdt.Constants(localeID='fr', usePyICU=True)) for date_string in [u"Aujourd'hui", "3 juillet", u"4 Août", u"Hier"]: dt, success = calendar.parseDT(date_string) if success: print(date_string, dt.date()) 

Exit

 3 juillet 2015-07-03 4 Août 2015-08-04 

Aujourd'hui, Hier are not recognized (parsedatetime 1.4).

The current version of github (future 1.5) supports setting daily offsets. It can be used to analyze Aujourd'hui, Hier:

 #!/usr/bin/env python # -*- coding: utf-8 -*- import parsedatetime as pdt class pdtLocale_fr(pdt.pdt_locales.pdtLocale_icu): def __init__(self): super(pdtLocale_fr, self).__init__(localeID='fr_FR') self.dayOffsets.update({u"aujourd'hui": 0, u'demain': 1, u'hier': -1}) pdt.pdtLocales['fr_FR'] = pdtLocale_fr calendar = pdt.Calendar(pdt.Constants(localeID='fr_FR', usePyICU=False)) for date_string in [u"Aujourd'hui", "3 juillet", u"4 Août", u"Hier", u"au jour de hui", u"aujour-d'hui", u"au-jour-d'hui", "demain", "hier", u"today", "tomorrow", "yesterday"]: dt, rc = calendar.parseDT(date_string) if rc > 0: print(date_string, dt.date()) 

latest version

Exit

 Aujourd'hui 2014-10-11 3 juillet 2015-07-03 4 Août 2015-08-04 Hier 2014-10-10 demain 2014-10-12 hier 2014-10-10 today 2014-10-11 tomorrow 2014-10-12 yesterday 2014-10-10 

To install it, run:

 $ pip install git+https://github.com/bear/parsedatetime 
+4
source share

First check if you have the correct locale in your repo:

 $ locale -a C C.UTF-8 de_AT.utf8 de_BE.utf8 de_CH.utf8 de_DE.utf8 de_LI.utf8 de_LU.utf8 en_AG en_AG.utf8 en_AU.utf8 en_BW.utf8 en_CA.utf8 en_DK.utf8 en_GB.utf8 en_HK.utf8 en_IE.utf8 en_IN en_IN.utf8 en_NG en_NG.utf8 en_NZ.utf8 en_PH.utf8 en_SG.utf8 en_US.utf8 en_ZA.utf8 en_ZM en_ZM.utf8 en_ZW.utf8 POSIX 

If not, do:

 $ sudo locale-gen fr_FR.UTF-8 Generating locales... fr_FR.UTF-8... done Generation complete. 

Then go back to python:

 $ python >>> import locale >>> import datetime >>> locale.setlocale(locale.LC_ALL, 'fr_FR.UTF-8') 'fr_FR.UTF-8' >>> >>> date_txt = "Dimanche 3 Juin 2012" >>> DATE_FORMAT = "%A %d %B %Y" >>> datetime.datetime.strptime(date_txt, DATE_FORMAT) datetime.datetime(2012, 6, 3, 0, 0) >>> 

Use date format:

 >>> date_txt = "3 juillet" >>> DATE_FORMAT = "%d %B" >>> datetime.datetime.strptime(date_txt, DATE_FORMAT) datetime.datetime(1900, 7, 3, 0, 0) 

You will understand that if the year is listed below, it sets the default value to 1900 .

+1
source share

All Articles