How to change the chronology of working with the mbox file?

I have one mool mool file that was created with evolution containing a list of letters that I want to print. My problem is that emails do not fit into the mbox file in chronological order. I would like to know how to best place files from the first to the last time using bash, perl or python. I would like to receive the received files for me and sent the files sent by me. Perhaps it would be easier to use maildir files or such?

Currently, emails exist in the format:

From x@blah.com Fri Aug 12 09:34:09 2005 Message-ID: < 42FBEE81.9090701@blah.com > Date: Fri, 12 Aug 2005 09:34:09 +0900 From: me < x@blah.com > User-Agent: Mozilla Thunderbird 1.0.6 (Windows/20050716) X-Accept-Language: en-us, en MIME-Version: 1.0 To: someone < someone@hotmail.com > Subject: Re: (no subject) References: < BAY101-F9353854000A4758A7E2CCA9BD0@phx.gbl > In-Reply-To: < BAY101-F9353854000A4758A7E2CCA9BD0@phx.gbl > Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit Status: RO X-Status: X-Keywords: X-UID: 371 X-Evolution-Source: imap:// x+blah.com@blah.com / X-Evolution: 00000002-0010 Hey the actual content of the email someone wrote: > lines of quotedtext 

I am wondering if there is a way to use this information to easily reorganize the file, possibly with perl or such.

+3
source share
3 answers

Here is how you could do it in python:

 #!/usr/bin/python2.5 from email.utils import parsedate import mailbox def extract_date(email): date = email.get('Date') return parsedate(date) the_mailbox = mailbox.mbox('/path/to/mbox') sorted_mails = sorted(the_mailbox, key=extract_date) the_mailbox.update(enumerate(sorted_mails)) the_mailbox.flush() 
+9
source

The Python solution will not work if mail messages have been imported to mbox using the Thunderbird ImportExportTools add-on. There is an error: messages must have a 'from' line prefix in the format:

 From - Tue Apr 27 19:42:22 2010 

but the ImportExportTools prefix with the line 'from':

 From - Sat May 01 2010 15:07:31 GMT+0400 (Russian Daylight Time) 

So there are two errors:

  • sequence "year of the year", broken down into "year"
  • extra garbage with GMT information along with time zone

Since Python mailbox.py/UnixMailbox has hardcoded regexp to match strings, some of the messages cannot be parsed.

I wrote an error message to the author, but there are many erroneously imported messages: (.

+1
source

What is the point of rewriting mbox, while you can change the order of letters in memory when loading a mailbox? What time do you want to order? Date of receiving? Departure date? In any case, all Ruby / Python / Perl modules for playing with mboxes can do this.

-3
source

All Articles