Before you guys tell me that Regex is the embodiment of all evil ... I already know. If I had more hair, it would be torn out already.
So to the question. I created a parser using a regular expression that removes the desired parts of the html email address. Why do I want to do this? Because I'm still a beginner programmer in order, if you can offer a better way, then by all means ... do it. The parser works great with regular html parts of email, however, if someone sends me and sends an email with only one attachment (or more) ...
LOSE ALL GOOD INTERRUPTIONS!
Instead of looking like a regular html email, I get a plain text version with html concatenation at the end like this:
--_1b4078c9-04f5-4cca-a220-e5b30eddef46_ Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable To: ****@****=3B ****@**** | Emmanuel Smith=3B= Jonny Barnes cc: |bcc: |Ref: Test123 --- Lorem ipsum dolor sit amet=2C consectetur adipiscing elit. Praesent in augu= e nec justo tempor varius eu et tellus. Nunc id massa tortor=2C ut lobortis= sem. Class aptent taciti sociosqu ad litora torquent per conubia nostra=2C= per inceptos himenaeos. Maecenas quis nisl nec quam tristique posuere sed = at nibh. Cras fringilla vestibulum metus vel porttitor. 2 + 2 =3D 7 Cras ia= culis=2C erat nec gravida accumsan=2C metus felis vestibulum risus=2C quis = venenatis nisl nulla sed diam. Aenean quis viverra velit. Etiam quis massa = lectus=2C faucibus facilisis sem. Curabitur non eros tellus. Sed at ligula = neque. Donec elementum rhoncus volutpat. Curabitur eu accumsan erat. Phasel= lus auctor odio dolor=2C ut ornare augue. Suspendisse vel est nibh. Vivamus= facilisis placerat augue sit amet aliquam. Maecenas viverra=2C ipsum a tin= cidunt elementum=2C arcu tellus rutrum ipsum=2C et dignissim urna orci ac m= i. Vivamus non odio massa. Nulla congue massa eu leo pretium non consequat = urna molestie. Integer neque odio=2C scelerisque at molestie quis=2C congue sed arcu. Prae= sent a arcu odio. Donec sollicitudin=2C quam vel tincidunt lobortis=2C urna= augue cursus lorem=2C in eleifend nunc risus nec neque. Donec euismod maur= is non nibh blandit sollicitudin. Vivamus sed tincidunt augue. Suspendisse = iaculis massa ut tellus rutrum auctor. Cras venenatis consequat urna in viv= erra. Ut blandit imperdiet dolor non scelerisque. Suspendisse potenti. Sed = vitae lacus ac odio euismod tempus. Aenean ut sem odio. Curabitur auctor pu= rus a diam iaculis facilisis. Integer molestie commodo mauris a imperdiet. = Nunc aliquet tempus orci sit amet viverra. =20 Hotmail is redefining busy with tools for the New Busy. Get more from your = inbox. See how. =20 _________________________________________________________________ The New Busy is not the old busy. Search=2C chat and e-mail from your inbox= .. http://www.windowslive.com/campaign/thenewbusy?ocid=3DPID28326::T:WLMTAGL:O= N:WL:en-US:WM_HMP:042010_3= --_1b4078c9-04f5-4cca-a220-e5b30eddef46_ Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable <html> <head> <style><!-- ..hmmessage P { margin:0px=3B padding:0px } body.hmmessage { font-size: 10pt=3B font-family:Verdana } --></style> </head> <body class=3D'hmmessage'> To: ****@**** ****@**** | Emmanuel Smith=3B= Jonny Barnes<br><div>cc: |</div><div>bcc: |</div><div>Ref: Test123</div><d= iv><br><span class=3D"ecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxec= xecxApple-style-span" style=3D"font-family:Tahoma=2C Verdana=2C Arial=2C sa= ns-serif=3Bcolor:rgb(68=2C 68=2C 68)"><font class=3D"ecxecxecxecxecxecxecxe= cxecxecxecxecxecxecxecxApple-style-span" color=3D"#000000"><font class=3D"e= cxecxecxecxecxecxApple-style-span" face=3D"Verdana">---<br></font></font><d= iv><font class=3D"ecxecxecxecxecxecxApple-style-span" face=3D"Verdana"><br>= </font></div><div><span class=3D"ecxecxecxecxecxecxecxecxecxecxecxecxecxecx= ecxecxecxecxecxecxecxecxecxecxecxecxecxecxApple-style-span" style=3D"font-s= ize:11px=3Bline-height:14px"><font class=3D"ecxecxecxecxecxecxApple-style-s= pan" face=3D"Verdana">Lorem ipsum dolor sit amet=2C consectetur adipiscing = elit. Praesent in augue nec justo tempor varius eu et tellus. Nunc id massa= tortor=2C ut lobortis sem. Class aptent taciti sociosqu ad litora torquent= per conubia nostra=2C per inceptos himenaeos. Maecenas quis nisl nec quam = tristique posuere sed at nibh. Cras fringilla vestibulum metus vel porttito= r. 2 + 2 =3D 7 Cras iaculis=2C erat nec gravida accumsan=2C metus felis ves= tibulum risus=2C quis venenatis nisl nulla sed diam. Aenean quis viverra ve= lit. Etiam quis massa lectus=2C faucibus facilisis sem. Curabitur non eros = tellus. Sed at ligula neque. Donec elementum rhoncus volutpat. Curabitur eu= accumsan erat. Phasellus auctor odio dolor=2C ut ornare augue. Suspendisse= vel est nibh. Vivamus facilisis placerat augue sit amet aliquam. Maecenas = viverra=2C ipsum a tincidunt elementum=2C arcu tellus rutrum ipsum=2C et di= gnissim urna orci ac mi. Vivamus non odio massa. Nulla congue massa eu leo = pretium non consequat urna molestie.</font></span></div><div><span class=3D= "ecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxec= xecxecxecxApple-style-span" style=3D"font-size:11px=3Bline-height:14px"><fo= nt class=3D"ecxecxecxecxecxecxApple-style-span" face=3D"Verdana"><br></font= ></span></div><div><span class=3D"ecxecxecxecxecxecxecxecxecxecxecxecxecxec= xecxecxecxecxecxecxecxecxecxecxecxecxecxecxApple-style-span" style=3D"font-= size:11px=3Bline-height:14px"><font class=3D"ecxecxecxecxecxecxApple-style-= span" face=3D"Verdana"><br></font></span></div><div><span class=3D"ecxecxec= xecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxec= xApple-style-span" style=3D"font-size:11px=3Bline-height:14px"><font class= =3D"ecxecxecxecxecxecxApple-style-span" face=3D"Verdana"><br></font></span>= </div><div><font class=3D"Apple-style-span" face=3D"Verdana" size=3D"3"><sp= an class=3D"Apple-style-span" style=3D"font-size: 11px=3B line-height: 14px= =3B"><br></span></font></div><span class=3D"ecxecxecxecxecxecxecxecxecxecxe= cxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxe= cxecxecxecxecxApple-style-span" style=3D"font-family:Arial=2C Helvetica=2C = sans=3Bfont-size:11px"><p style=3D"margin-right:0px=3Bmargin-bottom:14px=3B= margin-left:0px=3Btext-align:justify=3Bfont-size:11px=3Bline-height:14px=3B= padding-top:0px=3Bpadding-right:0px=3Bpadding-bottom:0px=3Bpadding-left:0px= "><font class=3D"ecxecxecxecxecxecxApple-style-span" face=3D"Verdana">Integ= er neque odio=2C scelerisque at molestie quis=2C congue sed arcu. Praesent = a arcu odio. Donec sollicitudin=2C quam vel tincidunt lobortis=2C urna augu= e cursus lorem=2C in eleifend nunc risus nec neque. Donec euismod mauris no= n nibh blandit sollicitudin. Vivamus sed tincidunt augue. Suspendisse iacul= is massa ut tellus rutrum auctor. Cras venenatis consequat urna in viverra.= Ut blandit imperdiet dolor non scelerisque. Suspendisse potenti. Sed vitae= lacus ac odio euismod tempus. Aenean ut sem odio. Curabitur auctor purus a= diam iaculis facilisis. Integer molestie commodo mauris a imperdiet. Nunc = aliquet tempus orci sit amet viverra.</font></p><p style=3D"margin-right:0p= x=3Bmargin-bottom:14px=3Bmargin-left:0px=3Btext-align:justify=3Bfont-size:1= 1px=3Bline-height:14px=3Bpadding-top:0px=3Bpadding-right:0px=3Bpadding-bott= om:0px=3Bpadding-left:0px"><font class=3D"ecxecxecxecxecxecxApple-style-spa= n" face=3D"Verdana"><br></font></p><p style=3D"margin-right:0px=3Bmargin-bo= ttom:14px=3Bmargin-left:0px=3Btext-align:justify=3Bfont-size:11px=3Bline-he= ight:14px=3Bpadding-top:0px=3Bpadding-right:0px=3Bpadding-bottom:0px=3Bpadd= ing-left:0px"><font class=3D"Apple-style-span" face=3D"Verdana"><br></font>= </p></span></span></div> <br><hr>Hotmail is redefining busy with= tools for the New Busy. Get more from your inbox. <a href=3D"http://www.wi= ndowslive.com/campaign/thenewbusy?ocid=3DPID28326::T:WLMTAGL:ON:WL:en-US:WM= _HMP:042010_2">See how.</a> <br /><hr />The New Busy is not the = old busy. Search=2C chat and e-mail from your inbox. <a href=3D'http://www.= windowslive.com/campaign/thenewbusy?ocid=3DPID28326::T:WLMTAGL:ON:WL:en-US:= WM_HMP:042010_3' target=3D'_new'>Get started.</a></body> </html>= --_1b4078c9-04f5-4cca-a220-e5b30eddef46_--
So my question is: how can I separate the HTML version from the text version using regex (or easier means)?
Several open source C # MIME parsers are available:
The last two are a bit outdated. If they are not easy to compile, their source may point you in the right direction.
, , , .. .. - Regex .
, Google MIME, nitty gritty. , RegEx ( , , RegEx , ).
, :
- 1b4078c9-04f5-4cca-A220-e5b30eddef46
MIME, MIME. MIME , HTML- , , . , , , , MIME.
, "Content-Type", MIME "border =". ( ) , .
, RegEx, , , . RegEx , , , , - :
myMessage.Split(myBoundary)
, HTML Content-Type: text/html;, , , , HTML. , , (.+)Content-Type: text/html; charset="iso-8859-1"(.+). 1, HTML 2. , . \n, .
Content-Type: text/html;
(.+)Content-Type: text/html; charset="iso-8859-1"(.+)
.
\n
, , .
: "--1b4078c9-04f5-4cca-a220-e5b30eddef46" - , MIME . .
, MIME, , : " " , . , - MIME, , .:) ( "" ) (/html). , , , , MIME. html- , .
, . . :
(1) Mimic separator of parts unknown (2) Known separator of parts of facial expressions, part mime unseen (3) Mime part seen, type not html (4) Mime part seen, type is html, empty line is not visible (5) Empty line is visible, mime invisible part separator (6) Mime separator markup (end of processing)
I am very fast, free, sloppy. Hope this helps.
John.