Regarding character decoding and mime decoding - mime-types

I have developed a program in java which fetches subject, sender, from and datetime of email information from an email account. I have done that using html parser and httpclient. I have two problems.
When I parse a subject string of the email I get some wiered character sometimes. for e.g. if subject is "Hi Mr. müller", I receive subject string as "Hi Mr. müller". As you can see it's not giving ü character properly. Any idea which encoding is this ? Is it UTF-8 ? How do I decode it to get the original string ?
I have also received email information like subject, sender, receiver, datetime etc. from yahoo account with pop3. In that I have noticed when the sender email id contains ü or ue (for e.g. reva.müller#gmx.de), it encodes it like ('=?iso-8859-1?Q?=22Reva_M=FCller=22?= '). Any idea about which encoding is this ? Is it mime encoding ? How do I decode it in java to get correct sender string ?
I would really appreciate any help.....

You need to read the RFC: http://www.ietf.org/rfc/rfc2045.txt. It will tell you how to interpret those = signs.
See "6.7. Quoted-Printable Content-Transfer-Encoding".
Also look for a Content-Type header to clue you in on the encoding.

Related

Unable to retrieve attachments from a signed mail having ContentType as "application/pkcs7-mime; name=smime.p7m"

I am trying to read a digitally signed mail from java code using multipart and mime messaging and fetch the attachments (xml, pdf, txt etc.,) and message details.
My code is working fine for mails having Content-Type as : multipart/signed; protocol="application/x-pkcs7-signature";
But For few mails having Content-Type as : application/pkcs7-mime; smime-type=signed-data; name=smime.p7m it is not fetching the attachments and message details. Can anyone explain what is the difference between both of them and how to resolve it.
I recently came across this issue myself, and although this question is three month old, I leave an answer with my findings, just in case.
Both kinds of messages are instances of S/MIME signed messages as specified in RFC2633 (https://www.rfc-editor.org/rfc/rfc2633).
The multipart/signed; protocol="application/x-pkcs7-signature" indicates a clear-signed message (section 3.4.3.3 of the RFC), meaning you can read the original message content without having S/MIME capabilities in your client code. Hence no problem with these.
The application/pkcs7-mime; smime-type=signed-data; name=smime.p7m indicates an S/MIME signedData email (section 3.4.2) Your client code needs S/MIME capability in order to read the original message (even if you don’t care about the signature).
Easiest way (worked for me) is to use bouncycastle's SMIMESigned class (from the S/MIME API, https://mvnrepository.com/artifact/org.bouncycastle/bcmail-jdk15on), like this:
byte[] content = <the signed data's content as byte[]>;
ByteArrayDataSource dataSource = new ByteArrayDataSource(content,"multipart/signed");
SMIMESigned signedData = new SMIMESigned(new MimeMultipart(dataSource));
MimeBodyPart bodyPart = signedData.getContent();
<you can process the body part as normal from here>

libcurl imaps doesn't returns Header Field "Content-Transfer-Encoding" or "Content-Type"

I have been trying to detect encoding of an email manually, to see whether the email that i have just fetched needed to be base64 decoded or not, but that wasn't a complete solution.
So now I'm trying to download the headers (fields) of an email first, check them what kind of email it is and then proceed to decoding it with base64 in case the header says that it is a base64 encoded email or just skip it if it is a plain text or HTML text.
The problem is that the libcurl commands for fetching these fields doesn't really work, most of the time the "Content-Type" returns just an empty string or says that it is a multipart/alternative but when i check the email manually it is just a plain text which obviously doesn't need to be decoded.??
"Content-Type" field multipart/alternative usually have different parts like text, html, base64 encoded text, attachments etc.
In cases of multipart/alternative the "Content-Transfer-Encoding" field doesn't return anything at all, and this is the most important header field for me to know what it contains.
My imaps request to Gmail account looks like this:
curl_easy_setopt(curl, CURLOPT_URL,"imaps://imap.gmail.com:993/INBOX;UID=33;SECTION=HEADER.FIELDS%%20(Content-Transfer-Encoding%%20Subject%%20From%%20Content-Type)");
Which returns this:
Subject: my subject
From: myname
Content-Type: multipart/alternative; boundary="001a114930a049c6da05dskdls9"
As you can see it doesn't returns the "Content-Transfer-Encoding" field.
This email actually contains the word "Yup" only, so it is a plain text and no attachments, when checked by clicking on the "show original" message in browser.
Sys info:
Linux, C , libcurl, gmail

CakePHP Email Transport Encoding

I've created a custom transport for CakeEmail (to allow me to use Mandrill to send email). However, whenever I access the content of the message (which is cake email template driven), it doesn't encode the characters correctly (it changes 'é' to 'é', etc). If I use CakeEmail and bypass the transport, it displays the characters correctly in the email. I've narrowed this down to $email->message('html') in the transport code. If I output $email->message('html'), the characters are already incorrect.
App::uses('AbstractTransport', 'Network/Email');
App::uses('HttpSocket', 'Network/Http');
class MandrillTransport extends AbstractTransport {
public function send(CakeEmail $email) {
debug($email->message('html'));exit;
}
}
Thoughts?
You most probably have an encoding mismatch somewhere, if for example your App.encoding doesn't match your CakeEmail::$charset, CakeEmail would try to convert the content from App.encoding to CakeEmail::$charset.
https://github.com/cakephp/.../2.6.2/lib/Cake/Network/Email/CakeEmail.php#L1338
If for example the former were iso-8859-1, and the latter utf-8, just like the content, you would end up with the result you are showing here.
// outputs é when displayed as utf-8/unicode
echo mb_convert_encoding('é', 'utf-8', 'iso-8859-1');
You'll have to do some further debugging to trace down where exactly things are going wrong.

How can I send a file, for example "exe", with Telnet?

With smtp, i know the commands: "HELO", "MAIL FROM", "RCPT TO", "QUIT", but i don't know how can i attach one file. Anyone can help me ?
telnet smtp.xxxx.xxxx 25
helo xxxx.xxxx
mail from: yyy#xxx.xxx
rcpt to: yy2#xxx.xxx
data
subject: hi
hello
.
quit
This is not something that can be easily done. You probably want to use a library (ex : in python) that will take care of formatting your email according to your needs.
In very brief :
Sending an attachement requires the email to be formatted according to the MIME RFC
A MIME formatted message will use some delimiters to separate the different parts of the message (ex : a plain text part, an HTML part, an attachment part, etc...)
Each MIME part will be prefixed by a header detailing the part content
an attachment part will be identified by a "Content-disposition" header, as detailed in RFC 2183.
The representation of your file will have to be specified using the "Content-Transfer-Encoding" header, described in RFC 2045. A common way to encode files for mail transfer is base64.
If you want to get an idea of how complex it is to generate an email with a valid attachment, you can use your email client to check the source of an email with an attachment (most email clients have this feature). That should eventually convince you to avoid doing this manually :)

How to read body of mail using javax.mail.internet.MimeMulitpart

I am trying to read the body contents of a mail and i keep getting the following "Message" string...but not actual body of mail
SentDate : Mon May 21 14:56:47 CAT 2012
From : {FROM}
Subject : TEST
Message :javax.mail.internet.MimeMultipart#320e7a
Any help/advice on what exactly it is i might be missing?
Thanking you in adnvance
Faheem
A mail could be plain text, html or Multipart(text + attachments), Multipart Alternative(text + html) etc.
You have to iterate through each BodyPart to know it's type and then get the content accordingly. This javamail FAQ entry could help you.

Resources