I'm piping incoming mail into a PHP script, immediately storing the RAW email in a MySQL db. It works very well, except ~0.7% of emails arrive with a truncated message body.
I found someone whose emails were failing, and had them send an email TO my gmail account AND to the server. Gmail had no problems, I saw the whole message. But my server cropped the raw message like so:
Delivered-To: asdasd@gmail.com
Received: by 10.152.1.193 with SMTP id 1csp3490lao;
Mon, 20 Oct 2014 05:33:31 -0700 (PDT)
Return-Path:
Received: from vps123.blahblah.com (vps123.blahblah.com. [74.124.111.111])
by mx.google.com with ESMTPS id fb7si7786786pab.30.2014.10.20.05.33.30
for
(version=TLSv1 cipher=RC4-SHA bits=128/128);
Mon, 20 Oct 2014 05:33:30 -0700 (PDT)
Message-ID: <14FBD481E1074C79AF3D@acerDator>
From: =?utf-8?Q?sende=C3=A4r?=
To: "test"
References:
Subject: Message body will contain only Det h
Date: Mon, 20 Oct 2014 14:33:24 +0200
MIME-Version: 1.0
Content-Type: multipart/alternative;
boundary="----=_NextPart_000_0018_01CFEC72.CE424470"
X-Priority: 3
X-MSMail-Priority: Normal
Importance: Normal
X-Mailer: Microsoft Windows Live Mail 14.0.8117.416
X-MimeOLE: Produced By Microsoft MimeOLE V14.0.8117.416
X-Source:
X-Source-Args:
X-Source-Dir:
Det här är ett flerdelat meddelande i MIME-format.
------=_NextPart_000_0018_01CFEC72.CE424470
Content-Type: text/plain;
charset="utf-8"
Content-Transfer-Encoding: quoted-printable
This email will not be received correctly. EXIM may not handle =
some poorly formed emails. For example ...
Det h=E4r =E4r ett flerdelat meddelande i MIME-format.
... is directly above this quoted-printable wrapper, thanks to the =
Swedish email client Microsoft Windows Live (circa 2009), adding UTF-8 =
chars where there should only be ascii. At least, that's what I think =
the problem is.
------=_NextPart_000_0018_01CFEC72.CE424470--
My server crops the message immediately before the first foreign character. The stored raw data contains the headers, a blank line, "Det h", and nothing else.
When I pipe the above email into the PHP script in the shell (/blah/email_in.php < bademail.txt
), and it stores the message perfectly. So I don't think my script is at fault, it stores the raw STDIN correctly.
I used cPanel to "Set Default Address" to "Pipe to a program". I don't know whether or not this setting bypasses EXIM entirely, but I read somewhere that EXIM handles the pipe transport, so my first guess is that EXIM is mangling a poorly formatted message, and choking the stream at the first unicode character ä.
To confirm this, I need a way to pipe email INTO EXIM, basically tricking EXIM into thinking it just received an email when actually it just received a txt file. I've found several tutorials on how to telnet to port 25, etc., but nothing that would preserve the headers, multipart boundaries, nor that made sense to a unix n00b like me that relies on cPanel.
Am I correct about EXIM being the likely culprit?
Can anyone suggest a way to test this, or an alternative approach?
My server runs EXIM + Dovecot on CentOS 6.5.
p.s. My only other thought is to let the server store mail normally, and if these messages are magically stored correctly, to use IMAP to retrieve/delete the messages rather than going directly into the pipe... seems less efficient to add the IMAP middleman, though this approach is probably more robust.
No comments:
Post a Comment