Skip to content

[WIP] Issue with emails attached to an email #43

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 35 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
87ffae7
Fixes tedivm/Fetch#43 by creating a name for nameless attached emails…
AdrianTP Apr 2, 2014
39eba97
Removed error_log statements I mistakenly left in while debugging; co…
AdrianTP Apr 7, 2014
3ac1f49
Abstracted messageBody processing from processStructure and enabled e…
AdrianTP Apr 17, 2014
73c26e9
Forgot to document the new method.
AdrianTP Apr 17, 2014
8281c08
Some more changes to support pulling the Subject line from a .eml and…
AdrianTP May 12, 2014
b225230
Fixed issue with the Subject line parsing regex, which would cause it…
AdrianTP May 12, 2014
45741af
syntax error
AdrianTP May 12, 2014
85241d7
Fixing bugs reported by scrutinizer ('The class Fetch\Exception does …
AdrianTP May 12, 2014
1a613d4
Converted tabs to 4 spaces. Moved opening braces to the line after th…
AdrianTP May 13, 2014
881a3ff
Ran PHP-CS-Fixer.
AdrianTP May 13, 2014
5a1150e
Subject-to-Filename parsing got confused by DKIM-Signature section of…
AdrianTP May 13, 2014
e585f16
My IDE lost its tabs --> spaces setting, and ended up putting tabs in…
AdrianTP May 13, 2014
ece5dfd
Made some changes to the processing of email contents and encoded Sub…
AdrianTP May 20, 2014
7078427
Simplified encoded Subject-line processing into filename. Skip proces…
AdrianTP May 23, 2014
a01932d
The code was still causing attached HTML files to be inlined. Now any…
AdrianTP May 27, 2014
766db0a
Merge pull request #75 from Aeolun/master
tedivm Jul 28, 2014
5ee7fc8
Fixes tedivm/Fetch#43 by creating a name for nameless attached emails…
AdrianTP Apr 2, 2014
c9c5435
Removed error_log statements I mistakenly left in while debugging; co…
AdrianTP Apr 7, 2014
d2fd08e
Abstracted messageBody processing from processStructure and enabled e…
AdrianTP Apr 17, 2014
14ace38
Forgot to document the new method.
AdrianTP Apr 17, 2014
827dea1
Some more changes to support pulling the Subject line from a .eml and…
AdrianTP May 12, 2014
7572fa4
Fixed issue with the Subject line parsing regex, which would cause it…
AdrianTP May 12, 2014
ebce9b8
syntax error
AdrianTP May 12, 2014
3b7398e
Fixing bugs reported by scrutinizer ('The class Fetch\Exception does …
AdrianTP May 12, 2014
1672250
Converted tabs to 4 spaces. Moved opening braces to the line after th…
AdrianTP May 13, 2014
3540360
Ran PHP-CS-Fixer.
AdrianTP May 13, 2014
a4c98f6
Subject-to-Filename parsing got confused by DKIM-Signature section of…
AdrianTP May 13, 2014
a44a856
My IDE lost its tabs --> spaces setting, and ended up putting tabs in…
AdrianTP May 13, 2014
1143d4c
Made some changes to the processing of email contents and encoded Sub…
AdrianTP May 20, 2014
79d96ff
Simplified encoded Subject-line processing into filename. Skip proces…
AdrianTP May 23, 2014
15e5160
The code was still causing attached HTML files to be inlined. Now any…
AdrianTP May 27, 2014
dc98c0f
local merge
AdrianTP Jul 30, 2014
27c1c88
Travis build failed on bootstrap.php because of a change my push/merg…
AdrianTP Jul 30, 2014
b6fa0f4
Fixing another error which I don't recall causing.
AdrianTP Jul 30, 2014
0c3f751
Ran PHP-CS-Fixer again.
AdrianTP Jul 30, 2014
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 13 additions & 5 deletions src/Fetch/Attachment.php
Original file line number Diff line number Diff line change
Expand Up @@ -109,20 +109,28 @@ public function __construct(Message $message, $structure, $partIdentifier = null
}

/**
* This function returns the data of the attachment. Combined with getMimeType() it can be used to directly output
* data to a browser.
* This function returns the data of the attachment. Combined with
* getMimeType() it can be used to directly output data to a browser.
*
* If the attachment file is message/rfc822, skip processing/decoding the
* contents in order to avoid mangling the file. Otherwise, decode as
* normal to ensure other files are handled correctly.
*
* @return string
*/
public function getData()
{
if (!isset($this->data)) {
$messageBody = isset($this->partId) ?
$rawBody = isset($this->partId) ?
imap_fetchbody($this->imapStream, $this->messageId, $this->partId, FT_UID)
: imap_body($this->imapStream, $this->messageId, FT_UID);

$messageBody = Message::decode($messageBody, $this->encoding);
$this->data = $messageBody;
if (strpos(strtolower($this->mimeType), "rfc822") !== false) {
$this->data = $rawBody;
} else {
$decodedBody = Message::decode($rawBody, $this->encoding);
$this->data = $decodedBody;
}
}

return $this->data;
Expand Down
218 changes: 167 additions & 51 deletions src/Fetch/Message.php
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,19 @@
*/
class Message
{
/**
* Primary Body Types
* According to http://www.php.net/manual/en/function.imap-fetchstructure.php
*/
const TYPE_TEXT = 0;
const TYPE_MULTIPART = 1;
const TYPE_MESSAGE = 2;
const TYPE_APPLICATION = 3;
const TYPE_AUDIO = 4;
const TYPE_IMAGE = 5;
const TYPE_VIDEO = 6;
const TYPE_OTHER = 7;

/**
* This is the connection/mailbox class that the email came from.
*
Expand Down Expand Up @@ -167,25 +180,7 @@ class Message
*
* @var string
*/
public static $charset = 'UTF-8';

/**
* This value defines the flag set for encoding if the mb_convert_encoding
* function can't be found, and in this case iconv encoding will be used.
*
* @var string
*/
public static $charsetFlag = '//TRANSLIT';

/**
* These constants can be used to easily access available flags
*/
const FLAG_RECENT = 'recent';
const FLAG_FLAGGED = 'flagged';
const FLAG_ANSWERED = 'answered';
const FLAG_DELETED = 'deleted';
const FLAG_SEEN = 'seen';
const FLAG_DRAFT = 'draft';
public static $charset = 'UTF-8//TRANSLIT';

/**
* This constructor takes in the uid for the message and the Imap class representing the mailbox the
Expand All @@ -212,7 +207,6 @@ public function __construct($messageUniqueId, Server $connection)
*/
protected function loadMessage()
{

/* First load the message overview information */

if(!is_object($messageOverview = $this->getOverview()))
Expand Down Expand Up @@ -251,8 +245,24 @@ protected function loadMessage()
$this->processStructure($structure);
} else {
// multipart
foreach ($structure->parts as $id => $part)
foreach ($structure->parts as $id => $part) {
if (!empty($part->description)) {
$cleanFilename = $this->makeFilenameSafe($part->description);
$part->description = $cleanFilename;
foreach ($part->parameters as $key => $parameter) {
if ($parameter->attribute === "name") {
$part->parameters[$key]->value = $cleanFilename;
}
}
foreach ($part->dparameters as $key => $dparameter) {
if ($dparameter->attribute === "filename") {
$part->dparameters[$key]->value = $cleanFilename;
}
}
}

$this->processStructure($part, $id + 1);
}
}

return true;
Expand Down Expand Up @@ -434,35 +444,138 @@ public function getImapBox()
}

/**
* This function takes in a structure and identifier and processes that part of the message. If that portion of the
* message has its own subparts, those are recursively processed using this function.
* Adds an attachment
*
* @param \stdClass $structure
* @param string $partIdentifier
* If a filename is not provided and the attachment is a message/rfc822
* email, parse the Subject line and use it as the filename. If the Subject
* line is blank or illegible, use a default filename (like Gmail and some
* desktop clients do)
*
* @param array $parameters
* @param \stdClass $structure
* @param string $partIdentifier
* @return boolean Successful attachment of file
*/
protected function processStructure($structure, $partIdentifier = null)
protected function addAttachment($parameters, $structure, $partIdentifier)
{
$parameters = self::getParametersFromStructure($structure);
if (!(isset($parameters["name"]) || isset($parameters["filename"])) && $structure->type == self::TYPE_MESSAGE) {
$body = isset($partIdentifier) ?
imap_fetchbody($this->imapStream, $this->uid, $partIdentifier, FT_UID)
: imap_body($this->imapStream, $this->uid, FT_UID);

$headers = iconv_mime_decode_headers($body, 0, self::$charset);
$filename = !empty($headers["Subject"]) ? $this->makeFilenameSafe($headers["Subject"]) : "email";

if (isset($parameters['name']) || isset($parameters['filename'])) {
$dpar = new \stdClass();
$dpar->attribute = "filename";
$dpar->value = str_replace(array("\r", "\n"), '', $filename) . ".eml";
$structure->dparameters[] = $dpar;
}

try {
$attachment = new Attachment($this, $structure, $partIdentifier);
$this->attachments[] = $attachment;
} elseif ($structure->type == 0 || $structure->type == 1) {
$messageBody = isset($partIdentifier) ?

return true;
} catch (\Exception $e) {
return false;
}
}

/**
* This function extracts the body of an email part, strips harmful
* Outlook-specific strings from it, processes any encoded one-liners,
* decodes it, converts it to the charset of the parent message, and
* returns the result.
*
* @param array $parameters
* @param \stdClass $structure
* @param string $partIdentifier
* @return string
*/
protected function processBody($structure, $partIdentifier)
{
$rawBody = isset($partIdentifier) ?
imap_fetchbody($this->imapStream, $this->uid, $partIdentifier, FT_UID)
: imap_body($this->imapStream, $this->uid, FT_UID);

$messageBody = self::decode($messageBody, $structure->encoding);
$bodyNoOutlook = $this->stripOutlookSpecificStrings($rawBody);

$decodedBody = self::decode($bodyNoOutlook, $structure->encoding);

$inCharset = $inCharset = mb_detect_encoding($decodedBody, array(
"US-ASCII",
"ISO-8859-1",
"UTF-8",
"UTF-7",
"ASCII",
"EUC-JP",
"SJIS",
"eucJP-win",
"SJIS-win",
"JIS",
"ISO-2022-JP",
"UTF-16",
"UTF-32",
"UCS2",
"UCS4")
);

if ($inCharset && $inCharset !== self::$charset) {
$decodedBody = iconv($inCharset, self::$charset, $decodedBody);
}

return $decodedBody;
}

if (!empty($parameters['charset']) && $parameters['charset'] !== self::$charset) {
if (function_exists('mb_convert_encoding')) {
$messageBody = mb_convert_encoding($messageBody, self::$charset, $parameters['charset']);
} else {
$messageBody = iconv($parameters['charset'], self::$charset . self::$charsetFlag, $messageBody);
}
}
/**
* Removes "Thread-Index:" line from the message body which is placed there
* by Outlook and messes up the other processing steps.
*
* @param string $messageBody
* @return string
*/
protected function stripOutlookSpecificStrings($bodyBefore)
{
$bodyAfter = preg_replace('/Thread-Index:.*$/m', "", $bodyBefore);

if (strtolower($structure->subtype) === 'plain' || ($structure->type == 1 && strtolower($structure->subtype) !== 'alternative')) {
return $bodyAfter;
}

/**
* This function takes in a string to be used as a filename and replaces
* any dangerous characters with underscores to ensure compatibility with
* various file systems
*
* @param string $oldName
* @return string
*/
protected function makeFilenameSafe($oldName)
{
return preg_replace('/[<>"{}|\\\^\[\]`;\/\?:@&=$,]/',"_", $oldName);
}

/**
* This function takes in a structure and identifier and processes that part of the message. If that portion of the
* message has its own subparts, those are recursively processed using this function.
*
* @param \stdClass $structure
* @param string $partIdentifier
*/
protected function processStructure($structure, $partIdentifier = null)
{
$attached = false;

// TODO: Get HTML attachments working, too!
if (isset($structure->disposition) && $structure->disposition == "attachment") {
$parameters = self::getParametersFromStructure($structure);
$attached = $this->addAttachment($parameters, $structure, $partIdentifier);
}

if (!$attached && ($structure->type == self::TYPE_TEXT || $structure->type == self::TYPE_MULTIPART)) {
$messageBody = $this->processBody($structure, $partIdentifier);

if (strtolower($structure->subtype) === 'plain' || ($structure->type == self::TYPE_MULTIPART && strtolower($structure->subtype) !== 'alternative')) {
if (isset($this->plaintextMessage)) {
$this->plaintextMessage .= PHP_EOL . PHP_EOL;
} else {
Expand All @@ -479,17 +592,16 @@ protected function processStructure($structure, $partIdentifier = null)

$this->htmlMessage .= $messageBody;
}
}

if (isset($structure->parts)) { // multipart: iterate through each part
if (isset($structure->parts)) { // multipart: iterate through each part
foreach ($structure->parts as $partIndex => $part) {
$partId = $partIndex + 1;

foreach ($structure->parts as $partIndex => $part) {
$partId = $partIndex + 1;
if (isset($partIdentifier))
$partId = $partIdentifier . '.' . $partId;

if (isset($partIdentifier))
$partId = $partIdentifier . '.' . $partId;

$this->processStructure($part, $partId);
$this->processStructure($part, $partId);
}
}
}
}
Expand Down Expand Up @@ -566,13 +678,17 @@ public static function typeIdToString($id)
public static function getParametersFromStructure($structure)
{
$parameters = array();
if (isset($structure->parameters))
foreach ($structure->parameters as $parameter)
if (isset($structure->parameters)) {
foreach ($structure->parameters as $parameter) {
$parameters[strtolower($parameter->attribute)] = $parameter->value;
}
}

if (isset($structure->dparameters))
foreach ($structure->dparameters as $parameter)
if (isset($structure->dparameters)) {
foreach ($structure->dparameters as $parameter) {
$parameters[strtolower($parameter->attribute)] = $parameter->value;
}
}

return $parameters;
}
Expand Down
2 changes: 0 additions & 2 deletions src/Fetch/Server.php
Original file line number Diff line number Diff line change
Expand Up @@ -159,8 +159,6 @@ public function setMailBox($mailbox = '')
return false;
}



$this->mailbox = $mailbox;
if (isset($this->imapStream)) {
$this->setImapStream();
Expand Down