Skip to content

Regression in multipart parsing #1354

@tlghacc

Description

@tlghacc

After switching from Debian Stretch to Buster (ie. from 2.6.4 to 2.7.1) on some of our company servers we noticed one of our applications corrupts encoding in multipart messages. When investigating, git bisect showed commit 21222e1 (Eliminate attachment corruption caused by line ending conversions) as a culprit.

Consider this example message:

To: me@email
Subject: email
From: someone@email
Content-Type: multipart/alternative;
  boundary="b1_7d9a9542dac6553ec140c72503fdec38"
Content-Transfer-Encoding: 8bit

--b1_7d9a9542dac6553ec140c72503fdec38
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit

UTF-8 text ě

--b1_7d9a9542dac6553ec140c72503fdec38
Content-Type: text/html; charset=utf-8
Content-Transfer-Encoding: 8bit

<body>UTF-8 html ě</body>

--b1_7d9a9542dac6553ec140c72503fdec38--

Example script:

require 'mail'
mail = Mail.read_from_string(
File.open('mail-multipart.txt', 'r:binary') do
|h| h.readlines.join
end
)
puts mail.parts.length # Output is "1"

Our application serves as a bridge between PHP apps (which think they are talking to sendmail) and the actual sendmail. It parses the e-mail using mail gem and performs some checks before forwarding it. Message is thus received from standard input, which needs to be set to binary encoding as the PHP apps themselves may pass data in whichever encoding their authors choose (making anything except binary encoding on Ruby side potentially invalid.)

Mail gem defaults to quoted-printable and converts message body accordingly. However - since both parts are parsed and considered as one, ie. second part is considered to be a body of the first part - part headers are converted as well, which results in this:

Content-Type: text/html; charset=3Dutf-8 (note the "=" character converted into "=3D")

Unsuprisingly, HTML version of the message is not showing correctly in mail clients after that.

Since this regression manifested after eliminating LF into CRLF conversion, my quick-fix was an alteration of the regular expression "parts_regex" in extract_parts method in mail/body.rb - changed "\r\n" into "\r?\n". To my knowledge this should not break anything, so I am posting this change for consideration.

(Sorry for absence of formatting, the editor decided to not cooperate for some reason.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions