Skip to content

Commit 9e0b30d

Browse files
tirangpshead
authored andcommitted
[3.11] pythongh-61460: Stronger HMAC in multiprocessing (pythonGH-20380)
bpo-17258: `multiprocessing` now supports stronger HMAC algorithms for inter-process connection authentication rather than only HMAC-MD5. Signed-off-by: Christian Heimes <[email protected]> gpshead: I Reworked to be more robust while keeping the idea. The protocol modification idea remains, but we now take advantage of the message length as an indicator of legacy vs modern protocol version. No more regular expression usage. We now default to HMAC-SHA256, but do so in a way that will be compatible when communicating with older clients or older servers. No protocol transition period is needed. More integration tests to verify these claims remain true are required. I'm unaware of anyone depending on multiprocessing connections between different Python versions. --------- (cherry picked from commit 3ed57e4) Co-authored-by: Christian Heimes <[email protected]> Signed-off-by: Christian Heimes <[email protected]> Co-authored-by: Gregory P. Smith [Google] <[email protected]>
1 parent f0895aa commit 9e0b30d

File tree

3 files changed

+254
-27
lines changed

3 files changed

+254
-27
lines changed

Lib/multiprocessing/connection.py

Lines changed: 205 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -740,39 +740,227 @@ def PipeClient(address):
740740
# Authentication stuff
741741
#
742742

743-
MESSAGE_LENGTH = 20
743+
MESSAGE_LENGTH = 40 # MUST be > 20
744744

745-
CHALLENGE = b'#CHALLENGE#'
746-
WELCOME = b'#WELCOME#'
747-
FAILURE = b'#FAILURE#'
745+
_CHALLENGE = b'#CHALLENGE#'
746+
_WELCOME = b'#WELCOME#'
747+
_FAILURE = b'#FAILURE#'
748748

749-
def deliver_challenge(connection, authkey):
749+
# multiprocessing.connection Authentication Handshake Protocol Description
750+
# (as documented for reference after reading the existing code)
751+
# =============================================================================
752+
#
753+
# On Windows: native pipes with "overlapped IO" are used to send the bytes,
754+
# instead of the length prefix SIZE scheme described below. (ie: the OS deals
755+
# with message sizes for us)
756+
#
757+
# Protocol error behaviors:
758+
#
759+
# On POSIX, any failure to receive the length prefix into SIZE, for SIZE greater
760+
# than the requested maxsize to receive, or receiving fewer than SIZE bytes
761+
# results in the connection being closed and auth to fail.
762+
#
763+
# On Windows, receiving too few bytes is never a low level _recv_bytes read
764+
# error, receiving too many will trigger an error only if receive maxsize
765+
# value was larger than 128 OR the if the data arrived in smaller pieces.
766+
#
767+
# Serving side Client side
768+
# ------------------------------ ---------------------------------------
769+
# 0. Open a connection on the pipe.
770+
# 1. Accept connection.
771+
# 2. Random 20+ bytes -> MESSAGE
772+
# Modern servers always send
773+
# more than 20 bytes and include
774+
# a {digest} prefix on it with
775+
# their preferred HMAC digest.
776+
# Legacy ones send ==20 bytes.
777+
# 3. send 4 byte length (net order)
778+
# prefix followed by:
779+
# b'#CHALLENGE#' + MESSAGE
780+
# 4. Receive 4 bytes, parse as network byte
781+
# order integer. If it is -1, receive an
782+
# additional 8 bytes, parse that as network
783+
# byte order. The result is the length of
784+
# the data that follows -> SIZE.
785+
# 5. Receive min(SIZE, 256) bytes -> M1
786+
# 6. Assert that M1 starts with:
787+
# b'#CHALLENGE#'
788+
# 7. Strip that prefix from M1 into -> M2
789+
# 7.1. Parse M2: if it is exactly 20 bytes in
790+
# length this indicates a legacy server
791+
# supporting only HMAC-MD5. Otherwise the
792+
# 7.2. preferred digest is looked up from an
793+
# expected "{digest}" prefix on M2. No prefix
794+
# or unsupported digest? <- AuthenticationError
795+
# 7.3. Put divined algorithm name in -> D_NAME
796+
# 8. Compute HMAC-D_NAME of AUTHKEY, M2 -> C_DIGEST
797+
# 9. Send 4 byte length prefix (net order)
798+
# followed by C_DIGEST bytes.
799+
# 10. Receive 4 or 4+8 byte length
800+
# prefix (#4 dance) -> SIZE.
801+
# 11. Receive min(SIZE, 256) -> C_D.
802+
# 11.1. Parse C_D: legacy servers
803+
# accept it as is, "md5" -> D_NAME
804+
# 11.2. modern servers check the length
805+
# of C_D, IF it is 16 bytes?
806+
# 11.2.1. "md5" -> D_NAME
807+
# and skip to step 12.
808+
# 11.3. longer? expect and parse a "{digest}"
809+
# prefix into -> D_NAME.
810+
# Strip the prefix and store remaining
811+
# bytes in -> C_D.
812+
# 11.4. Don't like D_NAME? <- AuthenticationError
813+
# 12. Compute HMAC-D_NAME of AUTHKEY,
814+
# MESSAGE into -> M_DIGEST.
815+
# 13. Compare M_DIGEST == C_D:
816+
# 14a: Match? Send length prefix &
817+
# b'#WELCOME#'
818+
# <- RETURN
819+
# 14b: Mismatch? Send len prefix &
820+
# b'#FAILURE#'
821+
# <- CLOSE & AuthenticationError
822+
# 15. Receive 4 or 4+8 byte length prefix (net
823+
# order) again as in #4 into -> SIZE.
824+
# 16. Receive min(SIZE, 256) bytes -> M3.
825+
# 17. Compare M3 == b'#WELCOME#':
826+
# 17a. Match? <- RETURN
827+
# 17b. Mismatch? <- CLOSE & AuthenticationError
828+
#
829+
# If this RETURNed, the connection remains open: it has been authenticated.
830+
#
831+
# Length prefixes are used consistently. Even on the legacy protocol, this
832+
# was good fortune and allowed us to evolve the protocol by using the length
833+
# of the opening challenge or length of the returned digest as a signal as
834+
# to which protocol the other end supports.
835+
836+
_ALLOWED_DIGESTS = frozenset(
837+
{b'md5', b'sha256', b'sha384', b'sha3_256', b'sha3_384'})
838+
_MAX_DIGEST_LEN = max(len(_) for _ in _ALLOWED_DIGESTS)
839+
840+
# Old hmac-md5 only server versions from Python <=3.11 sent a message of this
841+
# length. It happens to not match the length of any supported digest so we can
842+
# use a message of this length to indicate that we should work in backwards
843+
# compatible md5-only mode without a {digest_name} prefix on our response.
844+
_MD5ONLY_MESSAGE_LENGTH = 20
845+
_MD5_DIGEST_LEN = 16
846+
_LEGACY_LENGTHS = (_MD5ONLY_MESSAGE_LENGTH, _MD5_DIGEST_LEN)
847+
848+
849+
def _get_digest_name_and_payload(message: bytes) -> (str, bytes):
850+
"""Returns a digest name and the payload for a response hash.
851+
852+
If a legacy protocol is detected based on the message length
853+
or contents the digest name returned will be empty to indicate
854+
legacy mode where MD5 and no digest prefix should be sent.
855+
"""
856+
# modern message format: b"{digest}payload" longer than 20 bytes
857+
# legacy message format: 16 or 20 byte b"payload"
858+
if len(message) in _LEGACY_LENGTHS:
859+
# Either this was a legacy server challenge, or we're processing
860+
# a reply from a legacy client that sent an unprefixed 16-byte
861+
# HMAC-MD5 response. All messages using the modern protocol will
862+
# be longer than either of these lengths.
863+
return '', message
864+
if (message.startswith(b'{') and
865+
(curly := message.find(b'}', 1, _MAX_DIGEST_LEN+2)) > 0):
866+
digest = message[1:curly]
867+
if digest in _ALLOWED_DIGESTS:
868+
payload = message[curly+1:]
869+
return digest.decode('ascii'), payload
870+
raise AuthenticationError(
871+
'unsupported message length, missing digest prefix, '
872+
f'or unsupported digest: {message=}')
873+
874+
875+
def _create_response(authkey, message):
876+
"""Create a MAC based on authkey and message
877+
878+
The MAC algorithm defaults to HMAC-MD5, unless MD5 is not available or
879+
the message has a '{digest_name}' prefix. For legacy HMAC-MD5, the response
880+
is the raw MAC, otherwise the response is prefixed with '{digest_name}',
881+
e.g. b'{sha256}abcdefg...'
882+
883+
Note: The MAC protects the entire message including the digest_name prefix.
884+
"""
750885
import hmac
886+
digest_name = _get_digest_name_and_payload(message)[0]
887+
# The MAC protects the entire message: digest header and payload.
888+
if not digest_name:
889+
# Legacy server without a {digest} prefix on message.
890+
# Generate a legacy non-prefixed HMAC-MD5 reply.
891+
try:
892+
return hmac.new(authkey, message, 'md5').digest()
893+
except ValueError:
894+
# HMAC-MD5 is not available (FIPS mode?), fall back to
895+
# HMAC-SHA2-256 modern protocol. The legacy server probably
896+
# doesn't support it and will reject us anyways. :shrug:
897+
digest_name = 'sha256'
898+
# Modern protocol, indicate the digest used in the reply.
899+
response = hmac.new(authkey, message, digest_name).digest()
900+
return b'{%s}%s' % (digest_name.encode('ascii'), response)
901+
902+
903+
def _verify_challenge(authkey, message, response):
904+
"""Verify MAC challenge
905+
906+
If our message did not include a digest_name prefix, the client is allowed
907+
to select a stronger digest_name from _ALLOWED_DIGESTS.
908+
909+
In case our message is prefixed, a client cannot downgrade to a weaker
910+
algorithm, because the MAC is calculated over the entire message
911+
including the '{digest_name}' prefix.
912+
"""
913+
import hmac
914+
response_digest, response_mac = _get_digest_name_and_payload(response)
915+
response_digest = response_digest or 'md5'
916+
try:
917+
expected = hmac.new(authkey, message, response_digest).digest()
918+
except ValueError:
919+
raise AuthenticationError(f'{response_digest=} unsupported')
920+
if len(expected) != len(response_mac):
921+
raise AuthenticationError(
922+
f'expected {response_digest!r} of length {len(expected)} '
923+
f'got {len(response_mac)}')
924+
if not hmac.compare_digest(expected, response_mac):
925+
raise AuthenticationError('digest received was wrong')
926+
927+
928+
def deliver_challenge(connection, authkey: bytes, digest_name='sha256'):
751929
if not isinstance(authkey, bytes):
752930
raise ValueError(
753931
"Authkey must be bytes, not {0!s}".format(type(authkey)))
932+
assert MESSAGE_LENGTH > _MD5ONLY_MESSAGE_LENGTH, "protocol constraint"
754933
message = os.urandom(MESSAGE_LENGTH)
755-
connection.send_bytes(CHALLENGE + message)
756-
digest = hmac.new(authkey, message, 'md5').digest()
934+
message = b'{%s}%s' % (digest_name.encode('ascii'), message)
935+
# Even when sending a challenge to a legacy client that does not support
936+
# digest prefixes, they'll take the entire thing as a challenge and
937+
# respond to it with a raw HMAC-MD5.
938+
connection.send_bytes(_CHALLENGE + message)
757939
response = connection.recv_bytes(256) # reject large message
758-
if response == digest:
759-
connection.send_bytes(WELCOME)
940+
try:
941+
_verify_challenge(authkey, message, response)
942+
except AuthenticationError:
943+
connection.send_bytes(_FAILURE)
944+
raise
760945
else:
761-
connection.send_bytes(FAILURE)
762-
raise AuthenticationError('digest received was wrong')
946+
connection.send_bytes(_WELCOME)
763947

764-
def answer_challenge(connection, authkey):
765-
import hmac
948+
949+
def answer_challenge(connection, authkey: bytes):
766950
if not isinstance(authkey, bytes):
767951
raise ValueError(
768952
"Authkey must be bytes, not {0!s}".format(type(authkey)))
769953
message = connection.recv_bytes(256) # reject large message
770-
assert message[:len(CHALLENGE)] == CHALLENGE, 'message = %r' % message
771-
message = message[len(CHALLENGE):]
772-
digest = hmac.new(authkey, message, 'md5').digest()
954+
if not message.startswith(_CHALLENGE):
955+
raise AuthenticationError(
956+
f'Protocol error, expected challenge: {message=}')
957+
message = message[len(_CHALLENGE):]
958+
if len(message) < _MD5ONLY_MESSAGE_LENGTH:
959+
raise AuthenticationError('challenge too short: {len(message)} bytes')
960+
digest = _create_response(authkey, message)
773961
connection.send_bytes(digest)
774962
response = connection.recv_bytes(256) # reject large message
775-
if response != WELCOME:
963+
if response != _WELCOME:
776964
raise AuthenticationError('digest sent was rejected')
777965

778966
#

Lib/test/_test_multiprocessing.py

Lines changed: 47 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,7 @@
5050
import multiprocessing.managers
5151
import multiprocessing.pool
5252
import multiprocessing.queues
53+
from multiprocessing.connection import wait, AuthenticationError
5354

5455
from multiprocessing import util
5556

@@ -136,8 +137,6 @@ def _resource_unlink(name, rtype):
136137

137138
WIN32 = (sys.platform == "win32")
138139

139-
from multiprocessing.connection import wait
140-
141140
def wait_for_handle(handle, timeout):
142141
if timeout is not None and timeout < 0.0:
143142
timeout = None
@@ -3120,7 +3119,7 @@ def test_remote(self):
31203119
del queue
31213120

31223121

3123-
@hashlib_helper.requires_hashdigest('md5')
3122+
@hashlib_helper.requires_hashdigest('sha256')
31243123
class _TestManagerRestart(BaseTestCase):
31253124

31263125
@classmethod
@@ -3633,7 +3632,7 @@ def test_dont_merge(self):
36333632
#
36343633

36353634
@unittest.skipUnless(HAS_REDUCTION, "test needs multiprocessing.reduction")
3636-
@hashlib_helper.requires_hashdigest('md5')
3635+
@hashlib_helper.requires_hashdigest('sha256')
36373636
class _TestPicklingConnections(BaseTestCase):
36383637

36393638
ALLOWED_TYPES = ('processes',)
@@ -3936,7 +3935,7 @@ def test_copy(self):
39363935

39373936

39383937
@unittest.skipUnless(HAS_SHMEM, "requires multiprocessing.shared_memory")
3939-
@hashlib_helper.requires_hashdigest('md5')
3938+
@hashlib_helper.requires_hashdigest('sha256')
39403939
class _TestSharedMemory(BaseTestCase):
39413940

39423941
ALLOWED_TYPES = ('processes',)
@@ -4777,7 +4776,7 @@ def test_invalid_handles(self):
47774776

47784777

47794778

4780-
@hashlib_helper.requires_hashdigest('md5')
4779+
@hashlib_helper.requires_hashdigest('sha256')
47814780
class OtherTest(unittest.TestCase):
47824781
# TODO: add more tests for deliver/answer challenge.
47834782
def test_deliver_challenge_auth_failure(self):
@@ -4797,7 +4796,7 @@ def __init__(self):
47974796
def recv_bytes(self, size):
47984797
self.count += 1
47994798
if self.count == 1:
4800-
return multiprocessing.connection.CHALLENGE
4799+
return multiprocessing.connection._CHALLENGE
48014800
elif self.count == 2:
48024801
return b'something bogus'
48034802
return b''
@@ -4807,14 +4806,52 @@ def send_bytes(self, data):
48074806
multiprocessing.connection.answer_challenge,
48084807
_FakeConnection(), b'abc')
48094808

4809+
4810+
@hashlib_helper.requires_hashdigest('md5')
4811+
@hashlib_helper.requires_hashdigest('sha256')
4812+
class ChallengeResponseTest(unittest.TestCase):
4813+
authkey = b'supadupasecretkey'
4814+
4815+
def create_response(self, message):
4816+
return multiprocessing.connection._create_response(
4817+
self.authkey, message
4818+
)
4819+
4820+
def verify_challenge(self, message, response):
4821+
return multiprocessing.connection._verify_challenge(
4822+
self.authkey, message, response
4823+
)
4824+
4825+
def test_challengeresponse(self):
4826+
for algo in [None, "md5", "sha256"]:
4827+
with self.subTest(f"{algo=}"):
4828+
msg = b'is-twenty-bytes-long' # The length of a legacy message.
4829+
if algo:
4830+
prefix = b'{%s}' % algo.encode("ascii")
4831+
else:
4832+
prefix = b''
4833+
msg = prefix + msg
4834+
response = self.create_response(msg)
4835+
if not response.startswith(prefix):
4836+
self.fail(response)
4837+
self.verify_challenge(msg, response)
4838+
4839+
# TODO(gpshead): We need integration tests for handshakes between modern
4840+
# deliver_challenge() and verify_response() code and connections running a
4841+
# test-local copy of the legacy Python <=3.11 implementations.
4842+
4843+
# TODO(gpshead): properly annotate tests for requires_hashdigest rather than
4844+
# only running these on a platform supporting everything. otherwise logic
4845+
# issues preventing it from working on FIPS mode setups will be hidden.
4846+
48104847
#
48114848
# Test Manager.start()/Pool.__init__() initializer feature - see issue 5585
48124849
#
48134850

48144851
def initializer(ns):
48154852
ns.test += 1
48164853

4817-
@hashlib_helper.requires_hashdigest('md5')
4854+
@hashlib_helper.requires_hashdigest('sha256')
48184855
class TestInitializers(unittest.TestCase):
48194856
def setUp(self):
48204857
self.mgr = multiprocessing.Manager()
@@ -5729,7 +5766,7 @@ def is_alive(self):
57295766
any(process.is_alive() for process in forked_processes))
57305767

57315768

5732-
@hashlib_helper.requires_hashdigest('md5')
5769+
@hashlib_helper.requires_hashdigest('sha256')
57335770
class TestSyncManagerTypes(unittest.TestCase):
57345771
"""Test all the types which can be shared between a parent and a
57355772
child process by using a manager which acts as an intermediary
@@ -6174,7 +6211,7 @@ def install_tests_in_module_dict(remote_globs, start_method,
61746211
class Temp(base, Mixin, unittest.TestCase):
61756212
pass
61766213
if type_ == 'manager':
6177-
Temp = hashlib_helper.requires_hashdigest('md5')(Temp)
6214+
Temp = hashlib_helper.requires_hashdigest('sha256')(Temp)
61786215
Temp.__name__ = Temp.__qualname__ = newname
61796216
Temp.__module__ = __module__
61806217
remote_globs[newname] = Temp
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
:mod:`multiprocessing` now supports stronger HMAC algorithms for inter-process
2+
connection authentication rather than only HMAC-MD5.

0 commit comments

Comments
 (0)