Skip to content

Newline embedded in email RFC2047 encoding raises exception when parsed #114906

@fsc-eriker

Description

@fsc-eriker

Bug report

Bug description:

I came across some messages where the sender had embedded a newline in the From: header's display string. This was (ostensibly) a legitimate sender, possibly hoping to get more real estate for their message in the recipient's inbox listing or something, or just making a configuration mistake.

Unfortunately, this crashes Python's email parser with policy=default. (The legacy parser works fine, simply because it doesn't attempt to unpeel RFC2047 encoding by default.)

>>> from email import message_from_bytes
>>> from email.policy import default
>>> message = message_from_bytes(b'From: =?UTF-8?Q?=0AMy_self?= <me@example.org>\r\n\r\nHello\r\n', policy=default)
>>> message.items()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/fsc-eriker/git/cpython/email/message.py", line 491, in items
    return [(k, self.policy.header_fetch_parse(k, v))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/fsc-eriker/git/cpython/email/message.py", line 491, in <listcomp>
    return [(k, self.policy.header_fetch_parse(k, v))
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/fsc-eriker/git/cpython/email/policy.py", line 163, in header_fetch_parse
    return self.header_factory(name, value)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/fsc-eriker/git/cpython/email/headerregistry.py", line 604, in __call__
    return self[name](name, value)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/fsc-eriker/git/cpython/email/headerregistry.py", line 192, in __new__
    cls.parse(value, kwds)
  File "/Users/fsc-eriker/git/cpython/email/headerregistry.py", line 346, in parse
    [Address(mb.display_name or '',
  File "/Users/fsc-eriker/git/cpython/email/headerregistry.py", line 346, in <listcomp>
    [Address(mb.display_name or '',
     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/fsc-eriker/git/cpython/email/headerregistry.py", line 33, in __init__
    raise ValueError("invalid arguments; address parts cannot contain CR or LF")
ValueError: invalid arguments; address parts cannot contain CR or LF
>>> 

While the error message is correct, it should probably not raise an exception; perhaps instead register a defect?

I have tested this on 3.11 out of the box and with the current sources from the cpython Github repo; but I would expect it to manifest on all versions of the modern email module and all platforms.

CPython versions tested on:

3.11

Operating systems tested on:

macOS

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions