Mercurial > p > roundup > code
annotate roundup/anypy/email_.py @ 5548:fea11d05110e
Avoid errors from selecting "no selection" on multilink (issue2550722).
As discussed in issue 2550722 there are various cases where selecting
"no selection" on a multilink can result in inappropriate errors from
Roundup:
* If selecting "no selection" produces a null edit (a value was set in
the multilink in an edit with an error, then removed again, along
with all other changes, in the next form submission), so the page is
rendered from the form contents including the "-<id>" value for "no
selection" for the multilink.
* If creating an item with a nonempty value for a multilink has an
error, and the resubmission changes that multilink to "no selection"
(and this in turn has subcases, according to whether the creation
then succeeds or fails on the resubmission, which need fixes in
different places in the Roundup code).
All of these cases have in common that it is expected and OK to have a
"-<id>" value for a submission for a multilink when <id> is not set in
that multilink in the database (because the original attempt to set
<id> in that multilink had an error), so the hyperdb.py logic to give
an error in that case is thus removed. In the subcase of the second
case where the resubmission with "no selection" has an error, the
templating code tries to produce a menu entry for the "-<id>"
multilink value, which also results in an error, hence the
templating.py change to ignore such values in the list for a
multilink.
| author | Joseph Myers <jsm@polyomino.org.uk> |
|---|---|
| date | Thu, 27 Sep 2018 11:33:01 +0000 |
| parents | 29346d92d80c |
| children | cacef71b3a54 |
| rev | line source |
|---|---|
|
4575
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
1 import re |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
2 import binascii |
|
4979
f1a2bd1dea77
issue2550877: Writing headers with the email module will use continuation_ws = ' ' now for python 2.5 and 2.6 when importing anypy.email_.
Bernhard Reiter <bernhard@intevation.de>
parents:
4575
diff
changeset
|
3 import email |
|
4575
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
4 from email import quoprimime, base64mime |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
5 |
|
5494
b7fa56ced601
use gpg module instead of pyme module for PGP encryption
Christof Meerwald <cmeerw@cmeerw.org>
parents:
5421
diff
changeset
|
6 if str == bytes: |
|
b7fa56ced601
use gpg module instead of pyme module for PGP encryption
Christof Meerwald <cmeerw@cmeerw.org>
parents:
5421
diff
changeset
|
7 message_from_bytes = email.message_from_string |
|
5542
29346d92d80c
Fix email interfaces with Python 3 (issue 2550974, issue 2551000).
Joseph Myers <jsm@polyomino.org.uk>
parents:
5494
diff
changeset
|
8 message_from_binary_file = email.message_from_file |
|
5494
b7fa56ced601
use gpg module instead of pyme module for PGP encryption
Christof Meerwald <cmeerw@cmeerw.org>
parents:
5421
diff
changeset
|
9 else: |
|
b7fa56ced601
use gpg module instead of pyme module for PGP encryption
Christof Meerwald <cmeerw@cmeerw.org>
parents:
5421
diff
changeset
|
10 message_from_bytes = email.message_from_bytes |
|
5542
29346d92d80c
Fix email interfaces with Python 3 (issue 2550974, issue 2551000).
Joseph Myers <jsm@polyomino.org.uk>
parents:
5494
diff
changeset
|
11 message_from_binary_file = email.message_from_binary_file |
|
5494
b7fa56ced601
use gpg module instead of pyme module for PGP encryption
Christof Meerwald <cmeerw@cmeerw.org>
parents:
5421
diff
changeset
|
12 |
|
4979
f1a2bd1dea77
issue2550877: Writing headers with the email module will use continuation_ws = ' ' now for python 2.5 and 2.6 when importing anypy.email_.
Bernhard Reiter <bernhard@intevation.de>
parents:
4575
diff
changeset
|
13 ## please import this file if you are using the email module |
|
f1a2bd1dea77
issue2550877: Writing headers with the email module will use continuation_ws = ' ' now for python 2.5 and 2.6 when importing anypy.email_.
Bernhard Reiter <bernhard@intevation.de>
parents:
4575
diff
changeset
|
14 |
|
4575
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
15 # Match encoded-word strings in the form =?charset?q?Hello_World?= |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
16 ecre = re.compile(r''' |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
17 =\? # literal =? |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
18 (?P<charset>[^?]*?) # non-greedy up to the next ? is the charset |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
19 \? # literal ? |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
20 (?P<encoding>[qb]) # either a "q" or a "b", case insensitive |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
21 \? # literal ? |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
22 (?P<encoded>.*?) # non-greedy up to the next ?= is the encoded string |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
23 \?= # literal ?= |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
24 ''', re.VERBOSE | re.IGNORECASE | re.MULTILINE) |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
25 |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
26 |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
27 # Fixed header parser, see my proposed patch and discussions: |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
28 # http://bugs.python.org/issue1079 "decode_header does not follow RFC 2047" |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
29 # http://bugs.python.org/issue1467619 "Header.decode_header eats up spaces" |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
30 # This implements the decode_header specific parts of my proposed patch |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
31 # backported to python2.X |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
32 def decode_header(header): |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
33 """Decode a message header value without converting charset. |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
34 |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
35 Returns a list of (string, charset) pairs containing each of the decoded |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
36 parts of the header. Charset is None for non-encoded parts of the header, |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
37 otherwise a lower-case string containing the name of the character set |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
38 specified in the encoded string. |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
39 |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
40 header may be a string that may or may not contain RFC2047 encoded words, |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
41 or it may be a Header object. |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
42 |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
43 An email.errors.HeaderParseError may be raised when certain decoding error |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
44 occurs (e.g. a base64 decoding exception). |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
45 """ |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
46 # If it is a Header object, we can just return the encoded chunks. |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
47 if hasattr(header, '_chunks'): |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
48 return [(_charset._encode(string, str(charset)), str(charset)) |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
49 for string, charset in header._chunks] |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
50 # If no encoding, just return the header with no charset. |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
51 if not ecre.search(header): |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
52 return [(header, None)] |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
53 # First step is to parse all the encoded parts into triplets of the form |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
54 # (encoded_string, encoding, charset). For unencoded strings, the last |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
55 # two parts will be None. |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
56 words = [] |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
57 for line in header.splitlines(): |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
58 parts = ecre.split(line) |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
59 first = True |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
60 while parts: |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
61 unencoded = parts.pop(0) |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
62 if first: |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
63 unencoded = unencoded.lstrip() |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
64 first = False |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
65 if unencoded: |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
66 words.append((unencoded, None, None)) |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
67 if parts: |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
68 charset = parts.pop(0).lower() |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
69 encoding = parts.pop(0).lower() |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
70 encoded = parts.pop(0) |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
71 words.append((encoded, encoding, charset)) |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
72 # Now loop over words and remove words that consist of whitespace |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
73 # between two encoded strings. |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
74 import sys |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
75 droplist = [] |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
76 for n, w in enumerate(words): |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
77 if n>1 and w[1] and words[n-2][1] and words[n-1][0].isspace(): |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
78 droplist.append(n-1) |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
79 for d in reversed(droplist): |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
80 del words[d] |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
81 |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
82 # The next step is to decode each encoded word by applying the reverse |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
83 # base64 or quopri transformation. decoded_words is now a list of the |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
84 # form (decoded_word, charset). |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
85 decoded_words = [] |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
86 for encoded_string, encoding, charset in words: |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
87 if encoding is None: |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
88 # This is an unencoded word. |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
89 decoded_words.append((encoded_string, charset)) |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
90 elif encoding == 'q': |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
91 word = quoprimime.header_decode(encoded_string) |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
92 decoded_words.append((word, charset)) |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
93 elif encoding == 'b': |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
94 paderr = len(encoded_string) % 4 # Postel's law: add missing padding |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
95 if paderr: |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
96 encoded_string += '==='[:4 - paderr] |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
97 try: |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
98 word = base64mime.decode(encoded_string) |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
99 except binascii.Error: |
|
5238
758edaa61ec0
pylint flagged HeaderParseError as an Undefined variable.
John Rouillard <rouilj@ieee.org>
parents:
5090
diff
changeset
|
100 raise email.errors.HeaderParseError('Base64 decoding error') |
|
4575
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
101 else: |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
102 decoded_words.append((word, charset)) |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
103 else: |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
104 raise AssertionError('Unexpected encoding: ' + encoding) |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
105 # Now convert all words to bytes and collapse consecutive runs of |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
106 # similarly encoded words. |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
107 collapsed = [] |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
108 last_word = last_charset = None |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
109 for word, charset in decoded_words: |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
110 if isinstance(word, str): |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
111 pass |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
112 if last_word is None: |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
113 last_word = word |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
114 last_charset = charset |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
115 elif charset != last_charset: |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
116 collapsed.append((last_word, last_charset)) |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
117 last_word = word |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
118 last_charset = charset |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
119 elif last_charset is None: |
|
5090
89c2c1a88927
issue2550850 anypy/email_.py uses BSPACE which is not defined in python 2.7
John Rouillard <rouilj@ieee.org>
parents:
4983
diff
changeset
|
120 BSPACE = b' ' |
|
4575
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
121 last_word += BSPACE + word |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
122 else: |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
123 last_word += word |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
124 collapsed.append((last_word, last_charset)) |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
125 return collapsed |
|
c426cb251bc7
Be more tolerant when parsing RFC2047 encoded mail headers.
Ralf Schlatterbeck <rsc@runtux.com>
parents:
4447
diff
changeset
|
126 |
