Mercurial > p > roundup > code
annotate roundup/token_r.py @ 8180:d02ce1d14acd
feat: issue2551068 - Provide way to retrieve file/msg data via rest endpoint.
Use Allow header to change format of /binary_content endpoint. If
Allow header for endpoint is not application/json, it will be matched
against the mime type for the file. */*, text/* are supported and will
return the native mime type if present.
Changes:
move */* mime type from static dict of supported types. It was
hardcoded to return json only. Now it can return a matching
non-json mime type for the /binary_content endpoint.
Edited some errors to explicitly add */* mime type.
Cleanups to use ', ' separation in lists of valid mime types rather
than just space separated.
Remove ETag header when sending raw content. See issue 2551375 for
background.
Doc added to rest.txt.
Small format fix up (add dash) in CHANGES.txt.
Make passing an unset/None/False accept_mime_type to
format_dispatch_output a 500 error. This used to be the fallback
to produce a 406 error after all processing had happened. It
should no longer be possible to take that code path as all 406
errors (with valid accept_mime_types) are generated before
processing takes place.
Make format_dispatch_output handle output other than json/xml so it
can send back binary_content data.
Removed a spurious client.response_code = 400 that seems to not be
used.
Tests added for all code paths.
Database setup for tests msg and file entry. This required a file
upload test to change so it doesn't look for file1 as the link
returned by the upload. Download the link and verify the data
rather than verifying the link.
Multiple formatting changes to error messages to make all lists of
valid mime types ', ' an not just space separated.
| author | John Rouillard <rouilj@ieee.org> |
|---|---|
| date | Sun, 08 Dec 2024 17:22:33 -0500 |
| parents | 9a74dfeb8620 |
| children |
| rev | line source |
|---|---|
|
7178
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
1 # |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
2 # Copyright (c) 2001 Richard Jones, richard@bofh.asn.au. |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
3 # This module is free software, and you may redistribute it and/or modify |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
4 # under the same terms as Python, so long as this copyright message and |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
5 # disclaimer are retained in their original form. |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
6 # |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
7 # This module is distributed in the hope that it will be useful, |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
8 # but WITHOUT ANY WARRANTY; without even the implied warranty of |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
9 # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
10 # |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
11 |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
12 """This module provides the tokeniser used by roundup-admin. |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
13 """ |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
14 __docformat__ = 'restructuredtext' |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
15 |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
16 |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
17 def token_split(s, whitespace=' \r\n\t', quotes='\'"', |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
18 escaped={'r': '\r', 'n': '\n', 't': '\t'}): |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
19 r'''Split the string up into tokens. An occurence of a ``'`` or ``"`` in |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
20 the input will cause the splitter to ignore whitespace until a matching |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
21 quote char is found. Embedded non-matching quote chars are also skipped. |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
22 |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
23 Whitespace and quoting characters may be escaped using a backslash. |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
24 ``\r``, ``\n`` and ``\t`` are converted to carriage-return, newline and |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
25 tab. All other backslashed characters are left as-is. |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
26 |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
27 Valid examples:: |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
28 |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
29 hello world (2 tokens: hello, world) |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
30 "hello world" (1 token: hello world) |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
31 "Roch'e" Compaan (2 tokens: Roch'e Compaan) |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
32 Roch\'e Compaan (2 tokens: Roch'e Compaan) |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
33 address="1 2 3" (1 token: address=1 2 3) |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
34 \\ (1 token: \) |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
35 \n (1 token: a newline) |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
36 \o (1 token: \o) |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
37 |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
38 Invalid examples:: |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
39 |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
40 "hello world (no matching quote) |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
41 Roch'e Compaan (no matching quote) |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
42 ''' |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
43 l = [] |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
44 pos = 0 |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
45 NEWTOKEN = 'newtoken' |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
46 TOKEN = 'token' |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
47 QUOTE = 'quote' |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
48 ESCAPE = 'escape' |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
49 quotechar = '' |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
50 state = NEWTOKEN |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
51 oldstate = '' # one-level state stack ;) |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
52 length = len(s) |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
53 token = '' |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
54 while 1: |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
55 # end of string, finish off the current token |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
56 if pos == length: |
|
7228
07ce4e4110f5
flake8 fixes: whitespace, remove unused imports
John Rouillard <rouilj@ieee.org>
parents:
7178
diff
changeset
|
57 if state == QUOTE: raise ValueError # noqa: E701 |
|
07ce4e4110f5
flake8 fixes: whitespace, remove unused imports
John Rouillard <rouilj@ieee.org>
parents:
7178
diff
changeset
|
58 elif state == TOKEN: l.append(token) # noqa: E701 |
|
7178
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
59 break |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
60 c = s[pos] |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
61 if state == NEWTOKEN: |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
62 # looking for a new token |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
63 if c in quotes: |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
64 # quoted token |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
65 state = QUOTE |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
66 quotechar = c |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
67 pos = pos + 1 |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
68 continue |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
69 elif c in whitespace: |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
70 # skip whitespace |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
71 pos = pos + 1 |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
72 continue |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
73 elif c == '\\': |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
74 pos = pos + 1 |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
75 oldstate = TOKEN |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
76 state = ESCAPE |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
77 continue |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
78 # otherwise we have a token |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
79 state = TOKEN |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
80 elif state == TOKEN: |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
81 if c in whitespace: |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
82 # have a token, and have just found a whitespace terminator |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
83 l.append(token) |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
84 pos = pos + 1 |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
85 state = NEWTOKEN |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
86 token = '' |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
87 continue |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
88 elif c in quotes: |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
89 # have a token, just found embedded quotes |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
90 state = QUOTE |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
91 quotechar = c |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
92 pos = pos + 1 |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
93 continue |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
94 elif c == '\\': |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
95 pos = pos + 1 |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
96 oldstate = state |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
97 state = ESCAPE |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
98 continue |
|
7859
9a74dfeb8620
feat: can use escaped tokens inside quotes including quotes.
John Rouillard <rouilj@ieee.org>
parents:
7228
diff
changeset
|
99 elif state == QUOTE and c == '\\': |
|
9a74dfeb8620
feat: can use escaped tokens inside quotes including quotes.
John Rouillard <rouilj@ieee.org>
parents:
7228
diff
changeset
|
100 # in a quoted token and found an escape sequence |
|
9a74dfeb8620
feat: can use escaped tokens inside quotes including quotes.
John Rouillard <rouilj@ieee.org>
parents:
7228
diff
changeset
|
101 pos = pos + 1 |
|
9a74dfeb8620
feat: can use escaped tokens inside quotes including quotes.
John Rouillard <rouilj@ieee.org>
parents:
7228
diff
changeset
|
102 oldstate = state |
|
9a74dfeb8620
feat: can use escaped tokens inside quotes including quotes.
John Rouillard <rouilj@ieee.org>
parents:
7228
diff
changeset
|
103 state = ESCAPE |
|
9a74dfeb8620
feat: can use escaped tokens inside quotes including quotes.
John Rouillard <rouilj@ieee.org>
parents:
7228
diff
changeset
|
104 continue |
|
7178
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
105 elif state == QUOTE and c == quotechar: |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
106 # in a quoted token and found a matching quote char |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
107 pos = pos + 1 |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
108 # now we're looking for whitespace |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
109 state = TOKEN |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
110 continue |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
111 elif state == ESCAPE: |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
112 # escaped-char conversions (t, r, n) |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
113 # TODO: octal, hexdigit |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
114 state = oldstate |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
115 if c in escaped: |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
116 c = escaped[c] |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
117 # just add this char to the token and move along |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
118 token = token + c |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
119 pos = pos + 1 |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
120 return l |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
121 |
|
db06d4aeb978
unshadow stdlib token from roundup's token.
John Rouillard <rouilj@ieee.org>
parents:
diff
changeset
|
122 # vim: set filetype=python ts=4 sw=4 et si |
