annotate roundup/token.py @ 2467:76ead526113d

client instances may be used as translation engines. any backend translator may be passed as constructor argument or via setTranslator() method. by default, templating.translationService is used. use this engine to translate client messages.
author Alexander Smishlajev <a1s@users.sourceforge.net>
date Tue, 15 Jun 2004 09:19:49 +0000
parents fc52d57c6c3e
children 6e3e4f24c753
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
470
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
1 #
475
a1a44636bace Fix breakage caused by transaction changes.
Richard Jones <richard@users.sourceforge.net>
parents: 470
diff changeset
2 # Copyright (c) 2001 Richard Jones, richard@bofh.asn.au.
470
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
3 # This module is free software, and you may redistribute it and/or modify
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
4 # under the same terms as Python, so long as this copyright message and
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
5 # disclaimer are retained in their original form.
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
6 #
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
7 # This module is distributed in the hope that it will be useful,
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
8 # but WITHOUT ANY WARRANTY; without even the implied warranty of
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
9 # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
10 #
2005
fc52d57c6c3e documentation cleanup
Richard Jones <richard@users.sourceforge.net>
parents: 1090
diff changeset
11 # $Id: token.py,v 1.4 2004-02-11 23:55:08 richard Exp $
470
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
12 #
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
13
2005
fc52d57c6c3e documentation cleanup
Richard Jones <richard@users.sourceforge.net>
parents: 1090
diff changeset
14 """This module provides the tokeniser used by roundup-admin.
470
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
15 """
2005
fc52d57c6c3e documentation cleanup
Richard Jones <richard@users.sourceforge.net>
parents: 1090
diff changeset
16 __docformat__ = 'restructuredtext'
470
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
17
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
18 def token_split(s, whitespace=' \r\n\t', quotes='\'"',
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
19 escaped={'r':'\r', 'n':'\n', 't':'\t'}):
2005
fc52d57c6c3e documentation cleanup
Richard Jones <richard@users.sourceforge.net>
parents: 1090
diff changeset
20 '''Split the string up into tokens. An occurence of a ``'`` or ``"`` in
fc52d57c6c3e documentation cleanup
Richard Jones <richard@users.sourceforge.net>
parents: 1090
diff changeset
21 the input will cause the splitter to ignore whitespace until a matching
fc52d57c6c3e documentation cleanup
Richard Jones <richard@users.sourceforge.net>
parents: 1090
diff changeset
22 quote char is found. Embedded non-matching quote chars are also skipped.
fc52d57c6c3e documentation cleanup
Richard Jones <richard@users.sourceforge.net>
parents: 1090
diff changeset
23
fc52d57c6c3e documentation cleanup
Richard Jones <richard@users.sourceforge.net>
parents: 1090
diff changeset
24 Whitespace and quoting characters may be escaped using a backslash.
fc52d57c6c3e documentation cleanup
Richard Jones <richard@users.sourceforge.net>
parents: 1090
diff changeset
25 ``\r``, ``\n`` and ``\t`` are converted to carriage-return, newline and
fc52d57c6c3e documentation cleanup
Richard Jones <richard@users.sourceforge.net>
parents: 1090
diff changeset
26 tab. All other backslashed characters are left as-is.
fc52d57c6c3e documentation cleanup
Richard Jones <richard@users.sourceforge.net>
parents: 1090
diff changeset
27
fc52d57c6c3e documentation cleanup
Richard Jones <richard@users.sourceforge.net>
parents: 1090
diff changeset
28 Valid examples::
fc52d57c6c3e documentation cleanup
Richard Jones <richard@users.sourceforge.net>
parents: 1090
diff changeset
29
470
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
30 hello world (2 tokens: hello, world)
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
31 "hello world" (1 token: hello world)
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
32 "Roch'e" Compaan (2 tokens: Roch'e Compaan)
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
33 Roch\'e Compaan (2 tokens: Roch'e Compaan)
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
34 address="1 2 3" (1 token: address=1 2 3)
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
35 \\ (1 token: \)
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
36 \n (1 token: a newline)
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
37 \o (1 token: \o)
2005
fc52d57c6c3e documentation cleanup
Richard Jones <richard@users.sourceforge.net>
parents: 1090
diff changeset
38
fc52d57c6c3e documentation cleanup
Richard Jones <richard@users.sourceforge.net>
parents: 1090
diff changeset
39 Invalid examples::
fc52d57c6c3e documentation cleanup
Richard Jones <richard@users.sourceforge.net>
parents: 1090
diff changeset
40
470
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
41 "hello world (no matching quote)
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
42 Roch'e Compaan (no matching quote)
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
43 '''
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
44 l = []
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
45 pos = 0
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
46 NEWTOKEN = 'newtoken'
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
47 TOKEN = 'token'
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
48 QUOTE = 'quote'
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
49 ESCAPE = 'escape'
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
50 quotechar = ''
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
51 state = NEWTOKEN
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
52 oldstate = '' # one-level state stack ;)
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
53 length = len(s)
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
54 finish = 0
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
55 token = ''
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
56 while 1:
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
57 # end of string, finish off the current token
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
58 if pos == length:
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
59 if state == QUOTE: raise ValueError, "unmatched quote"
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
60 elif state == TOKEN: l.append(token)
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
61 break
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
62 c = s[pos]
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
63 if state == NEWTOKEN:
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
64 # looking for a new token
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
65 if c in quotes:
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
66 # quoted token
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
67 state = QUOTE
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
68 quotechar = c
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
69 pos = pos + 1
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
70 continue
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
71 elif c in whitespace:
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
72 # skip whitespace
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
73 pos = pos + 1
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
74 continue
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
75 elif c == '\\':
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
76 pos = pos + 1
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
77 oldstate = TOKEN
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
78 state = ESCAPE
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
79 continue
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
80 # otherwise we have a token
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
81 state = TOKEN
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
82 elif state == TOKEN:
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
83 if c in whitespace:
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
84 # have a token, and have just found a whitespace terminator
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
85 l.append(token)
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
86 pos = pos + 1
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
87 state = NEWTOKEN
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
88 token = ''
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
89 continue
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
90 elif c in quotes:
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
91 # have a token, just found embedded quotes
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
92 state = QUOTE
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
93 quotechar = c
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
94 pos = pos + 1
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
95 continue
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
96 elif c == '\\':
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
97 pos = pos + 1
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
98 oldstate = state
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
99 state = ESCAPE
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
100 continue
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
101 elif state == QUOTE and c == quotechar:
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
102 # in a quoted token and found a matching quote char
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
103 pos = pos + 1
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
104 # now we're looking for whitespace
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
105 state = TOKEN
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
106 continue
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
107 elif state == ESCAPE:
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
108 # escaped-char conversions (t, r, n)
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
109 # TODO: octal, hexdigit
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
110 state = oldstate
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
111 if escaped.has_key(c):
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
112 c = escaped[c]
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
113 # just add this char to the token and move along
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
114 token = token + c
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
115 pos = pos + 1
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
116 return l
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
117
9f7320624bc2 Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
118 # vim: set filetype=python ts=4 sw=4 et si

Roundup Issue Tracker: http://roundup-tracker.org/