Mercurial > p > roundup > code
annotate roundup/cgi/TAL/TALDefs.py @ 5096:e74c3611b138
- issue2550636, issue2550909: Added support for Whoosh indexer.
Also adds new config.ini setting called indexer to select
indexer. See ``doc/upgrading.txt`` for details. Initial patch
done by David Wolever. Patch modified (see ticket or below for
changes), docs updated and committed.
I have an outstanding issue with test/test_indexer.py. I have to
comment out all imports and tests for indexers I don't have (i.e.
mysql, postgres) otherwise no tests run.
With that change made, dbm, sqlite (rdbms), xapian and whoosh indexes
are all passing the indexer tests.
Changes summary:
1) support native back ends dbm and rdbms. (original patch only fell
through to dbm)
2) Developed whoosh stopfilter to not index stopwords or words outside
the the maxlength and minlength limits defined in index_common.py.
Required to pass the extremewords test_indexer test. Also I
removed a call to .lower on the input text as the tokenizer I chose
automatically does the lowercase.
3) Added support for max/min length to find. This was needed to pass
extremewords test.
4) Added back a call to save_index in add_text. This allowed all but
two tests to pass.
5) Fixed a call to:
results = searcher.search(query.Term("identifier", identifier))
which had an extra parameter that is an error under current whoosh.
6) Set limit=None in search call for find() otherwise it only return
10 items. This allowed it to pass manyresults test
Also due to changes in the roundup code removed the call in
indexer_whoosh to
from roundup.anypy.sets_ import set
since we use the python builtin set.
| author | John Rouillard <rouilj@ieee.org> |
|---|---|
| date | Sat, 25 Jun 2016 20:10:03 -0400 |
| parents | 8c2402a78bb0 |
| children | 63868084b8bb |
| rev | line source |
|---|---|
| 1049 | 1 ############################################################################## |
| 2 # | |
| 3 # Copyright (c) 2001, 2002 Zope Corporation and Contributors. | |
| 4 # All Rights Reserved. | |
|
2348
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
5 # |
| 1049 | 6 # This software is subject to the provisions of the Zope Public License, |
| 7 # Version 2.0 (ZPL). A copy of the ZPL should accompany this distribution. | |
| 8 # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED | |
| 9 # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED | |
| 10 # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS | |
|
2348
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
11 # FOR A PARTICULAR PURPOSE. |
|
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
12 # |
| 1049 | 13 ############################################################################## |
|
2348
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
14 # Modifications for Roundup: |
|
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
15 # 1. commented out ITALES references |
| 1049 | 16 """ |
|
2348
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
17 Common definitions used by TAL and METAL compilation an transformation. |
|
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
18 """ |
| 1049 | 19 |
| 20 from types import ListType, TupleType | |
| 21 | |
|
2348
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
22 #from ITALES import ITALESErrorInfo |
|
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
23 |
|
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
24 TAL_VERSION = "1.4" |
| 1049 | 25 |
| 26 XML_NS = "http://www.w3.org/XML/1998/namespace" # URI for XML namespace | |
| 27 XMLNS_NS = "http://www.w3.org/2000/xmlns/" # URI for XML NS declarations | |
| 28 | |
| 29 ZOPE_TAL_NS = "http://xml.zope.org/namespaces/tal" | |
| 30 ZOPE_METAL_NS = "http://xml.zope.org/namespaces/metal" | |
|
2348
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
31 ZOPE_I18N_NS = "http://xml.zope.org/namespaces/i18n" |
| 1049 | 32 |
|
2348
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
33 # This RE must exactly match the expression of the same name in the |
|
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
34 # zope.i18n.simpletranslationservice module: |
|
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
35 NAME_RE = "[a-zA-Z_][-a-zA-Z0-9_]*" |
| 1049 | 36 |
| 37 KNOWN_METAL_ATTRIBUTES = [ | |
| 38 "define-macro", | |
| 39 "use-macro", | |
| 40 "define-slot", | |
| 41 "fill-slot", | |
|
2348
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
42 "slot", |
| 1049 | 43 ] |
| 44 | |
| 45 KNOWN_TAL_ATTRIBUTES = [ | |
| 46 "define", | |
| 47 "condition", | |
| 48 "content", | |
| 49 "replace", | |
| 50 "repeat", | |
| 51 "attributes", | |
| 52 "on-error", | |
| 53 "omit-tag", | |
| 54 "tal tag", | |
| 55 ] | |
| 56 | |
|
2348
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
57 KNOWN_I18N_ATTRIBUTES = [ |
|
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
58 "translate", |
|
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
59 "domain", |
|
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
60 "target", |
|
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
61 "source", |
|
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
62 "attributes", |
|
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
63 "data", |
|
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
64 "name", |
|
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
65 ] |
|
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
66 |
| 1049 | 67 class TALError(Exception): |
| 68 | |
| 69 def __init__(self, msg, position=(None, None)): | |
| 70 assert msg != "" | |
| 71 self.msg = msg | |
| 72 self.lineno = position[0] | |
| 73 self.offset = position[1] | |
|
2348
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
74 self.filename = None |
|
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
75 |
|
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
76 def setFile(self, filename): |
|
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
77 self.filename = filename |
| 1049 | 78 |
| 79 def __str__(self): | |
| 80 result = self.msg | |
| 81 if self.lineno is not None: | |
| 82 result = result + ", at line %d" % self.lineno | |
| 83 if self.offset is not None: | |
| 84 result = result + ", column %d" % (self.offset + 1) | |
|
2348
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
85 if self.filename is not None: |
|
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
86 result = result + ', in file %s' % self.filename |
| 1049 | 87 return result |
| 88 | |
| 89 class METALError(TALError): | |
| 90 pass | |
| 91 | |
| 92 class TALESError(TALError): | |
| 93 pass | |
| 94 | |
|
2348
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
95 class I18NError(TALError): |
|
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
96 pass |
|
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
97 |
|
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
98 |
| 1049 | 99 class ErrorInfo: |
| 100 | |
|
2348
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
101 #__implements__ = ITALESErrorInfo |
|
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
102 |
| 1049 | 103 def __init__(self, err, position=(None, None)): |
| 104 if isinstance(err, Exception): | |
| 105 self.type = err.__class__ | |
| 106 self.value = err | |
| 107 else: | |
| 108 self.type = err | |
| 109 self.value = None | |
| 110 self.lineno = position[0] | |
| 111 self.offset = position[1] | |
| 112 | |
|
2348
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
113 |
|
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
114 |
| 1049 | 115 import re |
| 116 _attr_re = re.compile(r"\s*([^\s]+)\s+([^\s].*)\Z", re.S) | |
| 117 _subst_re = re.compile(r"\s*(?:(text|structure)\s+)?(.*)\Z", re.S) | |
| 118 del re | |
| 119 | |
|
2348
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
120 def parseAttributeReplacements(arg, xml): |
| 1049 | 121 dict = {} |
| 122 for part in splitParts(arg): | |
| 123 m = _attr_re.match(part) | |
| 124 if not m: | |
|
2348
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
125 raise TALError("Bad syntax in attributes: " + `part`) |
| 1049 | 126 name, expr = m.group(1, 2) |
|
2348
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
127 if not xml: |
|
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
128 name = name.lower() |
| 1049 | 129 if dict.has_key(name): |
|
2348
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
130 raise TALError("Duplicate attribute name in attributes: " + `part`) |
| 1049 | 131 dict[name] = expr |
| 132 return dict | |
| 133 | |
| 134 def parseSubstitution(arg, position=(None, None)): | |
| 135 m = _subst_re.match(arg) | |
| 136 if not m: | |
| 137 raise TALError("Bad syntax in substitution text: " + `arg`, position) | |
| 138 key, expr = m.group(1, 2) | |
| 139 if not key: | |
| 140 key = "text" | |
| 141 return key, expr | |
| 142 | |
| 143 def splitParts(arg): | |
| 144 # Break in pieces at undoubled semicolons and | |
| 145 # change double semicolons to singles: | |
|
2348
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
146 arg = arg.replace(";;", "\0") |
|
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
147 parts = arg.split(';') |
|
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
148 parts = [p.replace("\0", ";") for p in parts] |
|
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
149 if len(parts) > 1 and not parts[-1].strip(): |
| 1049 | 150 del parts[-1] # It ended in a semicolon |
| 151 return parts | |
| 152 | |
| 153 def isCurrentVersion(program): | |
| 154 version = getProgramVersion(program) | |
| 155 return version == TAL_VERSION | |
| 156 | |
| 157 def getProgramMode(program): | |
| 158 version = getProgramVersion(program) | |
| 159 if (version == TAL_VERSION and isinstance(program[1], TupleType) and | |
| 160 len(program[1]) == 2): | |
| 161 opcode, mode = program[1] | |
| 162 if opcode == "mode": | |
| 163 return mode | |
| 164 return None | |
| 165 | |
| 166 def getProgramVersion(program): | |
| 167 if (len(program) >= 2 and | |
| 168 isinstance(program[0], TupleType) and len(program[0]) == 2): | |
| 169 opcode, version = program[0] | |
| 170 if opcode == "version": | |
| 171 return version | |
| 172 return None | |
| 173 | |
|
2348
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
174 import re |
|
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
175 _ent1_re = re.compile('&(?![A-Z#])', re.I) |
|
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
176 _entch_re = re.compile('&([A-Z][A-Z0-9]*)(?![A-Z0-9;])', re.I) |
|
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
177 _entn1_re = re.compile('&#(?![0-9X])', re.I) |
|
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
178 _entnx_re = re.compile('&(#X[A-F0-9]*)(?![A-F0-9;])', re.I) |
|
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
179 _entnd_re = re.compile('&(#[0-9][0-9]*)(?![0-9;])') |
|
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
180 del re |
|
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
181 |
|
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
182 def attrEscape(s): |
|
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
183 """Replace special characters '&<>' by character entities, |
|
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
184 except when '&' already begins a syntactically valid entity.""" |
|
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
185 s = _ent1_re.sub('&', s) |
|
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
186 s = _entch_re.sub(r'&\1', s) |
|
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
187 s = _entn1_re.sub('&#', s) |
|
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
188 s = _entnx_re.sub(r'&\1', s) |
|
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
189 s = _entnd_re.sub(r'&\1', s) |
|
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
190 s = s.replace('<', '<') |
|
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
191 s = s.replace('>', '>') |
|
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
192 s = s.replace('"', '"') |
|
8c2402a78bb0
beginning getting ZPT up to date: TAL first
Richard Jones <richard@users.sourceforge.net>
parents:
2005
diff
changeset
|
193 return s |
