Mercurial > p > roundup > code
annotate roundup/backends/indexer_dbm.py @ 5548:fea11d05110e
Avoid errors from selecting "no selection" on multilink (issue2550722).
As discussed in issue 2550722 there are various cases where selecting
"no selection" on a multilink can result in inappropriate errors from
Roundup:
* If selecting "no selection" produces a null edit (a value was set in
the multilink in an edit with an error, then removed again, along
with all other changes, in the next form submission), so the page is
rendered from the form contents including the "-<id>" value for "no
selection" for the multilink.
* If creating an item with a nonempty value for a multilink has an
error, and the resubmission changes that multilink to "no selection"
(and this in turn has subcases, according to whether the creation
then succeeds or fails on the resubmission, which need fixes in
different places in the Roundup code).
All of these cases have in common that it is expected and OK to have a
"-<id>" value for a submission for a multilink when <id> is not set in
that multilink in the database (because the original attempt to set
<id> in that multilink had an error), so the hyperdb.py logic to give
an error in that case is thus removed. In the subcase of the second
case where the resubmission with "no selection" has an error, the
templating code tries to produce a menu entry for the "-<id>"
multilink value, which also results in an error, hence the
templating.py change to ignore such values in the list for a
multilink.
| author | Joseph Myers <jsm@polyomino.org.uk> |
|---|---|
| date | Thu, 27 Sep 2018 11:33:01 +0000 |
| parents | e2baa4e6ed6d |
| children | 4c7662c86a36 |
| rev | line source |
|---|---|
|
2089
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
1 # |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
2 # This module is derived from the module described at: |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
3 # http://gnosis.cx/publish/programming/charming_python_15.txt |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
4 # |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
5 # Author: David Mertz (mertz@gnosis.cx) |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
6 # Thanks to: Pat Knight (p.knight@ktgroup.co.uk) |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
7 # Gregory Popovitch (greg@gpy.com) |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
8 # |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
9 # The original module was released under this license, and remains under |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
10 # it: |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
11 # |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
12 # This file is released to the public domain. I (dqm) would |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
13 # appreciate it if you choose to keep derived works under terms |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
14 # that promote freedom, but obviously am giving up any rights |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
15 # to compel such. |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
16 # |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
17 '''This module provides an indexer class, RoundupIndexer, that stores text |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
18 indices in a roundup instance. This class makes searching the content of |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
19 messages, string properties and text files possible. |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
20 ''' |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
21 __docformat__ = 'restructuredtext' |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
22 |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
23 import os, shutil, re, mimetypes, marshal, zlib, errno |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
24 from roundup.hyperdb import Link, Multilink |
|
3544
5cd1c83dea50
Features and fixes.
Richard Jones <richard@users.sourceforge.net>
parents:
3295
diff
changeset
|
25 from roundup.backends.indexer_common import Indexer as IndexerBase |
|
2872
d530b68e4b42
don't index common words [SF#1046612]
Richard Jones <richard@users.sourceforge.net>
parents:
2089
diff
changeset
|
26 |
|
3544
5cd1c83dea50
Features and fixes.
Richard Jones <richard@users.sourceforge.net>
parents:
3295
diff
changeset
|
27 class Indexer(IndexerBase): |
|
2089
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
28 '''Indexes information from roundup's hyperdb to allow efficient |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
29 searching. |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
30 |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
31 Three structures are created by the indexer:: |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
32 |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
33 files {identifier: (fileid, wordcount)} |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
34 words {word: {fileid: count}} |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
35 fileids {fileid: identifier} |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
36 |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
37 where identifier is (classname, nodeid, propertyname) |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
38 ''' |
|
3295
a615cc230160
added Xapian indexer; replaces standard indexers if Xapian is available
Richard Jones <richard@users.sourceforge.net>
parents:
3092
diff
changeset
|
39 def __init__(self, db): |
|
3544
5cd1c83dea50
Features and fixes.
Richard Jones <richard@users.sourceforge.net>
parents:
3295
diff
changeset
|
40 IndexerBase.__init__(self, db) |
|
3295
a615cc230160
added Xapian indexer; replaces standard indexers if Xapian is available
Richard Jones <richard@users.sourceforge.net>
parents:
3092
diff
changeset
|
41 self.indexdb_path = os.path.join(db.config.DATABASE, 'indexes') |
|
2089
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
42 self.indexdb = os.path.join(self.indexdb_path, 'index.db') |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
43 self.reindex = 0 |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
44 self.quiet = 9 |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
45 self.changed = 0 |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
46 |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
47 # see if we need to reindex because of a change in code |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
48 version = os.path.join(self.indexdb_path, 'version') |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
49 if (not os.path.exists(self.indexdb_path) or |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
50 not os.path.exists(version)): |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
51 # for now the file itself is a flag |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
52 self.force_reindex() |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
53 elif os.path.exists(version): |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
54 version = open(version).read() |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
55 # check the value and reindex if it's not the latest |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
56 if version.strip() != '1': |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
57 self.force_reindex() |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
58 |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
59 def force_reindex(self): |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
60 '''Force a reindex condition |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
61 ''' |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
62 if os.path.exists(self.indexdb_path): |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
63 shutil.rmtree(self.indexdb_path) |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
64 os.makedirs(self.indexdb_path) |
|
5380
64c4e43fbb84
Python 3 preparation: numeric literal syntax.
Joseph Myers <jsm@polyomino.org.uk>
parents:
5248
diff
changeset
|
65 os.chmod(self.indexdb_path, 0o775) |
|
2089
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
66 open(os.path.join(self.indexdb_path, 'version'), 'w').write('1\n') |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
67 self.reindex = 1 |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
68 self.changed = 1 |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
69 |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
70 def should_reindex(self): |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
71 '''Should we reindex? |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
72 ''' |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
73 return self.reindex |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
74 |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
75 def add_text(self, identifier, text, mime_type='text/plain'): |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
76 '''Add some text associated with the (classname, nodeid, property) |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
77 identifier. |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
78 ''' |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
79 # make sure the index is loaded |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
80 self.load_index() |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
81 |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
82 # remove old entries for this identifier |
|
4357
13b3155869e0
Beginnings of a big code cleanup / modernisation to make 2to3 happy
Richard Jones <richard@users.sourceforge.net>
parents:
4252
diff
changeset
|
83 if identifier in self.files: |
|
2089
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
84 self.purge_entry(identifier) |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
85 |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
86 # split into words |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
87 words = self.splitter(text, mime_type) |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
88 |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
89 # Find new file index, and assign it to identifier |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
90 # (_TOP uses trick of negative to avoid conflict with file index) |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
91 self.files['_TOP'] = (self.files['_TOP'][0]-1, None) |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
92 file_index = abs(self.files['_TOP'][0]) |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
93 self.files[identifier] = (file_index, len(words)) |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
94 self.fileids[file_index] = identifier |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
95 |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
96 # find the unique words |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
97 filedict = {} |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
98 for word in words: |
|
3544
5cd1c83dea50
Features and fixes.
Richard Jones <richard@users.sourceforge.net>
parents:
3295
diff
changeset
|
99 if self.is_stopword(word): |
|
2872
d530b68e4b42
don't index common words [SF#1046612]
Richard Jones <richard@users.sourceforge.net>
parents:
2089
diff
changeset
|
100 continue |
|
4357
13b3155869e0
Beginnings of a big code cleanup / modernisation to make 2to3 happy
Richard Jones <richard@users.sourceforge.net>
parents:
4252
diff
changeset
|
101 if word in filedict: |
|
2089
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
102 filedict[word] = filedict[word]+1 |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
103 else: |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
104 filedict[word] = 1 |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
105 |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
106 # now add to the totals |
|
4357
13b3155869e0
Beginnings of a big code cleanup / modernisation to make 2to3 happy
Richard Jones <richard@users.sourceforge.net>
parents:
4252
diff
changeset
|
107 for word in filedict: |
|
2089
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
108 # each word has a dict of {identifier: count} |
|
4357
13b3155869e0
Beginnings of a big code cleanup / modernisation to make 2to3 happy
Richard Jones <richard@users.sourceforge.net>
parents:
4252
diff
changeset
|
109 if word in self.words: |
|
2089
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
110 entry = self.words[word] |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
111 else: |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
112 # new word |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
113 entry = {} |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
114 self.words[word] = entry |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
115 |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
116 # make a reference to the file for this word |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
117 entry[file_index] = filedict[word] |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
118 |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
119 # save needed |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
120 self.changed = 1 |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
121 |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
122 def splitter(self, text, ftype): |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
123 '''Split the contents of a text string into a list of 'words' |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
124 ''' |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
125 if ftype == 'text/plain': |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
126 words = self.text_splitter(text) |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
127 else: |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
128 return [] |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
129 return words |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
130 |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
131 def text_splitter(self, text): |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
132 """Split text/plain string into a list of words |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
133 """ |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
134 # case insensitive |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
135 text = str(text).upper() |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
136 |
|
4252
2ff6f39aa391
Indexers behaviour made more consistent regarding length of indexed words...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents:
3613
diff
changeset
|
137 # Split the raw text |
|
2ff6f39aa391
Indexers behaviour made more consistent regarding length of indexed words...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents:
3613
diff
changeset
|
138 return re.findall(r'\b\w{%d,%d}\b' % (self.minlength, self.maxlength), |
|
2ff6f39aa391
Indexers behaviour made more consistent regarding length of indexed words...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents:
3613
diff
changeset
|
139 text) |
|
2089
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
140 |
|
4252
2ff6f39aa391
Indexers behaviour made more consistent regarding length of indexed words...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents:
3613
diff
changeset
|
141 # we override this to ignore too short and too long words |
|
2ff6f39aa391
Indexers behaviour made more consistent regarding length of indexed words...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents:
3613
diff
changeset
|
142 # and also to fix a bug - the (fail) case. |
|
2089
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
143 def find(self, wordlist): |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
144 '''Locate files that match ALL the words in wordlist |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
145 ''' |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
146 if not hasattr(self, 'words'): |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
147 self.load_index() |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
148 self.load_index(wordlist=wordlist) |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
149 entries = {} |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
150 hits = None |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
151 for word in wordlist: |
|
4252
2ff6f39aa391
Indexers behaviour made more consistent regarding length of indexed words...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents:
3613
diff
changeset
|
152 if not self.minlength <= len(word) <= self.maxlength: |
|
2089
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
153 # word outside the bounds of what we index - ignore |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
154 continue |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
155 word = word.upper() |
|
4252
2ff6f39aa391
Indexers behaviour made more consistent regarding length of indexed words...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents:
3613
diff
changeset
|
156 if self.is_stopword(word): |
|
2ff6f39aa391
Indexers behaviour made more consistent regarding length of indexed words...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents:
3613
diff
changeset
|
157 continue |
|
2089
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
158 entry = self.words.get(word) # For each word, get index |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
159 entries[word] = entry # of matching files |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
160 if not entry: # Nothing for this one word (fail) |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
161 return {} |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
162 if hits is None: |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
163 hits = {} |
|
4357
13b3155869e0
Beginnings of a big code cleanup / modernisation to make 2to3 happy
Richard Jones <richard@users.sourceforge.net>
parents:
4252
diff
changeset
|
164 for k in entry: |
|
13b3155869e0
Beginnings of a big code cleanup / modernisation to make 2to3 happy
Richard Jones <richard@users.sourceforge.net>
parents:
4252
diff
changeset
|
165 if k not in self.fileids: |
|
13b3155869e0
Beginnings of a big code cleanup / modernisation to make 2to3 happy
Richard Jones <richard@users.sourceforge.net>
parents:
4252
diff
changeset
|
166 raise ValueError('Index is corrupted: re-generate it') |
|
2089
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
167 hits[k] = self.fileids[k] |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
168 else: |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
169 # Eliminate hits for every non-match |
|
4362
74476eaac38a
more modernisation
Richard Jones <richard@users.sourceforge.net>
parents:
4357
diff
changeset
|
170 for fileid in list(hits): |
|
4357
13b3155869e0
Beginnings of a big code cleanup / modernisation to make 2to3 happy
Richard Jones <richard@users.sourceforge.net>
parents:
4252
diff
changeset
|
171 if fileid not in entry: |
|
2089
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
172 del hits[fileid] |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
173 if hits is None: |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
174 return {} |
|
4357
13b3155869e0
Beginnings of a big code cleanup / modernisation to make 2to3 happy
Richard Jones <richard@users.sourceforge.net>
parents:
4252
diff
changeset
|
175 return list(hits.values()) |
|
2089
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
176 |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
177 segments = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ#_-!" |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
178 def load_index(self, reload=0, wordlist=None): |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
179 # Unless reload is indicated, do not load twice |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
180 if self.index_loaded() and not reload: |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
181 return 0 |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
182 |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
183 # Ok, now let's actually load it |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
184 db = {'WORDS': {}, 'FILES': {'_TOP':(0,None)}, 'FILEIDS': {}} |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
185 |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
186 # Identify the relevant word-dictionary segments |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
187 if not wordlist: |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
188 segments = self.segments |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
189 else: |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
190 segments = ['-','#'] |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
191 for word in wordlist: |
|
5470
e2baa4e6ed6d
handle words starting with unicode characters
Christof Meerwald <cmeerw@cmeerw.org>
parents:
5395
diff
changeset
|
192 initchar = word[0].upper() |
|
e2baa4e6ed6d
handle words starting with unicode characters
Christof Meerwald <cmeerw@cmeerw.org>
parents:
5395
diff
changeset
|
193 if initchar not in self.segments: |
|
e2baa4e6ed6d
handle words starting with unicode characters
Christof Meerwald <cmeerw@cmeerw.org>
parents:
5395
diff
changeset
|
194 initchar = '_' |
|
e2baa4e6ed6d
handle words starting with unicode characters
Christof Meerwald <cmeerw@cmeerw.org>
parents:
5395
diff
changeset
|
195 segments.append(initchar) |
|
2089
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
196 |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
197 # Load the segments |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
198 for segment in segments: |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
199 try: |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
200 f = open(self.indexdb + segment, 'rb') |
|
5248
198b6e810c67
Use Python-3-compatible 'as' syntax for except statements
Eric S. Raymond <esr@thyrsus.com>
parents:
4570
diff
changeset
|
201 except IOError as error: |
|
2089
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
202 # probably just nonexistent segment index file |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
203 if error.errno != errno.ENOENT: raise |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
204 else: |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
205 pickle_str = zlib.decompress(f.read()) |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
206 f.close() |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
207 dbslice = marshal.loads(pickle_str) |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
208 if dbslice.get('WORDS'): |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
209 # if it has some words, add them |
|
5395
23b8e6067f7c
Python 3 preparation: update calls to dict methods.
Joseph Myers <jsm@polyomino.org.uk>
parents:
5380
diff
changeset
|
210 for word, entry in dbslice['WORDS'].items(): |
|
2089
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
211 db['WORDS'][word] = entry |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
212 if dbslice.get('FILES'): |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
213 # if it has some files, add them |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
214 db['FILES'] = dbslice['FILES'] |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
215 if dbslice.get('FILEIDS'): |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
216 # if it has fileids, add them |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
217 db['FILEIDS'] = dbslice['FILEIDS'] |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
218 |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
219 self.words = db['WORDS'] |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
220 self.files = db['FILES'] |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
221 self.fileids = db['FILEIDS'] |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
222 self.changed = 0 |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
223 |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
224 def save_index(self): |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
225 # only save if the index is loaded and changed |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
226 if not self.index_loaded() or not self.changed: |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
227 return |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
228 |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
229 # brutal space saver... delete all the small segments |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
230 for segment in self.segments: |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
231 try: |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
232 os.remove(self.indexdb + segment) |
|
5248
198b6e810c67
Use Python-3-compatible 'as' syntax for except statements
Eric S. Raymond <esr@thyrsus.com>
parents:
4570
diff
changeset
|
233 except OSError as error: |
|
2089
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
234 # probably just nonexistent segment index file |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
235 if error.errno != errno.ENOENT: raise |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
236 |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
237 # First write the much simpler filename/fileid dictionaries |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
238 dbfil = {'WORDS':None, 'FILES':self.files, 'FILEIDS':self.fileids} |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
239 open(self.indexdb+'-','wb').write(zlib.compress(marshal.dumps(dbfil))) |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
240 |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
241 # The hard part is splitting the word dictionary up, of course |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
242 letters = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ#_" |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
243 segdicts = {} # Need batch of empty dicts |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
244 for segment in letters: |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
245 segdicts[segment] = {} |
|
5395
23b8e6067f7c
Python 3 preparation: update calls to dict methods.
Joseph Myers <jsm@polyomino.org.uk>
parents:
5380
diff
changeset
|
246 for word, entry in self.words.items(): # Split into segment dicts |
|
2089
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
247 initchar = word[0].upper() |
|
5470
e2baa4e6ed6d
handle words starting with unicode characters
Christof Meerwald <cmeerw@cmeerw.org>
parents:
5395
diff
changeset
|
248 if initchar not in letters: |
|
e2baa4e6ed6d
handle words starting with unicode characters
Christof Meerwald <cmeerw@cmeerw.org>
parents:
5395
diff
changeset
|
249 # if it's a unicode character, add it to the '_' segment |
|
e2baa4e6ed6d
handle words starting with unicode characters
Christof Meerwald <cmeerw@cmeerw.org>
parents:
5395
diff
changeset
|
250 initchar = '_' |
|
2089
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
251 segdicts[initchar][word] = entry |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
252 |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
253 # save |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
254 for initchar in letters: |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
255 db = {'WORDS':segdicts[initchar], 'FILES':None, 'FILEIDS':None} |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
256 pickle_str = marshal.dumps(db) |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
257 filename = self.indexdb + initchar |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
258 pickle_fh = open(filename, 'wb') |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
259 pickle_fh.write(zlib.compress(pickle_str)) |
|
5380
64c4e43fbb84
Python 3 preparation: numeric literal syntax.
Joseph Myers <jsm@polyomino.org.uk>
parents:
5248
diff
changeset
|
260 os.chmod(filename, 0o664) |
|
2089
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
261 |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
262 # save done |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
263 self.changed = 0 |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
264 |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
265 def purge_entry(self, identifier): |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
266 '''Remove a file from file index and word index |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
267 ''' |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
268 self.load_index() |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
269 |
|
4357
13b3155869e0
Beginnings of a big code cleanup / modernisation to make 2to3 happy
Richard Jones <richard@users.sourceforge.net>
parents:
4252
diff
changeset
|
270 if identifier not in self.files: |
|
2089
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
271 return |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
272 |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
273 file_index = self.files[identifier][0] |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
274 del self.files[identifier] |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
275 del self.fileids[file_index] |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
276 |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
277 # The much harder part, cleanup the word index |
|
5395
23b8e6067f7c
Python 3 preparation: update calls to dict methods.
Joseph Myers <jsm@polyomino.org.uk>
parents:
5380
diff
changeset
|
278 for key, occurs in self.words.items(): |
|
4357
13b3155869e0
Beginnings of a big code cleanup / modernisation to make 2to3 happy
Richard Jones <richard@users.sourceforge.net>
parents:
4252
diff
changeset
|
279 if file_index in occurs: |
|
2089
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
280 del occurs[file_index] |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
281 |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
282 # save needed |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
283 self.changed = 1 |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
284 |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
285 def index_loaded(self): |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
286 return (hasattr(self,'fileids') and hasattr(self,'files') and |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
287 hasattr(self,'words')) |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
288 |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
289 def rollback(self): |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
290 ''' load last saved index info. ''' |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
291 self.load_index(reload=1) |
|
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
292 |
|
3613
5f4db2650da3
implement close() on all indexers [SF#1242477]
Richard Jones <richard@users.sourceforge.net>
parents:
3555
diff
changeset
|
293 def close(self): |
|
5f4db2650da3
implement close() on all indexers [SF#1242477]
Richard Jones <richard@users.sourceforge.net>
parents:
3555
diff
changeset
|
294 pass |
|
5f4db2650da3
implement close() on all indexers [SF#1242477]
Richard Jones <richard@users.sourceforge.net>
parents:
3555
diff
changeset
|
295 |
|
5f4db2650da3
implement close() on all indexers [SF#1242477]
Richard Jones <richard@users.sourceforge.net>
parents:
3555
diff
changeset
|
296 |
|
2089
93f03c6714d8
A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
297 # vim: set filetype=python ts=4 sw=4 et si |
