annotate roundup/backends/indexer_dbm.py @ 7800:2d4684e4702d

fix: enhancement to history command output and % template fix. Rather than using the key field, use the label field for descriptions. Call cls.labelprop(default_to_id=True) so it returns id rather than the first sorted property name. If labelprop() returns 'id' or 'title', we return nothing. 'id' means there is no label set and no properties named 'name' or 'title'. So have the caller do whatever it wants (prepend classname for example) when there is no human readable name. This prevents %(name)s%(key)s from producing: 23(23). Also don't accept the 'title' property. Titles can be too long. Arguably we could: '%(name)20s' to limit the title length. However without ellipses or something truncating the title might be confusing. So again pretend there is no human readable name.
author John Rouillard <rouilj@ieee.org>
date Tue, 12 Mar 2024 11:52:17 -0400
parents d17e57220a62
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
2089
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
1 #
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
2 # This module is derived from the module described at:
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
3 # http://gnosis.cx/publish/programming/charming_python_15.txt
6982
e605ddb45701 flake8 - one var rename, import, whitespace
John Rouillard <rouilj@ieee.org>
parents: 6491
diff changeset
4 #
2089
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
5 # Author: David Mertz (mertz@gnosis.cx)
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
6 # Thanks to: Pat Knight (p.knight@ktgroup.co.uk)
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
7 # Gregory Popovitch (greg@gpy.com)
6982
e605ddb45701 flake8 - one var rename, import, whitespace
John Rouillard <rouilj@ieee.org>
parents: 6491
diff changeset
8 #
2089
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
9 # The original module was released under this license, and remains under
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
10 # it:
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
11 #
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
12 # This file is released to the public domain. I (dqm) would
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
13 # appreciate it if you choose to keep derived works under terms
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
14 # that promote freedom, but obviously am giving up any rights
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
15 # to compel such.
6982
e605ddb45701 flake8 - one var rename, import, whitespace
John Rouillard <rouilj@ieee.org>
parents: 6491
diff changeset
16 #
2089
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
17 '''This module provides an indexer class, RoundupIndexer, that stores text
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
18 indices in a roundup instance. This class makes searching the content of
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
19 messages, string properties and text files possible.
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
20 '''
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
21 __docformat__ = 'restructuredtext'
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
22
6982
e605ddb45701 flake8 - one var rename, import, whitespace
John Rouillard <rouilj@ieee.org>
parents: 6491
diff changeset
23 import errno
e605ddb45701 flake8 - one var rename, import, whitespace
John Rouillard <rouilj@ieee.org>
parents: 6491
diff changeset
24 import marshal
e605ddb45701 flake8 - one var rename, import, whitespace
John Rouillard <rouilj@ieee.org>
parents: 6491
diff changeset
25 import os
e605ddb45701 flake8 - one var rename, import, whitespace
John Rouillard <rouilj@ieee.org>
parents: 6491
diff changeset
26 import re
e605ddb45701 flake8 - one var rename, import, whitespace
John Rouillard <rouilj@ieee.org>
parents: 6491
diff changeset
27 import shutil
e605ddb45701 flake8 - one var rename, import, whitespace
John Rouillard <rouilj@ieee.org>
parents: 6491
diff changeset
28 import zlib
e605ddb45701 flake8 - one var rename, import, whitespace
John Rouillard <rouilj@ieee.org>
parents: 6491
diff changeset
29
3544
5cd1c83dea50 Features and fixes.
Richard Jones <richard@users.sourceforge.net>
parents: 3295
diff changeset
30 from roundup.backends.indexer_common import Indexer as IndexerBase
2872
d530b68e4b42 don't index common words [SF#1046612]
Richard Jones <richard@users.sourceforge.net>
parents: 2089
diff changeset
31
6982
e605ddb45701 flake8 - one var rename, import, whitespace
John Rouillard <rouilj@ieee.org>
parents: 6491
diff changeset
32
3544
5cd1c83dea50 Features and fixes.
Richard Jones <richard@users.sourceforge.net>
parents: 3295
diff changeset
33 class Indexer(IndexerBase):
2089
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
34 '''Indexes information from roundup's hyperdb to allow efficient
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
35 searching.
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
36
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
37 Three structures are created by the indexer::
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
38
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
39 files {identifier: (fileid, wordcount)}
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
40 words {word: {fileid: count}}
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
41 fileids {fileid: identifier}
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
42
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
43 where identifier is (classname, nodeid, propertyname)
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
44 '''
3295
a615cc230160 added Xapian indexer; replaces standard indexers if Xapian is available
Richard Jones <richard@users.sourceforge.net>
parents: 3092
diff changeset
45 def __init__(self, db):
3544
5cd1c83dea50 Features and fixes.
Richard Jones <richard@users.sourceforge.net>
parents: 3295
diff changeset
46 IndexerBase.__init__(self, db)
3295
a615cc230160 added Xapian indexer; replaces standard indexers if Xapian is available
Richard Jones <richard@users.sourceforge.net>
parents: 3092
diff changeset
47 self.indexdb_path = os.path.join(db.config.DATABASE, 'indexes')
2089
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
48 self.indexdb = os.path.join(self.indexdb_path, 'index.db')
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
49 self.reindex = 0
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
50 self.quiet = 9
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
51 self.changed = 0
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
52
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
53 # see if we need to reindex because of a change in code
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
54 version = os.path.join(self.indexdb_path, 'version')
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
55 if (not os.path.exists(self.indexdb_path) or
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
56 not os.path.exists(version)):
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
57 # for now the file itself is a flag
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
58 self.force_reindex()
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
59 elif os.path.exists(version):
6491
087cae2fbcea Handle more ResourceWarning issues.
John Rouillard <rouilj@ieee.org>
parents: 6002
diff changeset
60 fd = open(version)
087cae2fbcea Handle more ResourceWarning issues.
John Rouillard <rouilj@ieee.org>
parents: 6002
diff changeset
61 version = fd.read()
087cae2fbcea Handle more ResourceWarning issues.
John Rouillard <rouilj@ieee.org>
parents: 6002
diff changeset
62 fd.close()
2089
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
63 # check the value and reindex if it's not the latest
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
64 if version.strip() != '1':
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
65 self.force_reindex()
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
66
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
67 def force_reindex(self):
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
68 '''Force a reindex condition
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
69 '''
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
70 if os.path.exists(self.indexdb_path):
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
71 shutil.rmtree(self.indexdb_path)
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
72 os.makedirs(self.indexdb_path)
6002
3175bb92ca28 Cleanups for bandit
John Rouillard <rouilj@ieee.org>
parents: 5966
diff changeset
73 os.chmod(self.indexdb_path, 0o775) # nosec - allow group write
6491
087cae2fbcea Handle more ResourceWarning issues.
John Rouillard <rouilj@ieee.org>
parents: 6002
diff changeset
74 fd = open(os.path.join(self.indexdb_path, 'version'), 'w')
087cae2fbcea Handle more ResourceWarning issues.
John Rouillard <rouilj@ieee.org>
parents: 6002
diff changeset
75 fd.write('1\n')
087cae2fbcea Handle more ResourceWarning issues.
John Rouillard <rouilj@ieee.org>
parents: 6002
diff changeset
76 fd.close()
2089
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
77 self.reindex = 1
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
78 self.changed = 1
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
79
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
80 def should_reindex(self):
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
81 '''Should we reindex?
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
82 '''
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
83 return self.reindex
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
84
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
85 def add_text(self, identifier, text, mime_type='text/plain'):
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
86 '''Add some text associated with the (classname, nodeid, property)
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
87 identifier.
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
88 '''
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
89 # make sure the index is loaded
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
90 self.load_index()
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
91
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
92 # remove old entries for this identifier
4357
13b3155869e0 Beginnings of a big code cleanup / modernisation to make 2to3 happy
Richard Jones <richard@users.sourceforge.net>
parents: 4252
diff changeset
93 if identifier in self.files:
2089
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
94 self.purge_entry(identifier)
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
95
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
96 # split into words
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
97 words = self.splitter(text, mime_type)
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
98
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
99 # Find new file index, and assign it to identifier
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
100 # (_TOP uses trick of negative to avoid conflict with file index)
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
101 self.files['_TOP'] = (self.files['_TOP'][0]-1, None)
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
102 file_index = abs(self.files['_TOP'][0])
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
103 self.files[identifier] = (file_index, len(words))
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
104 self.fileids[file_index] = identifier
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
105
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
106 # find the unique words
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
107 filedict = {}
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
108 for word in words:
3544
5cd1c83dea50 Features and fixes.
Richard Jones <richard@users.sourceforge.net>
parents: 3295
diff changeset
109 if self.is_stopword(word):
2872
d530b68e4b42 don't index common words [SF#1046612]
Richard Jones <richard@users.sourceforge.net>
parents: 2089
diff changeset
110 continue
4357
13b3155869e0 Beginnings of a big code cleanup / modernisation to make 2to3 happy
Richard Jones <richard@users.sourceforge.net>
parents: 4252
diff changeset
111 if word in filedict:
2089
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
112 filedict[word] = filedict[word]+1
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
113 else:
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
114 filedict[word] = 1
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
115
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
116 # now add to the totals
4357
13b3155869e0 Beginnings of a big code cleanup / modernisation to make 2to3 happy
Richard Jones <richard@users.sourceforge.net>
parents: 4252
diff changeset
117 for word in filedict:
2089
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
118 # each word has a dict of {identifier: count}
4357
13b3155869e0 Beginnings of a big code cleanup / modernisation to make 2to3 happy
Richard Jones <richard@users.sourceforge.net>
parents: 4252
diff changeset
119 if word in self.words:
2089
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
120 entry = self.words[word]
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
121 else:
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
122 # new word
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
123 entry = {}
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
124 self.words[word] = entry
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
125
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
126 # make a reference to the file for this word
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
127 entry[file_index] = filedict[word]
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
128
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
129 # save needed
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
130 self.changed = 1
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
131
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
132 def splitter(self, text, ftype):
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
133 '''Split the contents of a text string into a list of 'words'
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
134 '''
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
135 if ftype == 'text/plain':
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
136 words = self.text_splitter(text)
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
137 else:
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
138 return []
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
139 return words
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
140
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
141 def text_splitter(self, text):
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
142 """Split text/plain string into a list of words
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
143 """
5966
8e4c5db44fde Handle memory db indexer test
John Rouillard <rouilj@ieee.org>
parents: 5963
diff changeset
144 if not text:
8e4c5db44fde Handle memory db indexer test
John Rouillard <rouilj@ieee.org>
parents: 5963
diff changeset
145 return []
6982
e605ddb45701 flake8 - one var rename, import, whitespace
John Rouillard <rouilj@ieee.org>
parents: 6491
diff changeset
146
2089
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
147 # case insensitive
5963
4c7662c86a36 fixed the dbm indexer test for unicode under python2.
John Rouillard <rouilj@ieee.org>
parents: 5470
diff changeset
148 text = text.upper()
2089
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
149
4252
2ff6f39aa391 Indexers behaviour made more consistent regarding length of indexed words...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents: 3613
diff changeset
150 # Split the raw text
2ff6f39aa391 Indexers behaviour made more consistent regarding length of indexed words...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents: 3613
diff changeset
151 return re.findall(r'\b\w{%d,%d}\b' % (self.minlength, self.maxlength),
5963
4c7662c86a36 fixed the dbm indexer test for unicode under python2.
John Rouillard <rouilj@ieee.org>
parents: 5470
diff changeset
152 text, re.UNICODE)
2089
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
153
4252
2ff6f39aa391 Indexers behaviour made more consistent regarding length of indexed words...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents: 3613
diff changeset
154 # we override this to ignore too short and too long words
2ff6f39aa391 Indexers behaviour made more consistent regarding length of indexed words...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents: 3613
diff changeset
155 # and also to fix a bug - the (fail) case.
2089
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
156 def find(self, wordlist):
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
157 '''Locate files that match ALL the words in wordlist
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
158 '''
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
159 if not hasattr(self, 'words'):
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
160 self.load_index()
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
161 self.load_index(wordlist=wordlist)
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
162 entries = {}
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
163 hits = None
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
164 for word in wordlist:
4252
2ff6f39aa391 Indexers behaviour made more consistent regarding length of indexed words...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents: 3613
diff changeset
165 if not self.minlength <= len(word) <= self.maxlength:
2089
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
166 # word outside the bounds of what we index - ignore
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
167 continue
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
168 word = word.upper()
4252
2ff6f39aa391 Indexers behaviour made more consistent regarding length of indexed words...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents: 3613
diff changeset
169 if self.is_stopword(word):
2ff6f39aa391 Indexers behaviour made more consistent regarding length of indexed words...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents: 3613
diff changeset
170 continue
2089
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
171 entry = self.words.get(word) # For each word, get index
6982
e605ddb45701 flake8 - one var rename, import, whitespace
John Rouillard <rouilj@ieee.org>
parents: 6491
diff changeset
172 entries[word] = entry # of matching files
2089
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
173 if not entry: # Nothing for this one word (fail)
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
174 return {}
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
175 if hits is None:
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
176 hits = {}
4357
13b3155869e0 Beginnings of a big code cleanup / modernisation to make 2to3 happy
Richard Jones <richard@users.sourceforge.net>
parents: 4252
diff changeset
177 for k in entry:
13b3155869e0 Beginnings of a big code cleanup / modernisation to make 2to3 happy
Richard Jones <richard@users.sourceforge.net>
parents: 4252
diff changeset
178 if k not in self.fileids:
13b3155869e0 Beginnings of a big code cleanup / modernisation to make 2to3 happy
Richard Jones <richard@users.sourceforge.net>
parents: 4252
diff changeset
179 raise ValueError('Index is corrupted: re-generate it')
2089
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
180 hits[k] = self.fileids[k]
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
181 else:
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
182 # Eliminate hits for every non-match
4362
74476eaac38a more modernisation
Richard Jones <richard@users.sourceforge.net>
parents: 4357
diff changeset
183 for fileid in list(hits):
4357
13b3155869e0 Beginnings of a big code cleanup / modernisation to make 2to3 happy
Richard Jones <richard@users.sourceforge.net>
parents: 4252
diff changeset
184 if fileid not in entry:
2089
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
185 del hits[fileid]
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
186 if hits is None:
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
187 return {}
4357
13b3155869e0 Beginnings of a big code cleanup / modernisation to make 2to3 happy
Richard Jones <richard@users.sourceforge.net>
parents: 4252
diff changeset
188 return list(hits.values())
2089
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
189
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
190 segments = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ#_-!"
6982
e605ddb45701 flake8 - one var rename, import, whitespace
John Rouillard <rouilj@ieee.org>
parents: 6491
diff changeset
191
2089
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
192 def load_index(self, reload=0, wordlist=None):
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
193 # Unless reload is indicated, do not load twice
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
194 if self.index_loaded() and not reload:
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
195 return 0
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
196
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
197 # Ok, now let's actually load it
6982
e605ddb45701 flake8 - one var rename, import, whitespace
John Rouillard <rouilj@ieee.org>
parents: 6491
diff changeset
198 db = {'WORDS': {}, 'FILES': {'_TOP': (0, None)}, 'FILEIDS': {}}
2089
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
199
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
200 # Identify the relevant word-dictionary segments
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
201 if not wordlist:
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
202 segments = self.segments
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
203 else:
6982
e605ddb45701 flake8 - one var rename, import, whitespace
John Rouillard <rouilj@ieee.org>
parents: 6491
diff changeset
204 segments = ['-', '#']
2089
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
205 for word in wordlist:
5470
e2baa4e6ed6d handle words starting with unicode characters
Christof Meerwald <cmeerw@cmeerw.org>
parents: 5395
diff changeset
206 initchar = word[0].upper()
e2baa4e6ed6d handle words starting with unicode characters
Christof Meerwald <cmeerw@cmeerw.org>
parents: 5395
diff changeset
207 if initchar not in self.segments:
e2baa4e6ed6d handle words starting with unicode characters
Christof Meerwald <cmeerw@cmeerw.org>
parents: 5395
diff changeset
208 initchar = '_'
e2baa4e6ed6d handle words starting with unicode characters
Christof Meerwald <cmeerw@cmeerw.org>
parents: 5395
diff changeset
209 segments.append(initchar)
2089
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
210
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
211 # Load the segments
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
212 for segment in segments:
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
213 try:
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
214 f = open(self.indexdb + segment, 'rb')
5248
198b6e810c67 Use Python-3-compatible 'as' syntax for except statements
Eric S. Raymond <esr@thyrsus.com>
parents: 4570
diff changeset
215 except IOError as error:
2089
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
216 # probably just nonexistent segment index file
6982
e605ddb45701 flake8 - one var rename, import, whitespace
John Rouillard <rouilj@ieee.org>
parents: 6491
diff changeset
217 if error.errno != errno.ENOENT: raise # noqa: E701
2089
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
218 else:
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
219 pickle_str = zlib.decompress(f.read())
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
220 f.close()
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
221 dbslice = marshal.loads(pickle_str)
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
222 if dbslice.get('WORDS'):
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
223 # if it has some words, add them
5395
23b8e6067f7c Python 3 preparation: update calls to dict methods.
Joseph Myers <jsm@polyomino.org.uk>
parents: 5380
diff changeset
224 for word, entry in dbslice['WORDS'].items():
2089
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
225 db['WORDS'][word] = entry
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
226 if dbslice.get('FILES'):
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
227 # if it has some files, add them
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
228 db['FILES'] = dbslice['FILES']
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
229 if dbslice.get('FILEIDS'):
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
230 # if it has fileids, add them
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
231 db['FILEIDS'] = dbslice['FILEIDS']
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
232
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
233 self.words = db['WORDS']
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
234 self.files = db['FILES']
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
235 self.fileids = db['FILEIDS']
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
236 self.changed = 0
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
237
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
238 def save_index(self):
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
239 # only save if the index is loaded and changed
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
240 if not self.index_loaded() or not self.changed:
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
241 return
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
242
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
243 # brutal space saver... delete all the small segments
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
244 for segment in self.segments:
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
245 try:
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
246 os.remove(self.indexdb + segment)
5248
198b6e810c67 Use Python-3-compatible 'as' syntax for except statements
Eric S. Raymond <esr@thyrsus.com>
parents: 4570
diff changeset
247 except OSError as error:
2089
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
248 # probably just nonexistent segment index file
6982
e605ddb45701 flake8 - one var rename, import, whitespace
John Rouillard <rouilj@ieee.org>
parents: 6491
diff changeset
249 if error.errno != errno.ENOENT: raise # noqa: E701
2089
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
250
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
251 # First write the much simpler filename/fileid dictionaries
6982
e605ddb45701 flake8 - one var rename, import, whitespace
John Rouillard <rouilj@ieee.org>
parents: 6491
diff changeset
252 dbfil = {'WORDS': None, 'FILES': self.files, 'FILEIDS': self.fileids}
7690
d17e57220a62 fix: close file properly in indexer_dbm.py:save_index()
John Rouillard <rouilj@ieee.org>
parents: 6982
diff changeset
253 marshal_fh = open(self.indexdb+'-', 'wb')
d17e57220a62 fix: close file properly in indexer_dbm.py:save_index()
John Rouillard <rouilj@ieee.org>
parents: 6982
diff changeset
254 marshal_fh.write(zlib.compress(marshal.dumps(dbfil)))
d17e57220a62 fix: close file properly in indexer_dbm.py:save_index()
John Rouillard <rouilj@ieee.org>
parents: 6982
diff changeset
255 marshal_fh.close()
2089
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
256
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
257 # The hard part is splitting the word dictionary up, of course
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
258 letters = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ#_"
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
259 segdicts = {} # Need batch of empty dicts
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
260 for segment in letters:
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
261 segdicts[segment] = {}
5395
23b8e6067f7c Python 3 preparation: update calls to dict methods.
Joseph Myers <jsm@polyomino.org.uk>
parents: 5380
diff changeset
262 for word, entry in self.words.items(): # Split into segment dicts
2089
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
263 initchar = word[0].upper()
5470
e2baa4e6ed6d handle words starting with unicode characters
Christof Meerwald <cmeerw@cmeerw.org>
parents: 5395
diff changeset
264 if initchar not in letters:
e2baa4e6ed6d handle words starting with unicode characters
Christof Meerwald <cmeerw@cmeerw.org>
parents: 5395
diff changeset
265 # if it's a unicode character, add it to the '_' segment
e2baa4e6ed6d handle words starting with unicode characters
Christof Meerwald <cmeerw@cmeerw.org>
parents: 5395
diff changeset
266 initchar = '_'
2089
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
267 segdicts[initchar][word] = entry
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
268
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
269 # save
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
270 for initchar in letters:
6982
e605ddb45701 flake8 - one var rename, import, whitespace
John Rouillard <rouilj@ieee.org>
parents: 6491
diff changeset
271 db = {'WORDS': segdicts[initchar], 'FILES': None, 'FILEIDS': None}
2089
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
272 pickle_str = marshal.dumps(db)
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
273 filename = self.indexdb + initchar
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
274 pickle_fh = open(filename, 'wb')
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
275 pickle_fh.write(zlib.compress(pickle_str))
6491
087cae2fbcea Handle more ResourceWarning issues.
John Rouillard <rouilj@ieee.org>
parents: 6002
diff changeset
276 pickle_fh.close()
5380
64c4e43fbb84 Python 3 preparation: numeric literal syntax.
Joseph Myers <jsm@polyomino.org.uk>
parents: 5248
diff changeset
277 os.chmod(filename, 0o664)
2089
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
278
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
279 # save done
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
280 self.changed = 0
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
281
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
282 def purge_entry(self, identifier):
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
283 '''Remove a file from file index and word index
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
284 '''
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
285 self.load_index()
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
286
4357
13b3155869e0 Beginnings of a big code cleanup / modernisation to make 2to3 happy
Richard Jones <richard@users.sourceforge.net>
parents: 4252
diff changeset
287 if identifier not in self.files:
2089
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
288 return
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
289
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
290 file_index = self.files[identifier][0]
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
291 del self.files[identifier]
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
292 del self.fileids[file_index]
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
293
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
294 # The much harder part, cleanup the word index
6982
e605ddb45701 flake8 - one var rename, import, whitespace
John Rouillard <rouilj@ieee.org>
parents: 6491
diff changeset
295 for _key, occurs in self.words.items():
4357
13b3155869e0 Beginnings of a big code cleanup / modernisation to make 2to3 happy
Richard Jones <richard@users.sourceforge.net>
parents: 4252
diff changeset
296 if file_index in occurs:
2089
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
297 del occurs[file_index]
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
298
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
299 # save needed
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
300 self.changed = 1
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
301
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
302 def index_loaded(self):
6982
e605ddb45701 flake8 - one var rename, import, whitespace
John Rouillard <rouilj@ieee.org>
parents: 6491
diff changeset
303 return (hasattr(self, 'fileids') and hasattr(self, 'files') and
e605ddb45701 flake8 - one var rename, import, whitespace
John Rouillard <rouilj@ieee.org>
parents: 6491
diff changeset
304 hasattr(self, 'words'))
2089
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
305
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
306 def rollback(self):
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
307 ''' load last saved index info. '''
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
308 self.load_index(reload=1)
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
309
3613
5f4db2650da3 implement close() on all indexers [SF#1242477]
Richard Jones <richard@users.sourceforge.net>
parents: 3555
diff changeset
310 def close(self):
5f4db2650da3 implement close() on all indexers [SF#1242477]
Richard Jones <richard@users.sourceforge.net>
parents: 3555
diff changeset
311 pass
5f4db2650da3 implement close() on all indexers [SF#1242477]
Richard Jones <richard@users.sourceforge.net>
parents: 3555
diff changeset
312
5f4db2650da3 implement close() on all indexers [SF#1242477]
Richard Jones <richard@users.sourceforge.net>
parents: 3555
diff changeset
313
2089
93f03c6714d8 A few big changes in this commit:
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
314 # vim: set filetype=python ts=4 sw=4 et si

Roundup Issue Tracker: http://roundup-tracker.org/