Mercurial > p > roundup > code
annotate roundup/backends/indexer_common.py @ 6915:9ff091537f43
postgresql native-fts; more indexer tests
1) Make postgresql native-fts actually work.
2) Add simple stopword filtering to sqlite native-fts indexer.
3) Add more tests for indexer_common get_indexer
Details:
1) roundup/backends/indexer_postgresql_fts.py:
ignore ValueError raised if we try to index a string with a null
character in it. This could happen due to an incorrect text/ mime
type on a file that has nulls in it.
Replace ValueError raised by postgresql with customized
IndexerQueryError if a search string has a null in it.
roundup/backends/rdbms_common.py:
Make postgresql native-fts work. When specified it was using using
whatever was returned from get_indexer(). However loading the
native-fts indexer backend failed because there was no connection to
the postgresql database when this call was made.
Simple solution, move the call after the open_connection call in
Database::__init__().
However the open_connection call creates the schema for the
database if it is not there. The schema builds tables for
indexer=native type indexing. As part of the build it looks at the
indexer to see the min/max size of the indexed tokens. No indexer
define, we get a crash.
So it's a a chicken/egg issue. I solved it by setting the indexer
to the Indexer from indexer_common which has the min/max token size
info. I also added a no-op save_indexer to this Indexer class. I
claim save_indexer() isn't needed as a commit() on the db does all
the saving required. Then after open_connection is called, I call
get_indexer to retrieve the correct indexer and
indexer_postgresql_fts woks since the conn connection property is
defined.
roundup/backends/indexer_common.py:
add save_index() method for indexer. It does nothing but is needed
in rdbms backends during schema initialization.
2) roundup/backends/indexer_sqlite_fts.py:
when this indexer is used, the indexer test in DBTest on the word
"the" fail. This is due to missing stopword filtering. Implement
basic stopword filtering for bare stopwords (like 'the') to make the
test pass. Note: this indexer is not currently automatically run by
the CI suite, it was found during manual testing. However there is a
FIXME to extract the indexer tests from DBTest and run it using this
backend.
roundup/configuration.py, roundup/doc/admin_guide.txt:
update doc on stopword use for sqlite native-fts.
test/db_test_base.py:
DBTest::testStringBinary creates a file with nulls in it. It was
breaking postgresql with native-fts indexer. Changed test to assign
mime type application/octet-stream that prevents it from being
processed by any text search indexer.
add test to exclude indexer searching in specific props. This code
path was untested before.
test/test_indexer.py:
add test to call find with no words. Untested code path.
add test to index and find a string with a null \x00 byte. it was
tested inadvertently by testStringBinary but this makes it explicit
and moves it to indexer testing. (one version each for: generic,
postgresql and mysql)
Renamed Get_IndexerAutoSelectTest to Get_IndexerTest and renamed
autoselect tests to include autoselect. Added tests for an invalid
indexer and using native-fts with anydbm (unsupported combo) to make
sure the code does something useful if the validation in
configuration.py is broken.
test/test_liveserver.py:
add test to load an issue
add test using text search (fts) to find the issue
add tests to find issue using postgresql native-fts
test/test_postgresql.py, test/test_sqlite.py:
added explanation on how to setup integration test using native-fts.
added code to clean up test environment if native-fts test is run.
| author | John Rouillard <rouilj@ieee.org> |
|---|---|
| date | Mon, 05 Sep 2022 16:25:20 -0400 |
| parents | 1f9291fbb835 |
| children |
| rev | line source |
|---|---|
|
3058
1c063814d567
Move search method duplicated in indexer_dbm and indexer_tsearch2...
Johannes Gijsbers <jlgijsbers@users.sourceforge.net>
parents:
diff
changeset
|
1 from roundup import hyperdb |
|
1c063814d567
Move search method duplicated in indexer_dbm and indexer_tsearch2...
Johannes Gijsbers <jlgijsbers@users.sourceforge.net>
parents:
diff
changeset
|
2 |
|
3544
5cd1c83dea50
Features and fixes.
Richard Jones <richard@users.sourceforge.net>
parents:
3092
diff
changeset
|
3 STOPWORDS = [ |
|
5cd1c83dea50
Features and fixes.
Richard Jones <richard@users.sourceforge.net>
parents:
3092
diff
changeset
|
4 "A", "AND", "ARE", "AS", "AT", "BE", "BUT", "BY", |
|
5cd1c83dea50
Features and fixes.
Richard Jones <richard@users.sourceforge.net>
parents:
3092
diff
changeset
|
5 "FOR", "IF", "IN", "INTO", "IS", "IT", |
|
5cd1c83dea50
Features and fixes.
Richard Jones <richard@users.sourceforge.net>
parents:
3092
diff
changeset
|
6 "NO", "NOT", "OF", "ON", "OR", "SUCH", |
|
5cd1c83dea50
Features and fixes.
Richard Jones <richard@users.sourceforge.net>
parents:
3092
diff
changeset
|
7 "THAT", "THE", "THEIR", "THEN", "THERE", "THESE", |
|
3997
edbb89730dc2
Fix indexer handling of indexed Link properties
Richard Jones <richard@users.sourceforge.net>
parents:
3751
diff
changeset
|
8 "THEY", "THIS", "TO", "WAS", "WILL", "WITH" |
|
3092
a8c2371f45b6
Some cleanup:
Johannes Gijsbers <jlgijsbers@users.sourceforge.net>
parents:
3088
diff
changeset
|
9 ] |
|
a8c2371f45b6
Some cleanup:
Johannes Gijsbers <jlgijsbers@users.sourceforge.net>
parents:
3088
diff
changeset
|
10 |
| 6910 | 11 |
|
3058
1c063814d567
Move search method duplicated in indexer_dbm and indexer_tsearch2...
Johannes Gijsbers <jlgijsbers@users.sourceforge.net>
parents:
diff
changeset
|
12 def _isLink(propclass): |
|
1c063814d567
Move search method duplicated in indexer_dbm and indexer_tsearch2...
Johannes Gijsbers <jlgijsbers@users.sourceforge.net>
parents:
diff
changeset
|
13 return (isinstance(propclass, hyperdb.Link) or |
|
1c063814d567
Move search method duplicated in indexer_dbm and indexer_tsearch2...
Johannes Gijsbers <jlgijsbers@users.sourceforge.net>
parents:
diff
changeset
|
14 isinstance(propclass, hyperdb.Multilink)) |
|
1c063814d567
Move search method duplicated in indexer_dbm and indexer_tsearch2...
Johannes Gijsbers <jlgijsbers@users.sourceforge.net>
parents:
diff
changeset
|
15 |
| 6910 | 16 |
|
3613
5f4db2650da3
implement close() on all indexers [SF#1242477]
Richard Jones <richard@users.sourceforge.net>
parents:
3544
diff
changeset
|
17 class Indexer: |
|
3544
5cd1c83dea50
Features and fixes.
Richard Jones <richard@users.sourceforge.net>
parents:
3092
diff
changeset
|
18 def __init__(self, db): |
|
4089
eddb82d0964c
Add compatibility package to allow us to deal with Python versions 2.3..2.6.
Richard Jones <richard@users.sourceforge.net>
parents:
4017
diff
changeset
|
19 self.stopwords = set(STOPWORDS) |
|
3544
5cd1c83dea50
Features and fixes.
Richard Jones <richard@users.sourceforge.net>
parents:
3092
diff
changeset
|
20 for word in db.config[('main', 'indexer_stopwords')]: |
|
5cd1c83dea50
Features and fixes.
Richard Jones <richard@users.sourceforge.net>
parents:
3092
diff
changeset
|
21 self.stopwords.add(word) |
|
6599
39189dd94f2c
issue2551189 - increase size of words in full text index.
John Rouillard <rouilj@ieee.org>
parents:
6588
diff
changeset
|
22 # Do not index anything longer than maxlength characters since |
|
39189dd94f2c
issue2551189 - increase size of words in full text index.
John Rouillard <rouilj@ieee.org>
parents:
6588
diff
changeset
|
23 # that'll be gibberish (encoded text or somesuch) or shorter |
|
39189dd94f2c
issue2551189 - increase size of words in full text index.
John Rouillard <rouilj@ieee.org>
parents:
6588
diff
changeset
|
24 # than 2 characters |
|
4252
2ff6f39aa391
Indexers behaviour made more consistent regarding length of indexed words...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents:
4089
diff
changeset
|
25 self.minlength = 2 |
|
6599
39189dd94f2c
issue2551189 - increase size of words in full text index.
John Rouillard <rouilj@ieee.org>
parents:
6588
diff
changeset
|
26 self.maxlength = 50 |
| 6910 | 27 self.language = db.config[('main', 'indexer_language')] |
|
6588
91ab3e0ffcd0
Summary: Add test cases for sqlite fts
John Rouillard <rouilj@ieee.org>
parents:
6353
diff
changeset
|
28 # Some indexers have a query language. If that is the case, |
|
91ab3e0ffcd0
Summary: Add test cases for sqlite fts
John Rouillard <rouilj@ieee.org>
parents:
6353
diff
changeset
|
29 # we don't parse the user supplied query into a wordlist. |
|
91ab3e0ffcd0
Summary: Add test cases for sqlite fts
John Rouillard <rouilj@ieee.org>
parents:
6353
diff
changeset
|
30 self.query_language = False |
|
3544
5cd1c83dea50
Features and fixes.
Richard Jones <richard@users.sourceforge.net>
parents:
3092
diff
changeset
|
31 |
|
5cd1c83dea50
Features and fixes.
Richard Jones <richard@users.sourceforge.net>
parents:
3092
diff
changeset
|
32 def is_stopword(self, word): |
|
5cd1c83dea50
Features and fixes.
Richard Jones <richard@users.sourceforge.net>
parents:
3092
diff
changeset
|
33 return word in self.stopwords |
|
5cd1c83dea50
Features and fixes.
Richard Jones <richard@users.sourceforge.net>
parents:
3092
diff
changeset
|
34 |
|
3058
1c063814d567
Move search method duplicated in indexer_dbm and indexer_tsearch2...
Johannes Gijsbers <jlgijsbers@users.sourceforge.net>
parents:
diff
changeset
|
35 def getHits(self, search_terms, klass): |
|
1c063814d567
Move search method duplicated in indexer_dbm and indexer_tsearch2...
Johannes Gijsbers <jlgijsbers@users.sourceforge.net>
parents:
diff
changeset
|
36 return self.find(search_terms) |
|
3997
edbb89730dc2
Fix indexer handling of indexed Link properties
Richard Jones <richard@users.sourceforge.net>
parents:
3751
diff
changeset
|
37 |
|
6915
9ff091537f43
postgresql native-fts; more indexer tests
John Rouillard <rouilj@ieee.org>
parents:
6910
diff
changeset
|
38 def save_index(self): |
|
9ff091537f43
postgresql native-fts; more indexer tests
John Rouillard <rouilj@ieee.org>
parents:
6910
diff
changeset
|
39 pass |
|
9ff091537f43
postgresql native-fts; more indexer tests
John Rouillard <rouilj@ieee.org>
parents:
6910
diff
changeset
|
40 |
| 6910 | 41 def search(self, search_terms, klass, ignore=None): |
|
4089
eddb82d0964c
Add compatibility package to allow us to deal with Python versions 2.3..2.6.
Richard Jones <richard@users.sourceforge.net>
parents:
4017
diff
changeset
|
42 """Display search results looking for [search, terms] associated |
|
3058
1c063814d567
Move search method duplicated in indexer_dbm and indexer_tsearch2...
Johannes Gijsbers <jlgijsbers@users.sourceforge.net>
parents:
diff
changeset
|
43 with the hyperdb Class "klass". Ignore hits on {class: property}. |
|
4089
eddb82d0964c
Add compatibility package to allow us to deal with Python versions 2.3..2.6.
Richard Jones <richard@users.sourceforge.net>
parents:
4017
diff
changeset
|
44 """ |
|
3058
1c063814d567
Move search method duplicated in indexer_dbm and indexer_tsearch2...
Johannes Gijsbers <jlgijsbers@users.sourceforge.net>
parents:
diff
changeset
|
45 # do the index lookup |
|
1c063814d567
Move search method duplicated in indexer_dbm and indexer_tsearch2...
Johannes Gijsbers <jlgijsbers@users.sourceforge.net>
parents:
diff
changeset
|
46 hits = self.getHits(search_terms, klass) |
|
1c063814d567
Move search method duplicated in indexer_dbm and indexer_tsearch2...
Johannes Gijsbers <jlgijsbers@users.sourceforge.net>
parents:
diff
changeset
|
47 if not hits: |
|
1c063814d567
Move search method duplicated in indexer_dbm and indexer_tsearch2...
Johannes Gijsbers <jlgijsbers@users.sourceforge.net>
parents:
diff
changeset
|
48 return {} |
|
1c063814d567
Move search method duplicated in indexer_dbm and indexer_tsearch2...
Johannes Gijsbers <jlgijsbers@users.sourceforge.net>
parents:
diff
changeset
|
49 |
|
1c063814d567
Move search method duplicated in indexer_dbm and indexer_tsearch2...
Johannes Gijsbers <jlgijsbers@users.sourceforge.net>
parents:
diff
changeset
|
50 designator_propname = {} |
|
5395
23b8e6067f7c
Python 3 preparation: update calls to dict methods.
Joseph Myers <jsm@polyomino.org.uk>
parents:
5388
diff
changeset
|
51 for nm, propclass in klass.getprops().items(): |
|
3058
1c063814d567
Move search method duplicated in indexer_dbm and indexer_tsearch2...
Johannes Gijsbers <jlgijsbers@users.sourceforge.net>
parents:
diff
changeset
|
52 if _isLink(propclass): |
|
3751
44603dd791b7
full-text search wasn't coping with multiple multilinks to the same class
Richard Jones <richard@users.sourceforge.net>
parents:
3718
diff
changeset
|
53 designator_propname.setdefault(propclass.classname, |
| 6910 | 54 []).append(nm) |
|
3058
1c063814d567
Move search method duplicated in indexer_dbm and indexer_tsearch2...
Johannes Gijsbers <jlgijsbers@users.sourceforge.net>
parents:
diff
changeset
|
55 |
|
1c063814d567
Move search method duplicated in indexer_dbm and indexer_tsearch2...
Johannes Gijsbers <jlgijsbers@users.sourceforge.net>
parents:
diff
changeset
|
56 # build a dictionary of nodes and their associated messages |
|
1c063814d567
Move search method duplicated in indexer_dbm and indexer_tsearch2...
Johannes Gijsbers <jlgijsbers@users.sourceforge.net>
parents:
diff
changeset
|
57 # and files |
|
1c063814d567
Move search method duplicated in indexer_dbm and indexer_tsearch2...
Johannes Gijsbers <jlgijsbers@users.sourceforge.net>
parents:
diff
changeset
|
58 nodeids = {} # this is the answer |
|
1c063814d567
Move search method duplicated in indexer_dbm and indexer_tsearch2...
Johannes Gijsbers <jlgijsbers@users.sourceforge.net>
parents:
diff
changeset
|
59 propspec = {} # used to do the klass.find |
| 6910 | 60 for pn in designator_propname.values(): |
| 61 for propname in pn: | |
|
3751
44603dd791b7
full-text search wasn't coping with multiple multilinks to the same class
Richard Jones <richard@users.sourceforge.net>
parents:
3718
diff
changeset
|
62 propspec[propname] = {} # used as a set (value doesn't matter) |
|
3718
0d561b24ceff
support sqlite3
Richard Jones <richard@users.sourceforge.net>
parents:
3613
diff
changeset
|
63 |
| 6910 | 64 if ignore is None: |
| 65 ignore = {} | |
|
3718
0d561b24ceff
support sqlite3
Richard Jones <richard@users.sourceforge.net>
parents:
3613
diff
changeset
|
66 # don't unpack hits entries as sqlite3's Row can't be unpacked :( |
|
0d561b24ceff
support sqlite3
Richard Jones <richard@users.sourceforge.net>
parents:
3613
diff
changeset
|
67 for entry in hits: |
|
3058
1c063814d567
Move search method duplicated in indexer_dbm and indexer_tsearch2...
Johannes Gijsbers <jlgijsbers@users.sourceforge.net>
parents:
diff
changeset
|
68 # skip this result if we don't care about this class/property |
|
3718
0d561b24ceff
support sqlite3
Richard Jones <richard@users.sourceforge.net>
parents:
3613
diff
changeset
|
69 classname = entry[0] |
|
0d561b24ceff
support sqlite3
Richard Jones <richard@users.sourceforge.net>
parents:
3613
diff
changeset
|
70 property = entry[2] |
|
4357
13b3155869e0
Beginnings of a big code cleanup / modernisation to make 2to3 happy
Richard Jones <richard@users.sourceforge.net>
parents:
4281
diff
changeset
|
71 if (classname, property) in ignore: |
|
3058
1c063814d567
Move search method duplicated in indexer_dbm and indexer_tsearch2...
Johannes Gijsbers <jlgijsbers@users.sourceforge.net>
parents:
diff
changeset
|
72 continue |
|
1c063814d567
Move search method duplicated in indexer_dbm and indexer_tsearch2...
Johannes Gijsbers <jlgijsbers@users.sourceforge.net>
parents:
diff
changeset
|
73 |
|
1c063814d567
Move search method duplicated in indexer_dbm and indexer_tsearch2...
Johannes Gijsbers <jlgijsbers@users.sourceforge.net>
parents:
diff
changeset
|
74 # if it's a property on klass, it's easy |
|
3998
20c9a1cefb39
make sure item ids are str()
Richard Jones <richard@users.sourceforge.net>
parents:
3997
diff
changeset
|
75 # (make sure the nodeid is str() not unicode() as returned by some |
|
20c9a1cefb39
make sure item ids are str()
Richard Jones <richard@users.sourceforge.net>
parents:
3997
diff
changeset
|
76 # backends as that can cause problems down the track) |
|
20c9a1cefb39
make sure item ids are str()
Richard Jones <richard@users.sourceforge.net>
parents:
3997
diff
changeset
|
77 nodeid = str(entry[1]) |
|
3058
1c063814d567
Move search method duplicated in indexer_dbm and indexer_tsearch2...
Johannes Gijsbers <jlgijsbers@users.sourceforge.net>
parents:
diff
changeset
|
78 if classname == klass.classname: |
|
4357
13b3155869e0
Beginnings of a big code cleanup / modernisation to make 2to3 happy
Richard Jones <richard@users.sourceforge.net>
parents:
4281
diff
changeset
|
79 if nodeid not in nodeids: |
|
3058
1c063814d567
Move search method duplicated in indexer_dbm and indexer_tsearch2...
Johannes Gijsbers <jlgijsbers@users.sourceforge.net>
parents:
diff
changeset
|
80 nodeids[nodeid] = {} |
|
1c063814d567
Move search method duplicated in indexer_dbm and indexer_tsearch2...
Johannes Gijsbers <jlgijsbers@users.sourceforge.net>
parents:
diff
changeset
|
81 continue |
|
1c063814d567
Move search method duplicated in indexer_dbm and indexer_tsearch2...
Johannes Gijsbers <jlgijsbers@users.sourceforge.net>
parents:
diff
changeset
|
82 |
|
1c063814d567
Move search method duplicated in indexer_dbm and indexer_tsearch2...
Johannes Gijsbers <jlgijsbers@users.sourceforge.net>
parents:
diff
changeset
|
83 # make sure the class is a linked one, otherwise ignore |
|
4357
13b3155869e0
Beginnings of a big code cleanup / modernisation to make 2to3 happy
Richard Jones <richard@users.sourceforge.net>
parents:
4281
diff
changeset
|
84 if classname not in designator_propname: |
|
3058
1c063814d567
Move search method duplicated in indexer_dbm and indexer_tsearch2...
Johannes Gijsbers <jlgijsbers@users.sourceforge.net>
parents:
diff
changeset
|
85 continue |
|
1c063814d567
Move search method duplicated in indexer_dbm and indexer_tsearch2...
Johannes Gijsbers <jlgijsbers@users.sourceforge.net>
parents:
diff
changeset
|
86 |
|
1c063814d567
Move search method duplicated in indexer_dbm and indexer_tsearch2...
Johannes Gijsbers <jlgijsbers@users.sourceforge.net>
parents:
diff
changeset
|
87 # it's a linked class - set up to do the klass.find |
|
3751
44603dd791b7
full-text search wasn't coping with multiple multilinks to the same class
Richard Jones <richard@users.sourceforge.net>
parents:
3718
diff
changeset
|
88 for linkprop in designator_propname[classname]: |
|
44603dd791b7
full-text search wasn't coping with multiple multilinks to the same class
Richard Jones <richard@users.sourceforge.net>
parents:
3718
diff
changeset
|
89 propspec[linkprop][nodeid] = 1 |
|
3058
1c063814d567
Move search method duplicated in indexer_dbm and indexer_tsearch2...
Johannes Gijsbers <jlgijsbers@users.sourceforge.net>
parents:
diff
changeset
|
90 |
|
1c063814d567
Move search method duplicated in indexer_dbm and indexer_tsearch2...
Johannes Gijsbers <jlgijsbers@users.sourceforge.net>
parents:
diff
changeset
|
91 # retain only the meaningful entries |
|
4359
b9abbdd15259
another module modernised
Richard Jones <richard@users.sourceforge.net>
parents:
4357
diff
changeset
|
92 for propname, idset in list(propspec.items()): |
|
3058
1c063814d567
Move search method duplicated in indexer_dbm and indexer_tsearch2...
Johannes Gijsbers <jlgijsbers@users.sourceforge.net>
parents:
diff
changeset
|
93 if not idset: |
|
1c063814d567
Move search method duplicated in indexer_dbm and indexer_tsearch2...
Johannes Gijsbers <jlgijsbers@users.sourceforge.net>
parents:
diff
changeset
|
94 del propspec[propname] |
|
3751
44603dd791b7
full-text search wasn't coping with multiple multilinks to the same class
Richard Jones <richard@users.sourceforge.net>
parents:
3718
diff
changeset
|
95 |
|
3058
1c063814d567
Move search method duplicated in indexer_dbm and indexer_tsearch2...
Johannes Gijsbers <jlgijsbers@users.sourceforge.net>
parents:
diff
changeset
|
96 # klass.find tells me the klass nodeids the linked nodes relate to |
|
3997
edbb89730dc2
Fix indexer handling of indexed Link properties
Richard Jones <richard@users.sourceforge.net>
parents:
3751
diff
changeset
|
97 propdefs = klass.getprops() |
|
3058
1c063814d567
Move search method duplicated in indexer_dbm and indexer_tsearch2...
Johannes Gijsbers <jlgijsbers@users.sourceforge.net>
parents:
diff
changeset
|
98 for resid in klass.find(**propspec): |
|
1c063814d567
Move search method duplicated in indexer_dbm and indexer_tsearch2...
Johannes Gijsbers <jlgijsbers@users.sourceforge.net>
parents:
diff
changeset
|
99 resid = str(resid) |
|
4017
605f4a7910b4
Small performance-improvement and bug-fix for indexer:
Ralf Schlatterbeck <schlatterbeck@users.sourceforge.net>
parents:
3998
diff
changeset
|
100 if resid in nodeids: |
| 6910 | 101 continue # we ignore duplicate resids |
|
4017
605f4a7910b4
Small performance-improvement and bug-fix for indexer:
Ralf Schlatterbeck <schlatterbeck@users.sourceforge.net>
parents:
3998
diff
changeset
|
102 nodeids[resid] = {} |
|
3058
1c063814d567
Move search method duplicated in indexer_dbm and indexer_tsearch2...
Johannes Gijsbers <jlgijsbers@users.sourceforge.net>
parents:
diff
changeset
|
103 node_dict = nodeids[resid] |
|
1c063814d567
Move search method duplicated in indexer_dbm and indexer_tsearch2...
Johannes Gijsbers <jlgijsbers@users.sourceforge.net>
parents:
diff
changeset
|
104 # now figure out where it came from |
|
4357
13b3155869e0
Beginnings of a big code cleanup / modernisation to make 2to3 happy
Richard Jones <richard@users.sourceforge.net>
parents:
4281
diff
changeset
|
105 for linkprop in propspec: |
|
3997
edbb89730dc2
Fix indexer handling of indexed Link properties
Richard Jones <richard@users.sourceforge.net>
parents:
3751
diff
changeset
|
106 v = klass.get(resid, linkprop) |
|
edbb89730dc2
Fix indexer handling of indexed Link properties
Richard Jones <richard@users.sourceforge.net>
parents:
3751
diff
changeset
|
107 # the link might be a Link so deal with a single result or None |
|
edbb89730dc2
Fix indexer handling of indexed Link properties
Richard Jones <richard@users.sourceforge.net>
parents:
3751
diff
changeset
|
108 if isinstance(propdefs[linkprop], hyperdb.Link): |
| 6910 | 109 if v is None: |
| 110 continue | |
|
3997
edbb89730dc2
Fix indexer handling of indexed Link properties
Richard Jones <richard@users.sourceforge.net>
parents:
3751
diff
changeset
|
111 v = [v] |
|
edbb89730dc2
Fix indexer handling of indexed Link properties
Richard Jones <richard@users.sourceforge.net>
parents:
3751
diff
changeset
|
112 for nodeid in v: |
|
4357
13b3155869e0
Beginnings of a big code cleanup / modernisation to make 2to3 happy
Richard Jones <richard@users.sourceforge.net>
parents:
4281
diff
changeset
|
113 if nodeid in propspec[linkprop]: |
|
3058
1c063814d567
Move search method duplicated in indexer_dbm and indexer_tsearch2...
Johannes Gijsbers <jlgijsbers@users.sourceforge.net>
parents:
diff
changeset
|
114 # OK, this node[propname] has a winner |
|
4357
13b3155869e0
Beginnings of a big code cleanup / modernisation to make 2to3 happy
Richard Jones <richard@users.sourceforge.net>
parents:
4281
diff
changeset
|
115 if linkprop not in node_dict: |
|
3058
1c063814d567
Move search method duplicated in indexer_dbm and indexer_tsearch2...
Johannes Gijsbers <jlgijsbers@users.sourceforge.net>
parents:
diff
changeset
|
116 node_dict[linkprop] = [nodeid] |
|
1c063814d567
Move search method duplicated in indexer_dbm and indexer_tsearch2...
Johannes Gijsbers <jlgijsbers@users.sourceforge.net>
parents:
diff
changeset
|
117 else: |
|
1c063814d567
Move search method duplicated in indexer_dbm and indexer_tsearch2...
Johannes Gijsbers <jlgijsbers@users.sourceforge.net>
parents:
diff
changeset
|
118 node_dict[linkprop].append(nodeid) |
|
1c063814d567
Move search method duplicated in indexer_dbm and indexer_tsearch2...
Johannes Gijsbers <jlgijsbers@users.sourceforge.net>
parents:
diff
changeset
|
119 return nodeids |
|
3613
5f4db2650da3
implement close() on all indexers [SF#1242477]
Richard Jones <richard@users.sourceforge.net>
parents:
3544
diff
changeset
|
120 |
| 6910 | 121 |
|
5096
e74c3611b138
- issue2550636, issue2550909: Added support for Whoosh indexer.
John Rouillard <rouilj@ieee.org>
parents:
4687
diff
changeset
|
122 def get_indexer(config, db): |
|
e74c3611b138
- issue2550636, issue2550909: Added support for Whoosh indexer.
John Rouillard <rouilj@ieee.org>
parents:
4687
diff
changeset
|
123 indexer_name = getattr(config, "INDEXER", "") |
|
e74c3611b138
- issue2550636, issue2550909: Added support for Whoosh indexer.
John Rouillard <rouilj@ieee.org>
parents:
4687
diff
changeset
|
124 if not indexer_name: |
|
e74c3611b138
- issue2550636, issue2550909: Added support for Whoosh indexer.
John Rouillard <rouilj@ieee.org>
parents:
4687
diff
changeset
|
125 # Try everything |
|
e74c3611b138
- issue2550636, issue2550909: Added support for Whoosh indexer.
John Rouillard <rouilj@ieee.org>
parents:
4687
diff
changeset
|
126 try: |
|
5388
d26921b851c3
Python 3 preparation: make relative imports explicit.
Joseph Myers <jsm@polyomino.org.uk>
parents:
5096
diff
changeset
|
127 from .indexer_xapian import Indexer |
|
5096
e74c3611b138
- issue2550636, issue2550909: Added support for Whoosh indexer.
John Rouillard <rouilj@ieee.org>
parents:
4687
diff
changeset
|
128 return Indexer(db) |
|
e74c3611b138
- issue2550636, issue2550909: Added support for Whoosh indexer.
John Rouillard <rouilj@ieee.org>
parents:
4687
diff
changeset
|
129 except ImportError: |
|
e74c3611b138
- issue2550636, issue2550909: Added support for Whoosh indexer.
John Rouillard <rouilj@ieee.org>
parents:
4687
diff
changeset
|
130 pass |
|
e74c3611b138
- issue2550636, issue2550909: Added support for Whoosh indexer.
John Rouillard <rouilj@ieee.org>
parents:
4687
diff
changeset
|
131 |
|
e74c3611b138
- issue2550636, issue2550909: Added support for Whoosh indexer.
John Rouillard <rouilj@ieee.org>
parents:
4687
diff
changeset
|
132 try: |
|
5388
d26921b851c3
Python 3 preparation: make relative imports explicit.
Joseph Myers <jsm@polyomino.org.uk>
parents:
5096
diff
changeset
|
133 from .indexer_whoosh import Indexer |
|
5096
e74c3611b138
- issue2550636, issue2550909: Added support for Whoosh indexer.
John Rouillard <rouilj@ieee.org>
parents:
4687
diff
changeset
|
134 return Indexer(db) |
|
e74c3611b138
- issue2550636, issue2550909: Added support for Whoosh indexer.
John Rouillard <rouilj@ieee.org>
parents:
4687
diff
changeset
|
135 except ImportError: |
|
e74c3611b138
- issue2550636, issue2550909: Added support for Whoosh indexer.
John Rouillard <rouilj@ieee.org>
parents:
4687
diff
changeset
|
136 pass |
|
e74c3611b138
- issue2550636, issue2550909: Added support for Whoosh indexer.
John Rouillard <rouilj@ieee.org>
parents:
4687
diff
changeset
|
137 |
| 6910 | 138 indexer_name = "native" # fallback to native full text search |
|
5096
e74c3611b138
- issue2550636, issue2550909: Added support for Whoosh indexer.
John Rouillard <rouilj@ieee.org>
parents:
4687
diff
changeset
|
139 |
|
e74c3611b138
- issue2550636, issue2550909: Added support for Whoosh indexer.
John Rouillard <rouilj@ieee.org>
parents:
4687
diff
changeset
|
140 if indexer_name == "xapian": |
|
5388
d26921b851c3
Python 3 preparation: make relative imports explicit.
Joseph Myers <jsm@polyomino.org.uk>
parents:
5096
diff
changeset
|
141 from .indexer_xapian import Indexer |
|
5096
e74c3611b138
- issue2550636, issue2550909: Added support for Whoosh indexer.
John Rouillard <rouilj@ieee.org>
parents:
4687
diff
changeset
|
142 return Indexer(db) |
|
e74c3611b138
- issue2550636, issue2550909: Added support for Whoosh indexer.
John Rouillard <rouilj@ieee.org>
parents:
4687
diff
changeset
|
143 |
|
e74c3611b138
- issue2550636, issue2550909: Added support for Whoosh indexer.
John Rouillard <rouilj@ieee.org>
parents:
4687
diff
changeset
|
144 if indexer_name == "whoosh": |
|
5388
d26921b851c3
Python 3 preparation: make relative imports explicit.
Joseph Myers <jsm@polyomino.org.uk>
parents:
5096
diff
changeset
|
145 from .indexer_whoosh import Indexer |
|
5096
e74c3611b138
- issue2550636, issue2550909: Added support for Whoosh indexer.
John Rouillard <rouilj@ieee.org>
parents:
4687
diff
changeset
|
146 return Indexer(db) |
|
e74c3611b138
- issue2550636, issue2550909: Added support for Whoosh indexer.
John Rouillard <rouilj@ieee.org>
parents:
4687
diff
changeset
|
147 |
|
6588
91ab3e0ffcd0
Summary: Add test cases for sqlite fts
John Rouillard <rouilj@ieee.org>
parents:
6353
diff
changeset
|
148 if indexer_name == "native-fts": |
|
91ab3e0ffcd0
Summary: Add test cases for sqlite fts
John Rouillard <rouilj@ieee.org>
parents:
6353
diff
changeset
|
149 if db.dbtype not in ("sqlite", "postgres"): |
| 6910 | 150 raise AssertionError("Indexer native-fts is configured, but only " |
| 151 "sqlite and postgres support it. Database is: %r" % db.dbtype) | |
|
6588
91ab3e0ffcd0
Summary: Add test cases for sqlite fts
John Rouillard <rouilj@ieee.org>
parents:
6353
diff
changeset
|
152 |
|
91ab3e0ffcd0
Summary: Add test cases for sqlite fts
John Rouillard <rouilj@ieee.org>
parents:
6353
diff
changeset
|
153 if db.dbtype == "sqlite": |
|
91ab3e0ffcd0
Summary: Add test cases for sqlite fts
John Rouillard <rouilj@ieee.org>
parents:
6353
diff
changeset
|
154 from roundup.backends.indexer_sqlite_fts import Indexer |
|
91ab3e0ffcd0
Summary: Add test cases for sqlite fts
John Rouillard <rouilj@ieee.org>
parents:
6353
diff
changeset
|
155 return Indexer(db) |
|
91ab3e0ffcd0
Summary: Add test cases for sqlite fts
John Rouillard <rouilj@ieee.org>
parents:
6353
diff
changeset
|
156 |
|
91ab3e0ffcd0
Summary: Add test cases for sqlite fts
John Rouillard <rouilj@ieee.org>
parents:
6353
diff
changeset
|
157 if db.dbtype == "postgres": |
|
6908
3260926d7e7e
fix postgresl-fts indexer: get_indexer reported not implemented
John Rouillard <rouilj@ieee.org>
parents:
6599
diff
changeset
|
158 from roundup.backends.indexer_postgresql_fts import Indexer |
|
6588
91ab3e0ffcd0
Summary: Add test cases for sqlite fts
John Rouillard <rouilj@ieee.org>
parents:
6353
diff
changeset
|
159 return Indexer(db) |
|
91ab3e0ffcd0
Summary: Add test cases for sqlite fts
John Rouillard <rouilj@ieee.org>
parents:
6353
diff
changeset
|
160 |
|
5096
e74c3611b138
- issue2550636, issue2550909: Added support for Whoosh indexer.
John Rouillard <rouilj@ieee.org>
parents:
4687
diff
changeset
|
161 if indexer_name == "native": |
|
e74c3611b138
- issue2550636, issue2550909: Added support for Whoosh indexer.
John Rouillard <rouilj@ieee.org>
parents:
4687
diff
changeset
|
162 # load proper native indexing based on database type |
|
e74c3611b138
- issue2550636, issue2550909: Added support for Whoosh indexer.
John Rouillard <rouilj@ieee.org>
parents:
4687
diff
changeset
|
163 if db.dbtype == "anydbm": |
|
e74c3611b138
- issue2550636, issue2550909: Added support for Whoosh indexer.
John Rouillard <rouilj@ieee.org>
parents:
4687
diff
changeset
|
164 from roundup.backends.indexer_dbm import Indexer |
|
e74c3611b138
- issue2550636, issue2550909: Added support for Whoosh indexer.
John Rouillard <rouilj@ieee.org>
parents:
4687
diff
changeset
|
165 return Indexer(db) |
|
e74c3611b138
- issue2550636, issue2550909: Added support for Whoosh indexer.
John Rouillard <rouilj@ieee.org>
parents:
4687
diff
changeset
|
166 |
|
e74c3611b138
- issue2550636, issue2550909: Added support for Whoosh indexer.
John Rouillard <rouilj@ieee.org>
parents:
4687
diff
changeset
|
167 if db.dbtype in ("sqlite", "postgres", "mysql"): |
|
e74c3611b138
- issue2550636, issue2550909: Added support for Whoosh indexer.
John Rouillard <rouilj@ieee.org>
parents:
4687
diff
changeset
|
168 from roundup.backends.indexer_rdbms import Indexer |
|
e74c3611b138
- issue2550636, issue2550909: Added support for Whoosh indexer.
John Rouillard <rouilj@ieee.org>
parents:
4687
diff
changeset
|
169 return Indexer(db) |
|
e74c3611b138
- issue2550636, issue2550909: Added support for Whoosh indexer.
John Rouillard <rouilj@ieee.org>
parents:
4687
diff
changeset
|
170 |
| 6910 | 171 raise AssertionError("Invalid indexer: %r" % (indexer_name)) |
