annotate test/test_indexer.py @ 5096:e74c3611b138

- issue2550636, issue2550909: Added support for Whoosh indexer. Also adds new config.ini setting called indexer to select indexer. See ``doc/upgrading.txt`` for details. Initial patch done by David Wolever. Patch modified (see ticket or below for changes), docs updated and committed. I have an outstanding issue with test/test_indexer.py. I have to comment out all imports and tests for indexers I don't have (i.e. mysql, postgres) otherwise no tests run. With that change made, dbm, sqlite (rdbms), xapian and whoosh indexes are all passing the indexer tests. Changes summary: 1) support native back ends dbm and rdbms. (original patch only fell through to dbm) 2) Developed whoosh stopfilter to not index stopwords or words outside the the maxlength and minlength limits defined in index_common.py. Required to pass the extremewords test_indexer test. Also I removed a call to .lower on the input text as the tokenizer I chose automatically does the lowercase. 3) Added support for max/min length to find. This was needed to pass extremewords test. 4) Added back a call to save_index in add_text. This allowed all but two tests to pass. 5) Fixed a call to: results = searcher.search(query.Term("identifier", identifier)) which had an extra parameter that is an error under current whoosh. 6) Set limit=None in search call for find() otherwise it only return 10 items. This allowed it to pass manyresults test Also due to changes in the roundup code removed the call in indexer_whoosh to from roundup.anypy.sets_ import set since we use the python builtin set.
author John Rouillard <rouilj@ieee.org>
date Sat, 25 Jun 2016 20:10:03 -0400
parents c977f3530944
children 37d1e24fb941
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
848
2a928d404af8 ehem, forgot to add
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
1 # Copyright (c) 2002 ekit.com Inc (http://www.ekit-inc.com/)
2a928d404af8 ehem, forgot to add
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
2 #
2a928d404af8 ehem, forgot to add
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
3 # Permission is hereby granted, free of charge, to any person obtaining a copy
2a928d404af8 ehem, forgot to add
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
4 # of this software and associated documentation files (the "Software"), to deal
2a928d404af8 ehem, forgot to add
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
5 # in the Software without restriction, including without limitation the rights
2a928d404af8 ehem, forgot to add
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
6 # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
2a928d404af8 ehem, forgot to add
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
7 # copies of the Software, and to permit persons to whom the Software is
2a928d404af8 ehem, forgot to add
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
8 # furnished to do so, subject to the following conditions:
2a928d404af8 ehem, forgot to add
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
9 #
2a928d404af8 ehem, forgot to add
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
10 # The above copyright notice and this permission notice shall be included in
2a928d404af8 ehem, forgot to add
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
11 # all copies or substantial portions of the Software.
2a928d404af8 ehem, forgot to add
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
12 #
2a928d404af8 ehem, forgot to add
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
13 # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
2a928d404af8 ehem, forgot to add
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
14 # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
2a928d404af8 ehem, forgot to add
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
15 # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
2a928d404af8 ehem, forgot to add
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
16 # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
2a928d404af8 ehem, forgot to add
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
17 # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
2a928d404af8 ehem, forgot to add
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
18 # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
2a928d404af8 ehem, forgot to add
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
19 # SOFTWARE.
2a928d404af8 ehem, forgot to add
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
20
2a928d404af8 ehem, forgot to add
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
21 import os, unittest, shutil
2a928d404af8 ehem, forgot to add
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
22
5036
380d8d8b30a3 Replace existing run_tests.py script with a pytest script
John Kristensen <john@jerrykan.com>
parents: 5033
diff changeset
23 import pytest
4008
0bf9f8ae7d1b fix bug introduced in 1.4.5 in RDBMS full-text indexing;
Richard Jones <richard@users.sourceforge.net>
parents: 3547
diff changeset
24 from roundup.backends import get_backend, have_backend
0bf9f8ae7d1b fix bug introduced in 1.4.5 in RDBMS full-text indexing;
Richard Jones <richard@users.sourceforge.net>
parents: 3547
diff changeset
25 from roundup.backends.indexer_rdbms import Indexer
0bf9f8ae7d1b fix bug introduced in 1.4.5 in RDBMS full-text indexing;
Richard Jones <richard@users.sourceforge.net>
parents: 3547
diff changeset
26
0bf9f8ae7d1b fix bug introduced in 1.4.5 in RDBMS full-text indexing;
Richard Jones <richard@users.sourceforge.net>
parents: 3547
diff changeset
27 # borrow from other tests
0bf9f8ae7d1b fix bug introduced in 1.4.5 in RDBMS full-text indexing;
Richard Jones <richard@users.sourceforge.net>
parents: 3547
diff changeset
28 from db_test_base import setupSchema, config
5036
380d8d8b30a3 Replace existing run_tests.py script with a pytest script
John Kristensen <john@jerrykan.com>
parents: 5033
diff changeset
29 from .test_postgresql import postgresqlOpener, skip_postgresql
380d8d8b30a3 Replace existing run_tests.py script with a pytest script
John Kristensen <john@jerrykan.com>
parents: 5033
diff changeset
30 from .test_mysql import mysqlOpener, skip_mysql
4008
0bf9f8ae7d1b fix bug introduced in 1.4.5 in RDBMS full-text indexing;
Richard Jones <richard@users.sourceforge.net>
parents: 3547
diff changeset
31 from test_sqlite import sqliteOpener
0bf9f8ae7d1b fix bug introduced in 1.4.5 in RDBMS full-text indexing;
Richard Jones <richard@users.sourceforge.net>
parents: 3547
diff changeset
32
5038
c977f3530944 Work-around for pytest.mark.skipif() bug
John Kristensen <john@jerrykan.com>
parents: 5037
diff changeset
33 # FIX: workaround for a bug in pytest.mark.skipif():
c977f3530944 Work-around for pytest.mark.skipif() bug
John Kristensen <john@jerrykan.com>
parents: 5037
diff changeset
34 # https://github.com/pytest-dev/pytest/issues/568
5036
380d8d8b30a3 Replace existing run_tests.py script with a pytest script
John Kristensen <john@jerrykan.com>
parents: 5033
diff changeset
35 try:
380d8d8b30a3 Replace existing run_tests.py script with a pytest script
John Kristensen <john@jerrykan.com>
parents: 5033
diff changeset
36 import xapian
5038
c977f3530944 Work-around for pytest.mark.skipif() bug
John Kristensen <john@jerrykan.com>
parents: 5037
diff changeset
37 skip_xapian = lambda func, *args, **kwargs: func
5036
380d8d8b30a3 Replace existing run_tests.py script with a pytest script
John Kristensen <john@jerrykan.com>
parents: 5033
diff changeset
38 except ImportError:
5038
c977f3530944 Work-around for pytest.mark.skipif() bug
John Kristensen <john@jerrykan.com>
parents: 5037
diff changeset
39 skip_xapian = pytest.skip(
c977f3530944 Work-around for pytest.mark.skipif() bug
John Kristensen <john@jerrykan.com>
parents: 5037
diff changeset
40 "Skipping Xapian indexer tests: 'xapian' not installed")
5036
380d8d8b30a3 Replace existing run_tests.py script with a pytest script
John Kristensen <john@jerrykan.com>
parents: 5033
diff changeset
41
5096
e74c3611b138 - issue2550636, issue2550909: Added support for Whoosh indexer.
John Rouillard <rouilj@ieee.org>
parents: 5038
diff changeset
42 try:
e74c3611b138 - issue2550636, issue2550909: Added support for Whoosh indexer.
John Rouillard <rouilj@ieee.org>
parents: 5038
diff changeset
43 import whoosh
e74c3611b138 - issue2550636, issue2550909: Added support for Whoosh indexer.
John Rouillard <rouilj@ieee.org>
parents: 5038
diff changeset
44 skip_whoosh = lambda func, *args, **kwargs: func
e74c3611b138 - issue2550636, issue2550909: Added support for Whoosh indexer.
John Rouillard <rouilj@ieee.org>
parents: 5038
diff changeset
45 except ImportError:
e74c3611b138 - issue2550636, issue2550909: Added support for Whoosh indexer.
John Rouillard <rouilj@ieee.org>
parents: 5038
diff changeset
46 skip_whoosh = pytest.skip(
e74c3611b138 - issue2550636, issue2550909: Added support for Whoosh indexer.
John Rouillard <rouilj@ieee.org>
parents: 5038
diff changeset
47 "Skipping Whoosh indexer tests: 'whoosh' not installed")
5036
380d8d8b30a3 Replace existing run_tests.py script with a pytest script
John Kristensen <john@jerrykan.com>
parents: 5033
diff changeset
48
3297
8f7dc283bfa5 some more Xapian stuff (doc, test fixes)
Richard Jones <richard@users.sourceforge.net>
parents: 3295
diff changeset
49 class db:
3546
a4edd24c32be test fixes and checking of indexer overwrites (xapian currently fails)
Richard Jones <richard@users.sourceforge.net>
parents: 3297
diff changeset
50 class config(dict):
3297
8f7dc283bfa5 some more Xapian stuff (doc, test fixes)
Richard Jones <richard@users.sourceforge.net>
parents: 3295
diff changeset
51 DATABASE = 'test-index'
3546
a4edd24c32be test fixes and checking of indexer overwrites (xapian currently fails)
Richard Jones <richard@users.sourceforge.net>
parents: 3297
diff changeset
52 config = config()
a4edd24c32be test fixes and checking of indexer overwrites (xapian currently fails)
Richard Jones <richard@users.sourceforge.net>
parents: 3297
diff changeset
53 config[('main', 'indexer_stopwords')] = []
3297
8f7dc283bfa5 some more Xapian stuff (doc, test fixes)
Richard Jones <richard@users.sourceforge.net>
parents: 3295
diff changeset
54
848
2a928d404af8 ehem, forgot to add
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
55 class IndexerTest(unittest.TestCase):
2a928d404af8 ehem, forgot to add
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
56 def setUp(self):
2a928d404af8 ehem, forgot to add
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
57 if os.path.exists('test-index'):
2a928d404af8 ehem, forgot to add
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
58 shutil.rmtree('test-index')
2a928d404af8 ehem, forgot to add
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
59 os.mkdir('test-index')
2a928d404af8 ehem, forgot to add
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
60 os.mkdir('test-index/files')
3295
a615cc230160 added Xapian indexer; replaces standard indexers if Xapian is available
Richard Jones <richard@users.sourceforge.net>
parents: 3078
diff changeset
61 from roundup.backends.indexer_dbm import Indexer
3297
8f7dc283bfa5 some more Xapian stuff (doc, test fixes)
Richard Jones <richard@users.sourceforge.net>
parents: 3295
diff changeset
62 self.dex = Indexer(db)
848
2a928d404af8 ehem, forgot to add
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
63 self.dex.load_index()
2a928d404af8 ehem, forgot to add
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
64
4008
0bf9f8ae7d1b fix bug introduced in 1.4.5 in RDBMS full-text indexing;
Richard Jones <richard@users.sourceforge.net>
parents: 3547
diff changeset
65 def assertSeqEqual(self, s1, s2):
4102
dcca66d56815 fix unit test compatibility
Richard Jones <richard@users.sourceforge.net>
parents: 4016
diff changeset
66 # First argument is the db result we're testing, second is the
dcca66d56815 fix unit test compatibility
Richard Jones <richard@users.sourceforge.net>
parents: 4016
diff changeset
67 # desired result. Some db results don't have iterable rows, so we
dcca66d56815 fix unit test compatibility
Richard Jones <richard@users.sourceforge.net>
parents: 4016
diff changeset
68 # have to work around that.
4015
6eec11b197aa fix for indexer-test:
Ralf Schlatterbeck <schlatterbeck@users.sourceforge.net>
parents: 4008
diff changeset
69 # Also work around some dbs not returning items in the expected
4102
dcca66d56815 fix unit test compatibility
Richard Jones <richard@users.sourceforge.net>
parents: 4016
diff changeset
70 # order.
dcca66d56815 fix unit test compatibility
Richard Jones <richard@users.sourceforge.net>
parents: 4016
diff changeset
71 s1 = list([tuple([r[n] for n in range(len(r))]) for r in s1])
4015
6eec11b197aa fix for indexer-test:
Ralf Schlatterbeck <schlatterbeck@users.sourceforge.net>
parents: 4008
diff changeset
72 s1.sort()
4102
dcca66d56815 fix unit test compatibility
Richard Jones <richard@users.sourceforge.net>
parents: 4016
diff changeset
73 if s1 != s2:
4008
0bf9f8ae7d1b fix bug introduced in 1.4.5 in RDBMS full-text indexing;
Richard Jones <richard@users.sourceforge.net>
parents: 3547
diff changeset
74 self.fail('contents of %r != %r'%(s1, s2))
0bf9f8ae7d1b fix bug introduced in 1.4.5 in RDBMS full-text indexing;
Richard Jones <richard@users.sourceforge.net>
parents: 3547
diff changeset
75
848
2a928d404af8 ehem, forgot to add
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
76 def test_basics(self):
3295
a615cc230160 added Xapian indexer; replaces standard indexers if Xapian is available
Richard Jones <richard@users.sourceforge.net>
parents: 3078
diff changeset
77 self.dex.add_text(('test', '1', 'foo'), 'a the hello world')
a615cc230160 added Xapian indexer; replaces standard indexers if Xapian is available
Richard Jones <richard@users.sourceforge.net>
parents: 3078
diff changeset
78 self.dex.add_text(('test', '2', 'foo'), 'blah blah the world')
4008
0bf9f8ae7d1b fix bug introduced in 1.4.5 in RDBMS full-text indexing;
Richard Jones <richard@users.sourceforge.net>
parents: 3547
diff changeset
79 self.assertSeqEqual(self.dex.find(['world']), [('test', '1', 'foo'),
3295
a615cc230160 added Xapian indexer; replaces standard indexers if Xapian is available
Richard Jones <richard@users.sourceforge.net>
parents: 3078
diff changeset
80 ('test', '2', 'foo')])
4008
0bf9f8ae7d1b fix bug introduced in 1.4.5 in RDBMS full-text indexing;
Richard Jones <richard@users.sourceforge.net>
parents: 3547
diff changeset
81 self.assertSeqEqual(self.dex.find(['blah']), [('test', '2', 'foo')])
0bf9f8ae7d1b fix bug introduced in 1.4.5 in RDBMS full-text indexing;
Richard Jones <richard@users.sourceforge.net>
parents: 3547
diff changeset
82 self.assertSeqEqual(self.dex.find(['blah', 'hello']), [])
848
2a928d404af8 ehem, forgot to add
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
83
3547
7728ee93efd2 fix reindexing in Xapian
Richard Jones <richard@users.sourceforge.net>
parents: 3546
diff changeset
84 def test_change(self):
7728ee93efd2 fix reindexing in Xapian
Richard Jones <richard@users.sourceforge.net>
parents: 3546
diff changeset
85 self.dex.add_text(('test', '1', 'foo'), 'a the hello world')
7728ee93efd2 fix reindexing in Xapian
Richard Jones <richard@users.sourceforge.net>
parents: 3546
diff changeset
86 self.dex.add_text(('test', '2', 'foo'), 'blah blah the world')
4008
0bf9f8ae7d1b fix bug introduced in 1.4.5 in RDBMS full-text indexing;
Richard Jones <richard@users.sourceforge.net>
parents: 3547
diff changeset
87 self.assertSeqEqual(self.dex.find(['world']), [('test', '1', 'foo'),
3547
7728ee93efd2 fix reindexing in Xapian
Richard Jones <richard@users.sourceforge.net>
parents: 3546
diff changeset
88 ('test', '2', 'foo')])
3546
a4edd24c32be test fixes and checking of indexer overwrites (xapian currently fails)
Richard Jones <richard@users.sourceforge.net>
parents: 3297
diff changeset
89 self.dex.add_text(('test', '1', 'foo'), 'a the hello')
4008
0bf9f8ae7d1b fix bug introduced in 1.4.5 in RDBMS full-text indexing;
Richard Jones <richard@users.sourceforge.net>
parents: 3547
diff changeset
90 self.assertSeqEqual(self.dex.find(['world']), [('test', '2', 'foo')])
3546
a4edd24c32be test fixes and checking of indexer overwrites (xapian currently fails)
Richard Jones <richard@users.sourceforge.net>
parents: 3297
diff changeset
91
3547
7728ee93efd2 fix reindexing in Xapian
Richard Jones <richard@users.sourceforge.net>
parents: 3546
diff changeset
92 def test_clear(self):
7728ee93efd2 fix reindexing in Xapian
Richard Jones <richard@users.sourceforge.net>
parents: 3546
diff changeset
93 self.dex.add_text(('test', '1', 'foo'), 'a the hello world')
7728ee93efd2 fix reindexing in Xapian
Richard Jones <richard@users.sourceforge.net>
parents: 3546
diff changeset
94 self.dex.add_text(('test', '2', 'foo'), 'blah blah the world')
4008
0bf9f8ae7d1b fix bug introduced in 1.4.5 in RDBMS full-text indexing;
Richard Jones <richard@users.sourceforge.net>
parents: 3547
diff changeset
95 self.assertSeqEqual(self.dex.find(['world']), [('test', '1', 'foo'),
3547
7728ee93efd2 fix reindexing in Xapian
Richard Jones <richard@users.sourceforge.net>
parents: 3546
diff changeset
96 ('test', '2', 'foo')])
7728ee93efd2 fix reindexing in Xapian
Richard Jones <richard@users.sourceforge.net>
parents: 3546
diff changeset
97 self.dex.add_text(('test', '1', 'foo'), '')
4008
0bf9f8ae7d1b fix bug introduced in 1.4.5 in RDBMS full-text indexing;
Richard Jones <richard@users.sourceforge.net>
parents: 3547
diff changeset
98 self.assertSeqEqual(self.dex.find(['world']), [('test', '2', 'foo')])
3547
7728ee93efd2 fix reindexing in Xapian
Richard Jones <richard@users.sourceforge.net>
parents: 3546
diff changeset
99
4251
2b1241daaa20 Added more indexer tests for stopwords, case-insensitity...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents: 4102
diff changeset
100 def test_stopwords(self):
2b1241daaa20 Added more indexer tests for stopwords, case-insensitity...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents: 4102
diff changeset
101 """Test that we can find a text with a stopword in it."""
2b1241daaa20 Added more indexer tests for stopwords, case-insensitity...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents: 4102
diff changeset
102 stopword = "with"
2b1241daaa20 Added more indexer tests for stopwords, case-insensitity...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents: 4102
diff changeset
103 self.assert_(self.dex.is_stopword(stopword.upper()))
2b1241daaa20 Added more indexer tests for stopwords, case-insensitity...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents: 4102
diff changeset
104 self.dex.add_text(('test', '1', 'bar'), '%s hello world' % stopword)
2b1241daaa20 Added more indexer tests for stopwords, case-insensitity...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents: 4102
diff changeset
105 self.dex.add_text(('test', '2', 'bar'), 'blah a %s world' % stopword)
2b1241daaa20 Added more indexer tests for stopwords, case-insensitity...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents: 4102
diff changeset
106 self.dex.add_text(('test', '3', 'bar'), 'blah Blub river')
2b1241daaa20 Added more indexer tests for stopwords, case-insensitity...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents: 4102
diff changeset
107 self.dex.add_text(('test', '4', 'bar'), 'blah river %s' % stopword)
2b1241daaa20 Added more indexer tests for stopwords, case-insensitity...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents: 4102
diff changeset
108 self.assertSeqEqual(self.dex.find(['with','world']),
2b1241daaa20 Added more indexer tests for stopwords, case-insensitity...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents: 4102
diff changeset
109 [('test', '1', 'bar'),
2b1241daaa20 Added more indexer tests for stopwords, case-insensitity...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents: 4102
diff changeset
110 ('test', '2', 'bar')])
2b1241daaa20 Added more indexer tests for stopwords, case-insensitity...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents: 4102
diff changeset
111 def test_extremewords(self):
2b1241daaa20 Added more indexer tests for stopwords, case-insensitity...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents: 4102
diff changeset
112 """Testing too short or too long words."""
2b1241daaa20 Added more indexer tests for stopwords, case-insensitity...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents: 4102
diff changeset
113 short = "b"
2b1241daaa20 Added more indexer tests for stopwords, case-insensitity...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents: 4102
diff changeset
114 long = "abcdefghijklmnopqrstuvwxyz"
2b1241daaa20 Added more indexer tests for stopwords, case-insensitity...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents: 4102
diff changeset
115 self.dex.add_text(('test', '1', 'a'), '%s hello world' % short)
2b1241daaa20 Added more indexer tests for stopwords, case-insensitity...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents: 4102
diff changeset
116 self.dex.add_text(('test', '2', 'a'), 'blah a %s world' % short)
2b1241daaa20 Added more indexer tests for stopwords, case-insensitity...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents: 4102
diff changeset
117 self.dex.add_text(('test', '3', 'a'), 'blah Blub river')
2b1241daaa20 Added more indexer tests for stopwords, case-insensitity...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents: 4102
diff changeset
118 self.dex.add_text(('test', '4', 'a'), 'blah river %s %s'
2b1241daaa20 Added more indexer tests for stopwords, case-insensitity...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents: 4102
diff changeset
119 % (short, long))
2b1241daaa20 Added more indexer tests for stopwords, case-insensitity...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents: 4102
diff changeset
120 self.assertSeqEqual(self.dex.find([short,'world', long, short]),
2b1241daaa20 Added more indexer tests for stopwords, case-insensitity...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents: 4102
diff changeset
121 [('test', '1', 'a'),
2b1241daaa20 Added more indexer tests for stopwords, case-insensitity...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents: 4102
diff changeset
122 ('test', '2', 'a')])
2b1241daaa20 Added more indexer tests for stopwords, case-insensitity...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents: 4102
diff changeset
123 self.assertSeqEqual(self.dex.find([long]),[])
2b1241daaa20 Added more indexer tests for stopwords, case-insensitity...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents: 4102
diff changeset
124
2b1241daaa20 Added more indexer tests for stopwords, case-insensitity...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents: 4102
diff changeset
125 # special test because some faulty code indexed length(word)>=2
2b1241daaa20 Added more indexer tests for stopwords, case-insensitity...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents: 4102
diff changeset
126 # but only considered length(word)>=3 to be significant
2b1241daaa20 Added more indexer tests for stopwords, case-insensitity...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents: 4102
diff changeset
127 self.dex.add_text(('test', '5', 'a'), 'blah py %s %s'
2b1241daaa20 Added more indexer tests for stopwords, case-insensitity...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents: 4102
diff changeset
128 % (short, long))
2b1241daaa20 Added more indexer tests for stopwords, case-insensitity...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents: 4102
diff changeset
129 self.assertSeqEqual(self.dex.find(["py"]), [('test', '5', 'a')])
2b1241daaa20 Added more indexer tests for stopwords, case-insensitity...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents: 4102
diff changeset
130
2b1241daaa20 Added more indexer tests for stopwords, case-insensitity...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents: 4102
diff changeset
131 def test_casesensitity(self):
2b1241daaa20 Added more indexer tests for stopwords, case-insensitity...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents: 4102
diff changeset
132 """Test if searches are case-in-sensitive."""
2b1241daaa20 Added more indexer tests for stopwords, case-insensitity...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents: 4102
diff changeset
133 self.dex.add_text(('test', '1', 'a'), 'aaaa bbbb')
2b1241daaa20 Added more indexer tests for stopwords, case-insensitity...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents: 4102
diff changeset
134 self.dex.add_text(('test', '2', 'a'), 'aAaa BBBB')
2b1241daaa20 Added more indexer tests for stopwords, case-insensitity...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents: 4102
diff changeset
135 self.assertSeqEqual(self.dex.find(['aaaa']),
2b1241daaa20 Added more indexer tests for stopwords, case-insensitity...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents: 4102
diff changeset
136 [('test', '1', 'a'),
2b1241daaa20 Added more indexer tests for stopwords, case-insensitity...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents: 4102
diff changeset
137 ('test', '2', 'a')])
2b1241daaa20 Added more indexer tests for stopwords, case-insensitity...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents: 4102
diff changeset
138 self.assertSeqEqual(self.dex.find(['BBBB']),
2b1241daaa20 Added more indexer tests for stopwords, case-insensitity...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents: 4102
diff changeset
139 [('test', '1', 'a'),
2b1241daaa20 Added more indexer tests for stopwords, case-insensitity...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents: 4102
diff changeset
140 ('test', '2', 'a')])
2b1241daaa20 Added more indexer tests for stopwords, case-insensitity...
Bernhard Reiter <Bernhard.Reiter@intevation.de>
parents: 4102
diff changeset
141
4314
b41a033bffcc - add a small word-splitting test for the indexers...
Ralf Schlatterbeck <schlatterbeck@users.sourceforge.net>
parents: 4251
diff changeset
142 def test_wordsplitting(self):
b41a033bffcc - add a small word-splitting test for the indexers...
Ralf Schlatterbeck <schlatterbeck@users.sourceforge.net>
parents: 4251
diff changeset
143 """Test if word splitting works."""
b41a033bffcc - add a small word-splitting test for the indexers...
Ralf Schlatterbeck <schlatterbeck@users.sourceforge.net>
parents: 4251
diff changeset
144 self.dex.add_text(('test', '1', 'a'), 'aaaa-aaa bbbb*bbb')
b41a033bffcc - add a small word-splitting test for the indexers...
Ralf Schlatterbeck <schlatterbeck@users.sourceforge.net>
parents: 4251
diff changeset
145 self.dex.add_text(('test', '2', 'a'), 'aaaA-aaa BBBB*BBB')
b41a033bffcc - add a small word-splitting test for the indexers...
Ralf Schlatterbeck <schlatterbeck@users.sourceforge.net>
parents: 4251
diff changeset
146 for k in 'aaaa', 'aaa', 'bbbb', 'bbb':
b41a033bffcc - add a small word-splitting test for the indexers...
Ralf Schlatterbeck <schlatterbeck@users.sourceforge.net>
parents: 4251
diff changeset
147 self.assertSeqEqual(self.dex.find([k]),
b41a033bffcc - add a small word-splitting test for the indexers...
Ralf Schlatterbeck <schlatterbeck@users.sourceforge.net>
parents: 4251
diff changeset
148 [('test', '1', 'a'), ('test', '2', 'a')])
b41a033bffcc - add a small word-splitting test for the indexers...
Ralf Schlatterbeck <schlatterbeck@users.sourceforge.net>
parents: 4251
diff changeset
149
4841
3ff1a288fb9c issue2550583, issue2550635 Do not limit results with Xapian indexer
Thomas Arendsen Hein <thomas@intevation.de>
parents: 4570
diff changeset
150 def test_manyresults(self):
3ff1a288fb9c issue2550583, issue2550635 Do not limit results with Xapian indexer
Thomas Arendsen Hein <thomas@intevation.de>
parents: 4570
diff changeset
151 """Test if searches find many results."""
3ff1a288fb9c issue2550583, issue2550635 Do not limit results with Xapian indexer
Thomas Arendsen Hein <thomas@intevation.de>
parents: 4570
diff changeset
152 for i in range(123):
3ff1a288fb9c issue2550583, issue2550635 Do not limit results with Xapian indexer
Thomas Arendsen Hein <thomas@intevation.de>
parents: 4570
diff changeset
153 self.dex.add_text(('test', str(i), 'many'), 'many')
3ff1a288fb9c issue2550583, issue2550635 Do not limit results with Xapian indexer
Thomas Arendsen Hein <thomas@intevation.de>
parents: 4570
diff changeset
154 self.assertEqual(len(self.dex.find(['many'])), 123)
3ff1a288fb9c issue2550583, issue2550635 Do not limit results with Xapian indexer
Thomas Arendsen Hein <thomas@intevation.de>
parents: 4570
diff changeset
155
848
2a928d404af8 ehem, forgot to add
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
156 def tearDown(self):
2a928d404af8 ehem, forgot to add
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
157 shutil.rmtree('test-index')
2a928d404af8 ehem, forgot to add
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
158
5096
e74c3611b138 - issue2550636, issue2550909: Added support for Whoosh indexer.
John Rouillard <rouilj@ieee.org>
parents: 5038
diff changeset
159 @skip_whoosh
e74c3611b138 - issue2550636, issue2550909: Added support for Whoosh indexer.
John Rouillard <rouilj@ieee.org>
parents: 5038
diff changeset
160 class WhooshIndexerTest(IndexerTest):
e74c3611b138 - issue2550636, issue2550909: Added support for Whoosh indexer.
John Rouillard <rouilj@ieee.org>
parents: 5038
diff changeset
161 def setUp(self):
e74c3611b138 - issue2550636, issue2550909: Added support for Whoosh indexer.
John Rouillard <rouilj@ieee.org>
parents: 5038
diff changeset
162 if os.path.exists('test-index'):
e74c3611b138 - issue2550636, issue2550909: Added support for Whoosh indexer.
John Rouillard <rouilj@ieee.org>
parents: 5038
diff changeset
163 shutil.rmtree('test-index')
e74c3611b138 - issue2550636, issue2550909: Added support for Whoosh indexer.
John Rouillard <rouilj@ieee.org>
parents: 5038
diff changeset
164 os.mkdir('test-index')
e74c3611b138 - issue2550636, issue2550909: Added support for Whoosh indexer.
John Rouillard <rouilj@ieee.org>
parents: 5038
diff changeset
165 from roundup.backends.indexer_whoosh import Indexer
e74c3611b138 - issue2550636, issue2550909: Added support for Whoosh indexer.
John Rouillard <rouilj@ieee.org>
parents: 5038
diff changeset
166 self.dex = Indexer(db)
e74c3611b138 - issue2550636, issue2550909: Added support for Whoosh indexer.
John Rouillard <rouilj@ieee.org>
parents: 5038
diff changeset
167 def tearDown(self):
e74c3611b138 - issue2550636, issue2550909: Added support for Whoosh indexer.
John Rouillard <rouilj@ieee.org>
parents: 5038
diff changeset
168 shutil.rmtree('test-index')
5036
380d8d8b30a3 Replace existing run_tests.py script with a pytest script
John Kristensen <john@jerrykan.com>
parents: 5033
diff changeset
169
380d8d8b30a3 Replace existing run_tests.py script with a pytest script
John Kristensen <john@jerrykan.com>
parents: 5033
diff changeset
170 @skip_xapian
3295
a615cc230160 added Xapian indexer; replaces standard indexers if Xapian is available
Richard Jones <richard@users.sourceforge.net>
parents: 3078
diff changeset
171 class XapianIndexerTest(IndexerTest):
a615cc230160 added Xapian indexer; replaces standard indexers if Xapian is available
Richard Jones <richard@users.sourceforge.net>
parents: 3078
diff changeset
172 def setUp(self):
3297
8f7dc283bfa5 some more Xapian stuff (doc, test fixes)
Richard Jones <richard@users.sourceforge.net>
parents: 3295
diff changeset
173 if os.path.exists('test-index'):
8f7dc283bfa5 some more Xapian stuff (doc, test fixes)
Richard Jones <richard@users.sourceforge.net>
parents: 3295
diff changeset
174 shutil.rmtree('test-index')
8f7dc283bfa5 some more Xapian stuff (doc, test fixes)
Richard Jones <richard@users.sourceforge.net>
parents: 3295
diff changeset
175 os.mkdir('test-index')
3295
a615cc230160 added Xapian indexer; replaces standard indexers if Xapian is available
Richard Jones <richard@users.sourceforge.net>
parents: 3078
diff changeset
176 from roundup.backends.indexer_xapian import Indexer
3297
8f7dc283bfa5 some more Xapian stuff (doc, test fixes)
Richard Jones <richard@users.sourceforge.net>
parents: 3295
diff changeset
177 self.dex = Indexer(db)
3295
a615cc230160 added Xapian indexer; replaces standard indexers if Xapian is available
Richard Jones <richard@users.sourceforge.net>
parents: 3078
diff changeset
178 def tearDown(self):
3297
8f7dc283bfa5 some more Xapian stuff (doc, test fixes)
Richard Jones <richard@users.sourceforge.net>
parents: 3295
diff changeset
179 shutil.rmtree('test-index')
3295
a615cc230160 added Xapian indexer; replaces standard indexers if Xapian is available
Richard Jones <richard@users.sourceforge.net>
parents: 3078
diff changeset
180
5033
63c79c0992ae Update tests to work with py.test
John Kristensen <john@jerrykan.com>
parents: 4841
diff changeset
181
63c79c0992ae Update tests to work with py.test
John Kristensen <john@jerrykan.com>
parents: 4841
diff changeset
182 class RDBMSIndexerTest(object):
4008
0bf9f8ae7d1b fix bug introduced in 1.4.5 in RDBMS full-text indexing;
Richard Jones <richard@users.sourceforge.net>
parents: 3547
diff changeset
183 def setUp(self):
0bf9f8ae7d1b fix bug introduced in 1.4.5 in RDBMS full-text indexing;
Richard Jones <richard@users.sourceforge.net>
parents: 3547
diff changeset
184 # remove previous test, ignore errors
0bf9f8ae7d1b fix bug introduced in 1.4.5 in RDBMS full-text indexing;
Richard Jones <richard@users.sourceforge.net>
parents: 3547
diff changeset
185 if os.path.exists(config.DATABASE):
0bf9f8ae7d1b fix bug introduced in 1.4.5 in RDBMS full-text indexing;
Richard Jones <richard@users.sourceforge.net>
parents: 3547
diff changeset
186 shutil.rmtree(config.DATABASE)
0bf9f8ae7d1b fix bug introduced in 1.4.5 in RDBMS full-text indexing;
Richard Jones <richard@users.sourceforge.net>
parents: 3547
diff changeset
187 self.db = self.module.Database(config, 'admin')
0bf9f8ae7d1b fix bug introduced in 1.4.5 in RDBMS full-text indexing;
Richard Jones <richard@users.sourceforge.net>
parents: 3547
diff changeset
188 self.dex = Indexer(self.db)
0bf9f8ae7d1b fix bug introduced in 1.4.5 in RDBMS full-text indexing;
Richard Jones <richard@users.sourceforge.net>
parents: 3547
diff changeset
189 def tearDown(self):
0bf9f8ae7d1b fix bug introduced in 1.4.5 in RDBMS full-text indexing;
Richard Jones <richard@users.sourceforge.net>
parents: 3547
diff changeset
190 if hasattr(self, 'db'):
0bf9f8ae7d1b fix bug introduced in 1.4.5 in RDBMS full-text indexing;
Richard Jones <richard@users.sourceforge.net>
parents: 3547
diff changeset
191 self.db.close()
0bf9f8ae7d1b fix bug introduced in 1.4.5 in RDBMS full-text indexing;
Richard Jones <richard@users.sourceforge.net>
parents: 3547
diff changeset
192 if os.path.exists(config.DATABASE):
0bf9f8ae7d1b fix bug introduced in 1.4.5 in RDBMS full-text indexing;
Richard Jones <richard@users.sourceforge.net>
parents: 3547
diff changeset
193 shutil.rmtree(config.DATABASE)
0bf9f8ae7d1b fix bug introduced in 1.4.5 in RDBMS full-text indexing;
Richard Jones <richard@users.sourceforge.net>
parents: 3547
diff changeset
194
5033
63c79c0992ae Update tests to work with py.test
John Kristensen <john@jerrykan.com>
parents: 4841
diff changeset
195
5036
380d8d8b30a3 Replace existing run_tests.py script with a pytest script
John Kristensen <john@jerrykan.com>
parents: 5033
diff changeset
196 @skip_postgresql
5033
63c79c0992ae Update tests to work with py.test
John Kristensen <john@jerrykan.com>
parents: 4841
diff changeset
197 class postgresqlIndexerTest(postgresqlOpener, RDBMSIndexerTest, IndexerTest):
4008
0bf9f8ae7d1b fix bug introduced in 1.4.5 in RDBMS full-text indexing;
Richard Jones <richard@users.sourceforge.net>
parents: 3547
diff changeset
198 def setUp(self):
0bf9f8ae7d1b fix bug introduced in 1.4.5 in RDBMS full-text indexing;
Richard Jones <richard@users.sourceforge.net>
parents: 3547
diff changeset
199 postgresqlOpener.setUp(self)
0bf9f8ae7d1b fix bug introduced in 1.4.5 in RDBMS full-text indexing;
Richard Jones <richard@users.sourceforge.net>
parents: 3547
diff changeset
200 RDBMSIndexerTest.setUp(self)
0bf9f8ae7d1b fix bug introduced in 1.4.5 in RDBMS full-text indexing;
Richard Jones <richard@users.sourceforge.net>
parents: 3547
diff changeset
201 def tearDown(self):
0bf9f8ae7d1b fix bug introduced in 1.4.5 in RDBMS full-text indexing;
Richard Jones <richard@users.sourceforge.net>
parents: 3547
diff changeset
202 RDBMSIndexerTest.tearDown(self)
0bf9f8ae7d1b fix bug introduced in 1.4.5 in RDBMS full-text indexing;
Richard Jones <richard@users.sourceforge.net>
parents: 3547
diff changeset
203 postgresqlOpener.tearDown(self)
0bf9f8ae7d1b fix bug introduced in 1.4.5 in RDBMS full-text indexing;
Richard Jones <richard@users.sourceforge.net>
parents: 3547
diff changeset
204
5033
63c79c0992ae Update tests to work with py.test
John Kristensen <john@jerrykan.com>
parents: 4841
diff changeset
205
5036
380d8d8b30a3 Replace existing run_tests.py script with a pytest script
John Kristensen <john@jerrykan.com>
parents: 5033
diff changeset
206 @skip_mysql
5033
63c79c0992ae Update tests to work with py.test
John Kristensen <john@jerrykan.com>
parents: 4841
diff changeset
207 class mysqlIndexerTest(mysqlOpener, RDBMSIndexerTest, IndexerTest):
4008
0bf9f8ae7d1b fix bug introduced in 1.4.5 in RDBMS full-text indexing;
Richard Jones <richard@users.sourceforge.net>
parents: 3547
diff changeset
208 def setUp(self):
0bf9f8ae7d1b fix bug introduced in 1.4.5 in RDBMS full-text indexing;
Richard Jones <richard@users.sourceforge.net>
parents: 3547
diff changeset
209 mysqlOpener.setUp(self)
0bf9f8ae7d1b fix bug introduced in 1.4.5 in RDBMS full-text indexing;
Richard Jones <richard@users.sourceforge.net>
parents: 3547
diff changeset
210 RDBMSIndexerTest.setUp(self)
0bf9f8ae7d1b fix bug introduced in 1.4.5 in RDBMS full-text indexing;
Richard Jones <richard@users.sourceforge.net>
parents: 3547
diff changeset
211 def tearDown(self):
0bf9f8ae7d1b fix bug introduced in 1.4.5 in RDBMS full-text indexing;
Richard Jones <richard@users.sourceforge.net>
parents: 3547
diff changeset
212 RDBMSIndexerTest.tearDown(self)
0bf9f8ae7d1b fix bug introduced in 1.4.5 in RDBMS full-text indexing;
Richard Jones <richard@users.sourceforge.net>
parents: 3547
diff changeset
213 mysqlOpener.tearDown(self)
0bf9f8ae7d1b fix bug introduced in 1.4.5 in RDBMS full-text indexing;
Richard Jones <richard@users.sourceforge.net>
parents: 3547
diff changeset
214
5033
63c79c0992ae Update tests to work with py.test
John Kristensen <john@jerrykan.com>
parents: 4841
diff changeset
215
63c79c0992ae Update tests to work with py.test
John Kristensen <john@jerrykan.com>
parents: 4841
diff changeset
216 class sqliteIndexerTest(sqliteOpener, RDBMSIndexerTest, IndexerTest):
4008
0bf9f8ae7d1b fix bug introduced in 1.4.5 in RDBMS full-text indexing;
Richard Jones <richard@users.sourceforge.net>
parents: 3547
diff changeset
217 pass
0bf9f8ae7d1b fix bug introduced in 1.4.5 in RDBMS full-text indexing;
Richard Jones <richard@users.sourceforge.net>
parents: 3547
diff changeset
218
848
2a928d404af8 ehem, forgot to add
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
219 # vim: set filetype=python ts=4 sw=4 et si

Roundup Issue Tracker: http://roundup-tracker.org/