Mercurial > p > roundup > code
annotate test/test_token.py @ 6588:91ab3e0ffcd0
Summary: Add test cases for sqlite fts
Add support for using the FTS5 full text query engine for sqlite.
Also stubbed out some sections for adding postgresql FTS support as
well.
Added nee indexer type native-fts. It is not selected by default. The
indexer=native is used if no indexer is set. This prevents an upgrade
from seeming to wipe out the native index if upgraded and
indexer=native is not explicitly set.
Docs updated. Also changed section headers to sentence case for the
current release notes.
Indexing backend can control if the full text search phrase is broken
into a list of words or passed intact. For backends with query
languages (sqlite and can be enabled for whoosh and xapian) we do not
want the phrase "tokenized" on whitespace.
This also updates the rdbms database version to version 7 to add FTS
table. I will be using the same version when I add postgresql. If
somebody runs this version on postgresql, they will have to manually
add the fts tables for postgresql if they want to use it.
Added a new renderError method to client. This allows errors to be
reported still using page.html rather than raw html. It also supports
templates for any error code. If no template for the error code
(e.g. 400) is found, the error in raw html with no page frame is
shown.
New IndexerQueryError exception to pass back message about query syntax
errors.
| author | John Rouillard <rouilj@ieee.org> |
|---|---|
| date | Sun, 23 Jan 2022 18:57:45 -0500 |
| parents | 364c54991861 |
| children | 6971c9249c6d |
| rev | line source |
|---|---|
|
470
9f7320624bc2
Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
1 # |
|
9f7320624bc2
Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
2 # Copyright (c) 2001 Richard Jones |
|
9f7320624bc2
Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
3 # This module is free software, and you may redistribute it and/or modify |
|
9f7320624bc2
Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
4 # under the same terms as Python, so long as this copyright message and |
|
9f7320624bc2
Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
5 # disclaimer are retained in their original form. |
|
9f7320624bc2
Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
6 # |
|
9f7320624bc2
Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
7 # This module is distributed in the hope that it will be useful, |
|
9f7320624bc2
Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
8 # but WITHOUT ANY WARRANTY; without even the implied warranty of |
|
9f7320624bc2
Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
9 # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. |
|
9f7320624bc2
Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
10 |
|
9f7320624bc2
Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
11 import unittest, time |
|
9f7320624bc2
Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
12 |
|
9f7320624bc2
Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
13 from roundup.token import token_split |
|
9f7320624bc2
Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
14 |
|
9f7320624bc2
Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
15 class TokenTestCase(unittest.TestCase): |
|
9f7320624bc2
Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
16 def testValid(self): |
|
9f7320624bc2
Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
17 l = token_split('hello world') |
|
9f7320624bc2
Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
18 self.assertEqual(l, ['hello', 'world']) |
|
9f7320624bc2
Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
19 |
|
9f7320624bc2
Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
20 def testIgnoreExtraSpace(self): |
|
9f7320624bc2
Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
21 l = token_split('hello world ') |
|
9f7320624bc2
Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
22 self.assertEqual(l, ['hello', 'world']) |
|
9f7320624bc2
Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
23 |
|
9f7320624bc2
Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
24 def testQuoting(self): |
|
9f7320624bc2
Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
25 l = token_split('"hello world"') |
|
9f7320624bc2
Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
26 self.assertEqual(l, ['hello world']) |
|
9f7320624bc2
Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
27 l = token_split("'hello world'") |
|
9f7320624bc2
Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
28 self.assertEqual(l, ['hello world']) |
|
9f7320624bc2
Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
29 |
|
9f7320624bc2
Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
30 def testEmbedQuote(self): |
|
9f7320624bc2
Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
31 l = token_split(r'Roch\'e Compaan') |
|
9f7320624bc2
Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
32 self.assertEqual(l, ["Roch'e", "Compaan"]) |
|
9f7320624bc2
Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
33 l = token_split('address="1 2 3"') |
|
9f7320624bc2
Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
34 self.assertEqual(l, ['address=1 2 3']) |
|
9f7320624bc2
Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
35 |
|
9f7320624bc2
Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
36 def testEscaping(self): |
|
9f7320624bc2
Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
37 l = token_split('"Roch\'e" Compaan') |
|
9f7320624bc2
Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
38 self.assertEqual(l, ["Roch'e", "Compaan"]) |
|
9f7320624bc2
Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
39 l = token_split(r'hello\ world') |
|
9f7320624bc2
Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
40 self.assertEqual(l, ['hello world']) |
|
9f7320624bc2
Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
41 l = token_split(r'\\') |
|
9f7320624bc2
Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
42 self.assertEqual(l, ['\\']) |
|
9f7320624bc2
Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
43 l = token_split(r'\n') |
|
9f7320624bc2
Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
44 self.assertEqual(l, ['\n']) |
|
9f7320624bc2
Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
45 |
|
9f7320624bc2
Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
46 def testBadQuote(self): |
|
9f7320624bc2
Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
47 self.assertRaises(ValueError, token_split, '"hello world') |
|
9f7320624bc2
Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
48 self.assertRaises(ValueError, token_split, "Roch'e Compaan") |
|
9f7320624bc2
Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
49 |
|
9f7320624bc2
Added better tokenising to roundup-admin - handles spaces and stuff.
Richard Jones <richard@users.sourceforge.net>
parents:
diff
changeset
|
50 # vim: set filetype=python ts=4 sw=4 et si |
