changeset 6588:91ab3e0ffcd0

Summary: Add test cases for sqlite fts Add support for using the FTS5 full text query engine for sqlite. Also stubbed out some sections for adding postgresql FTS support as well. Added nee indexer type native-fts. It is not selected by default. The indexer=native is used if no indexer is set. This prevents an upgrade from seeming to wipe out the native index if upgraded and indexer=native is not explicitly set. Docs updated. Also changed section headers to sentence case for the current release notes. Indexing backend can control if the full text search phrase is broken into a list of words or passed intact. For backends with query languages (sqlite and can be enabled for whoosh and xapian) we do not want the phrase "tokenized" on whitespace. This also updates the rdbms database version to version 7 to add FTS table. I will be using the same version when I add postgresql. If somebody runs this version on postgresql, they will have to manually add the fts tables for postgresql if they want to use it. Added a new renderError method to client. This allows errors to be reported still using page.html rather than raw html. It also supports templates for any error code. If no template for the error code (e.g. 400) is found, the error in raw html with no page frame is shown. New IndexerQueryError exception to pass back message about query syntax errors.
author John Rouillard <rouilj@ieee.org>
date Sun, 23 Jan 2022 18:57:45 -0500
parents 4f8fc55730e1
children 5ce396880899
files CHANGES.txt doc/installation.txt doc/upgrading.txt roundup/backends/back_sqlite.py roundup/backends/indexer_common.py roundup/backends/indexer_sqlite_fts.py roundup/backends/rdbms_common.py roundup/cgi/actions.py roundup/cgi/client.py roundup/cgi/exceptions.py roundup/cgi/templating.py roundup/configuration.py test/db_test_base.py test/test_cgi.py test/test_indexer.py
diffstat 15 files changed, 581 insertions(+), 25 deletions(-) [+]
line wrap: on
line diff
--- a/CHANGES.txt	Sat Jan 22 01:33:06 2022 -0500
+++ b/CHANGES.txt	Sun Jan 23 18:57:45 2022 -0500
@@ -122,6 +122,11 @@
   as the secret. This allows committing config.ini to a VCS. (John
   Rouillard)
 - Added xapian indexer to Docker container. (John Rouillard)
+- Add support for indexer type native-fts to use FTS5 for sqlite
+  databases. (John Rouillard)
+- Add better error display to the user. Needed to expose errors in fts5
+  search syntax to the user while also displaying the template page
+  structure. (John Rouillard)
 
 2021-07-13 2.1.0
 
--- a/doc/installation.txt	Sat Jan 22 01:33:06 2022 -0500
+++ b/doc/installation.txt	Sun Jan 23 18:57:45 2022 -0500
@@ -517,6 +517,10 @@
 
   Installed SQLite should be the latest version available (3.3.8 is known
   to work, 3.1.3 is known to have problems).
+
+  Roundup supports using sqlite's full text search capability. This
+  can improve searching if you are not installing another indexer like
+  xapian or whoosh. It works best with English text.
 **postgresql**
   Backend for popular RDBMS PostgreSQL. You must read doc/postgresql.txt for
   additional installation steps and requirements. You must also configure
--- a/doc/upgrading.txt	Sat Jan 22 01:33:06 2022 -0500
+++ b/doc/upgrading.txt	Sun Jan 23 18:57:45 2022 -0500
@@ -35,7 +35,7 @@
 Migrating from 2.1.0 to 2.x.y
 =============================
 
-Check Compression Settings
+Check compression settings
 --------------------------
 
 Read the `administration guide`_ section on 'Configuring Compression'.
@@ -50,13 +50,55 @@
 compression settings as you want. Then replace ``config.ini`` with the
 ``newconfig.ini`` file.
 
-Search Added to User Index Page
+Search added to user index page
 -------------------------------
 
 A search form and count of number of hits has been added to the
 ``user.index.html`` template page in the classic template. You may
 want to merge the search form and footer into your template.
 
+Enhanced full-text search
+-------------------------
+
+SQLite's `FTS5 full-text search engine`_ is available. It should work
+with any recent sqlite3 installation. Any full text search field will
+allow searching using the MATCH query format described at:
+https://www.sqlite.org/fts5.html#full_text_query_syntax. A list of
+words behaves almost the same as the default text search
+(`native`). So the search string `fts search` will find all issues
+that have both of those words (an AND search) in a text-field (like
+title) or in a message (or file) attached to the issue.
+
+One thing to note is that the fts search does not ignore words longer
+than 25 characters or less than 2 characters. Also it does not filter
+out common works (i.e. there is no stoplist). So words like "and",
+"or", "then", "with" ...  are included in the FTS5 search. The native
+search applies both filters.
+
+Using SQLite FTS requires a schema change so you should run
+`roundup-admin -i tracker_home migrate` as the FTS specific tables
+need to be created.
+
+Then you must explicitly enable it by changing the `indexer` setting
+in `config.ini` to `native-fts`. Native-fts is never chosen by default
+like xapian or whoosh. This prevents the existing native indexing from
+being discarded if `indexer` is not set.
+
+Next re-index your data with `roundup-admin -i tracker_home
+reindex`. This can take a while depending on the size of the tracker.
+
+You may want to update your `config.ini` by following the directions
+above to get the latest documentation.
+
+If you are happy with the fts indexing, you can save some space by
+removing the data from the native text indexing tables. This requires
+using the `sqlite3` command to delete the rows in the `__textids` and
+`__words` tables. You can do this with the following sqlite3
+commands::
+
+   delete from __words;
+   delete from __textids;
+
 
 Migrating from 2.0.0 to 2.1.0
 =============================
@@ -3270,3 +3312,4 @@
 .. _`security documentation`: security.html
 .. _`administration guide`: admin_guide.html
 .. _`xmlrpc guide`: xmlrpc.html
+.. _FTS5 full-text search engine: https://www.sqlite.org/fts5.html
--- a/roundup/backends/back_sqlite.py	Sat Jan 22 01:33:06 2022 -0500
+++ b/roundup/backends/back_sqlite.py	Sun Jan 23 18:57:45 2022 -0500
@@ -174,6 +174,7 @@
             self.sql('create table ids (name varchar, num integer)')
             self.sql('create index ids_name_idx on ids(name)')
             self.create_version_2_tables()
+            self._add_fts5_table()
 
     def create_version_2_tables(self):
         self.sql('create table otks (otk_key varchar, '
@@ -212,6 +213,15 @@
         # NOOP - no restriction on column length here
         pass
 
+    def _add_fts5_table(self):
+        self.sql('CREATE virtual TABLE __fts USING fts5(_class, '
+                 '_itemid, _prop, _textblob)'
+        )
+
+    def fix_version_6_tables(self):
+        # Add native full-text indexing table
+        self._add_fts5_table()
+
     def update_class(self, spec, old_spec, force=0, adding_v2=0):
         """ Determine the differences between the current spec and the
             database version of the spec, and update where necessary.
--- a/roundup/backends/indexer_common.py	Sat Jan 22 01:33:06 2022 -0500
+++ b/roundup/backends/indexer_common.py	Sun Jan 23 18:57:45 2022 -0500
@@ -24,6 +24,9 @@
         self.minlength = 2
         self.maxlength = 25
         self.language = db.config[('main','indexer_language')]
+        # Some indexers have a query language. If that is the case,
+        # we don't parse the user supplied query into a wordlist.
+        self.query_language = False
 
     def is_stopword(self, word):
         return word in self.stopwords
@@ -134,6 +137,19 @@
         from .indexer_whoosh import Indexer
         return Indexer(db)
 
+    if indexer_name == "native-fts":
+        if db.dbtype not in ("sqlite", "postgres"):
+            raise AssertionError("Indexer native-fts is configured, but only sqlite and postgres support it. Database is: %r" % db.dbtype)
+
+        if db.dbtype == "sqlite":
+            from roundup.backends.indexer_sqlite_fts import Indexer
+            return Indexer(db)
+
+        if db.dbtype == "postgres":
+            raise NotImplementedError("Postgres FTS not available")
+            from roundup.backends.indexer_postgres_fts import Indexer
+            return Indexer(db)
+
     if indexer_name == "native":
         # load proper native indexing based on database type
         if db.dbtype == "anydbm":
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/roundup/backends/indexer_sqlite_fts.py	Sun Jan 23 18:57:45 2022 -0500
@@ -0,0 +1,123 @@
+""" This implements the full-text indexer using fts5 in sqlite.
+The table consists of (Class, propname, itemid) instances as columns
+along with a textblob column. The textblob column is searched using
+MATCH and the instances returned.
+
+sqlite test commands to manage schema version change required by
+this update.
+
+-- check length before and after
+select length(schema) from schema;
+
+-- reset from version 7 (with fts index) to version 6
+ update schema set schema = (select replace(schema,
+   '''version'': 7','''version'': 6') as new_schema from schema);
+
+-- check version. Good thing it's at the front of the schema
+ select substr(schema,0,15) from schema;
+ {'version': 6,
+"""
+
+from roundup.backends.indexer_common import Indexer as IndexerBase
+from roundup.i18n import _
+from roundup.cgi.exceptions import IndexerQueryError
+
+try:
+    import sqlite3 as sqlite
+    if sqlite.sqlite_version_info < (3, 9, 0):
+        raise ValueError('sqlite minimum version for FTS5 is 3.9.0+ '
+                         '- %s found' % sqlite.sqlite_version)
+except ImportError:
+    raise ValueError('Unable to import sqlite3 to support FTS.')
+
+
+class Indexer(IndexerBase):
+    def __init__(self, db):
+        IndexerBase.__init__(self, db)
+        self.db = db
+        self.reindex = 0
+        self.query_language = True
+
+    def close(self):
+        """close the indexing database"""
+        # just nuke the circular reference
+        self.db = None
+
+    def save_index(self):
+        """Save the changes to the index."""
+        # not necessary - the RDBMS connection will handle this for us
+        pass
+
+    def force_reindex(self):
+        """Force a reindexing of the database.  This essentially
+        empties the __fts table and sets a flag so
+        that the databases are reindexed"""
+        self.reindex = 1
+
+    def should_reindex(self):
+        """returns True if the indexes need to be rebuilt"""
+        return self.reindex
+
+    def add_text(self, identifier, text, mime_type='text/plain'):
+        """ "identifier" is  (classname, itemid, property) """
+        if mime_type != 'text/plain':
+            return
+
+        # Ensure all elements of the identifier are strings 'cos the itemid
+        # column is varchar even if item ids may be numbers elsewhere in the
+        # code. ugh.
+        identifier = tuple(map(str, identifier))
+
+        # removed pre-processing of text that incudes only words with:
+        # self.minlength <= len(word) <= self.maxlength
+        # Not sure if that is correct.
+
+        # first, find the rowid of the (classname, itemid, property)
+        a = self.db.arg  # arg is the token for positional parameters
+        sql = 'select rowid from __fts where _class=%s and '\
+            '_itemid=%s and _prop=%s' % (a, a, a)
+        self.db.cursor.execute(sql, identifier)
+        r = self.db.cursor.fetchone()
+        if not r:
+            # not previously indexed
+            sql = 'insert into __fts (_class, _itemid, _prop, _textblob)'\
+                ' values (%s, %s, %s, %s)' % (a, a, a, a)
+            self.db.cursor.execute(sql, identifier + (text,))
+        else:
+            id = int(r[0])
+            sql = 'update __fts set _textblob=%s where rowid=%s' % \
+                  (a, a)
+            self.db.cursor.execute(sql, (text, id))
+
+    def find(self, wordlist):
+        """look up all the words in the wordlist.
+           For testing wordlist is actually a list.
+           In production, wordlist is a list of a single string
+           that is a sqlite MATCH query.
+
+           https://www.sqlite.org/fts5.html#full_text_query_syntax
+        """
+        if not wordlist:
+            return []
+
+        a = self.db.arg  # arg is the token for positional parameters
+
+        # removed filtering of word in wordlist to include only
+        # words with:  self.minlength <= len(word) <= self.maxlength
+
+        sql = 'select _class, _itemid, _prop from __fts '\
+              'where _textblob MATCH %s' % a
+
+        try:
+            # tests supply a multi element word list. Join them.
+            self.db.cursor.execute(sql, (" ".join(wordlist),))
+        except sqlite.OperationalError as e:
+            if 'no such column' in e.args[0]:
+                raise IndexerQueryError(
+                    _("Search failed. Try quoting any terms that "
+                      "include a '-' and retry the search."))
+            else:
+                raise IndexerQueryError(e.args[0].replace("fts5:",
+                                                          "Query error:"))
+
+        return self.db.cursor.fetchall()
--- a/roundup/backends/rdbms_common.py	Sat Jan 22 01:33:06 2022 -0500
+++ b/roundup/backends/rdbms_common.py	Sun Jan 23 18:57:45 2022 -0500
@@ -329,8 +329,8 @@
         self.sql_commit()
 
     # update this number when we need to make changes to the SQL structure
-    # of the backen database
-    current_db_version = 6
+    # of the backend database
+    current_db_version = 7
     db_version_updated = False
 
     def upgrade_db(self):
@@ -376,10 +376,18 @@
             self.log_info('upgrade to version 6')
             self.fix_version_5_tables()
 
+        if version < 7:
+            self.log_info('upgrade to version 7')
+            self.fix_version_6_tables()
+
         self.database_schema['version'] = self.current_db_version
         self.db_version_updated = True
         return 1
 
+    def fix_version_2_tables(self):
+        # Default (used by sqlite): NOOP
+        pass
+
     def fix_version_3_tables(self):
         # drop the shorter VARCHAR OTK column and add a new TEXT one
         for name in ('otk', 'session'):
@@ -387,10 +395,6 @@
             self.sql('ALTER TABLE %ss DROP %s_value' % (name, name))
             self.sql('ALTER TABLE %ss ADD %s_value TEXT' % (name, name))
 
-    def fix_version_2_tables(self):
-        # Default (used by sqlite): NOOP
-        pass
-
     def fix_version_4_tables(self):
         # note this is an explicit call now
         c = self.cursor
@@ -411,6 +415,12 @@
         # as version 5.
         pass
 
+    def fix_version_6_tables(self):
+        # Default (used by mysql): NOOP
+        # sqlite/postgres override this to add fts
+        # full text search tables.
+        pass
+
     def _convert_journal_tables(self):
         """Get current journal table contents, drop the table and re-create"""
         c = self.cursor
--- a/roundup/cgi/actions.py	Sat Jan 22 01:33:06 2022 -0500
+++ b/roundup/cgi/actions.py	Sun Jan 23 18:57:45 2022 -0500
@@ -1433,8 +1433,19 @@
 
         # full-text search
         if request.search_text:
-            matches = self.db.indexer.search(
-                re.findall(r'\b\w{2,25}\b', request.search_text), klass)
+            if self.db.indexer.query_language:
+                try:
+                    matches = self.db.indexer.search(
+                        [request.search_text], klass)
+                except Exception as e:
+                    error = " ".join(e.args)
+                    self.client.add_error_message(error)
+                    self.client.response_code = 400
+                    # trigger error reporting. NotFound isn't right but...
+                    raise exceptions.NotFound(error)
+            else:
+                matches = self.db.indexer.search(
+                    re.findall(r'\b\w{2,25}\b', request.search_text), klass)
         else:
             matches = None
 
@@ -1598,8 +1609,20 @@
 
         # full-text search
         if request.search_text:
-            matches = self.db.indexer.search(
-                re.findall(r'\b\w{2,25}\b', request.search_text), klass)
+            if self.db.indexer.query_language:
+                try:
+                    matches = self.db.indexer.search(
+                        [request.search_text], klass)
+                except Exception as e:
+                    error = " ".join(e.args)
+                    self.client.add_error_message(error)
+                    self.client.response_code = 400
+                    # trigger error reporting. NotFound isn't right but...
+                    raise exceptions.NotFound(error)
+            else:
+                matches = self.db.indexer.search(
+                    re.findall(r'\b\w{2,25}\b', request.search_text),
+                    klass)
         else:
             matches = None
 
--- a/roundup/cgi/client.py	Sat Jan 22 01:33:06 2022 -0500
+++ b/roundup/cgi/client.py	Sun Jan 23 18:57:45 2022 -0500
@@ -31,8 +31,8 @@
 from roundup.exceptions import LoginError, Reject, RejectRaw, \
                                Unauthorised, UsageError
 from roundup.cgi.exceptions import (
-    FormError, NotFound, NotModified, Redirect, SendFile, SendStaticFile,
-    DetectorError, SeriousError)
+    FormError, IndexerQueryError, NotFound, NotModified, Redirect,
+    SendFile, SendStaticFile, DetectorError, SeriousError)
 from roundup.cgi.form_parser import FormParser
 from roundup.mailer import Mailer, MessageSendError
 from roundup.cgi import accept_language
@@ -1923,7 +1923,11 @@
             }
             pt = self.instance.templates.load(tplname)
             # let the template render figure stuff out
-            result = pt.render(self, None, None, **args)
+            try:
+                result = pt.render(self, None, None, **args)
+            except IndexerQueryError as e:
+                result = self.renderError(e.args[0])
+
             self.additional_headers['Content-Type'] = pt.content_type
             if self.env.get('CGI_SHOW_TIMING', ''):
                 if self.env['CGI_SHOW_TIMING'].upper() == 'COMMENT':
@@ -1970,6 +1974,46 @@
                 else:
                     exec('raise exc_info[0], exc_info[1], exc_info[2]')  # nosec
 
+    def renderError(self, error, response_code=400, use_template=True):
+        self.response_code = response_code
+
+        # see if error message already logged add if not
+        if error not in self._error_message:
+            self.add_error_message(error, escape=True)
+
+        # allow use of template for a specific code
+        trial_templates = []
+        if use_template:
+            if response_code == 400:
+                trial_templates = [ "400" ]
+            else:
+                trial_templates = [ str(response_code), "400" ]
+
+        tplname = None
+        for rcode in trial_templates:
+            try:
+                tplname = self.selectTemplate(self.classname, rcode)
+                break
+            except templating.NoTemplate:
+                pass
+
+        if not tplname:
+            # call string of serious error to get basic html
+            # response.
+            return str(SeriousError(error))
+
+        args = {
+            'ok_message': self._ok_message,
+            'error_message': self._error_message
+        }
+
+        try:
+            pt = self.instance.templates.load(tplname)
+            return pt.render(self, None, None, **args)
+        except Exception:
+            # report original error
+            return str(SeriousError(error))
+
     # these are the actions that are available
     actions = (
         ('edit',        actions.EditItemAction),
--- a/roundup/cgi/exceptions.py	Sat Jan 22 01:33:06 2022 -0500
+++ b/roundup/cgi/exceptions.py	Sun Jan 23 18:57:45 2022 -0500
@@ -54,6 +54,11 @@
     """
     pass
 
+class IndexerQueryError(RoundupException):
+    """Raised to handle errors from FTS searches due to query
+       syntax errors.
+    """
+    pass
 
 class SendFile(RoundupException):
     """Send a file from the database."""
--- a/roundup/cgi/templating.py	Sat Jan 22 01:33:06 2022 -0500
+++ b/roundup/cgi/templating.py	Sun Jan 23 18:57:45 2022 -0500
@@ -3318,11 +3318,19 @@
         # get the list of ids we're batching over
         klass = self.client.db.getclass(self.classname)
         if self.search_text:
-            matches = self.client.db.indexer.search(
-                [u2s(w.upper()) for w in re.findall(
-                    r'(?u)\b\w{2,25}\b',
-                    s2u(self.search_text, "replace")
-                )], klass)
+            if self.client.db.indexer.query_language:
+                try:
+                    matches = self.client.db.indexer.search(
+                        [self.search_text], klass)
+                except Exception as e:
+                    self.client.add_error_message(" ".join(e.args))
+                    raise
+            else:
+                matches = self.client.db.indexer.search(
+                    [u2s(w.upper()) for w in re.findall(
+                        r'(?u)\b\w{2,25}\b',
+                        s2u(self.search_text, "replace")
+                    )], klass)
         else:
             matches = None
 
--- a/roundup/configuration.py	Sat Jan 22 01:33:06 2022 -0500
+++ b/roundup/configuration.py	Sun Jan 23 18:57:45 2022 -0500
@@ -426,7 +426,7 @@
 class IndexerOption(Option):
     """Valid options for indexer"""
 
-    allowed = ['', 'xapian', 'whoosh', 'native']
+    allowed = ['', 'xapian', 'whoosh', 'native', 'native-fts']
     class_description = "Allowed values: %s" % ', '.join("'%s'" % a
                                                          for a in allowed)
 
@@ -875,7 +875,8 @@
             "Force Roundup to use a particular text indexer.\n"
             "If no indexer is supplied, the first available indexer\n"
             "will be used in the following order:\n"
-            "Possible values: xapian, whoosh, native (internal)."),
+            "Possible values: xapian, whoosh, native (internal), "
+            "native-fts.\nNote 'native-fts' will only be used if set."),
         (Option, "indexer_language", "english",
             "Used to determine what language should be used by the\n"
             "indexer above. Currently only affects Xapian indexer. It\n"
@@ -885,7 +886,8 @@
         (WordListOption, "indexer_stopwords", "",
             "Additional stop-words for the full-text indexer specific to\n"
             "your tracker. See the indexer source for the default list of\n"
-            "stop-words (eg. A,AND,ARE,AS,AT,BE,BUT,BY, ...)"),
+            "stop-words (eg. A,AND,ARE,AS,AT,BE,BUT,BY, ...). This is\n"
+            "not used by the native-fts indexer."),
         (OctalNumberOption, "umask", "0o002",
             "Defines the file creation mode mask."),
         (IntegerNumberGeqZeroOption, 'csv_field_size', '131072',
--- a/test/db_test_base.py	Sat Jan 22 01:33:06 2022 -0500
+++ b/test/db_test_base.py	Sun Jan 23 18:57:45 2022 -0500
@@ -252,8 +252,9 @@
         self.db.issue.create(title="flebble frooz")
         self.db.commit()
 
-        self.assertEqual(self.db.database_schema['version'], 6,
-                         "This test only runs for database version 6")
+        if self.db.database_schema['version'] != 6:
+            self.skipTest("This test only runs for database version 6")
+
         self.db.database_schema['version'] = 5
         if self.db.dbtype == 'mysql':
             # version 6 has 5 indexes
--- a/test/test_cgi.py	Sat Jan 22 01:33:06 2022 -0500
+++ b/test/test_cgi.py	Sun Jan 23 18:57:45 2022 -0500
@@ -2036,6 +2036,61 @@
         self.assertNotEqual(-1,
            self.output[0].index('<!-- SHA: c87a4e18d59a527331f1d367c0c6cc67ee123e63 -->'))
 
+    def testRenderError(self):
+        # set up the client;
+        # run determine_context to set the required client attributes
+        # run renderError(); check result for proper page
+
+        self.client.form=db_test_base.makeForm({})
+        self.client.path = ''
+        self.client.determine_context()
+
+        error = "Houston, we have a problem"
+
+
+        # template rendering will fail and return fallback html
+        out = self.client.renderError(error, 404)
+
+        expected_fallback = (
+            '\n<html><head><title>Roundup issue tracker: '
+            'An error has occurred</title>\n'
+            ' <link rel="stylesheet" type="text/css" href="@@file/style.css">\n'
+            '</head>\n'
+            '<body class="body" marginwidth="0" marginheight="0">\n'
+            ' <p class="error-message">Houston, we have a problem</p>\n'
+            '</body></html>\n')
+
+        self.assertEqual(out, expected_fallback)
+        self.assertIn(error, self.client._error_message)
+        self.assertEqual(self.client.response_code, 404)
+
+        ### next test
+        # Set this so template rendering works.
+        self.client.classname = 'issue'
+
+        out = self.client.renderError("Houston, we have a problem", 404)
+        # match hard coded line in 404 template
+        expected = ('There is no <span>issue</span> with id')
+        
+        self.assertIn(expected, out)
+        self.assertEqual(self.client.response_code, 404)
+
+
+        ### next test
+        # disable template use get fallback
+        out = self.client.renderError("Houston, we have a problem", 404,
+                                      use_template=False)
+        
+        self.assertEqual(out, expected_fallback)
+        self.assertEqual(self.client.response_code, 404)
+
+        ### next test
+        # no 400 template (default 2nd param) so we get fallback
+        out = self.client.renderError("Houston, we have a problem")
+        self.assertEqual(out, expected_fallback)
+        self.assertIn(error, self.client._error_message)
+        self.assertEqual(self.client.response_code, 400)
+
     def testrenderContext(self):
         # set up the client;
         # run determine_context to set the required client attributes
@@ -2288,4 +2343,69 @@
         r = t.selectTemplate("user", "subdir/item")
         self.assertEqual("subdir/user.item", r)
 
+class SqliteCgiTest(unittest.TestCase):
+    """All of the rest of the tests use anydbm as the backend.
+       This class tests renderError when renderContext fails.
+       Triggering this error requires the native-fts backend for
+       the sqlite db.
+    """
+
+    def setUp(self):
+        self.dirname = '_test_template'
+        # set up and open a tracker
+        self.instance = setupTracker(self.dirname, backend="sqlite")
+
+        self.instance.config.INDEXER = "native-fts"
+
+        # open the database
+        self.db = self.instance.open('admin')
+        self.db.tx_Source = "web"
+
+        # create a client instance and hijack write_html
+        self.client = client.Client(self.instance, "user",
+                {'PATH_INFO':'/user', 'REQUEST_METHOD':'POST'},
+                form=db_test_base.makeForm({"@template": "item"}))
+
+        self.client._error_message = []
+        self.client._ok_message = []
+        self.client.db = self.db
+        self.client.userid = '1'
+        self.client.language = ('en',)
+        self.client.session_api = MockNull(_sid="1234567890")
+
+        self.output = []
+        # ugly hack to get html_write to return data here.
+        def html_write(s):
+            self.output.append(s)
+
+        # hijack html_write
+        self.client.write_html = html_write
+
+    def tearDown(self):
+        self.db.close()
+        try:
+            shutil.rmtree(self.dirname)
+        except OSError as error:
+            if error.errno not in (errno.ENOENT, errno.ESRCH): raise
+
+    def testRenderContextBadFtsQuery(self):
+        # only test for sqlite
+        if self.db.dbtype not in [ "sqlite" ]:
+            pytest.skip("Not tested for backends without native FTS")
+
+        # generate a bad fts query
+        self.client.form=db_test_base.makeForm(
+            { "@ok_message": "ok message", "@template": "index",
+            "@search_text": "foo-bar"})
+        self.client.path = 'issue'
+        self.client.determine_context()
+
+        result = self.client.renderContext()
+
+        expected = '\n<html><head><title>Roundup issue tracker: An error has occurred</title>\n <link rel="stylesheet" type="text/css" href="@@file/style.css">\n</head>\n<body class="body" marginwidth="0" marginheight="0">\n <p class="error-message">Search failed. Try quoting any terms that include a \'-\' and retry the search.</p>\n</body></html>\n'
+
+        self.assertEqual(result, expected)
+        self.assertEqual(self.client.response_code, 400)
+
+
 # vim: set filetype=python sts=4 sw=4 et si :
--- a/test/test_indexer.py	Sat Jan 22 01:33:06 2022 -0500
+++ b/test/test_indexer.py	Sun Jan 23 18:57:45 2022 -0500
@@ -24,6 +24,8 @@
 from roundup.backends import get_backend, have_backend
 from roundup.backends.indexer_rdbms import Indexer
 
+from roundup.cgi.exceptions import IndexerQueryError
+
 # borrow from other tests
 from .db_test_base import setupSchema, config
 from .test_postgresql import postgresqlOpener, skip_postgresql
@@ -115,6 +117,11 @@
                                                      ('test', '2', 'bar')])
     def test_extremewords(self):
         """Testing too short or too long words."""
+
+        # skip this for FTS test
+        if isinstance(self,sqliteFtsIndexerTest):
+            pytest.skip("extremewords not tested for native FTS backends")
+
         short = "b"
         long = "abcdefghijklmnopqrstuvwxyz"
         self.dex.add_text(('test', '1', 'a'), '%s hello world' % short)
@@ -219,6 +226,16 @@
         RDBMSIndexerTest.tearDown(self)
         postgresqlOpener.tearDown(self)
 
+"""
+@skip_postgresql
+class postgresqlFtsIndexerTest(postgresqlOpener, RDBMSIndexerTest, IndexerTest):
+    def setUp(self):
+        postgresqlOpener.setUp(self)
+        RDBMSIndexerTest.setUp(self)
+    def tearDown(self):
+        RDBMSIndexerTest.tearDown(self)
+        postgresqlOpener.tearDown(self)
+"""
 
 @skip_mysql
 class mysqlIndexerTest(mysqlOpener, RDBMSIndexerTest, IndexerTest):
@@ -233,4 +250,129 @@
 class sqliteIndexerTest(sqliteOpener, RDBMSIndexerTest, IndexerTest):
     pass
 
+class sqliteFtsIndexerTest(sqliteOpener, RDBMSIndexerTest, IndexerTest):
+    def setUp(self):
+        RDBMSIndexerTest.setUp(self)
+        from roundup.backends.indexer_sqlite_fts import Indexer
+        self.dex = Indexer(db)
+        self.dex.db = self.db
+
+    def test_phrase_and_near(self):
+        self.dex.add_text(('test', '1', 'foo'), 'a the hello world')
+        self.dex.add_text(('test', '2', 'foo'), 'helh blah blah the world')
+        self.dex.add_text(('test', '3', 'foo'), 'blah hello the world')
+        self.dex.add_text(('test', '4', 'foo'), 'hello blah blech the world')
+
+        # test two separate words for sanity
+        self.assertSeqEqual(self.dex.find(['"hello" "world"']),
+                                                    [('test', '1', 'foo'),
+                                                     ('test', '3', 'foo'),
+                                                     ('test', '4', 'foo')
+                                                    ])
+        # now check the phrase
+        self.assertSeqEqual(self.dex.find(['"hello world"']),
+                                                    [('test', '1', 'foo'),
+                                                     ])
+
+        # now check the phrase with near explicitly 0 intervening items
+        self.assertSeqEqual(self.dex.find(['NEAR(hello world, 0)']),
+                                                    [('test', '1', 'foo'),
+                                                     ])
+
+        # now check the phrase with near explicitly 1 intervening item
+        self.assertSeqEqual(self.dex.find(['NEAR(hello world, 1)']),
+                                                    [('test', '1', 'foo'),
+                                                     ('test', '3', 'foo'),
+                                                     ])
+        # now check the phrase with near explicitly 3 intervening item
+        self.assertSeqEqual(self.dex.find(['NEAR(hello world, 3)']),
+                                                    [('test', '1', 'foo'),
+                                                     ('test', '3', 'foo'),
+                                                     ('test', '4', 'foo'),
+                                                     ])
+
+    def test_prefix(self):
+        self.dex.add_text(('test', '1', 'foo'), 'a the hello world')
+        self.dex.add_text(('test', '2', 'foo'), 'helh blah blah the world')
+        self.dex.add_text(('test', '3', 'foo'), 'blah hello the world')
+        self.dex.add_text(('test', '4', 'foo'), 'hello blah blech the world')
+
+        self.assertSeqEqual(self.dex.find(['hel*']),
+                                                    [('test', '1', 'foo'),
+                                                     ('test', '2', 'foo'),
+                                                     ('test', '3', 'foo'),
+                                                     ('test', '4', 'foo')
+                                                    ])
+
+
+    def test_bool_start(self):
+        self.dex.add_text(('test', '1', 'foo'), 'a the hello world')
+        self.dex.add_text(('test', '2', 'foo'), 'helh blah blah the world')
+        self.dex.add_text(('test', '3', 'foo'), 'blah hello the world')
+        self.dex.add_text(('test', '4', 'foo'), 'hello blah blech the world')
+
+        self.assertSeqEqual(self.dex.find(['hel* NOT helh NOT blech']),
+                                                    [('test', '1', 'foo'),
+                                                     ('test', '3', 'foo'),
+                                                    ])
+
+        self.assertSeqEqual(self.dex.find(['hel* NOT helh NOT blech OR the']),
+                                                    [('test', '1', 'foo'),
+                                                     ('test', '2', 'foo'),
+                                                     ('test', '3', 'foo'),
+                                                     ('test', '4', 'foo'),
+                                                    ])
+
+        self.assertSeqEqual(self.dex.find(['helh OR hello']),
+                                                    [('test', '1', 'foo'),
+                                                     ('test', '2', 'foo'),
+                                                     ('test', '3', 'foo'),
+                                                     ('test', '4', 'foo'),
+                                                    ])
+
+
+        self.assertSeqEqual(self.dex.find(['helh AND hello']),
+                                                    [])
+        # matches if line starts with hello
+        self.assertSeqEqual(self.dex.find(['^hello']),
+                                                    [
+                                                     ('test', '4', 'foo'),
+                                                    ])
+
+        self.assertSeqEqual(self.dex.find(['hello']),
+                                                    [
+                                                     ('test', '1', 'foo'),
+                                                     ('test', '3', 'foo'),
+                                                     ('test', '4', 'foo'),
+                                                    ])
+
+    def test_query_errors(self):
+        """test query phrases that generate an error. Also test the
+           correction"""
+
+        self.dex.add_text(('test', '1', 'foo'), 'a the hello-world')
+        self.dex.add_text(('test', '2', 'foo'), 'helh blah blah the world')
+        self.dex.add_text(('test', '3', 'foo'), 'blah hello the world')
+        self.dex.add_text(('test', '4', 'foo'), 'hello blah blech the world')
+
+        # handle known error that roundup recognizes and tries to diagnose
+        with self.assertRaises(IndexerQueryError) as ctx:
+            self.dex.find(['the hello-world'])
+
+        error = ( "Search failed. Try quoting any terms that include a '-' "
+                  "and retry the search.")
+        self.assertEqual(str(ctx.exception), error)
+
+
+        self.assertSeqEqual(self.dex.find(['the "hello-world"']),
+                                                    [('test', '1', 'foo'),
+                                                    ])
+
+        # handle known error that roundup recognizes and tries to diagnose
+        with self.assertRaises(IndexerQueryError) as ctx:
+                self.dex.find(['hello world + ^the'])
+
+        error = 'Query error: syntax error near "^"'
+        self.assertEqual(str(ctx.exception), error)
+
 # vim: set filetype=python ts=4 sw=4 et si

Roundup Issue Tracker: http://roundup-tracker.org/