annotate doc/implementation.txt @ 6915:9ff091537f43

postgresql native-fts; more indexer tests 1) Make postgresql native-fts actually work. 2) Add simple stopword filtering to sqlite native-fts indexer. 3) Add more tests for indexer_common get_indexer Details: 1) roundup/backends/indexer_postgresql_fts.py: ignore ValueError raised if we try to index a string with a null character in it. This could happen due to an incorrect text/ mime type on a file that has nulls in it. Replace ValueError raised by postgresql with customized IndexerQueryError if a search string has a null in it. roundup/backends/rdbms_common.py: Make postgresql native-fts work. When specified it was using using whatever was returned from get_indexer(). However loading the native-fts indexer backend failed because there was no connection to the postgresql database when this call was made. Simple solution, move the call after the open_connection call in Database::__init__(). However the open_connection call creates the schema for the database if it is not there. The schema builds tables for indexer=native type indexing. As part of the build it looks at the indexer to see the min/max size of the indexed tokens. No indexer define, we get a crash. So it's a a chicken/egg issue. I solved it by setting the indexer to the Indexer from indexer_common which has the min/max token size info. I also added a no-op save_indexer to this Indexer class. I claim save_indexer() isn't needed as a commit() on the db does all the saving required. Then after open_connection is called, I call get_indexer to retrieve the correct indexer and indexer_postgresql_fts woks since the conn connection property is defined. roundup/backends/indexer_common.py: add save_index() method for indexer. It does nothing but is needed in rdbms backends during schema initialization. 2) roundup/backends/indexer_sqlite_fts.py: when this indexer is used, the indexer test in DBTest on the word "the" fail. This is due to missing stopword filtering. Implement basic stopword filtering for bare stopwords (like 'the') to make the test pass. Note: this indexer is not currently automatically run by the CI suite, it was found during manual testing. However there is a FIXME to extract the indexer tests from DBTest and run it using this backend. roundup/configuration.py, roundup/doc/admin_guide.txt: update doc on stopword use for sqlite native-fts. test/db_test_base.py: DBTest::testStringBinary creates a file with nulls in it. It was breaking postgresql with native-fts indexer. Changed test to assign mime type application/octet-stream that prevents it from being processed by any text search indexer. add test to exclude indexer searching in specific props. This code path was untested before. test/test_indexer.py: add test to call find with no words. Untested code path. add test to index and find a string with a null \x00 byte. it was tested inadvertently by testStringBinary but this makes it explicit and moves it to indexer testing. (one version each for: generic, postgresql and mysql) Renamed Get_IndexerAutoSelectTest to Get_IndexerTest and renamed autoselect tests to include autoselect. Added tests for an invalid indexer and using native-fts with anydbm (unsupported combo) to make sure the code does something useful if the validation in configuration.py is broken. test/test_liveserver.py: add test to load an issue add test using text search (fts) to find the issue add tests to find issue using postgresql native-fts test/test_postgresql.py, test/test_sqlite.py: added explanation on how to setup integration test using native-fts. added code to clean up test environment if native-fts test is run.
author John Rouillard <rouilj@ieee.org>
date Mon, 05 Sep 2022 16:25:20 -0400
parents 9ca128103a3a
children 6f5054751fb6
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
688
b38f4f95bffd More doc tweaks
Richard Jones <richard@users.sourceforge.net>
parents: 686
diff changeset
1 ====================
111
6e8197a5880f Split off implementation notes into separate file in doc directory.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
2 Implementation notes
6e8197a5880f Split off implementation notes into separate file in doc directory.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
3 ====================
6e8197a5880f Split off implementation notes into separate file in doc directory.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
4
6e8197a5880f Split off implementation notes into separate file in doc directory.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
5 [see also the roundup package docstring]
6e8197a5880f Split off implementation notes into separate file in doc directory.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
6
6e8197a5880f Split off implementation notes into separate file in doc directory.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
7 There have been some modifications to the spec. I've marked these in the
6e8197a5880f Split off implementation notes into separate file in doc directory.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
8 source with 'XXX' comments when I remember to.
6e8197a5880f Split off implementation notes into separate file in doc directory.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
9
6e8197a5880f Split off implementation notes into separate file in doc directory.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
10 In short:
6e8197a5880f Split off implementation notes into separate file in doc directory.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
11 Class.find() - may match multiple properties, uses keyword args.
6e8197a5880f Split off implementation notes into separate file in doc directory.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
12
1661
b9c1226cb600 Reflowed text to 72 cols...
Jean Jordaan <neaj@users.sourceforge.net>
parents: 1089
diff changeset
13 Class.filter() - isn't in the spec and it's very useful to have at the
b9c1226cb600 Reflowed text to 72 cols...
Jean Jordaan <neaj@users.sourceforge.net>
parents: 1089
diff changeset
14 Class level.
b9c1226cb600 Reflowed text to 72 cols...
Jean Jordaan <neaj@users.sourceforge.net>
parents: 1089
diff changeset
15
111
6e8197a5880f Split off implementation notes into separate file in doc directory.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
16 CGI interface index view specifier layout part - lose the '+' from the
6e8197a5880f Split off implementation notes into separate file in doc directory.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
17 sorting arguments (it's a reserved URL character ;). Just made no
6e8197a5880f Split off implementation notes into separate file in doc directory.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
18 prefix mean ascending and '-' prefix descending.
6e8197a5880f Split off implementation notes into separate file in doc directory.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
19
6e8197a5880f Split off implementation notes into separate file in doc directory.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
20 ItemClass - renamed to IssueClass to better match it only having one
6e8197a5880f Split off implementation notes into separate file in doc directory.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
21 hypderdb class "issue". Allowing > 1 hyperdb class breaks the
1661
b9c1226cb600 Reflowed text to 72 cols...
Jean Jordaan <neaj@users.sourceforge.net>
parents: 1089
diff changeset
22 "superseder" multilink (since it can only link to one thing, and
b9c1226cb600 Reflowed text to 72 cols...
Jean Jordaan <neaj@users.sourceforge.net>
parents: 1089
diff changeset
23 we'd want bugs to link to support and vice-versa).
111
6e8197a5880f Split off implementation notes into separate file in doc directory.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
24
1661
b9c1226cb600 Reflowed text to 72 cols...
Jean Jordaan <neaj@users.sourceforge.net>
parents: 1089
diff changeset
25 template - the call="link()" is handled by special-case mechanisms in
b9c1226cb600 Reflowed text to 72 cols...
Jean Jordaan <neaj@users.sourceforge.net>
parents: 1089
diff changeset
26 my top-level CGI handler. In a nutshell, the handler looks for a
b9c1226cb600 Reflowed text to 72 cols...
Jean Jordaan <neaj@users.sourceforge.net>
parents: 1089
diff changeset
27 method on itself called 'index%s' or 'item%s' where %s is a class.
b9c1226cb600 Reflowed text to 72 cols...
Jean Jordaan <neaj@users.sourceforge.net>
parents: 1089
diff changeset
28 Most items pass on to the templating mechanism, but the file class
b9c1226cb600 Reflowed text to 72 cols...
Jean Jordaan <neaj@users.sourceforge.net>
parents: 1089
diff changeset
29 _always_ does downloading. It'll probably stay this way too...
111
6e8197a5880f Split off implementation notes into separate file in doc directory.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
30
1089
43ab730ee194 instance -> tracker, node -> item
Richard Jones <richard@users.sourceforge.net>
parents: 907
diff changeset
31 template - call="link(property)" may be used to link "the current item"
111
6e8197a5880f Split off implementation notes into separate file in doc directory.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
32 (from an index) - the link text is the property specified.
6e8197a5880f Split off implementation notes into separate file in doc directory.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
33
6e8197a5880f Split off implementation notes into separate file in doc directory.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
34 template - added functions that I found very useful: List, History and
6e8197a5880f Split off implementation notes into separate file in doc directory.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
35 Submit.
6e8197a5880f Split off implementation notes into separate file in doc directory.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
36
1661
b9c1226cb600 Reflowed text to 72 cols...
Jean Jordaan <neaj@users.sourceforge.net>
parents: 1089
diff changeset
37 template - items must specify the message lists, history, etc. Having
b9c1226cb600 Reflowed text to 72 cols...
Jean Jordaan <neaj@users.sourceforge.net>
parents: 1089
diff changeset
38 them by default was sometimes not wanted.
111
6e8197a5880f Split off implementation notes into separate file in doc directory.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
39
1661
b9c1226cb600 Reflowed text to 72 cols...
Jean Jordaan <neaj@users.sourceforge.net>
parents: 1089
diff changeset
40 template - index view determines its default columns from the
b9c1226cb600 Reflowed text to 72 cols...
Jean Jordaan <neaj@users.sourceforge.net>
parents: 1089
diff changeset
41 template's ``tal:condition="request/show/<property>"`` directives.
111
6e8197a5880f Split off implementation notes into separate file in doc directory.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
42
6e8197a5880f Split off implementation notes into separate file in doc directory.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
43 template - menu() and field() look awfully similar now .... ;)
6e8197a5880f Split off implementation notes into separate file in doc directory.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
44
907
38a74d1351c5 documentation updates
Richard Jones <richard@users.sourceforge.net>
parents: 688
diff changeset
45 roundup_admin.py - the command-line tool has a lot more commands at its
38a74d1351c5 documentation updates
Richard Jones <richard@users.sourceforge.net>
parents: 688
diff changeset
46 disposal
111
6e8197a5880f Split off implementation notes into separate file in doc directory.
Richard Jones <richard@users.sourceforge.net>
parents:
diff changeset
47
686
c52122f38c9b Documentation cleanup, added info for potential (and current) developers
Richard Jones <richard@users.sourceforge.net>
parents: 111
diff changeset
48 -----------------
c52122f38c9b Documentation cleanup, added info for potential (and current) developers
Richard Jones <richard@users.sourceforge.net>
parents: 111
diff changeset
49
c52122f38c9b Documentation cleanup, added info for potential (and current) developers
Richard Jones <richard@users.sourceforge.net>
parents: 111
diff changeset
50 Back to `Table of Contents`_
c52122f38c9b Documentation cleanup, added info for potential (and current) developers
Richard Jones <richard@users.sourceforge.net>
parents: 111
diff changeset
51
6905
9ca128103a3a fix broken links to doc table of contents
John Rouillard <rouilj@ieee.org>
parents: 4557
diff changeset
52 .. _`Table of Contents`: ../docs.html
686
c52122f38c9b Documentation cleanup, added info for potential (and current) developers
Richard Jones <richard@users.sourceforge.net>
parents: 111
diff changeset
53

Roundup Issue Tracker: http://roundup-tracker.org/