view doc/implementation.txt @ 5096:e74c3611b138

- issue2550636, issue2550909: Added support for Whoosh indexer. Also adds new config.ini setting called indexer to select indexer. See ``doc/upgrading.txt`` for details. Initial patch done by David Wolever. Patch modified (see ticket or below for changes), docs updated and committed. I have an outstanding issue with test/test_indexer.py. I have to comment out all imports and tests for indexers I don't have (i.e. mysql, postgres) otherwise no tests run. With that change made, dbm, sqlite (rdbms), xapian and whoosh indexes are all passing the indexer tests. Changes summary: 1) support native back ends dbm and rdbms. (original patch only fell through to dbm) 2) Developed whoosh stopfilter to not index stopwords or words outside the the maxlength and minlength limits defined in index_common.py. Required to pass the extremewords test_indexer test. Also I removed a call to .lower on the input text as the tokenizer I chose automatically does the lowercase. 3) Added support for max/min length to find. This was needed to pass extremewords test. 4) Added back a call to save_index in add_text. This allowed all but two tests to pass. 5) Fixed a call to: results = searcher.search(query.Term("identifier", identifier)) which had an extra parameter that is an error under current whoosh. 6) Set limit=None in search call for find() otherwise it only return 10 items. This allowed it to pass manyresults test Also due to changes in the roundup code removed the call in indexer_whoosh to from roundup.anypy.sets_ import set since we use the python builtin set.
author John Rouillard <rouilj@ieee.org>
date Sat, 25 Jun 2016 20:10:03 -0400
parents 33a1f03b9de0
children 9ca128103a3a
line wrap: on
line source

====================
Implementation notes
====================

[see also the roundup package docstring]

There have been some modifications to the spec. I've marked these in the
source with 'XXX' comments when I remember to.

In short:
 Class.find() - may match multiple properties, uses keyword args.

 Class.filter() - isn't in the spec and it's very useful to have at the
    Class level.

 CGI interface index view specifier layout part - lose the '+' from the
    sorting arguments (it's a reserved URL character ;). Just made no
    prefix mean ascending and '-' prefix descending.

 ItemClass - renamed to IssueClass to better match it only having one
    hypderdb class "issue". Allowing > 1 hyperdb class breaks the
    "superseder" multilink (since it can only link to one thing, and
    we'd want bugs to link to support and vice-versa).

 template - the call="link()" is handled by special-case mechanisms in
    my top-level CGI handler. In a nutshell, the handler looks for a
    method on itself called 'index%s' or 'item%s' where %s is a class.
    Most items pass on to the templating mechanism, but the file class
    _always_ does downloading. It'll probably stay this way too...

 template - call="link(property)" may be used to link "the current item"
    (from an index) - the link text is the property specified.

 template - added functions that I found very useful: List, History and
    Submit.

 template - items must specify the message lists, history, etc. Having
    them by default was sometimes not wanted.

 template - index view determines its default columns from the
    template's ``tal:condition="request/show/<property>"`` directives.

 template - menu() and field() look awfully similar now .... ;)

 roundup_admin.py - the command-line tool has a lot more commands at its
    disposal

-----------------

Back to `Table of Contents`_

.. _`Table of Contents`: index.html


Roundup Issue Tracker: http://roundup-tracker.org/