annotate scripts/schema-dump.py @ 5096:e74c3611b138

- issue2550636, issue2550909: Added support for Whoosh indexer. Also adds new config.ini setting called indexer to select indexer. See ``doc/upgrading.txt`` for details. Initial patch done by David Wolever. Patch modified (see ticket or below for changes), docs updated and committed. I have an outstanding issue with test/test_indexer.py. I have to comment out all imports and tests for indexers I don't have (i.e. mysql, postgres) otherwise no tests run. With that change made, dbm, sqlite (rdbms), xapian and whoosh indexes are all passing the indexer tests. Changes summary: 1) support native back ends dbm and rdbms. (original patch only fell through to dbm) 2) Developed whoosh stopfilter to not index stopwords or words outside the the maxlength and minlength limits defined in index_common.py. Required to pass the extremewords test_indexer test. Also I removed a call to .lower on the input text as the tokenizer I chose automatically does the lowercase. 3) Added support for max/min length to find. This was needed to pass extremewords test. 4) Added back a call to save_index in add_text. This allowed all but two tests to pass. 5) Fixed a call to: results = searcher.search(query.Term("identifier", identifier)) which had an extra parameter that is an error under current whoosh. 6) Set limit=None in search call for find() otherwise it only return 10 items. This allowed it to pass manyresults test Also due to changes in the roundup code removed the call in indexer_whoosh to from roundup.anypy.sets_ import set since we use the python builtin set.
author John Rouillard <rouilj@ieee.org>
date Sat, 25 Jun 2016 20:10:03 -0400
parents 27f592f3d696
children e46ce04d5bbc
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
4937
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
1 #!/usr/bin/env python
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
2 # -*- coding: utf-8 -*-
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
3 """
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
4 Use recently documented XML-RPC API to dump
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
5 Roundup data schema in human readable form.
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
6
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
7 Future development may cover:
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
8 [ ] unreadable dump formats
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
9 [ ] access to local database
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
10 [ ] lossless dump/restore cycle
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
11 [ ] data dump and filtering with preserved
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
12 """
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
13 __license__ = "Public Domain"
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
14 __version__ = "1.0"
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
15 __authors__ = [
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
16 "anatoly techtonik <techtonik@gmail.com>"
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
17 ]
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
18
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
19 import os
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
20 import sys
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
21 import xmlrpclib
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
22 import pprint
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
23 import textwrap
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
24 from optparse import OptionParser
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
25
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
26 sname = os.path.basename(sys.argv[0])
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
27 usage = """\
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
28 usage: %s [options] URL
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
29
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
30 URL is XML-RPC endpoint for your tracker, such as:
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
31
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
32 http://localhost:8917/demo/xmlrpc
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
33
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
34 options:
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
35 --pprint (default)
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
36 --json
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
37 --yaml
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
38 --raw
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
39
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
40 -h --help
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
41 --version
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
42 """ % sname
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
43
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
44 def format_pprint(var):
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
45 return pprint.pformat(var)
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
46
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
47 def format_json(var):
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
48 jout = pprint.pformat(var)
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
49 jout = jout.replace('"', "\\'") # " to \'
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
50 jout = jout.replace("'", '"') # ' to "
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
51 jout = jout.replace('\\"', "'") # \" to '
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
52 return jout
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
53
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
54 def format_yaml(var):
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
55 out = pprint.pformat(var)
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
56 out = out.replace('{', ' ')
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
57 out = out.replace('}', '')
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
58 out = textwrap.dedent(out)
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
59 out = out.replace("'", '')
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
60 out = out.replace(' [[', '\n [')
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
61 out = out.replace(']]', ']')
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
62 out = out.replace('],', '')
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
63 out = out.replace(']', '')
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
64 out2 = []
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
65 for line in out.splitlines():
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
66 if '[' in line:
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
67 line = ' ' + line.lstrip(' [')
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
68 line = line.replace('>', '')
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
69 line = line.replace('roundup.hyperdb.', '')
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
70 # expandtabs(16) with limit=1
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
71 n, v = line.split(', <')
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
72 if len(n) > 14:
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
73 indent = 0
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
74 else:
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
75 indent = 14 - len(n)
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
76 line = line.replace(', <', ': '+' '*indent)
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
77 line.split(",")
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
78 out2.append(line)
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
79 out = '\n'.join(out2)
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
80 return out
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
81
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
82 if __name__ == "__main__":
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
83 if len(sys.argv) < 2 or "-h" in sys.argv or "--help" in sys.argv:
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
84 sys.exit(usage)
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
85 if "--version" in sys.argv:
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
86 sys.exit(sname + " " + __version__)
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
87
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
88 parser = OptionParser()
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
89 parser.add_option("--raw", action='store_true')
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
90 parser.add_option("--yaml", action='store_true')
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
91 parser.add_option("--json", action='store_true')
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
92 (options, args) = parser.parse_args()
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
93
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
94 url = args[0]
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
95 roundup_server = xmlrpclib.ServerProxy(url, allow_none=True)
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
96 schema = roundup_server.schema()
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
97 if options.raw:
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
98 print(str(schema))
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
99 elif options.yaml:
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
100 print(format_yaml(schema))
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
101 elif options.json:
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
102 print(format_json(schema))
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
103 else:
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
104 print(format_pprint(schema))
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
105
9369ade6c24b scripts/schema-dump.py: New script to dump schema from tracker through XML-RPC
anatoly techtonik <techtonik@gmail.com>
parents:
diff changeset
106 print("")

Roundup Issue Tracker: http://roundup-tracker.org/