Skip to content

Commit 39f857e

Browse files
committed
Tweakage
1 parent 95f6a45 commit 39f857e

File tree

1 file changed

+8
-11
lines changed

1 file changed

+8
-11
lines changed

code/gmane/README.txt

Lines changed: 8 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -9,22 +9,21 @@ You should install the SQLite browser to view and modify the databases from:
99

1010
http://sqlitebrowser.org/
1111

12-
The base URL is hard-coded in the gmane.py.
13-
Make sure to delete the content.sqlite file if you
14-
switch the base url. The gmane.py file operates as a spider in
15-
that it runs slowly and retrieves one mail message per second so
16-
as to avoid getting throttled by gmane.org. It stores all of
12+
The base URL is hard-coded in the gmane.py. Make sure to delete the
13+
content.sqlite file if you switch the base url. The gmane.py file
14+
operates as a spider in that it runs slowly and retrieves one mail
15+
message per second so as to avoid getting throttled. It stores all of
1716
its data in a database and can be interrupted and re-started
1817
as often as needed. It may take many hours to pull all the data
1918
down. So you may need to restart several times.
2019

2120
To give you a head-start, I have put up 600MB of pre-spidered Sakai
2221
email here:
2322

24-
https://online.dr-chuck.com/files/sakai/email/content.sqlite
23+
https://online.dr-chuck.com/files/sakai/email/content.sqlite.zip
2524

26-
If you download this, you can "catch up with the latest" by
27-
running gmane.py.
25+
If you download and unzip this, you can "catch up with the
26+
latest" by running gmane.py.
2827

2928
Navigate to the folder where you extracted the gmane.zip
3029

@@ -42,14 +41,12 @@ http://mbox.dr-chuck.net/sakai.devel/6/7 3586
4241
http://mbox.dr-chuck.net/sakai.devel/7/8 10600
4342
john@caret.cam.ac.uk 2005-12-09T13:42:24+00:00 re: lms/vle rants/comments
4443

45-
Does not start with From
46-
4744
The program scans content.sqlite from 1 up to the first message number not
4845
already spidered and starts spidering at that message. It continues spidering
4946
until it has spidered the desired number of messages or it reaches a page
5047
that does not appear to be a properly formatted message.
5148

52-
Sometimes gmane.org is missing a message. Perhaps administrators can delete messages
49+
Sometimes there is missing a message. Perhaps administrators can delete messages
5350
or perhaps they get lost - I don't know. If your spider stops, and it seems it has hit
5451
a missing message, go into the SQLite Manager and add a row with the missing id - leave
5552
all the other fields blank - and then restart gmane.py. This will unstick the

0 commit comments

Comments
 (0)