-
Notifications
You must be signed in to change notification settings - Fork 1
Expand file tree
/
Copy pathWeb.html
More file actions
1132 lines (991 loc) · 53.6 KB
/
Copy pathWeb.html
File metadata and controls
1132 lines (991 loc) · 53.6 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html><head>
<title></title>
</head><body>
<div class="article">
<h2>Description</h2>
<p>The PlotDevice Web library offers a collection of services to retrieve content from the
internet. You can use the library to query <a href="http://www.yahoo.com">Yahoo!</a> for links,
images, news and spelling suggestions, to read RSS and Atom newsfeeds, to retrieve articles
from <a href="http://en.wikipedia.org">Wikipedia</a>, to collect quality images from <a href="http://www.morguefile.com">morgueFile</a> or <a href="http://www.flickr.com">Flickr</a>, to
get color themes from <a href="http://kuler.adobe.com">kuler</a> or <a href="http://www.colr.org">Colr</a>, to browse through HTML documents, to clean up HTML, to validate
URL’s, to create GIF images from math equations using <a href="http://www.forkosh.com/mimetex.html">mimeTeX</a>, to get ironic word definitions from <a href="http://www.urbandictionary.com">Urban Dictionary</a>.<br/></p>
<p>The PlotDevice Web library works with a caching mechanism that stores things you download from
the web, so they can be retrieved faster the next time. Many of the services also work
asynchronously. This means you can use the library in an animation that keeps on running while
new content is downloaded in the background.</p>
<p>The library bundles Leonard Richardson’s <a href="http://www.crummy.com/software/BeautifulSoup/">Beautiful Soup</a> to parse HTM, Mark Pilgrim’s
<a href="http://www.feedparser.org/">Universal Feed Parser</a> for newsfeeds, a connection to
John Forkosh’s mathTeX server (thanks Cedric Foellmi), Leif K-Brooks entity replace algorithm,
<a href="http://code.google.com/p/simplejson/">simplejson</a>, and patches for Debian from the
people at <a href="http://indywiki.sourceforge.net/">Indywiki</a>.<br/></p>
<h2>Download</h2>
<table border="0">
<tbody>
<tr>
<td><a href="http://plotdevice.io/extras/web.zip"><img alt="download" height="20" src="../etc/lib/download.gif" width="20"/></a>
</td><td><a href="http://plotdevice.io/extras/web.zip">web.zip</a> (390KB)<br/>
<i>Last updated for NodeBox 1.9.4.6<br/>
Licensed under GPL<br/></i><i>Author: Tom De Smedt</i><br/>
</td></tr></tbody></table>
<h2>Documentation</h2>
<ul>
<li><a href="#library">How to get the library up and running</a>
</li><li><a href="#validation">Validating web content</a><br/>
</li><li><a href="#url">Working with URL’s</a>
</li><li><a href="#html">Working with HTML</a>
</li><li><a href="#yahoo">Querying Yahoo! for links, images and news</a>
</li><li><a href="#yahoocontextual">Improving Yahoo! results with a contextual search</a>
</li><li><a href="#yahoospelling">Using Yahoo! to suggest spelling corrections</a><br/>
</li><li><a href="#yahoosort">Using Yahoo! to sort associatively</a>
</li><li><a href="#google">Querying Google</a><br/>
</li><li><a href="#newsfeed">Reading newsfeeds</a>
</li><li><a href="#wikipedia">Retrieving articles from Wikipedia</a>
</li><li><a href="#wikipediahelpers">Some helper commands to draw Wikipedia content in PlotDevice</a>
</li><li><a href="#morguefile">Querying morgueFile for images</a>
</li><li><a href="#flickr">Querying Flickr for images</a><br/>
</li><li><a href="#kuler">Querying kuler for color themes</a>
</li><li><a href="#colr">Querying Colr for color schemes</a>
</li><li><a href="#math">Creating GIF images from math equations</a>
</li><li><a href="#urbandictionary">Word definitions from Urban Dictionary</a><br/>
</li><li><a href="#asynchronous">Working with asynchronous downloads</a>
</li><li><a href="#cache">Clearing the cache</a>
</li><li><a href="#json">Reading JSON</a><br/>
</li></ul>
<p> </p>
<hr/>
<h2><a id="library" name="library" title="library"></a>How to get the library up and
running</h2>
<p>Put the <i>web</i> library folder in the same folder as your script so PlotDevice can find the
library.<br/>
You can also put it in <i>~/Library/Application Support/PlotDevice/.</i></p>
<pre>web = ximport("web")
</pre>
<p><span class="small_text">Outside of PlotDevice you can also just do <i>import web</i>.</span></p>
<p><span class="grey_box">Proxy servers</span> <br/></p>
<p>If you are behind a proxy server the library may not be able to connect to the internet.<br/>
In that case you need to inform the library with the <i>set_proxy()</i> command:</p>
<pre>web.set_proxy("https://www.myproxyserver.com:80", type="https")
</pre>
<hr/>
<h2><a id="validation" name="validation" title="validation"></a>Validating web content</h2>
<p>Web content is accessed with a URL, the address you use to connect to a place on the
internet. The library has a number of commands to find out what type of content (e.g. web page,
image, ...) is associated with a given URL.</p>
<p>The most basic command, <i>is_url()</i> checks whether a given string is a grammatically
correct URL (e.g. <i>http://nodebox.net</i> but not <i>htp://nodebox.net</i>). It takes a wait
parameter indicating the number of seconds after which the library should stop connecting to
the internet and give up.</p>
<pre>web.is_url(url, wait=10)
</pre>
<p>Even if a URL is valid, it might not refer to actual content on the internet. We can check
if a URL exists with the <i>not_found()</i> command:</p>
<pre>web.url.not_found(url, wait=10)
</pre>
<p>The following commands are useful to find out what content is associated with the URL. We
can discern between HTML web pages which we can parse with <i><a href="#html">page.parse()</a></i>, newsfeeds which we can parse with <i><a href="#newsfeed">newsfeed.parse()</a></i>, images, audio and video etc. which we can download with
<i><a href="#url">url.retrieve()</a></i>.</p>
<pre>web.url.is_webpage(url, wait=10)
</pre>
<pre>web.url.is_stylesheet(url, wait=10)
</pre>
<pre>web.url.is_plaintext(url, wait=10)
</pre>
<pre>web.url.is_pdf(url, wait=10)
</pre>
<pre>web.url.is_newsfeed(url, wait=10)
</pre>
<pre>web.url.is_image(url, wait=10)
</pre>
<pre>web.url.is_audio(url, wait=10)
</pre>
<pre>web.url.is_video(url, wait=10)
</pre>
<pre>web.url.is_archive(url, wait=10)
</pre>
<hr/>
<h2><a id="url" name="url" title="url"></a>Working with URL’s</h2>
<p>A <a href="http://en.wikipedia.org/wiki/Url">URL</a> is the address you use to connect
to a page on the internet, for example: <i>http://nodebox.net</i>. The PlotDevice Web library can
do three different things with a URL: download the content associated with it, parse it (find
out which parts make up the URL) and construct it from scratch (a simple way to create a URL
with HTTP GET or HTTP POST data).</p>
<pre>web.download(url, wait=60, cache=None, type=".html")
</pre>
<pre>web.save(url, path="", wait=60)
</pre>
<pre>web.url.retrieve(url, wait=60, asynchronous=False, cache=None, type=".html")
</pre>
<pre>web.url.parse(url)
</pre>
<pre>web.url.create(url="", method="get")
</pre>
<p>The <i>download()</i> command returns the content associated with the given web address. The
command has an optional parameter <i>wait</i> that determines <b>how long to wait for a
download</b>. If the time is exceeded, the download is aborted.</p>
<p>The last two parameters can be used to <b>cache downloaded content locally</b>, so it
doesn’t have to downloaded again in the future. The <i>cache</i> parameter is a string with the
name of a subfolder in /cache where to store content. The <i>type</i> parameter is the file
extension of the downloaded content.</p>
<p>The <i>save()</i> command stores the URL’s content at the given local path. If no path is
given it will attempt to extract a filename from the URL and store that in the current working
directory. The path to the saved file is returned.</p>
<p>The Web library also has easier ways to deal with specific web content like HTML
(<i>page.parse()</i> command) or Wikipedia articles (<i>wikipedia.search()</i> command) for
example.</p>
<p>The <i>download()</i> command is actually an alias of the <i>url.retrieve()</i> command.
This command has an additional <i>asynchronous</i> parameter useful to <b>download stuff in the
background</b> while an animation keeps on running. We’ll see about asynchronous downloads
later on. The <i>url.retrieve()</i> command returns an object with a <i>data</i> property
containing a string with the downloaded content. If you don’t need anything that complicated
just use the easy <i>download()</i> command:<br/></p>
<pre># Download an image from the PlotDevice Gallery.
url = "http://nodebox.net/code/data/media/twisted-final.jpg"
img = web.download(url)
# Display the image data in PlotDevice.
image(None, 0, 0, data=img)
# Write the image data to a file.
file = open("twisted.jpg", "w")
file.write(img)
file.close()
</pre>
<p><img alt="web_url1" height="390" src="../etc/lib/web_url1.jpg" width="550"/><br/></p>
<p> </p>
<p>The <i>url.parse()</i> command splits a given url into its components. The returned objects
has the following properties:</p>
<ul>
<li><i>url.protocol</i>: the type of internet service, usually <i>http</i>
</li><li><i>url.domain</i>: the domain name, for example, <i>nodebox.net</i><br/>
</li><li><i>url.username</i>: a username for a secure connection<br/>
</li><li><i>url.password</i>: a password for a secure connection
</li><li><i>url.port</i>: the port number at the host
</li><li><i>url.path</i>: the subdirectory at the server, for example <i>/code/index.php/</i>
</li><li><i>url.page</i>: the name of the document, for example <i>search</i><br/>
</li><li><i>url.anchor</i>: named anchor on the page<br/>
</li><li><i>url.query</i>: a dictionary of query string values, for example <i>{ ‘q’: ‘pixels’
}</i><br/>
</li><li><i>url.method</i>: the query string method either ‘get’ or ‘post’<br/>
</li></ul>
<p>In the same way the <i>url.create()</i> command returns an object with these properties.
This command is useful to, for example, construct URL’s with a POST query and pass that to
<i>url.retrieve()</i> or <i>page.parse()</i>.</p>
<p>For example. this script retrieves the first 10 forum pages from PlotDevice:</p>
<pre>url = web.url.create("http://nodebox.net/code/index.php/Share")
url.query["p"] = 1
for i in range(10):
print web.url.retrieve(url).data
url.query["p"] += 1
</pre>
<hr/>
<h2><a id="html" name="html" title="html"></a>Working with HTML</h2>
<p>The PlotDevice Web library uses Leonard Richardson’s <a href="http://www.crummy.com/software/BeautifulSoup/">Beautiful Soup</a> to parse HTML content. This
means you can search and loop through all the tags in the HTML. For example, you might want to
download a HTML page from the internet, find all the links in it and then download all those
links. Or find all the image tags in the source and then retrieve all those images with
<i>download()</i>.</p>
<p>The <i>page.parse()</i> command takes a URL as input and returns a Beautiful Soup object.
The optional <i>cached</i> parameter determines if downloaded pages should be cached locally
for faster retrieval.</p>
<pre>web.page.parse(url, wait=10, cached=True)
</pre>
<p>You can get the meta information in the HTML header with the returned object’s <i>title</i>,
<i>description</i> and <i>keywords</i> properties:</p>
<pre>html = web.page.parse("http://nodebox.net")
print html.description
>>> PlotDevice is a Mac OS X application that lets you create 2D visuals
>>> (static, animated or interactive) using Python programming code
>>> and export them as a PDF or a QuickTime movie. PlotDevice is free
>>> and well-documented.
print html.keywords
>>>[u'PlotDevice', u'Home']
</pre>
<p>You can easily get all the links in the HTML with the <i>links()</i> method. It takes an
optional <i>external</i> parameter which when True, returns only links to other
domains/websites.</p>
<pre>html = web.page.parse("http://nodebox.net/code/index.php/About")
print html.links()
>>> [u'http://research.nodebox.net',
>>> u'http://www.opensource.org/licenses/mit-license.php',
>>> u'http://www.python.org/', u'http://bert.debruijn.be/kgp/',
>>> u'http://diveintopython.org/xml_processing/',
>>> u'http://processing.org',
>>> u'http://www.freelists.org/archives/the_posthumans/',
>>> u'http://research.nodebox.net'
>>> ]
</pre>
<p>The <i>find_all()</i> method returns a list of specific tags. The <i>find()</i> method just
returns the first tag:</p>
<pre>html = web.page.parse("http://nodebox.net/")
print html.find("title").string
>>> PlotDevice | Home
titles = html.find_all("h2")
for title in titles:
print title.string
>>> News
>>> Current projects
>>> Gallery favorites
content = html.find(id="content")
print web.html.plain(content)
>>> Welcome to PlotDevice PlotDevice is a Mac OS X application
>>> that lets you create 2D visuals (static, animated or interactive)
>>> using Python programming code and export them as a PDF or
>>> a QuickTime movie. PlotDevice is free and well-documented.
>>> Read more
>>>
>>> Download PlotDevice for Mac OS X (version 1.8.5) Universal Binary
>>>
>>> Latest updates:
>>>
>>> * Interactivity
>>> * Stop running scripts by hitting command-dot.
>>> ...
</pre>
<p>As you can see you can supply tag names or attribute-value pairs to the find methods.</p>
<p>The <i>find()</i> and <i>find_all()</i> methods return <i>Tag</i> objects that each have
<i>find()</i> and <i>find_all()</i> too. Alternatively, you can also find tags more directly,
for example: <i>html.body.p</i> returns a list of all p tags.</p>
<pre>html = web.page.parse("http://nodebox.net/")
list = html.find(id="content").find("ul")
print web.html.plain(list)
>>> * Interactivity
>>> * Stop running scripts by hitting command-dot.
>>> * Transparent PDFs with the background() command.
>>> * Fast, integrated path mathematics .
>>> * Store libaries centrally in the Application Support folder.
# These statements retrieve exactly the same.
list = html.body(id="content")[0].ul
list = html.body.ul
</pre>
<p>If you need to retrieve tags by their CSS classname, use the <i>find_class()</i> method.<br/></p>
<p>To get attributes from a tag, address it as a dictionary:</p>
<pre>html = web.page.parse("http://nodebox.net/")
html.body.a["href"] # the first link's href attribute
>>> Home
</pre>
<hr/>
<p>The PlotDevice Web library has various commands to clean up HTML.</p>
<pre># Replaces HTML special characters by readable characters.
web.html.replace_entities(unicode, placeholder=" ")
</pre>
<pre># Removes all tags from HTML except those in the exclude list.
web.html.strip_tags(html, exclude=[], linebreaks=False,
blocks="\n", breaks="\n", columns="\n")
</pre>
<pre>web.html.strip_javascript(html)
</pre>
<pre>web.html.strip_inline_css(html)
</pre>
<pre>web.html.strip_comments(html)
</pre>
<pre>web.html.strip_forms(html)
</pre>
<pre># If there are 10 consecutive spaces, 9 of them are removed.
web.html.collapse_spaces(str)
</pre>
<pre># Allow only a maximum of max linebreaks to build up.
web.html.collapse_linebreaks(str, max=2)
</pre>
<pre># Converts tabs to spaces, optionally leaving the left indentation unmodified.
web.html.collapse_tabs(str, indent=False)
</pre>
<pre># Combines all of the above.
web.html.plain(html)
</pre>
<hr/>
<h2><a id="yahoo" name="yahoo" title="yahoo"></a>Querying Yahoo! for links, images and
news</h2>
<p>Before you can use the Web library to query the Yahoo! search engine, you need to obtain
<b>a license key</b>: <a href="http://developer.yahoo.com/search/">http://developer.yahoo.com/search/</a></p>
<p>Click ‘get an application ID’. Fill out the form and you’ll end up with a long string
of numbers and characters which is your Yahoo! license key. It entitles you to 5000 queries a
day. Now register your license key in your PlotDevice script:<br/></p>
<pre>web.yahoo.license_key("myID")
print web.yahoo.license_key()
>>> myID
</pre>
<p>Note that you can query Yahoo! without setting a license key, in which case you are using a
default key that you share with all the other PlotDevice users who work with the Web library.</p>
<p>Use the <i>yahoo.search()</i>, <i>yahoo.search_images()</i> and <i>yahoo.search_news()</i>
commands to query Yahoo! for links to relevant webpages/images/news:</p>
<pre>web.yahoo.search(q, start=1, count=10, context=None, cached=False)
</pre>
<pre>web.yahoo.search_images(q, start=1, count=10, cached=False)
</pre>
<pre>web.yahoo.search_news(q, start=1, count=10, cached=False)
</pre>
<p>The commands take a <i>q</i> query parameter, and optional <i>start</i>, <i>count</i> and
<i>cached</i> parameters. The <i>start</i> parameter defines the first link to return,
<i>count</i> defines the total amount of links to return. The <i>cached</i> parameter defines
if Yahoo queries are cached locally (so they can be retrieved faster in the future).</p>
<pre>results = web.yahoo.search("plotdevice", start=1, count=5, cached=False)
for item in results:
print item.title
>>> PlotDevice | Home
>>> PlotDevice | Features
>>> visualcomplexity.com | PlotDevice
>>> PlotDevice - Wikipedia, the free encyclopedia
>>> Nodebox - SWiK
</pre>
<p>Each item in the list of results is a <i>YahooResult</i> object with the following
properties:</p>
<ul>
<li><i>result.url</i>: the URL of the linked page
</li><li><i>result.title</i>: the title of the linked page
</li><li><i>result.description</i>: a short description for the page
</li><li><i>result.type</i>: the <a href="http://en.wikipedia.org/wiki/Mimetype">MIME-type</a> of
the linked page
</li><li><i>result.date</i>: the modification date of the linked page
</li><li><i>result.width</i>: for images, the width in pixels
</li><li><i>result.height</i>: for images, the height in pixels
</li><li><i>result.source</i>: for news items, the source of the article
</li><li><i>result.language</i>: for news items, the language used<br/>
</li></ul>
<p>The list of results has a <i>total</i> property containing the total number of results
Yahoo! has for your query:</p>
<pre>print results.total
>>> 37200
</pre>
<hr/>
<h2><a id="yahoocontextual" name="yahoocontextual" title="yahoocontextual"></a>Improving Yahoo!
results with a contextual search</h2>
<p>Suppose you are querying Yahoo! for <i>apple</i>. Most likely Yahoo! will return links to
pages relating to the Apple Computer company. But perhaps you wanted links to apples as in
<i>fruit</i>. The <i>yahoo.search()</i> command has an optional <i>context</i> parameter which
you can supply a description of what you mean exactly with <i>apple</i>:</p>
<pre>ctx = '''
The apple tree was perhaps the earliest tree to be cultivated,
and apples have remained an important food in all cooler climates.
To a greater degree than other tree fruit, except possibly citrus,
apples store for months while still retaining their nutritive value.
We are not looking for a company named Apple.
'''
results = web.yahoo.search("apple", count=5, context=ctx)
print r.total
for item in results:
print item.title
>>> Apple - Wikipedia, the free encyclopedia
>>> Apple Core
>>> Apple - Free Encyclopedia
>>> Apple : Apple (fruit)
>>> About the Apple -- Fruit
</pre>
<p> </p>
<hr/>
<h2><a id="yahoospelling" name="yahoospelling" title="yahoospelling"></a>Using Yahoo! to
suggest spelling corrections</h2>
<p>To get spelling suggestions from Yahoo, use the <i>yahoo.suggest_spelling()</i> command. It
returns a list of suggested spelling corrections:</p>
<pre>corrections = web.yahoo.suggest_spelling ("amazoon", cached=False)
print corrections
>>> ['amazon']
</pre>
<p> </p>
<hr/>
<h2><a id="yahoosort" name="yahoosort" title="yahoosort"></a>Using Yahoo! to sort
associatively</h2>
<p>You can use Yahoo! to sort concepts according to association. Is <i>sky</i> more
<i>green</i>, <i>red</i> or <i>blue</i>?</p>
<pre>colors = ["green", "blue", "red"]
sorted = web.yahoo.sort(colors, "sky", strict=False, cached=True)
for word, weight in sorted:
print word, weight
>>> blue sky 0.396039408425
>>> red sky 0.33604773392
>>> green sky 0.267912857655
</pre>
<p>In this example, Yahoo! is queried for <i>green sky</i>, <i>blue sky</i> and <i>red sky</i>.
The result is a sorted list of (<i>query</i>, <i>weight</i>) tuples. We learn that <i>blue</i>
is the color best associated with <i>sky</i>.</p>
<p>The <a href="http://nodebox.net/code/index.php/Prism">Prism</a> algorithm roughly works in
this way.</p>
<p> </p>
<hr size="2" width="100%"/>
<h2><a id="google" name="google" title="google"></a>Querying Google</h2>
<p>You can run <a href="http://www.google.com">Google</a> queries in the same way as <a href="#yahoo">querying Yahoo!</a>.<br/>
The library has the following commands available:</p>
<pre>web.google.search(q, start=0, cached=False)
</pre>
<pre>web.google.search_images(q, start=0, size="", cached=False)
</pre>
<pre>web.google.search_news(q, start=0, cached=False)
</pre>
<pre>web.google.search_blogs(q, start=0, cached=False)
</pre>
<pre>web.google.sort(words, context="", strict=True, cached=False)
</pre>
<p>The search commands return a list of results. This list has an additional <i>total</i>
property. Each item in the list is a <i>GoogleResult</i> object with the following properties:</p>
<ul>
<li><i>result.url</i>: the URL of the linked page
</li><li><i>result.title</i>: the title of the linked page
</li><li><i>result.description</i>: a short description for the page
</li><li><i>result.date</i>: for news and blogs search
</li><li><i>result.author</i>: for news and blogs search.
</li><li><i>result.location</i>: for news search.
</li></ul>
<p>Per search, a list of 8 items is returned from the given <i>start</i> item. Google will only
ever return the first 32 results, so the maximum value for <i>start</i> is 24.</p>
<p>When searching for images, the results can be filtered for image <i>size</i> with the
optional size parameter.<br/>
Acceptable values are ‘small’, ‘medium’, ‘large’ and ‘wallpaper’.<br/></p>
<p> </p>
<hr/>
<h2><a id="newsfeed" name="newsfeed" title="newsfeed"></a>Reading newsfeeds</h2>
<p>The <i>newsfeed.read()</i> command returns information from <a href="http://en.wikipedia.org/wiki/Rss">RSS</a> or <a href="http://en.wikipedia.org/wiki/Atom_%28standard%29">Atom</a> newsfeeds as a collection of news
items with a title, link, description, and more.</p>
<pre>web.newsfeed.parse(url, wait=10, cached=True)
</pre>
<p>The returned newsfeed object has the following properties:</p>
<ul>
<li><i>newsfeed.title</i>: the title of the newsfeed<br/>
</li><li><i>newsfeed.description</i>: a short description for the newsfeed<br/>
</li><li><i>newsfeed.link</i>: a link to the homepage of the news channel<br/>
</li><li><i>newsfeed.date</i>: a publication date of the news channel<br/>
</li><li><i>newsfeed.encoding</i>: the text encoding used (usually Unicode)<br/>
</li><li><i>newsfeed.items</i>: a list of news items<br/>
</li></ul>
<p>The <i>items</i> property is a list in which each item object has properties of its own:</p>
<ul>
<li><i>item.title</i>: the title of the news item<br/>
</li><li><i>item.link</i>: a link to the full article online
</li><li><i>item.description</i>: a short description of the news item
</li><li><i>item.date</i>: the publication date of the news item<br/>
</li><li><i>item.author</i>: the author of the news item
</li></ul><br/>
<pre>newsfeed = web.newsfeed.parse"http://www.whitehouse.gov/rss/news.xml")
print "Channel:", newsfeed.title
print "Channel description:", newsfeed.description
for item in newsfeed.items:
print "Title:", item.title
print "Link:", item.link
print "Description", item.description
print "Date:", item.date
print "Author:", item.author
</pre>
<p><br/>
There are other properties as well, like <i>item.date_parsed</i>, <i>item.author_detail</i> and
<i>item.author_detail.email</i>. See the <a href="http://www.feedparser.org/">Universal Feed
Parser</a> documentation for more information.</p>
<p>The address of some well-known newsfeeds can be found in the <i>newsfeed.favorites</i>
dictionary or with the <i>newsfeed.favorite_url()</i> command:</p>
<pre>print web.newsfeed.favorite_url("apple")
>>> http://images.apple.com/main/rss/hotnews/hotnews.rss
</pre>
<hr/>
<h2><a id="wikipedia" name="wikipedia" title="wikipedia"></a>Retrieving articles from
Wikipedia</h2>
<p><a href="http://en.wikipedia.org/">Wikipedia</a> is a multilingual, web-based, free content
encyclopedia project. Wikipedia is written collaboratively by volunteers; the vast majority of
its articles can be edited by anyone with access to the Internet.</p>
<p>The <i>wikipedia.search()</i> command retrieves articles from Wikipedia. It parses an
article corresponding to the given query into lists of related articles and paragraphs of plain
text without any HTML or other markup in it:</p>
<pre>web.wikipedia.search(q, language="en", light=False, wait=10, cached=True)
</pre>
<p>The command takes a <i>q</i> query parameter and, optionally, the <i>language</i> the
article should be written in. When <i>light</i> is True, only a <i>title</i>, <i>links</i> to
other articles, <i>categories</i> and <i>disambiguation</i> will be parsed from the article
(it’s faster than a full parse).</p>
<p>Note that the <i>q</i> parameter is case-insenstive - this gives the best chance of
retrieving an article.<br/>
If you do want case-sensitivity use <i>search(q, case_sensitive=True)</i>.</p>
<p>The return value is an article object with the following properties:</p>
<ul>
<li><i>article.title</i>: the title of the article<br/>
</li><li><i>article</i><i>.links</i>: a list of titles of related articles
</li><li><i>article</i><i>.categories</i>: a list of categories this article belongs to<br/>
</li><li><i>article</i><i>.disambiguation</i>: a list of article titles describing other
interpretations of the query
</li><li><i>article</i><i>.paragraphs</i>: a list of paragraph objects
(<i>WikipediaParagraph</i>)<br/>
</li><li><i>article</i> <i>.images</i>: a list of image objects (<i>WikipediaImage</i>)
</li><li><i>article</i><i>.tables</i>: a list of table objects (<i>WikipediaTable</i>)
</li><li><i>article</i><i>.references</i>: a list of reference objects (<i>WikipediaReference</i>)
</li><li><i>article</i><i>.translations</i>: a dictionary of language keys linking to title
translations<br/>
</li><li><i>article</i><i>.important</i>: important phrases that appear in bold in the online
article<br/>
</li><li><i>article</i><i>.markup</i>: the source text of the article in <a href="http://www.mediawiki.org/wiki/MediaWiki">MediaWiki</a> markup
</li></ul>
<p> </p>
<hr/>
<p><span class="grey_box">Article links</span></p>
<p>Each of the titles in the <i>article.links</i> list can be supplied to
<i>wikipedia.search()</i> to get more information on that topic.</p>
<pre>article = web.wikipedia.search("plotdevice")
print article.title
print article.links
>>> PlotDevice
>>> [u'2007', u'2D computer graphics', u'Adobe Photoshop', u'Animation',
>>> u'CMYK', u'Computer animation', u'Core Image', u'DrawBot',
>>> u'February 27', u'Graphic design', u'HSV color space',
>>> u'MIT License', u'Mac OS X', u'OpenGL', u'Portable Document Format',
>>> u'PostScript', u'Processing (programming language)',
>>> u'Python (programming language)', u'QuickTime', u'RGB', u'SVG',
>>> u'WordNet', u'alpha transparency', u'artificial intelligence',
>>> u'collage', u'graphic design'
>>> ]
</pre>
<hr/>
<p><span class="grey_box">Article paragraphs</span></p>
<p>Each paragraph object in <i>article.paragraphs</i> is a list of plain text blocks.
Furthermore, a paragraph has a number of properties. This code snippet loops through all
the paragraphs:</p>
<pre>article = web.wikipedia.search("plotdevice")
for paragraph in article.paragraphs:
# A paragraph's depth determines
# if it's a subparagraph or a top-level paragraph.
if paragraph.depth <= 1:
print "="*100
else:
print "-"*100
print paragraph.title + "\n"
# Each paragraph is a list of separate blocks of text:
for textblock in paragraph:
print textblock + "\n"
</pre>
<p>To display a paragraph with the <a href="../ref/Primitives.html#text()">text()</a> command
you can loop over all the textblocks individually:</p>
<pre>fontsize(10)
x, y, w = 20, 20, 300
for textblock in article.paragraphs[0]:
text(textblock, x, y, width=w)
y += textheight(textblock, width=w) + 10
</pre>
<p>Or simply use the Python <i>str()</i> command on the entire list:</p>
<pre>text(str(article.paragraphs[0]), 20, 20, width=300)
</pre>
<p>A paragraph object has the following properties:</p>
<ul>
<li><i>paragraph.title</i>: the title or heading of this paragraph<br/>
</li><li><i>paragraph.depth</i>: the depth of the paragraph<br/>
</li><li><i>paragraph.parent</i>: a <i>WikipediaParagraph</i> object containing this subparagraph
</li><li><i>paragraph.children</i>: a list of sub-<i>WikipediaParagraph</i> objects
</li><li><i>paragraph.main</i>: a list of article titles whose contents describe this paragraph in
detail
</li><li><i>paragraph.related</i>: more article titles that have related contents
</li><li><i>paragraph.tables</i>: a list of <i>WikipediaTable</i> objects found in this
paragraph<br/>
</li></ul>
<p> </p>
<hr/>
<p><span class="grey_box">Article images</span></p>
<p>Image objects in the <i>article.images</i> list have properties (like a description) that
can help in discerning what is being displayed in the image:</p>
<pre>article = web.wikipedia.search("computer")
for img in article.images:
print img.description
>>> The NASA Columbia Supercomputer.
>>> A computer in a wristwatch.
>>> The Jacquard loom was one of the first programmable devices.
>>> ...
</pre>
<p>An image object has the following properties:</p>
<ul>
<li><i>image.path</i>: the image’s filename
</li><li><i>image.description</i>: a description parsed from the source content
</li><li><i>image.links</i>: a list of related article titles
</li><li><i>image properties</i>: a list of properties parsed from the source (e.g. <i>left</i>,
<i>thumb</i>, ...)<br/>
</li></ul>
<p><br/>
Finally, here is a little web mash-up to draw article images in PlotDevice:</p>
<pre># 1) Get the image filename from the article.
article = web.wikipedia.search("computer")
img = article.images[0]
# 2) Get the HTML for the Wikipedia page displaying the full-size image
img = img.path.replace(" ", "_")
html = web.page.parse("http://en.wikipedia.org/wiki/Image:"+img)
# 3) Find the link in the HTML pointing to the image file.
for a in html.find_all("a"):
if a.has_key("href") and a["href"].endswith(img):
img = a["href"]
break
# 4) Download the image link.
# Pass the downloaded data to the image() command.
img = web.download(img)
image(None, 0, 0, data=img)
</pre>
<p><img alt="web-wikipedia1" height="326" src="../etc/lib/web-wikipedia1.jpg" width="550"/><br/></p>
<p> </p>
<hr/>
<p><span class="grey_box">Article tables</span><br/></p>
<p>Tables parsed from an article can be accessed from the <i>article.tables</i> list or from
<i>article.paragraph[i].tables</i>. A table object is a list of rows. Each row is a list of
cells:</p>
<pre>article = web.wikipedia.search("computer")
table = article.tables[0]
print table.paragraph.title
print table.title, "("+table.properties+")"
for row in table:
print "-"*50
print row.properties
for cell in row:
print cell, "("+cell.properties+")"
</pre>
<p>As you can see, tables, rows and cells all have <i>properties.</i> Tables also have a
<i>title</i> property and a <i>paragraph</i> property linking to the paragraph object in which
this table was found.</p>
<p> </p>
<hr/>
<p><span class="grey_box">Article references</span><br/></p>
<p>Text blocks in a Wikipedia paragraph can contain numerous references to websites, journal
and footnotes. They are marked as a number between square brackets, e.g. [15].</p>
<p>For example:</p>
<pre>>>> A key component common to all CPUs is the program counter,
>>> a special memory cell (a register) that keeps track of which
>>> location in memory the next instruction is to be read from. [11]
</pre>
<p>corresponds to <i>article.references[10]</i> - keeping in mind that list indices start from
0:</p>
<pre>print article.references[10]
>>> Instructions often occupy more than one memory address,
>>> so the program counters usually increases by the number of
>>> memory locations required to store one instruction
</pre>
<p>Each reference object in the <i>article.references</i> list has a number of properties. In
the worst case all of the information is stored in <i>reference.note</i>, in the best case the
reference has data for all of the folllowing properties:</p>
<ul>
<li><i>reference.title</i>: a title of a publication<br/>
</li><li><i>reference.url</i>: a link to a web page<br/>
</li><li><i>reference.author</i>: the author of a publication<br/>
</li><li><i>reference.first</i>: the author’s first name<br/>
</li><li><i>reference.last</i>: the author’s last name
</li><li><i>reference.journal</i>: the journal in which the article is published
</li><li><i>reference.publisher</i>: the journal’s publisher
</li><li><i>reference.date</i>: publication date<br/>
</li><li><i>reference.year</i>: publication year<br/>
</li><li><i>reference.id</i>: an ISBN book number
</li><li><i>reference.note</i>: footnotes and descriptions
</li></ul>
<p> </p>
<hr/>
<p><span class="grey_box">Article translations</span></p>
<p>Here’s an example script how to get the translated version of an article:</p>
<pre>article = web.wikipedia.search("computer")
language = "fr"
if article.translations.has_key(language):
translation = article.translations[language]
article = web.wikipedia.search(translation, language)
print article.title
>>> Ordinateur
</pre>
<p>The dictionary of all languages used in Wikipedia:</p>
<pre>print web.wikipedia.languages
</pre>
<hr/>
<h2><a id="wikipediahelpers" name="wikipediahelpers" title="wikipediahelpers"></a>Some helper
commands to draw Wikipedia content in PlotDevice</h2>
<p>A number of commands in the library can help you find out how to display content from
Wikipedia in PlotDevice. There are commands to draw lists, math equations and tables.</p>
<pre># Returns True if a given text block str in a paragraph
# is preformatted text, e.g. a programming code example.
web.wikipedia.is_preformatted(str)
</pre>
<pre># Returns True if a text block in a paragraph is a (numbered) list.
web.wikipedia.is_list(str)
</pre>
<pre># Returns True if a text block in a paragraph is a math equation.
web.wikipedia.is_math(str)
</pre>
<pre># Draws a list text block at x, y coordinates in PlotDevice.
web.wikipedia.draw_list(str, x, y, w, padding=5, callback=None)
</pre>
<pre># Use mimeTeX to draw an image of a math equation at x, y.
web.wikipedia.draw_math(str, x, y, alpha=1.0)
</pre>
<pre># Draws WikipediaTable object in PlotDevice; works very poorly.
web.wikipedia.draw_table(table, x, y, w, padding=5)
</pre>
<p><img alt="web_wikipedia2" height="324" src="../etc/lib/web_wikipedia2.jpg" width="550"/></p>
<p> </p>
<hr/>
<h2><a id="morguefile" name="morguefile" title="morguefile"></a>Querying morgueFile for
images</h2>
<p><a href="http://www.morguefile.com">morgueFile</a> contains photographs freely contributed
by many artists to be used in creative projects by visitors to the site.</p>
<p>The <i>morguefile.search()</i> command returns a list of images on morgueFile that
correspond to a given query. It has an optional parameter <i>max</i> specifying the maximum
number of images to return:</p>
<pre>web.morguefile.search(q, max=100, wait-10, cached=True)
</pre>
<pre>web.morguefile.search_by_author(q, max=100, wait=10, cached=True)
</pre>
<p>Each image object in the returned list has the following properties:</p>
<ul>
<li><i>img.id</i>: the unique morgueFile ID for the image
</li><li><i>img.category</i>: the category the image belongs to
</li><li><i>img.author</i>: the name of the author
</li><li><i>img.name</i>: the image’s name
</li><li><i>img.url</i>: the URL of the image thumbnail<br/>
<i>img.width</i>: the image width in pixels
</li><li><i>img.height</i>: the image height in pixels
</li><li><i>img.date</i>: the date the image was added to morgueFile
</li></ul><br/>
<pre>images = web.morguefile.search("leaf", max=10)
for img in images:
print img.name, img.views
>>> IMG_1662_d.JPG
>>> fedegrafo_100_0221.jpg
>>> cha827.jpg
>>> Filiford_P1010003.JPG
>>> IMG_8336.jpg
>>> CIMG0012_s.JPG
>>> Target_spot_disease_on_maple.JPG
>>> IMG_1664.JPG
>>> bumpy_leaf.JPG
>>> Aztec_Grass.JPG
</pre>
<p>Each image object in the list has a <i>download()</i> method that stores the image file
locally in cache. It has an optional parameter <i>size</i>, which can be set to ‘small’ (image
thumbnail) or ‘large’ (default):</p>
<pre>img = images[0]
img.download(size="large", wait=60)
image(img.path, 0, 0, width=img.width, height=img.height)
print img.author, img.path
>>> Filiford cache/morguefile/240060eb1e1a0628ae32aff811b167ef.JPG
</pre>
<p><img alt="web_morguefile2" height="366" src="../etc/lib/web_morguefile2.jpg" width="550"/></p>
<p> </p>
<hr size="2" width="100%"/>
<h2><a id="flickr" name="flickr" title="flickr"></a>Querying Flickr for images</h2>
<p><a href="http://www.flickr.com">Flickr</a> is an online photo management and sharing
application.<br/>
You can query it for images in the same way as <a href="#morguefile">querying morgueFile</a>.</p>
<pre>web.flickr.search(q, start=1, count=100, wait=10, cached=True)
</pre>
<pre>web.flickr.recent(start=1, count=100, wait=10, cached=True)
</pre>
<p>The <i>flickr.search()</i> command has two optional parameters: <i>sort</i> and
<i>match</i>. The sort order can be set either to ‘interest’, ‘relevance’ (default) or ‘date’.
The <i>match</i> parameter can be either ‘all’ (image tags must include all of the search
words) or ‘any’ (default).</p>
<p>Each image object in the returned list has a <i>download()</i> method like the morgueFile
interface. The <i>download()</i> method has an optional <i>size</i> parameter which can be
‘square’, ‘small’, ‘medium’, ‘large’ and ‘wallpaper’. This way you can specify the size of the
image to download.<br/></p>
<p> </p>
<hr/>
<h2><a id="kuler" name="kuler" title="kuler"></a>Querying kuler for color themes</h2>
<p><a href="http://kuler.adobe.com/">kuler</a> is an Adobe web-application that allows users to
construct and share color themes.</p>
<p>The <i>kuler.search()</i> command returns a list of color themes on kuler that correspong to
a given query. It has an optional <i>page</i> parameter defining the starting page (each page
has 30 themes).</p>
<pre>web.kuler.search(q, page=0, wait=10, cached=True)
</pre>
<pre>web.kuler.search_by_id(id, page=0, wait=10, cached=True)
</pre>
<pre>web.kuler.search_by_popularity(page=0, wait=10, cached=True)
</pre>
<pre>web.kuler.search_by_rating(page=0, wait=10, cached=True)
</pre>
<p>Each theme object in the returned list has the following properties:</p>
<ul>
<li><i>theme.id</i>: the unique kuler id for the theme
</li><li><i>theme.author</i>: the name of the author<br/>
</li><li><i>theme.label</i>: the title of the theme<br/>
</li><li><i>theme.tags</i>: a list of keywords for a theme found with
<i>kuler.search_by_id()</i>
</li><li><i>theme.darkest</i>: a tuple of (R, G, B) values for the darkest color in the theme<br/>
</li><li><i>theme.lightest</i>: a tuple of (R, G, B) values the lightest color in the theme<br/>
</li></ul>
<p>You can loop through a theme object as a list. Each item in the theme is a tuple of (R, G,
B) values, which you can supply to <a href="../ref/Line+Color.html#fill()">fill()</a> or
<a href="../ref/Line+Color.html#stroke()">stroke()</a> in PlotDevice.</p>
<pre>themes = web.kuler.search("banana")
for r, g, b in themes[0]:
print r, g, b
# The kuler.preview() command gives you an idea of the theme's colors.
web.kuler.preview(themes[0])
</pre>
<p><img alt="web_kuler1" height="394" src="../etc/lib/web_kuler1.jpg" width="550"/><br/></p>
<p> </p>
<p>Each theme also has a draw() method that displays the colors in the theme in PlotDevice:</p>
<pre>themes = web.kuler.search("banana")
themes[0].draw(50, 50, w=40, h=120)
</pre>
<p><img alt="web_kuler2" height="299" src="../etc/lib/web_kuler2.jpg" width="549"/><br/></p>
<p> </p>
<hr size="2" width="100%"/>
<h2><a id="colr" name="colr" title="colr"></a>Querying Colr for color themes</h2>
<p><a href="http://www.colr.org">Colr</a> is an online tool by Lloyd Dalton to let people
fiddle around with colors.<br/>
You can query it for color themes in the same way as querying <a href="#kuler">kuler</a>.</p>
<pre>web.colr.search(q, page=0, wait=10, cached=True)
</pre>
<pre>web.colr.search_by_id(id, page=0, wait=10, cached=True)
</pre>
<pre>web.colr.latest(page=0, wait=10, cached=True)
</pre>
<pre>web.colr.random(page=0, wait=10, cached=True)
</pre>
<p>You can manipulate each theme object in the returned list as with the kuler interface.<br/></p>
<p> </p>
<hr/>
<h2><a id="math" name="math" title="math"></a> Creating PNG or GIF images from math
equations</h2>
<p>John Forkosh has a <a href="http://www.forkosh.com/mathtex.html">mathTeX server</a> that
creates PNG or GIF images from math equations.</p>
<pre>web.mathtext.png(equation, dpi=120, color="")
</pre>
<pre>web.mathtext.gif(equation, dpi=120, color="")
</pre>
<p>The optional <i>dpi</i> parameter sets the image resolution, while <i>color</i> can be the
name of a primary color (e.g. blue, green, red, ...)</p>
<pre>equation = r"E = hf = \frac{hc}{\lambda} \,\! "
img = web.mathtex.gif(equation)
image(img, 0, 0)
</pre>
<p><img alt="web_mimetex1" height="35" src="../etc/lib/web_mimetex1.jpg" width="153"/></p>
<p> </p>
<hr/>
<h2><a id="urbandictionary" name="urbandictionary" title="urbandictionary"></a>Word definitions
from Urban Dictionary</h2>
<p><a href="http://www.urbandictionary.com">Urban Dictionary</a> is a slang dictionary with
user-defined description for words. You can often get some humorous (or cruel and childish)
results from it.</p>
<p>The <i>urbandictionary.search()</i> command returns a list of definitions for a given word.</p>
<pre>web.urbandictionary.search(q, cached=True)
</pre>
<p>Each definition object in the returned list has the following properties:</p>
<ul>
<li><i>definition.description</i>: a description of the given word
</li><li><i>definition.example</i>: example usage of the word in a sentence
</li><li><i>definition.author</i>: the author who came up with the definition
</li><li><i>definition.links</i>: a list of words linked to the definition
</li><li><i>definition.url</i>: the web page where the definition can be found
</li></ul>
<pre>definitions = web.urbandictionary.search("life")
print definitions[0].description
>>> A sexually-transmitted, terminal disease.
</pre>
<hr/>
<h2><a id="asynchronous" name="asynchronous" title="asynchronous"></a>Working with asynchronous
downloads</h2>
<p>Downloading content from the internet may take a moment to complete. When running an
animation in PlotDevice, you don’t want it to halt while PlotDevice waits for the download to
complete. The Web library offers you the capability to download <i>asynchronously</i>.
N<b>odeBox will then continue running with the download taking place in the background
memory</b>. Once it is done you can start manipulating the retrieved data.</p>
<p>The <i>url.retrieve()</i> command has some optional parameters to do asynchronous downloads:</p>
<pre>web.url.retrieve(url, wait=60, asynchronous=False, cache=None, type=".html")
</pre>
<p>With <i>asynchronous</i> set to True, the download will occur in the background. The
returned object has a <i>done</i> property which is True when downloading has terminated. The
object’s <i>data</i> property then contains the source data.</p>
<p>You can also set a <i>wait</i> amount of seconds that is the maximum amount of time PlotDevice
will connect to the internet. When the limit is exceeded and no data was fully recovered, the
returned object will have an <i>error</i> property set. When something else went wrong
<i>error</i> will be set as well by the way, usually with a <i>URLTimeout</i>, a
<i>HTTPError</i> or a <i>HTTP404NotFound</i> exception. </p>