Web Scraper Scenario by sirMackk · Pull Request #235 · realpython/python-guide

sirMackk · 2012-12-31T22:28:16Z

I wrote a scenario about scraping data from a website using lxml and Requests. I also gave pointers on cool ways to expand on this functionality. One of the reasons why I got around to this was that I couldn't find a good resource for writing a scraper in Python that had everything all in one place.

Fixed some markup.

Web Scraper Scenario

markotibold · 2013-01-02T12:04:21Z

Is this a typo?

lyndsysimon · 2013-01-07T22:08:52Z

Considering the opinionated nature of this guide, does anyone have thoughts on rewriting this example to use BeatifulSoup? It's much more pythonic, and doesn't require lxml (which can be difficult to install).

kennethreitz · 2013-01-07T22:15:51Z

@lyndsysimon beautifulsoup has a lot of nuances (often times the releases operate very differently), and i'd like to avoid recommending it to people if possible. It deserves a mention though.

lyndsysimon · 2013-01-07T23:06:18Z

Fair enough.

sirMackk added 7 commits December 31, 2012 10:22

Added scenario about web scraping using lxml

faae04c

2nd draft of web scraping scenario

3aef3bd

Fixed some markup.

Third, final markup fixes.

c3d7bdd

Added a bit more code to improve understanding.

83c9cba

Fixing html code-block

a22a6e9

Using requests instead of urllib2, final draft.

32dea94

Final version

aa7f9aa

kennethreitz pushed a commit that referenced this pull request Dec 31, 2012

Merge pull request #235 from sirMackk/master

e1b1201

Web Scraper Scenario

kennethreitz merged commit e1b1201 into realpython:master Dec 31, 2012

markotibold reviewed Jan 2, 2013
View reviewed changes

Comment thread docs/scenarios/scrape.rst

markotibold Jan 2, 2013

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a typo?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Web Scraper Scenario#235

Web Scraper Scenario#235
kennethreitz merged 7 commits into
realpython:masterfrom
sirMackk:master

sirMackk commented Dec 31, 2012

Uh oh!

markotibold Jan 2, 2013

Uh oh!

lyndsysimon commented Jan 7, 2013

Uh oh!

kennethreitz commented Jan 7, 2013

Uh oh!

lyndsysimon commented Jan 7, 2013

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

sirMackk commented Dec 31, 2012

Uh oh!

markotibold Jan 2, 2013

Choose a reason for hiding this comment

Uh oh!

lyndsysimon commented Jan 7, 2013

Uh oh!

kennethreitz commented Jan 7, 2013

Uh oh!

lyndsysimon commented Jan 7, 2013

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants