Skip to content

Commit eba6ce5

Browse files
committed
updating ch 7 ordering
1 parent cc4e51b commit eba6ce5

File tree

4 files changed

+236
-2
lines changed

4 files changed

+236
-2
lines changed

feeds/all.atom.xml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,2 @@
11
<?xml version="1.0" encoding="utf-8"?>
2-
<feed xmlns="http://www.w3.org/2005/Atom"><title>Matt Makai</title><link href="http://www.fullstackpython.com/" rel="alternate"></link><link href="http://www.fullstackpython.com/feeds/all.atom.xml" rel="self"></link><id>http://www.fullstackpython.com/</id><updated>2014-08-13T09:27:55Z</updated></feed>
2+
<feed xmlns="http://www.w3.org/2005/Atom"><title>Matt Makai</title><link href="http://www.fullstackpython.com/" rel="alternate"></link><link href="http://www.fullstackpython.com/feeds/all.atom.xml" rel="self"></link><id>http://www.fullstackpython.com/</id><updated>2014-08-13T20:06:30Z</updated></feed>

source/content/pages/07-performance/0701-static-content.markdown

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
title: Static Content
22
category: page
33
slug: static-content
4-
sort-order: 071
4+
sort-order: 0701
55
choice1url: /caching.html
66
choice1icon: fa-repeat
77
choice1text: How do I cache repeated operations to improve performance?
Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
title: Caching
2+
category: page
3+
slug: caching
4+
sort-order: 0702
5+
choice1url: /task-queues.html
6+
choice1icon: fa-tasks
7+
choice1text: How do I run Python outside the HTTP request-response cycle?
8+
choice2url: /web-analytics.html
9+
choice2icon: fa-dashboard
10+
choice2text: What can I learn about my users through web analytics?
11+
choice3url: /web-application-security.html
12+
choice3icon: fa-lock fa-inverse
13+
choice3text: What should I know about security to protect my app?
14+
choice4url: /configuration-management.html
15+
choice4icon: fa-gears fa-inverse
16+
choice4text: How do I automate the server configuration that I set up?
17+
18+
19+
# Caching
20+
Caching can reduce the load on servers by storing the results of common
21+
operations and serving the precomputed answers to clients.
22+
23+
For example, instead of retrieving data from database tables that rarely
24+
change, you can store the values in-memory. Retrieving values from an
25+
in-memory location is far faster than retrieving them from a database (which
26+
stores them on a persistent disk like a hard drive.) When the cached values
27+
change the system can invalidate the cache and re-retrieve the updated values
28+
for future requests.
29+
30+
A cache can be created for multiple layers of the stack.
31+
32+
33+
## Caching backends
34+
* [memcached](http://memcached.org/) is a common in-memory caching system.
35+
36+
* [Redis](http://redis.io/) is a key-value in-memory data store that can
37+
easily be configured for caching with libraries such as
38+
[django-redis-cache](https://github.com/sebleier/django-redis-cache).
39+
40+
41+
## Caching resources
42+
* "[Caching: Varnish or Nginx?](https://bjornjohansen.no/caching-varnish-or-nginx)"
43+
reviews some considerations such as SSL and SPDY support when choosing
44+
reverse proxy Nginx or Varnish.
45+
46+
* [Caching is Hard, Draw me a Picture](http://bizcoder.com/caching-is-hard-draw-me-a-picture)
47+
has diagrams of how web request caching layers work. The post is relevant
48+
reading even though the author is describing his Microsoft code as the
49+
impetus for writing the content.
50+
51+
52+
## Caching learning checklist
53+
<i class="fa fa-check-square-o"></i>
54+
Analyze your web application for the slowest parts. It's likely there are
55+
complex database queries that can be precomputed and stored in an in-memory
56+
data store.
57+
58+
<i class="fa fa-check-square-o"></i>
59+
Leverage your existing in-memory data store already used for session data
60+
to cache the results of those complex database queries.
61+
A [task queue](/task-queues.html) can often be used to precompute the results
62+
on a regular basis and save them in the data store.
63+
64+
<i class="fa fa-check-square-o"></i>
65+
Incorporate a cache invalidation scheme so the precomputed results remain
66+
accurate when served up to the user.
67+
68+
69+
70+
### What do you want to learn now that your app is responding faster?
Lines changed: 164 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,164 @@
1+
title: Task Queues
2+
category: page
3+
slug: task-queues
4+
sort-order: 0703
5+
choice1url: /logging.html
6+
choice1icon: fa-align-left fa-inverse
7+
choice1text: How do I monitor my app and its task queues with logging?
8+
choice2url: /web-analytics.html
9+
choice2icon: fa-dashboard
10+
choice2text: How can I learn more about the users of my application?
11+
choice3url: /monitoring.html
12+
choice3icon: fa-bar-chart-o fa-inverse
13+
choice3text: What tools exist for monitoring a live web application?
14+
choice4url:
15+
choice4icon:
16+
choice4text:
17+
18+
19+
# Task queues
20+
Task queues manage background work that must be executed outside the usual
21+
HTTP request-response cycle.
22+
23+
24+
## Why are task queues necessary?
25+
Tasks are handled asynchronously either because they are not initiated by
26+
an HTTP request or because they are long-running jobs that would dramatically
27+
reduce the performance of an HTTP response.
28+
29+
For example, a web application could poll the GitHub API every 10 minutes to
30+
collect the names of the top 100 starred repositories. A task queue would
31+
handle invoking code to call the GitHub API, process the results and store them
32+
in a persistent database for later use.
33+
34+
Another example is when a database query would take too long during the HTTP
35+
request-response cycle. The query could be performed in the background on a
36+
fixed interval with the results stored in the database. When an
37+
HTTP request comes in that needs those results a query would simply fetch the
38+
precalculated result instead of re-executing the longer query.
39+
This precalculation scenario is a form of [caching](/caching.html) enabled
40+
by task queues.
41+
42+
Other types of jobs for task queues include
43+
44+
* spreading out large numbers of independent database inserts over time
45+
instead of inserting everything at once
46+
47+
* aggregating collected data values on a fixed interval, such as every
48+
15 minutes
49+
50+
* scheduling periodic jobs such as batch processes
51+
52+
53+
## Task queue projects
54+
The defacto standard Python task queue is Celery. The other task queue
55+
projects that arise tend to come from the perspective that Celery is overly
56+
complicated for simple use cases. My recommendation is to put the effort into
57+
Celery's reasonable learning curve as it is worth the time it takes to
58+
understand how to use the project.
59+
60+
* The [Celery](http://www.celeryproject.org/) distributed task queue is the
61+
most commonly used Python library for handling asynchronous tasks and
62+
scheduling.
63+
64+
* The [RQ (Redis Queue)](http://python-rq.org/) is a simple Python
65+
library for queueing jobs and processing them in the background with workers.
66+
RQ is backed by Redis and is designed to have a low barrier to entry.
67+
The [intro post](http://nvie.com/posts/introducing-rq/) contains information
68+
on design decisions and how to use RQ.
69+
70+
* [Taskmaster](https://github.com/dcramer/taskmaster) is a lightweight simple
71+
distributed queue for handling large volumes of one-off tasks.
72+
73+
74+
## Hosted message and task queue services
75+
Task queue third party services aim to solve the complexity issues that arise
76+
when scaling out a large deployment of distributed task queues.
77+
78+
* [Iron.io](http://www.iron.io/) is a distributed messaging service platform
79+
that works with many types of task queues such as Celery. It also is built
80+
to work with other IaaS and PaaS environments such as Amazon Web Services
81+
and Heroku.
82+
83+
* [Amazon Simple Queue Service (SQS)](http://aws.amazon.com/sqs/) is a
84+
set of five APIs for creating, sending, receiving, modifying and deleting
85+
messages.
86+
87+
* [CloudAMQP](http://www.cloudamqp.com/) is at its core managed servers with
88+
RabbitMQ installed and configured. This service is an option if you are
89+
using RabbitMQ and do not want to maintain RabbitMQ installations on your
90+
own servers.
91+
92+
93+
## Task queue resources
94+
* [Getting Started Scheduling Tasks with Celery](http://www.caktusgroup.com/blog/2014/06/23/scheduling-tasks-celery/)
95+
is a detailed walkthrough for setting up Celery with Django (although
96+
Celery can also be used without a problem with other frameworks).
97+
98+
* [Distributing work without Celery](http://justcramer.com/2012/05/04/distributing-work-without-celery/)
99+
provides a scenario in which Celery and RabbitMQ are not the right tool
100+
for scheduling asynchronous jobs.
101+
102+
* [Evaluating persistent, replicated message queues](http://www.warski.org/blog/2014/07/evaluating-persistent-replicated-message-queues/)
103+
is a detailed comparison of Amazon SQS, MongoDB, RabbitMQ, HornetQ and
104+
Kafka's designs and performance.
105+
106+
* [Queues.io](http://queues.io/) is a collection of task queue systems with
107+
short summaries for each one. The task queues are not all compatible with
108+
Python but ones that work with it are tagged with the "Python" keyword.
109+
110+
* [Why Task Queues](http://www.slideshare.net/bryanhelmig/task-queues-comorichweb-12962619)
111+
is a presentation for what task queues are and why they are needed.
112+
113+
* [How to use Celery with RabbitMQ](https://www.digitalocean.com/community/articles/how-to-use-celery-with-rabbitmq-to-queue-tasks-on-an-ubuntu-vps)
114+
is a detailed walkthrough for using these tools on an Ubuntu VPS.
115+
116+
* Heroku has a clear walkthrough for using
117+
[RQ for background tasks](https://devcenter.heroku.com/articles/python-rq).
118+
119+
* [Introducing Celery for Python+Django](http://www.linuxforu.com/2013/12/introducing-celery-pythondjango/)
120+
provides an introduction to the Celery task queue.
121+
122+
* [Celery - Best Practices](https://denibertovic.com/posts/celery-best-practices/)
123+
explains things you should not do with Celery and shows some underused
124+
features for making task queues easier to work with.
125+
126+
* The "Django in Production" series by
127+
[Rob Golding](https://twitter.com/robgolding63) contains a post
128+
specifically on [Background Tasks](http://www.robgolding.com/blog/2011/11/27/django-in-production-part-2---background-tasks/).
129+
130+
* [Asynchronous Processing in Web Applications Part One](http://blog.thecodepath.com/2012/11/15/asynchronous-processing-in-web-applications-part-1-a-database-is-not-a-queue/)
131+
and [Part Two](http://blog.thecodepath.com/2013/01/06/asynchronous-processing-in-web-applications-part-2-developers-need-to-understand-message-queues/)
132+
are great reads for understanding the difference between a task queue and
133+
why you shouldn't use your database as one.
134+
135+
* [A 4 Minute Intro to Celery](https://www.youtube.com/watch?v=68QWZU_gCDA) is
136+
a short introductory task queue screencast.
137+
138+
139+
## Task queue learning checklist
140+
<i class="fa fa-check-square-o"></i>
141+
Pick a slow function in your project that is called during an HTTP request.
142+
143+
<i class="fa fa-check-square-o"></i>
144+
Determine if you can precompute the results on a fixed interval instead of
145+
during the HTTP request. If so, create a separate function you can call
146+
from elsewhere then store the precomputed value in the database.
147+
148+
<i class="fa fa-check-square-o"></i>
149+
Read the Celery documentation and the links in the resources section below
150+
to understand how the project works.
151+
152+
<i class="fa fa-check-square-o"></i>
153+
Install a message broker such as RabbitMQ or Redis and then add Celery to your
154+
project. Configure Celery to work with the installed message broker.
155+
156+
<i class="fa fa-check-square-o"></i>
157+
Use Celery to invoke the function from step one on a regular basis.
158+
159+
<i class="fa fa-check-square-o"></i>
160+
Have the HTTP request function use the precomputed value instead of the
161+
slow running code it originally relied upon.
162+
163+
164+
### What's next after task queues?

0 commit comments

Comments
 (0)