Skip to content

Commit 0f458c9

Browse files
committed
cleaning up and polishing the task queues page
1 parent 9c4aed6 commit 0f458c9

File tree

3 files changed

+44
-35
lines changed

3 files changed

+44
-35
lines changed

feeds/all.atom.xml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,2 @@
11
<?xml version="1.0" encoding="utf-8"?>
2-
<feed xmlns="http://www.w3.org/2005/Atom"><title>Matt Makai</title><link href="http://www.fullstackpython.com/" rel="alternate"></link><link href="http://www.fullstackpython.com/feeds/all.atom.xml" rel="self"></link><id>http://www.fullstackpython.com/</id><updated>2014-05-10T12:45:37Z</updated></feed>
2+
<feed xmlns="http://www.w3.org/2005/Atom"><title>Matt Makai</title><link href="http://www.fullstackpython.com/" rel="alternate"></link><link href="http://www.fullstackpython.com/feeds/all.atom.xml" rel="self"></link><id>http://www.fullstackpython.com/</id><updated>2014-05-10T13:10:58Z</updated></feed>

source/content/pages/07-performance/0705-task-queues.markdown

Lines changed: 22 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -17,40 +17,46 @@ choice4text:
1717

1818

1919
# Task queues
20-
Task queues handle background work processed outside the usual HTTP
21-
request-response cycle.
20+
Task queues manage background work that must be executed outside the usual
21+
HTTP request-response cycle.
22+
2223

2324
## Why are tasks queues necessary?
24-
Some tasks are handled asynchronously either because they are not initiated by
25-
an HTTP request or because they are long-running jobs that take longer than
26-
a few milliseconds.
25+
Tasks are handled asynchronously either because they are not initiated by
26+
an HTTP request or because they are long-running jobs that would dramatically
27+
reduce the performance of an HTTP response.
2728

2829
For example, a web application could poll the GitHub API every 10 minutes to
29-
find out what are the top 100 starred repositories. A task queue would be set
30-
up to automatically call the GitHub API, process the results and store them
30+
collect the names of the top 100 starred repositories. A task queue would
31+
handle invoking code to call the GitHub API, process the results and store them
3132
in a persistent database for later use.
3233

3334
Another example is when a database query would take too long during the HTTP
3435
request-response cycle. The query could be performed in the background on a
35-
fixed interval with the results stored in the database. Then when the
36-
HTTP request comes in it could fetch the calculated results from the database
37-
instead of re-executing the query. This is a form of [caching](/caching.html)
38-
enabled by task queues.
36+
fixed interval with the results stored in the database. When an
37+
HTTP request comes in that needs those results a query would simply fetch the
38+
precalculated result instead of re-executing the longer query.
39+
This precalculation scenario is a form of [caching](/caching.html) enabled
40+
by task queues.
3941

4042
Other types of jobs for task queues include
4143

42-
* calculating computationally expensive data analytics
43-
44-
* scheduling periodic jobs such as batch processes
45-
4644
* spreading out large numbers of independent database inserts over time
47-
instead of all at once
45+
instead of inserting everything at once
4846

4947
* aggregating collected data values on a fixed interval, such as every
5048
15 minutes
5149

50+
* scheduling periodic jobs such as batch processes
51+
5252

5353
## Task queue projects
54+
The defacto standard Python task queue is Celery. The other task queue
55+
projects that arise tend to come from the perspective that Celery is overly
56+
complicated for simple use cases. My recommendation is to put the effort into
57+
Celery's reasonable learning curve as it is worth the time it takes to
58+
understand how to use the project.
59+
5460
* The [Celery](http://www.celeryproject.org/) distributed task queue is the
5561
most commonly used Python library for handling asynchronous tasks and
5662
scheduling.

task-queues.html

Lines changed: 21 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -45,40 +45,43 @@
4545
<div class="row">
4646
<div class="col-md-8">
4747
<h1>Task queues</h1>
48-
<p>Task queues handle background work processed outside the usual HTTP
49-
request-response cycle. </p>
48+
<p>Task queues manage background work that must be executed outside the usual
49+
HTTP request-response cycle.</p>
5050
<h2>Why are tasks queues necessary?</h2>
51-
<p>Some tasks are handled asynchronously either because they are not initiated by
52-
an HTTP request or because they are long-running jobs that take longer than
53-
a few milliseconds. </p>
51+
<p>Tasks are handled asynchronously either because they are not initiated by
52+
an HTTP request or because they are long-running jobs that would dramatically
53+
reduce the performance of an HTTP response.</p>
5454
<p>For example, a web application could poll the GitHub API every 10 minutes to
55-
find out what are the top 100 starred repositories. A task queue would be set
56-
up to automatically call the GitHub API, process the results and store them
55+
collect the names of the top 100 starred repositories. A task queue would
56+
handle invoking code to call the GitHub API, process the results and store them
5757
in a persistent database for later use.</p>
5858
<p>Another example is when a database query would take too long during the HTTP
5959
request-response cycle. The query could be performed in the background on a
60-
fixed interval with the results stored in the database. Then when the
61-
HTTP request comes in it could fetch the calculated results from the database
62-
instead of re-executing the query. This is a form of <a href="/caching.html">caching</a>
63-
enabled by task queues.</p>
60+
fixed interval with the results stored in the database. When an
61+
HTTP request comes in that needs those results a query would simply fetch the
62+
precalculated result instead of re-executing the longer query.
63+
This precalculation scenario is a form of <a href="/caching.html">caching</a> enabled
64+
by task queues.</p>
6465
<p>Other types of jobs for task queues include</p>
6566
<ul>
6667
<li>
67-
<p>calculating computationally expensive data analytics</p>
68-
</li>
69-
<li>
70-
<p>scheduling periodic jobs such as batch processes</p>
71-
</li>
72-
<li>
7368
<p>spreading out large numbers of independent database inserts over time
74-
instead of all at once</p>
69+
instead of inserting everything at once</p>
7570
</li>
7671
<li>
7772
<p>aggregating collected data values on a fixed interval, such as every
7873
15 minutes</p>
7974
</li>
75+
<li>
76+
<p>scheduling periodic jobs such as batch processes</p>
77+
</li>
8078
</ul>
8179
<h2>Task queue projects</h2>
80+
<p>The defacto standard Python task queue is Celery. The other task queue
81+
projects that arise tend to come from the perspective that Celery is overly
82+
complicated for simple use cases. My recommendation is to put the effort into
83+
Celery's reasonable learning curve as it is worth the time it takes to
84+
understand how to use the project.</p>
8285
<ul>
8386
<li>
8487
<p>The <a href="http://www.celeryproject.org/">Celery</a> distributed task queue is the

0 commit comments

Comments
 (0)