cleaning up and polishing the task queues page

mattmakai · mattmakai · commit 0f458c92c3cc · 2014-05-10T13:11:46.000-04:00
diff --git a/feeds/all.atom.xml b/feeds/all.atom.xml
@@ -1,2 +1,2 @@
 <?xml version="1.0" encoding="utf-8"?>
-<feed xmlns="http://www.w3.org/2005/Atom"><title>Matt Makai</title><link href="http://www.fullstackpython.com/" rel="alternate"></link><link href="http://www.fullstackpython.com/feeds/all.atom.xml" rel="self"></link><id>http://www.fullstackpython.com/</id><updated>2014-05-10T12:45:37Z</updated></feed>
+<feed xmlns="http://www.w3.org/2005/Atom"><title>Matt Makai</title><link href="http://www.fullstackpython.com/" rel="alternate"></link><link href="http://www.fullstackpython.com/feeds/all.atom.xml" rel="self"></link><id>http://www.fullstackpython.com/</id><updated>2014-05-10T13:10:58Z</updated></feed>
diff --git a/source/content/pages/07-performance/0705-task-queues.markdown b/source/content/pages/07-performance/0705-task-queues.markdown
@@ -17,40 +17,46 @@ choice4text:
 
 
 # Task queues
-Task queues handle background work processed outside the usual HTTP 
-request-response cycle. 
+Task queues manage background work that must be executed outside the usual
+HTTP request-response cycle.
+
 
 ## Why are tasks queues necessary?
-Some tasks are handled asynchronously either because they are not initiated by 
-an HTTP request or because they are long-running jobs that take longer than
-a few milliseconds. 
+Tasks are handled asynchronously either because they are not initiated by 
+an HTTP request or because they are long-running jobs that would dramatically
+reduce the performance of an HTTP response.
 
 For example, a web application could poll the GitHub API every 10 minutes to
-find out what are the top 100 starred repositories. A task queue would be set
-up to automatically call the GitHub API, process the results and store them
+collect the names of the top 100 starred repositories. A task queue would
+handle invoking code to call the GitHub API, process the results and store them
 in a persistent database for later use.
 
 Another example is when a database query would take too long during the HTTP
 request-response cycle. The query could be performed in the background on a
-fixed interval with the results stored in the database. Then when the 
-HTTP request comes in it could fetch the calculated results from the database 
-instead of re-executing the query. This is a form of [caching](/caching.html)
-enabled by task queues.
+fixed interval with the results stored in the database. When an
+HTTP request comes in that needs those results a query would simply fetch the
+precalculated result instead of re-executing the longer query.
+This precalculation scenario is a form of [caching](/caching.html) enabled 
+by task queues.
 
 Other types of jobs for task queues include
 
-* calculating computationally expensive data analytics
-
-* scheduling periodic jobs such as batch processes
-
 * spreading out large numbers of independent database inserts over time 
-  instead of all at once
+  instead of inserting everything at once
 
 * aggregating collected data values on a fixed interval, such as every
   15 minutes
 
+* scheduling periodic jobs such as batch processes
+
 
 ## Task queue projects
+The defacto standard Python task queue is Celery. The other task queue 
+projects that arise tend to come from the perspective that Celery is overly
+complicated for simple use cases. My recommendation is to put the effort into
+Celery's reasonable learning curve as it is worth the time it takes to 
+understand how to use the project.
+
 * The [Celery](http://www.celeryproject.org/) distributed task queue is the
   most commonly used Python library for handling asynchronous tasks and 
   scheduling.
diff --git a/task-queues.html b/task-queues.html
@@ -45,40 +45,43 @@
         <div class="row">
     <div class="col-md-8">
       <h1>Task queues</h1>
-<p>Task queues handle background work processed outside the usual HTTP 
-request-response cycle. </p>
+<p>Task queues manage background work that must be executed outside the usual
+HTTP request-response cycle.</p>
 <h2>Why are tasks queues necessary?</h2>
-<p>Some tasks are handled asynchronously either because they are not initiated by 
-an HTTP request or because they are long-running jobs that take longer than
-a few milliseconds. </p>
+<p>Tasks are handled asynchronously either because they are not initiated by 
+an HTTP request or because they are long-running jobs that would dramatically
+reduce the performance of an HTTP response.</p>
 <p>For example, a web application could poll the GitHub API every 10 minutes to
-find out what are the top 100 starred repositories. A task queue would be set
-up to automatically call the GitHub API, process the results and store them
+collect the names of the top 100 starred repositories. A task queue would
+handle invoking code to call the GitHub API, process the results and store them
 in a persistent database for later use.</p>
 <p>Another example is when a database query would take too long during the HTTP
 request-response cycle. The query could be performed in the background on a
-fixed interval with the results stored in the database. Then when the 
-HTTP request comes in it could fetch the calculated results from the database 
-instead of re-executing the query. This is a form of <a href="/caching.html">caching</a>
-enabled by task queues.</p>
+fixed interval with the results stored in the database. When an
+HTTP request comes in that needs those results a query would simply fetch the
+precalculated result instead of re-executing the longer query.
+This precalculation scenario is a form of <a href="/caching.html">caching</a> enabled 
+by task queues.</p>
 <p>Other types of jobs for task queues include</p>
 <ul>
 <li>
-<p>calculating computationally expensive data analytics</p>
-</li>
-<li>
-<p>scheduling periodic jobs such as batch processes</p>
-</li>
-<li>
 <p>spreading out large numbers of independent database inserts over time 
-  instead of all at once</p>
+  instead of inserting everything at once</p>
 </li>
 <li>
 <p>aggregating collected data values on a fixed interval, such as every
   15 minutes</p>
 </li>
+<li>
+<p>scheduling periodic jobs such as batch processes</p>
+</li>
 </ul>
 <h2>Task queue projects</h2>
+<p>The defacto standard Python task queue is Celery. The other task queue 
+projects that arise tend to come from the perspective that Celery is overly
+complicated for simple use cases. My recommendation is to put the effort into
+Celery's reasonable learning curve as it is worth the time it takes to 
+understand how to use the project.</p>
 <ul>
 <li>
 <p>The <a href="http://www.celeryproject.org/">Celery</a> distributed task queue is the

Original file line number	Diff line number	Diff line change
`@@ -1,2 +1,2 @@`
`1`	`1`	`<?xml version="1.0" encoding="utf-8"?>`
`2`		`-<feed xmlns="http://www.w3.org/2005/Atom"><title>Matt Makai</title><link href="http://www.fullstackpython.com/" rel="alternate"></link><link href="http://www.fullstackpython.com/feeds/all.atom.xml" rel="self"></link><id>http://www.fullstackpython.com/</id><updated>2014-05-10T12:45:37Z</updated></feed>`
	`2`	`+<feed xmlns="http://www.w3.org/2005/Atom"><title>Matt Makai</title><link href="http://www.fullstackpython.com/" rel="alternate"></link><link href="http://www.fullstackpython.com/feeds/all.atom.xml" rel="self"></link><id>http://www.fullstackpython.com/</id><updated>2014-05-10T13:10:58Z</updated></feed>`