Effective Pythonhttps://effectivepython.com/2024-09-07T14:30:00-07:00Preorder the Third Edition2024-09-07T14:30:00-07:002024-09-07T14:30:00-07:00Brett Slatkintag:effectivepython.com,2024-09-07:/2024/09/07/preorder-third-edition/<p><em>Effective Python: Third Edition</em> is now available for preorder! <a target="_blank" href="https://amzn.to/46xVwYd">Follow this link to buy your copy in advance</a>. It will ship in late November 2024 once the book has finished printing and is stocked in the warehouse. Digital editions will become available when the physical book ships or sooner.</p>Item 10: Prevent Repetition with Assignment Expressions2020-02-02T22:00:00-08:002020-02-02T22:00:00-08:00Brett Slatkintag:effectivepython.com,2020-02-02:/2020/02/02/prevent-repetition-with-assignment-expressions/ <p>An assignment expression—also known as the <em>walrus operator</em>—is a new syntax introduced in Python 3.8 to solve a long-standing problem with the language that can cause code duplication. Whereas normal assignment statements are written <code>a = b</code> and pronounced &#8220;a equals b&#8221;, these assignments are written <code>a := b</code> and pronounced &#8220;a <em>walrus</em> b&#8221; (because <code>:=</code> looks like a pair of eyeballs and tusks).<p><strong>This sample is from a previous version of the book. <a href="https://effectivepython.com/">See the new third edition here</a>.</strong><br><br></p> <p>An assignment expression—also known as the <em>walrus operator</em>—is a new syntax introduced in Python 3.8 to solve a long-standing problem with the language that can cause code duplication. Whereas normal assignment statements are written <code>a = b</code> and pronounced &#8220;a equals b&#8221;, these assignments are written <code>a := b</code> and pronounced &#8220;a <em>walrus</em> b&#8221; (because <code>:=</code> looks like a pair of eyeballs and tusks).</p> <p>Assignment expressions are useful because they enable you to assign variables in places where assignment statements are disallowed, such as in the conditional expression of an <code>if</code> statement. An assignment expression&#8217;s value evaluates to whatever was assigned to the identifier on the left side of the walrus operator.</p> <p>For example, say that I have a basket of fresh fruit that I’m trying to manage for a juice bar. Here, I define the contents of the basket:</p> <div class="highlight"><pre><span></span><span class="n">fresh_fruit</span> <span class="o">=</span> <span class="p">{</span> <span class="s1">&#39;apple&#39;</span><span class="p">:</span> <span class="mi">10</span><span class="p">,</span> <span class="s1">&#39;banana&#39;</span><span class="p">:</span> <span class="mi">8</span><span class="p">,</span> <span class="s1">&#39;lemon&#39;</span><span class="p">:</span> <span class="mi">5</span><span class="p">,</span> <span class="p">}</span> </pre></div> <p>When a customer comes to the counter to order some lemonade, I need to make sure there is at least one lemon in the basket to squeeze. Here, I do this by retrieving the count of lemons and then using an <code>if</code> statement to check for a non-zero value:</p> <div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">make_lemonade</span><span class="p">(</span><span class="n">count</span><span class="p">):</span> <span class="o">...</span> <span class="k">def</span> <span class="nf">out_of_stock</span><span class="p">():</span> <span class="o">...</span> <span class="n">count</span> <span class="o">=</span> <span class="n">fresh_fruit</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">&#39;lemon&#39;</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span> <span class="k">if</span> <span class="n">count</span><span class="p">:</span> <span class="n">make_lemonade</span><span class="p">(</span><span class="n">count</span><span class="p">)</span> <span class="k">else</span><span class="p">:</span> <span class="n">out_of_stock</span><span class="p">()</span> </pre></div> <p>The problem with this seemingly simple code is that it’s noisier than it needs to be. The <code>count</code> variable is used only within the first block of the <code>if</code> statement. Defining <code>count</code> above the <code>if</code> statement causes it to appear to be more important than it really is, as if all code that follows, including the <code>else</code> block, will need to access the <code>count</code> variable, when in fact that is not the case.</p> <p>This pattern of fetching a value, checking to see if it’s non-zero, and then using it is extremely common in Python. Many programmers try to work around the multiple references to <code>count</code> with a variety of tricks that hurt readability (see Item 5: “Write Helper Functions Instead of Complex Expressions” for an example). Luckily, assignment expressions were added to the language to streamline exactly this type of code. Here, I rewrite this example using the walrus operator:</p> <div class="highlight"><pre><span></span><span class="k">if</span> <span class="n">count</span> <span class="o">:=</span> <span class="n">fresh_fruit</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">&#39;lemon&#39;</span><span class="p">,</span> <span class="mi">0</span><span class="p">):</span> <span class="n">make_lemonade</span><span class="p">(</span><span class="n">count</span><span class="p">)</span> <span class="k">else</span><span class="p">:</span> <span class="n">out_of_stock</span><span class="p">()</span> </pre></div> <p>Though this is only one line shorter, it’s a lot more readable because it’s now clear that <code>count</code> is only relevant to the first block of the <code>if</code> statement. The assignment expression is first assigning a value to the <code>count</code> variable, and then evaluating that value in the context of the <code>if</code> statement to determine how to proceed with flow control. This two-step behavior—assign and then evaluate—is the fundamental nature of the walrus operator.</p> <p>Lemons are quite potent, so only one is needed for my lemonade recipe, which means a non-zero check is good enough. If a customer orders a cider, though, I need to make sure that I have at least four apples. Here, I do this by fetching the <code>count</code> from the <code>fruit_basket</code> dictionary, and then using a comparison in the <code>if</code> statement conditional expression:</p> <div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">make_cider</span><span class="p">(</span><span class="n">count</span><span class="p">):</span> <span class="o">...</span> <span class="n">count</span> <span class="o">=</span> <span class="n">fresh_fruit</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">&#39;apple&#39;</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span> <span class="k">if</span> <span class="n">count</span> <span class="o">&gt;=</span> <span class="mi">4</span><span class="p">:</span> <span class="n">make_cider</span><span class="p">(</span><span class="n">count</span><span class="p">)</span> <span class="k">else</span><span class="p">:</span> <span class="n">out_of_stock</span><span class="p">()</span> </pre></div> <p>This has the same problem as the lemonade example, where the assignment of <code>count</code> puts distracting emphasis on that variable. Here, I improve the clarity of this code by also using the walrus operator:</p> <div class="highlight"><pre><span></span><span class="k">if</span> <span class="p">(</span><span class="n">count</span> <span class="o">:=</span> <span class="n">fresh_fruit</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">&#39;apple&#39;</span><span class="p">,</span> <span class="mi">0</span><span class="p">))</span> <span class="o">&gt;=</span> <span class="mi">4</span><span class="p">:</span> <span class="n">make_cider</span><span class="p">(</span><span class="n">count</span><span class="p">)</span> <span class="k">else</span><span class="p">:</span> <span class="n">out_of_stock</span><span class="p">()</span> </pre></div> <p>This works as expected and makes the code one line shorter. It’s important to note how I needed to surround the assignment expression with parentheses to compare it with <code>4</code> in the <code>if</code> statement. In the lemonade example, no surrounding parentheses were required because the assignment expression stood on its own as a non-zero check; it wasn’t a subexpression of a larger expression. As with other expressions, you should avoid surrounding assignment expressions with parentheses when possible.</p> <p>Another common variation of this repetitive pattern occurs when I need to assign a variable in the enclosing scope depending on some condition, and then reference that variable shortly afterward in a function call. For example, say that a customer orders some banana smoothies. In order to make them, I need to have at least two bananas’ worth of slices, or else an <code>OutOfBananas</code> exception will be raised. Here, I implement this logic in a typical way:</p> <div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">slice_bananas</span><span class="p">(</span><span class="n">count</span><span class="p">):</span> <span class="o">...</span> <span class="k">class</span> <span class="nc">OutOfBananas</span><span class="p">(</span><span class="ne">Exception</span><span class="p">):</span> <span class="k">pass</span> <span class="k">def</span> <span class="nf">make_smoothies</span><span class="p">(</span><span class="n">count</span><span class="p">):</span> <span class="o">...</span> <span class="n">pieces</span> <span class="o">=</span> <span class="mi">0</span> <span class="n">count</span> <span class="o">=</span> <span class="n">fresh_fruit</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">&#39;banana&#39;</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span> <span class="k">if</span> <span class="n">count</span> <span class="o">&gt;=</span> <span class="mi">2</span><span class="p">:</span> <span class="n">pieces</span> <span class="o">=</span> <span class="n">slice_bananas</span><span class="p">(</span><span class="n">count</span><span class="p">)</span> <span class="k">try</span><span class="p">:</span> <span class="n">smoothies</span> <span class="o">=</span> <span class="n">make_smoothies</span><span class="p">(</span><span class="n">pieces</span><span class="p">)</span> <span class="k">except</span> <span class="n">OutOfBananas</span><span class="p">:</span> <span class="n">out_of_stock</span><span class="p">()</span> </pre></div> <p>The other common way to do this is to put the <code>pieces = 0</code> assignment in the <code>else</code> block:</p> <div class="highlight"><pre><span></span><span class="n">count</span> <span class="o">=</span> <span class="n">fresh_fruit</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">&#39;banana&#39;</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span> <span class="k">if</span> <span class="n">count</span> <span class="o">&gt;=</span> <span class="mi">2</span><span class="p">:</span> <span class="n">pieces</span> <span class="o">=</span> <span class="n">slice_bananas</span><span class="p">(</span><span class="n">count</span><span class="p">)</span> <span class="k">else</span><span class="p">:</span> <span class="n">pieces</span> <span class="o">=</span> <span class="mi">0</span> <span class="k">try</span><span class="p">:</span> <span class="n">smoothies</span> <span class="o">=</span> <span class="n">make_smoothies</span><span class="p">(</span><span class="n">pieces</span><span class="p">)</span> <span class="k">except</span> <span class="n">OutOfBananas</span><span class="p">:</span> <span class="n">out_of_stock</span><span class="p">()</span> </pre></div> <p>This second approach can feel odd because it means that the <code>pieces</code> variable has two different locations—in each block of the <code>if</code> statement—where it can be initially defined. This split definition technically works because of Python’s scoping rules (see Item 21: “Know How Closures Interact with Variable Scope”), but it isn’t easy to read or discover, which is why many people prefer the construct above, where the <code>pieces = 0</code> assignment is first.</p> <p>The walrus operator can again be used to shorten this example by one line of code. This small change removes any emphasis on the <code>count</code> variable. Now, it’s clearer that <code>pieces</code> will be important beyond the <code>if</code> statement:</p> <div class="highlight"><pre><span></span><span class="n">pieces</span> <span class="o">=</span> <span class="mi">0</span> <span class="k">if</span> <span class="p">(</span><span class="n">count</span> <span class="o">:=</span> <span class="n">fresh_fruit</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">&#39;banana&#39;</span><span class="p">,</span> <span class="mi">0</span><span class="p">))</span> <span class="o">&gt;=</span> <span class="mi">2</span><span class="p">:</span> <span class="n">pieces</span> <span class="o">=</span> <span class="n">slice_bananas</span><span class="p">(</span><span class="n">count</span><span class="p">)</span> <span class="k">try</span><span class="p">:</span> <span class="n">smoothies</span> <span class="o">=</span> <span class="n">make_smoothies</span><span class="p">(</span><span class="n">pieces</span><span class="p">)</span> <span class="k">except</span> <span class="n">OutOfBananas</span><span class="p">:</span> <span class="n">out_of_stock</span><span class="p">()</span> </pre></div> <p>Using the walrus operator also improves the readability of splitting the definition of <code>pieces</code> across both parts of the <code>if</code> statement. It’s easier to trace the <code>pieces</code> variable when the <code>count</code> definition no longer precedes the <code>if</code> statement:</p> <div class="highlight"><pre><span></span><span class="k">if</span> <span class="p">(</span><span class="n">count</span> <span class="o">:=</span> <span class="n">fresh_fruit</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">&#39;banana&#39;</span><span class="p">,</span> <span class="mi">0</span><span class="p">))</span> <span class="o">&gt;=</span> <span class="mi">2</span><span class="p">:</span> <span class="n">pieces</span> <span class="o">=</span> <span class="n">slice_bananas</span><span class="p">(</span><span class="n">count</span><span class="p">)</span> <span class="k">else</span><span class="p">:</span> <span class="n">pieces</span> <span class="o">=</span> <span class="mi">0</span> <span class="k">try</span><span class="p">:</span> <span class="n">smoothies</span> <span class="o">=</span> <span class="n">make_smoothies</span><span class="p">(</span><span class="n">pieces</span><span class="p">)</span> <span class="k">except</span> <span class="n">OutOfBananas</span><span class="p">:</span> <span class="n">out_of_stock</span><span class="p">()</span> </pre></div> <p>One frustration that programmers who are new to Python often have is the lack of a flexible switch/case statement. The general style for approximating this type of functionality is to have a deep nesting of multiple <code>if</code>, <code>elif</code>, and <code>else</code> statements.</p> <p>For example, imagine that I want to implement a system of precedence so that each customer automatically gets the best juice available and doesn’t have to order. Here, I define logic to make it so banana smoothies are served first, followed by apple cider, and then finally lemonade:</p> <div class="highlight"><pre><span></span><span class="n">count</span> <span class="o">=</span> <span class="n">fresh_fruit</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">&#39;banana&#39;</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span> <span class="k">if</span> <span class="n">count</span> <span class="o">&gt;=</span> <span class="mi">2</span><span class="p">:</span> <span class="n">pieces</span> <span class="o">=</span> <span class="n">slice_bananas</span><span class="p">(</span><span class="n">count</span><span class="p">)</span> <span class="n">to_enjoy</span> <span class="o">=</span> <span class="n">make_smoothies</span><span class="p">(</span><span class="n">pieces</span><span class="p">)</span> <span class="k">else</span><span class="p">:</span> <span class="n">count</span> <span class="o">=</span> <span class="n">fresh_fruit</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">&#39;apple&#39;</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span> <span class="k">if</span> <span class="n">count</span> <span class="o">&gt;=</span> <span class="mi">4</span><span class="p">:</span> <span class="n">to_enjoy</span> <span class="o">=</span> <span class="n">make_cider</span><span class="p">(</span><span class="n">count</span><span class="p">)</span> <span class="k">else</span><span class="p">:</span> <span class="n">count</span> <span class="o">=</span> <span class="n">fresh_fruit</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">&#39;lemon&#39;</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span> <span class="k">if</span> <span class="n">count</span><span class="p">:</span> <span class="n">to_enjoy</span> <span class="o">=</span> <span class="n">make_lemonade</span><span class="p">(</span><span class="n">count</span><span class="p">)</span> <span class="k">else</span><span class="p">:</span> <span class="n">to_enjoy</span> <span class="o">=</span> <span class="s1">&#39;Nothing&#39;</span> </pre></div> <p>Ugly constructs like this are surprisingly common in Python code. Luckily, the walrus operator provides an elegant solution that can feel nearly as versatile as dedicated syntax for switch/case statements:</p> <div class="highlight"><pre><span></span><span class="k">if</span> <span class="p">(</span><span class="n">count</span> <span class="o">:=</span> <span class="n">fresh_fruit</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">&#39;banana&#39;</span><span class="p">,</span> <span class="mi">0</span><span class="p">))</span> <span class="o">&gt;=</span> <span class="mi">2</span><span class="p">:</span> <span class="n">pieces</span> <span class="o">=</span> <span class="n">slice_bananas</span><span class="p">(</span><span class="n">count</span><span class="p">)</span> <span class="n">to_enjoy</span> <span class="o">=</span> <span class="n">make_smoothies</span><span class="p">(</span><span class="n">pieces</span><span class="p">)</span> <span class="k">elif</span> <span class="p">(</span><span class="n">count</span> <span class="o">:=</span> <span class="n">fresh_fruit</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">&#39;apple&#39;</span><span class="p">,</span> <span class="mi">0</span><span class="p">))</span> <span class="o">&gt;=</span> <span class="mi">4</span><span class="p">:</span> <span class="n">to_enjoy</span> <span class="o">=</span> <span class="n">make_cider</span><span class="p">(</span><span class="n">count</span><span class="p">)</span> <span class="k">elif</span> <span class="n">count</span> <span class="o">:=</span> <span class="n">fresh_fruit</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">&#39;lemon&#39;</span><span class="p">,</span> <span class="mi">0</span><span class="p">):</span> <span class="n">to_enjoy</span> <span class="o">=</span> <span class="n">make_lemonade</span><span class="p">(</span><span class="n">count</span><span class="p">)</span> <span class="k">else</span><span class="p">:</span> <span class="n">to_enjoy</span> <span class="o">=</span> <span class="s1">&#39;Nothing&#39;</span> </pre></div> <p>The version that uses assignment expressions is only five lines shorter than the original, but the improvement in readability is vast due to the reduction in nesting and indentation. If you ever see such ugly constructs emerge in your code, I suggest that you move them over to using the walrus operator if possible.</p> <p>Another common frustration of new Python programmers is the lack of a do/while loop construct. For example, say that I want to bottle juice as new fruit is delivered until there’s no fruit remaining. Here, I implement this logic with a <code>while</code> loop:</p> <div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">pick_fruit</span><span class="p">():</span> <span class="o">...</span> <span class="k">def</span> <span class="nf">make_juice</span><span class="p">(</span><span class="n">fruit</span><span class="p">,</span> <span class="n">count</span><span class="p">):</span> <span class="o">...</span> <span class="n">bottles</span> <span class="o">=</span> <span class="p">[]</span> <span class="n">fresh_fruit</span> <span class="o">=</span> <span class="n">pick_fruit</span><span class="p">()</span> <span class="k">while</span> <span class="n">fresh_fruit</span><span class="p">:</span> <span class="k">for</span> <span class="n">fruit</span><span class="p">,</span> <span class="n">count</span> <span class="ow">in</span> <span class="n">fresh_fruit</span><span class="o">.</span><span class="n">items</span><span class="p">():</span> <span class="n">batch</span> <span class="o">=</span> <span class="n">make_juice</span><span class="p">(</span><span class="n">fruit</span><span class="p">,</span> <span class="n">count</span><span class="p">)</span> <span class="n">bottles</span><span class="o">.</span><span class="n">extend</span><span class="p">(</span><span class="n">batch</span><span class="p">)</span> <span class="n">fresh_fruit</span> <span class="o">=</span> <span class="n">pick_fruit</span><span class="p">()</span> </pre></div> <p>This is repetitive because it requires two separate <code>fresh_fruit = pick_fruit()</code> calls: one before the loop to set initial conditions, and another at the end of the loop to replenish the <code>list</code> of delivered fruit.</p> <p>A strategy for improving code reuse in this situation is to use the <em>loop-and-a-half</em> idiom. This eliminates the redundant lines, but it also undermines the <code>while</code> loop’s contribution by making it a dumb infinite loop. Now, all of the flow control of the loop depends on the conditional <code>break</code> statement:</p> <div class="highlight"><pre><span></span><span class="n">bottles</span> <span class="o">=</span> <span class="p">[]</span> <span class="k">while</span> <span class="kc">True</span><span class="p">:</span> <span class="c1"># Loop</span> <span class="n">fresh_fruit</span> <span class="o">=</span> <span class="n">pick_fruit</span><span class="p">()</span> <span class="k">if</span> <span class="ow">not</span> <span class="n">fresh_fruit</span><span class="p">:</span> <span class="c1"># And a half</span> <span class="k">break</span> <span class="k">for</span> <span class="n">fruit</span><span class="p">,</span> <span class="n">count</span> <span class="ow">in</span> <span class="n">fresh_fruit</span><span class="o">.</span><span class="n">items</span><span class="p">():</span> <span class="n">batch</span> <span class="o">=</span> <span class="n">make_juice</span><span class="p">(</span><span class="n">fruit</span><span class="p">,</span> <span class="n">count</span><span class="p">)</span> <span class="n">bottles</span><span class="o">.</span><span class="n">extend</span><span class="p">(</span><span class="n">batch</span><span class="p">)</span> </pre></div> <p>The walrus operator obviates the need for the loop-and-a-half idiom by allowing the <code>fresh_fruit</code> variable to be reassigned and then conditionally evaluated each time through the <code>while</code> loop. This solution is short and easy to read, and it should be the preferred approach in your code:</p> <div class="highlight"><pre><span></span><span class="n">bottles</span> <span class="o">=</span> <span class="p">[]</span> <span class="k">while</span> <span class="n">fresh_fruit</span> <span class="o">:=</span> <span class="n">pick_fruit</span><span class="p">():</span> <span class="k">for</span> <span class="n">fruit</span><span class="p">,</span> <span class="n">count</span> <span class="ow">in</span> <span class="n">fresh_fruit</span><span class="o">.</span><span class="n">items</span><span class="p">():</span> <span class="n">batch</span> <span class="o">=</span> <span class="n">make_juice</span><span class="p">(</span><span class="n">fruit</span><span class="p">,</span> <span class="n">count</span><span class="p">)</span> <span class="n">bottles</span><span class="o">.</span><span class="n">extend</span><span class="p">(</span><span class="n">batch</span><span class="p">)</span> </pre></div> <p>There are many other situations where assignment expressions can be used to eliminate redundancy (see Item 29: “Avoid Repeated Work in Comprehensions by Using Assignment Expressions” for another). In general, when you find yourself repeating the same expression or assignment multiple times within a grouping of lines, it’s time to consider using assignment expressions in order to improve readability.</p> <h3>Things to Remember</h3> <ul> <li>Assignment expressions use the walrus operator (<code>:=</code>) to both assign and evaluate variable names in a single expression, thus reducing repetition.</li> <li>When an assignment expression is a subexpression of a larger expression, it must be surrounded with parentheses.</li> <li>Although switch/case statements and do/while loops are not available in Python, their functionality can be emulated much more clearly by using assignment expressions.</li> </ul>Item 51: Prefer Class Decorators Over Metaclasses for Composable Class Extensions2019-12-18T23:00:00-08:002019-12-18T23:00:00-08:00Brett Slatkintag:effectivepython.com,2019-12-18:/2019/12/18/prefer-class-decorators-over-metaclasses/ <p>Although metaclasses allow you to customize class creation in multiple ways (see Item 48: “Validate Subclasses with <code>__init_subclass__</code>” and Item 49: “Register Class Existence with <code>__init_subclass__</code>”), they still fall short of handling every situation that may arise.<p><strong>This sample is from a previous version of the book. <a href="https://effectivepython.com/">See the new third edition here</a>.</strong><br><br></p> <p>Although metaclasses allow you to customize class creation in multiple ways (see Item 48: “Validate Subclasses with <code>__init_subclass__</code>” and Item 49: “Register Class Existence with <code>__init_subclass__</code>”), they still fall short of handling every situation that may arise.</p> <p>For example, say that I want to decorate all of the methods of a class with a helper that prints arguments, return values, and exceptions raised. Here, I define the debugging decorator (see Item 26: “Define Function Decorators with <code>functools.wraps</code>” for background):</p> <div class="highlight"><pre><span></span><span class="kn">from</span> <span class="nn">functools</span> <span class="kn">import</span> <span class="n">wraps</span> <span class="k">def</span> <span class="nf">trace_func</span><span class="p">(</span><span class="n">func</span><span class="p">):</span> <span class="k">if</span> <span class="nb">hasattr</span><span class="p">(</span><span class="n">func</span><span class="p">,</span> <span class="s1">&#39;tracing&#39;</span><span class="p">):</span> <span class="c1"># Only decorate once</span> <span class="k">return</span> <span class="n">func</span> <span class="nd">@wraps</span><span class="p">(</span><span class="n">func</span><span class="p">)</span> <span class="k">def</span> <span class="nf">wrapper</span><span class="p">(</span><span class="o">*</span><span class="n">args</span><span class="p">,</span> <span class="o">**</span><span class="n">kwargs</span><span class="p">):</span> <span class="n">result</span> <span class="o">=</span> <span class="kc">None</span> <span class="k">try</span><span class="p">:</span> <span class="n">result</span> <span class="o">=</span> <span class="n">func</span><span class="p">(</span><span class="o">*</span><span class="n">args</span><span class="p">,</span> <span class="o">**</span><span class="n">kwargs</span><span class="p">)</span> <span class="k">return</span> <span class="n">result</span> <span class="k">except</span> <span class="ne">Exception</span> <span class="k">as</span> <span class="n">e</span><span class="p">:</span> <span class="n">result</span> <span class="o">=</span> <span class="n">e</span> <span class="k">raise</span> <span class="k">finally</span><span class="p">:</span> <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s1">&#39;</span><span class="si">{</span><span class="n">func</span><span class="o">.</span><span class="vm">__name__</span><span class="si">}</span><span class="s1">(</span><span class="si">{</span><span class="n">args</span><span class="si">!r}</span><span class="s1">, </span><span class="si">{</span><span class="n">kwargs</span><span class="si">!r}</span><span class="s1">) -&gt; &#39;</span> <span class="sa">f</span><span class="s1">&#39;</span><span class="si">{</span><span class="n">result</span><span class="si">!r}</span><span class="s1">&#39;</span><span class="p">)</span> <span class="n">wrapper</span><span class="o">.</span><span class="n">tracing</span> <span class="o">=</span> <span class="kc">True</span> <span class="k">return</span> <span class="n">wrapper</span> </pre></div> <p>I can apply this decorator to various special methods in my new <code>dict</code> subclass (see Item 43: “Inherit from <code>collections.abc</code> for Custom Container Types”):</p> <div class="highlight"><pre><span></span><span class="k">class</span> <span class="nc">TraceDict</span><span class="p">(</span><span class="nb">dict</span><span class="p">):</span> <span class="nd">@trace_func</span> <span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="o">*</span><span class="n">args</span><span class="p">,</span> <span class="o">**</span><span class="n">kwargs</span><span class="p">):</span> <span class="nb">super</span><span class="p">()</span><span class="o">.</span><span class="fm">__init__</span><span class="p">(</span><span class="o">*</span><span class="n">args</span><span class="p">,</span> <span class="o">**</span><span class="n">kwargs</span><span class="p">)</span> <span class="nd">@trace_func</span> <span class="k">def</span> <span class="fm">__setitem__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="o">*</span><span class="n">args</span><span class="p">,</span> <span class="o">**</span><span class="n">kwargs</span><span class="p">):</span> <span class="k">return</span> <span class="nb">super</span><span class="p">()</span><span class="o">.</span><span class="fm">__setitem__</span><span class="p">(</span><span class="o">*</span><span class="n">args</span><span class="p">,</span> <span class="o">**</span><span class="n">kwargs</span><span class="p">)</span> <span class="nd">@trace_func</span> <span class="k">def</span> <span class="fm">__getitem__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="o">*</span><span class="n">args</span><span class="p">,</span> <span class="o">**</span><span class="n">kwargs</span><span class="p">):</span> <span class="k">return</span> <span class="nb">super</span><span class="p">()</span><span class="o">.</span><span class="fm">__getitem__</span><span class="p">(</span><span class="o">*</span><span class="n">args</span><span class="p">,</span> <span class="o">**</span><span class="n">kwargs</span><span class="p">)</span> <span class="o">...</span> </pre></div> <p>And I can verify that these methods are decorated by interacting with an instance of the class:</p> <div class="highlight"><pre><span></span><span class="n">trace_dict</span> <span class="o">=</span> <span class="n">TraceDict</span><span class="p">([(</span><span class="s1">&#39;hi&#39;</span><span class="p">,</span> <span class="mi">1</span><span class="p">)])</span> <span class="n">trace_dict</span><span class="p">[</span><span class="s1">&#39;there&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="mi">2</span> <span class="n">trace_dict</span><span class="p">[</span><span class="s1">&#39;hi&#39;</span><span class="p">]</span> <span class="k">try</span><span class="p">:</span> <span class="n">trace_dict</span><span class="p">[</span><span class="s1">&#39;does not exist&#39;</span><span class="p">]</span> <span class="k">except</span> <span class="ne">KeyError</span><span class="p">:</span> <span class="k">pass</span> <span class="c1"># Expected</span> </pre></div> <div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt;</span> <span class="go">__init__(({&#39;hi&#39;: 1}, [(&#39;hi&#39;, 1)]), {}) -&gt; None</span> <span class="go">__setitem__(({&#39;hi&#39;: 1, &#39;there&#39;: 2}, &#39;there&#39;, 2), {}) -&gt; None</span> <span class="go">__getitem__(({&#39;hi&#39;: 1, &#39;there&#39;: 2}, &#39;hi&#39;), {}) -&gt; 1</span> <span class="go">__getitem__(({&#39;hi&#39;: 1, &#39;there&#39;: 2}, &#39;does not exist&#39;), {}) -&gt; KeyError(&#39;does not exist&#39;)</span> </pre></div> <p>The problem with this code is that I had to redefine all of the methods that I wanted to decorate with <code>@trace_func</code>. This is redundant boilerplate that’s hard to read and error prone. Further, if a new method is later added to the <code>dict</code> superclass, it won’t be decorated unless I also define it in <code>TraceDict</code>.</p> <p>One way to solve this problem is to use a metaclass to automatically decorate all methods of a class. Here, I implement this behavior by wrapping each function or method in the new type with the <code>trace_func</code> decorator:</p> <div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">types</span> <span class="n">trace_types</span> <span class="o">=</span> <span class="p">(</span> <span class="n">types</span><span class="o">.</span><span class="n">MethodType</span><span class="p">,</span> <span class="n">types</span><span class="o">.</span><span class="n">FunctionType</span><span class="p">,</span> <span class="n">types</span><span class="o">.</span><span class="n">BuiltinFunctionType</span><span class="p">,</span> <span class="n">types</span><span class="o">.</span><span class="n">BuiltinMethodType</span><span class="p">,</span> <span class="n">types</span><span class="o">.</span><span class="n">MethodDescriptorType</span><span class="p">,</span> <span class="n">types</span><span class="o">.</span><span class="n">ClassMethodDescriptorType</span><span class="p">)</span> <span class="k">class</span> <span class="nc">TraceMeta</span><span class="p">(</span><span class="nb">type</span><span class="p">):</span> <span class="k">def</span> <span class="fm">__new__</span><span class="p">(</span><span class="n">meta</span><span class="p">,</span> <span class="n">name</span><span class="p">,</span> <span class="n">bases</span><span class="p">,</span> <span class="n">class_dict</span><span class="p">):</span> <span class="n">klass</span> <span class="o">=</span> <span class="nb">super</span><span class="p">()</span><span class="o">.</span><span class="fm">__new__</span><span class="p">(</span><span class="n">meta</span><span class="p">,</span> <span class="n">name</span><span class="p">,</span> <span class="n">bases</span><span class="p">,</span> <span class="n">class_dict</span><span class="p">)</span> <span class="k">for</span> <span class="n">key</span> <span class="ow">in</span> <span class="nb">dir</span><span class="p">(</span><span class="n">klass</span><span class="p">):</span> <span class="n">value</span> <span class="o">=</span> <span class="nb">getattr</span><span class="p">(</span><span class="n">klass</span><span class="p">,</span> <span class="n">key</span><span class="p">)</span> <span class="k">if</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="n">trace_types</span><span class="p">):</span> <span class="n">wrapped</span> <span class="o">=</span> <span class="n">trace_func</span><span class="p">(</span><span class="n">value</span><span class="p">)</span> <span class="nb">setattr</span><span class="p">(</span><span class="n">klass</span><span class="p">,</span> <span class="n">key</span><span class="p">,</span> <span class="n">wrapped</span><span class="p">)</span> <span class="k">return</span> <span class="n">klass</span> </pre></div> <p>Now, I can declare my <code>dict</code> subclass by using the <code>TraceMeta</code> metaclass and verify that it works as expected:</p> <div class="highlight"><pre><span></span><span class="k">class</span> <span class="nc">TraceDict</span><span class="p">(</span><span class="nb">dict</span><span class="p">,</span> <span class="n">metaclass</span><span class="o">=</span><span class="n">TraceMeta</span><span class="p">):</span> <span class="k">pass</span> <span class="n">trace_dict</span> <span class="o">=</span> <span class="n">TraceDict</span><span class="p">([(</span><span class="s1">&#39;hi&#39;</span><span class="p">,</span> <span class="mi">1</span><span class="p">)])</span> <span class="n">trace_dict</span><span class="p">[</span><span class="s1">&#39;there&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="mi">2</span> <span class="n">trace_dict</span><span class="p">[</span><span class="s1">&#39;hi&#39;</span><span class="p">]</span> <span class="k">try</span><span class="p">:</span> <span class="n">trace_dict</span><span class="p">[</span><span class="s1">&#39;does not exist&#39;</span><span class="p">]</span> <span class="k">except</span> <span class="ne">KeyError</span><span class="p">:</span> <span class="k">pass</span> <span class="c1"># Expected</span> </pre></div> <div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt;</span> <span class="go">__new__((&lt;class &#39;__main__.TraceDict&#39;&gt;, [(&#39;hi&#39;, 1)]), {}) -&gt; {}</span> <span class="go">__getitem__(({&#39;hi&#39;: 1, &#39;there&#39;: 2}, &#39;hi&#39;), {}) -&gt; 1</span> <span class="go">__getitem__(({&#39;hi&#39;: 1, &#39;there&#39;: 2}, &#39;does not exist&#39;), {}) -&gt; KeyError(&#39;does not exist&#39;)</span> </pre></div> <p>This works, and it even prints out a call to <code>__new__</code> that was missing from my earlier implementation. What happens if I try to use <code>TraceMeta</code> when a superclass already has specified a metaclass?</p> <div class="highlight"><pre><span></span><span class="k">class</span> <span class="nc">OtherMeta</span><span class="p">(</span><span class="nb">type</span><span class="p">):</span> <span class="k">pass</span> <span class="k">class</span> <span class="nc">SimpleDict</span><span class="p">(</span><span class="nb">dict</span><span class="p">,</span> <span class="n">metaclass</span><span class="o">=</span><span class="n">OtherMeta</span><span class="p">):</span> <span class="k">pass</span> <span class="k">class</span> <span class="nc">TraceDict</span><span class="p">(</span><span class="n">SimpleDict</span><span class="p">,</span> <span class="n">metaclass</span><span class="o">=</span><span class="n">TraceMeta</span><span class="p">):</span> <span class="k">pass</span> </pre></div> <div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt;</span> <span class="go">Traceback ...</span> <span class="go">TypeError: metaclass conflict: the metaclass of a derived class must be a (non-strict) subclass of the metaclasses of all its bases</span> </pre></div> <p>This fails because <code>TraceMeta</code> does not inherit from <code>OtherMeta</code>. In theory, I can use metaclass inheritance to solve this problem by having <code>OtherMeta</code> inherit from <code>TraceMeta</code>:</p> <div class="highlight"><pre><span></span><span class="k">class</span> <span class="nc">TraceMeta</span><span class="p">(</span><span class="nb">type</span><span class="p">):</span> <span class="o">...</span> <span class="k">class</span> <span class="nc">OtherMeta</span><span class="p">(</span><span class="n">TraceMeta</span><span class="p">):</span> <span class="k">pass</span> <span class="k">class</span> <span class="nc">SimpleDict</span><span class="p">(</span><span class="nb">dict</span><span class="p">,</span> <span class="n">metaclass</span><span class="o">=</span><span class="n">OtherMeta</span><span class="p">):</span> <span class="k">pass</span> <span class="k">class</span> <span class="nc">TraceDict</span><span class="p">(</span><span class="n">SimpleDict</span><span class="p">,</span> <span class="n">metaclass</span><span class="o">=</span><span class="n">TraceMeta</span><span class="p">):</span> <span class="k">pass</span> <span class="n">trace_dict</span> <span class="o">=</span> <span class="n">TraceDict</span><span class="p">([(</span><span class="s1">&#39;hi&#39;</span><span class="p">,</span> <span class="mi">1</span><span class="p">)])</span> <span class="n">trace_dict</span><span class="p">[</span><span class="s1">&#39;there&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="mi">2</span> <span class="n">trace_dict</span><span class="p">[</span><span class="s1">&#39;hi&#39;</span><span class="p">]</span> <span class="k">try</span><span class="p">:</span> <span class="n">trace_dict</span><span class="p">[</span><span class="s1">&#39;does not exist&#39;</span><span class="p">]</span> <span class="k">except</span> <span class="ne">KeyError</span><span class="p">:</span> <span class="k">pass</span> <span class="c1"># Expected</span> </pre></div> <div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt;</span> <span class="go">__init_subclass__((), {}) -&gt; None</span> <span class="go">__new__((&lt;class &#39;__main__.TraceDict&#39;&gt;, [(&#39;hi&#39;, 1)]), {}) -&gt; {}</span> <span class="go">__getitem__(({&#39;hi&#39;: 1, &#39;there&#39;: 2}, &#39;hi&#39;), {}) -&gt; 1</span> <span class="go">__getitem__(({&#39;hi&#39;: 1, &#39;there&#39;: 2}, &#39;does not exist&#39;), {}) -&gt; KeyError(&#39;does not exist&#39;)</span> </pre></div> <p>But this won’t work if the metaclass is from a library that I can’t modify, or if I want to use multiple utility metaclasses like <code>TraceMeta</code> at the same time. The metaclass approach puts too many constraints on the class that’s being modified.</p> <p>To solve this problem, Python supports <em>class decorators</em>. Class decorators work just like function decorators: They’re applied with the <code>@</code> symbol prefixing a function before the class declaration. The function is expected to modify or re-create the class accordingly and then return it:</p> <div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">my_class_decorator</span><span class="p">(</span><span class="n">klass</span><span class="p">):</span> <span class="n">klass</span><span class="o">.</span><span class="n">extra_param</span> <span class="o">=</span> <span class="s1">&#39;hello&#39;</span> <span class="k">return</span> <span class="n">klass</span> <span class="nd">@my_class_decorator</span> <span class="k">class</span> <span class="nc">MyClass</span><span class="p">:</span> <span class="k">pass</span> <span class="nb">print</span><span class="p">(</span><span class="n">MyClass</span><span class="p">)</span> <span class="nb">print</span><span class="p">(</span><span class="n">MyClass</span><span class="o">.</span><span class="n">extra_param</span><span class="p">)</span> </pre></div> <div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt;</span> <span class="go">&lt;class &#39;__main__.MyClass&#39;&gt;</span> <span class="go">hello</span> </pre></div> <p>I can implement a class decorator to apply <code>trace_func</code> to all methods and functions of a class by moving the core of the <code>TraceMeta.__new__</code> method above into a stand-alone function. This implementation is much shorter than the metaclass version:</p> <div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">trace</span><span class="p">(</span><span class="n">klass</span><span class="p">):</span> <span class="k">for</span> <span class="n">key</span> <span class="ow">in</span> <span class="nb">dir</span><span class="p">(</span><span class="n">klass</span><span class="p">):</span> <span class="n">value</span> <span class="o">=</span> <span class="nb">getattr</span><span class="p">(</span><span class="n">klass</span><span class="p">,</span> <span class="n">key</span><span class="p">)</span> <span class="k">if</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="n">trace_types</span><span class="p">):</span> <span class="n">wrapped</span> <span class="o">=</span> <span class="n">trace_func</span><span class="p">(</span><span class="n">value</span><span class="p">)</span> <span class="nb">setattr</span><span class="p">(</span><span class="n">klass</span><span class="p">,</span> <span class="n">key</span><span class="p">,</span> <span class="n">wrapped</span><span class="p">)</span> <span class="k">return</span> <span class="n">klass</span> </pre></div> <p>I can apply this decorator to my <code>dict</code> subclass to get the same behavior as I get by using the metaclass approach above:</p> <div class="highlight"><pre><span></span><span class="nd">@trace</span> <span class="k">class</span> <span class="nc">TraceDict</span><span class="p">(</span><span class="nb">dict</span><span class="p">):</span> <span class="k">pass</span> <span class="n">trace_dict</span> <span class="o">=</span> <span class="n">TraceDict</span><span class="p">([(</span><span class="s1">&#39;hi&#39;</span><span class="p">,</span> <span class="mi">1</span><span class="p">)])</span> <span class="n">trace_dict</span><span class="p">[</span><span class="s1">&#39;there&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="mi">2</span> <span class="n">trace_dict</span><span class="p">[</span><span class="s1">&#39;hi&#39;</span><span class="p">]</span> <span class="k">try</span><span class="p">:</span> <span class="n">trace_dict</span><span class="p">[</span><span class="s1">&#39;does not exist&#39;</span><span class="p">]</span> <span class="k">except</span> <span class="ne">KeyError</span><span class="p">:</span> <span class="k">pass</span> <span class="c1"># Expected</span> </pre></div> <div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt;</span> <span class="go">__new__((&lt;class &#39;__main__.TraceDict&#39;&gt;, [(&#39;hi&#39;, 1)]), {}) -&gt; {}</span> <span class="go">__getitem__(({&#39;hi&#39;: 1, &#39;there&#39;: 2}, &#39;hi&#39;), {}) -&gt; 1</span> <span class="go">__getitem__(({&#39;hi&#39;: 1, &#39;there&#39;: 2}, &#39;does not exist&#39;), {}) -&gt; KeyError(&#39;does not exist&#39;)</span> </pre></div> <p>Class decorators also work when the class being decorated already has a metaclass:</p> <div class="highlight"><pre><span></span><span class="k">class</span> <span class="nc">OtherMeta</span><span class="p">(</span><span class="nb">type</span><span class="p">):</span> <span class="k">pass</span> <span class="nd">@trace</span> <span class="k">class</span> <span class="nc">TraceDict</span><span class="p">(</span><span class="nb">dict</span><span class="p">,</span> <span class="n">metaclass</span><span class="o">=</span><span class="n">OtherMeta</span><span class="p">):</span> <span class="k">pass</span> <span class="n">trace_dict</span> <span class="o">=</span> <span class="n">TraceDict</span><span class="p">([(</span><span class="s1">&#39;hi&#39;</span><span class="p">,</span> <span class="mi">1</span><span class="p">)])</span> <span class="n">trace_dict</span><span class="p">[</span><span class="s1">&#39;there&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="mi">2</span> <span class="n">trace_dict</span><span class="p">[</span><span class="s1">&#39;hi&#39;</span><span class="p">]</span> <span class="k">try</span><span class="p">:</span> <span class="n">trace_dict</span><span class="p">[</span><span class="s1">&#39;does not exist&#39;</span><span class="p">]</span> <span class="k">except</span> <span class="ne">KeyError</span><span class="p">:</span> <span class="k">pass</span> <span class="c1"># Expected</span> </pre></div> <div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt;</span> <span class="go">__new__((&lt;class &#39;__main__.TraceDict&#39;&gt;, [(&#39;hi&#39;, 1)]), {}) -&gt; {}</span> <span class="go">__getitem__(({&#39;hi&#39;: 1, &#39;there&#39;: 2}, &#39;hi&#39;), {}) -&gt; 1</span> <span class="go">__getitem__(({&#39;hi&#39;: 1, &#39;there&#39;: 2}, &#39;does not exist&#39;), {}) -&gt; KeyError(&#39;does not exist&#39;)</span> </pre></div> <p>When you’re looking for composable ways to extend classes, class decorators are the best tool for the job. (See Item 73: “Know How to Use <code>heapq</code> for Priority Queues” for a useful class decorator called <code>functools.total_ordering</code>.)</p> <h3>Things to Remember</h3> <ul> <li>A class decorator is a simple function that receives a <code>class</code> instance as a parameter and returns either a new class or a modified version of the original class.</li> <li>Class decorators are useful when you want to modify every method or attribute of a class with minimal boilerplate.</li> <li>Metaclasses can’t be composed together easily, while many class decorators can be used to extend the same class without conflicts.</li> </ul>Digital Versions of the 2nd Edition are Now Available2019-11-06T12:00:00-08:002019-11-06T19:30:00-08:00Brett Slatkintag:effectivepython.com,2019-11-06:/2019/11/06/digital-second-edition-available/<p>You can immediately read the second edition of <em>Effective Python</em> as a <a target="_blank" href="https://click.linksynergy.com/link?id=YvEWtFaKGwg&offerid=145238.2820370&type=2&murl=http%3A%2F%2Fwww.informit.com%2Ftitle%2F9780134854762"><span class="caps">DRM</span>-free eBook</a>, <a target="_blank" href="https://amzn.to/2pQOqy2">on Kindle</a>, <a target="_blank" href="https://play.google.com/store/books/details/Brett_Slatkin_Effective_Python?id=9kG4DwAAQBAJ">on Google Play</a>, and on <a href="https://learning.oreilly.com/library/view/effective-python-90/9780134854717/">O&#8217;Reilly Online Learning</a>.</p>Item 74: Consider memoryview and bytearray for Zero-Copy Interactions with bytes2019-10-22T05:00:00-07:002019-10-22T05:00:00-07:00Brett Slatkintag:effectivepython.com,2019-10-22:/2019/10/22/memoryview-bytearray-zero-copy-interactions/ <p>Though Python isn’t able to parallelize <span class="caps">CPU</span>-bound computation without extra effort (see Item 64: “Consider <code>concurrent.futures</code> for True Parallelism”), it is able to support high-throughput, parallel I/O in a variety of ways (see Item 53: “Use Threads for Blocking I/O, Avoid for Parallelism” and Item 60: “Achieve Highly Concurrent I/O with Coroutines” for details). That said, it’s surprisingly easy to use these I/O tools the wrong way and reach the conclusion that the language is too slow for even I/O-bound workloads.<p><strong>This sample is from a previous version of the book. <a href="https://effectivepython.com/">See the new third edition here</a>.</strong><br><br></p> <p>Though Python isn’t able to parallelize <span class="caps">CPU</span>-bound computation without extra effort (see Item 64: “Consider <code>concurrent.futures</code> for True Parallelism”), it is able to support high-throughput, parallel I/O in a variety of ways (see Item 53: “Use Threads for Blocking I/O, Avoid for Parallelism” and Item 60: “Achieve Highly Concurrent I/O with Coroutines” for details). That said, it’s surprisingly easy to use these I/O tools the wrong way and reach the conclusion that the language is too slow for even I/O-bound workloads.</p> <p>For example, say that you’re building a media server to stream television or movies over a network to users so they can watch without having to download the video data in advance. One of the key features of such a system is the ability for users to move forward or backward in the video playback so they can skip or repeat parts. In the client program I can implement this by requesting a chunk of data from the server corresponding to the new time index selected by the user:</p> <div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">timecode_to_index</span><span class="p">(</span><span class="n">video_id</span><span class="p">,</span> <span class="n">timecode</span><span class="p">):</span> <span class="o">...</span> <span class="c1"># Returns the byte offset in the video data</span> <span class="k">def</span> <span class="nf">request_chunk</span><span class="p">(</span><span class="n">video_id</span><span class="p">,</span> <span class="n">byte_offset</span><span class="p">,</span> <span class="n">size</span><span class="p">):</span> <span class="o">...</span> <span class="c1"># Returns size bytes of video_id&#39;s data from the offset</span> <span class="n">video_id</span> <span class="o">=</span> <span class="o">...</span> <span class="n">timecode</span> <span class="o">=</span> <span class="s1">&#39;01:09:14:28&#39;</span> <span class="n">byte_offset</span> <span class="o">=</span> <span class="n">timecode_to_index</span><span class="p">(</span><span class="n">video_id</span><span class="p">,</span> <span class="n">timecode</span><span class="p">)</span> <span class="n">size</span> <span class="o">=</span> <span class="mi">20</span> <span class="o">*</span> <span class="mi">1024</span> <span class="o">*</span> <span class="mi">1024</span> <span class="n">video_data</span> <span class="o">=</span> <span class="n">request_chunk</span><span class="p">(</span><span class="n">video_id</span><span class="p">,</span> <span class="n">byte_offset</span><span class="p">,</span> <span class="n">size</span><span class="p">)</span> </pre></div> <p>How would you implement the server-side handler that receives the <code>request_chunk</code> request and returns the corresponding <span class="caps">20MB</span> chunk of video data? For the sake of this example, I’m going to assume that the command and control parts of the server have already been hooked up (see Item 61: “Know How to Port Threaded I/O to <code>asyncio</code>” for what that requires). I’m going to focus on the last steps where the requested chunk is extracted from gigabytes of video data that’s cached in memory, and is then sent over a socket back to the client. Here’s what the implementation would look like:</p> <div class="highlight"><pre><span></span><span class="n">socket</span> <span class="o">=</span> <span class="o">...</span> <span class="c1"># socket connection to client</span> <span class="n">video_data</span> <span class="o">=</span> <span class="o">...</span> <span class="c1"># bytes containing data for video_id</span> <span class="n">byte_offset</span> <span class="o">=</span> <span class="o">...</span> <span class="c1"># Requested starting position</span> <span class="n">size</span> <span class="o">=</span> <span class="mi">20</span> <span class="o">*</span> <span class="mi">1024</span> <span class="o">*</span> <span class="mi">1024</span> <span class="c1"># Requested chunk size</span> <span class="n">chunk</span> <span class="o">=</span> <span class="n">video_data</span><span class="p">[</span><span class="n">byte_offset</span><span class="p">:</span><span class="n">byte_offset</span> <span class="o">+</span> <span class="n">size</span><span class="p">]</span> <span class="n">socket</span><span class="o">.</span><span class="n">send</span><span class="p">(</span><span class="n">chunk</span><span class="p">)</span> </pre></div> <p>The latency and throughput of this code will come down to two factors: how much time it takes to slice the <span class="caps">20MB</span> video <code>chunk</code> from <code>video_data</code>, and how much time the socket takes to transmit that data to the client. If I assume that the socket is infinitely fast, I can run a micro-benchmark using the <code>timeit</code> built-in module to understand the performance characteristics of slicing <code>bytes</code> instances this way to create chunks (see Item 11: “Know How to Slice Sequences” for background).</p> <div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">timeit</span> <span class="k">def</span> <span class="nf">run_test</span><span class="p">():</span> <span class="n">chunk</span> <span class="o">=</span> <span class="n">video_data</span><span class="p">[</span><span class="n">byte_offset</span><span class="p">:</span><span class="n">byte_offset</span> <span class="o">+</span> <span class="n">size</span><span class="p">]</span> <span class="c1"># Call socket.send(chunk), but ignoring for benchmark</span> <span class="n">result</span> <span class="o">=</span> <span class="n">timeit</span><span class="o">.</span><span class="n">timeit</span><span class="p">(</span> <span class="n">stmt</span><span class="o">=</span><span class="s1">&#39;run_test()&#39;</span><span class="p">,</span> <span class="nb">globals</span><span class="o">=</span><span class="nb">globals</span><span class="p">(),</span> <span class="n">number</span><span class="o">=</span><span class="mi">100</span><span class="p">)</span> <span class="o">/</span> <span class="mi">100</span> <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s1">&#39;</span><span class="si">{</span><span class="n">result</span><span class="si">:</span><span class="s1">0.9f</span><span class="si">}</span><span class="s1"> seconds&#39;</span><span class="p">)</span> </pre></div> <div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt;</span> <span class="go">0.004925669 seconds</span> </pre></div> <p>It took roughly 5 milliseconds to extract the <span class="caps">20MB</span> slice of data to transmit to the client. That means the overall throughput of my server is limited to a theoretical maximum of <span class="caps">20MB</span> / 5 milliseconds = 7.<span class="caps">3GB</span> / second, since that’s the fastest I can extract the video data from memory. My server will also be limited to 1 <span class="caps">CPU</span>-second / 5 milliseconds = 200 clients requesting new chunks in parallel, which is tiny compared to the tens of thousands of simultaneous connections that tools like the <code>asyncio</code> built-in module can support. The problem is that slicing a <code>bytes</code> instance causes the underlying data to be copied, which takes <span class="caps">CPU</span> time.</p> <p>A better way to write this code is using Python’s built-in <code>memoryview</code> type, which exposes CPython’s high-performance <em>buffer protocol</em> to programs. The buffer protocol is a low-level C <span class="caps">API</span> that allows the Python runtime and C extensions to access the underlying data buffers that are behind objects like <code>bytes</code> instances. The best part about <code>memoryview</code> instances is that slicing them results in another <code>memoryview</code> instance without copying the underlying data. Here, I create a <code>memoryview</code> wrapping a <code>bytes</code> instance and inspect a slice of it:</p> <div class="highlight"><pre><span></span><span class="n">data</span> <span class="o">=</span> <span class="sa">b</span><span class="s1">&#39;shave and a haircut, two bits&#39;</span> <span class="n">view</span> <span class="o">=</span> <span class="nb">memoryview</span><span class="p">(</span><span class="n">data</span><span class="p">)</span> <span class="n">chunk</span> <span class="o">=</span> <span class="n">view</span><span class="p">[</span><span class="mi">12</span><span class="p">:</span><span class="mi">19</span><span class="p">]</span> <span class="nb">print</span><span class="p">(</span><span class="n">chunk</span><span class="p">)</span> <span class="nb">print</span><span class="p">(</span><span class="s1">&#39;Size: &#39;</span><span class="p">,</span> <span class="n">chunk</span><span class="o">.</span><span class="n">nbytes</span><span class="p">)</span> <span class="nb">print</span><span class="p">(</span><span class="s1">&#39;Data in view: &#39;</span><span class="p">,</span> <span class="n">chunk</span><span class="o">.</span><span class="n">tobytes</span><span class="p">())</span> <span class="nb">print</span><span class="p">(</span><span class="s1">&#39;Underlying data:&#39;</span><span class="p">,</span> <span class="n">chunk</span><span class="o">.</span><span class="n">obj</span><span class="p">)</span> </pre></div> <div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt;</span> <span class="go">&lt;memory at 0x105d6ba00&gt;</span> <span class="go">Size: 7</span> <span class="go">Data in view: b&#39;haircut&#39;</span> <span class="go">Underlying data: b&#39;shave and a haircut, two bits&#39;</span> </pre></div> <p>By enabling <em>zero-copy</em> operations, <code>memoryview</code> can provide enormous speedups for code that needs to quickly process large amounts of memory, such as numerical C-extensions like NumPy and I/O bound programs like this one. Here, I replace the simple <code>bytes</code> slicing above with <code>memoryview</code> slicing instead, and repeat the same micro-benchmark:</p> <div class="highlight"><pre><span></span><span class="n">video_view</span> <span class="o">=</span> <span class="nb">memoryview</span><span class="p">(</span><span class="n">video_data</span><span class="p">)</span> <span class="k">def</span> <span class="nf">run_test</span><span class="p">():</span> <span class="n">chunk</span> <span class="o">=</span> <span class="n">video_view</span><span class="p">[</span><span class="n">byte_offset</span><span class="p">:</span><span class="n">byte_offset</span> <span class="o">+</span> <span class="n">size</span><span class="p">]</span> <span class="c1"># Call socket.send(chunk), but ignoring for benchmark</span> <span class="n">result</span> <span class="o">=</span> <span class="n">timeit</span><span class="o">.</span><span class="n">timeit</span><span class="p">(</span> <span class="n">stmt</span><span class="o">=</span><span class="s1">&#39;run_test()&#39;</span><span class="p">,</span> <span class="nb">globals</span><span class="o">=</span><span class="nb">globals</span><span class="p">(),</span> <span class="n">number</span><span class="o">=</span><span class="mi">100</span><span class="p">)</span> <span class="o">/</span> <span class="mi">100</span> <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s1">&#39;</span><span class="si">{</span><span class="n">result</span><span class="si">:</span><span class="s1">0.9f</span><span class="si">}</span><span class="s1"> seconds&#39;</span><span class="p">)</span> </pre></div> <div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt;</span> <span class="go">0.000000250 seconds</span> </pre></div> <p>The result is 250 nanoseconds. Now the theoretical maximum throughput of my server is <span class="caps">20MB</span> / 250 nanoseconds = 164 <span class="caps">TB</span>/second. For parallel clients, I can theoretically support up to 1 <span class="caps">CPU</span>-second / 250 nanoseconds = 4 million. That’s more like it! This means that now my program is entirely bound by the underlying performance of the socket connection to the client, not by <span class="caps">CPU</span> constraints.</p> <p>Now, imagine that the data must flow in the other direction, where some clients are sending live video streams to the server in order to broadcast them to other users. In order to do this, I need to store the latest video data from the user in a cache that other clients can read from. Here’s what the implementation of reading <span class="caps">1MB</span> of new data from the incoming client would look like:</p> <div class="highlight"><pre><span></span><span class="n">socket</span> <span class="o">=</span> <span class="o">...</span> <span class="c1"># socket connection to the client</span> <span class="n">video_cache</span> <span class="o">=</span> <span class="o">...</span> <span class="c1"># Cache of incoming video stream</span> <span class="n">byte_offset</span> <span class="o">=</span> <span class="o">...</span> <span class="c1"># Incoming buffer position</span> <span class="n">size</span> <span class="o">=</span> <span class="mi">1024</span> <span class="o">*</span> <span class="mi">1024</span> <span class="c1"># Incoming chunk size</span> <span class="n">chunk</span> <span class="o">=</span> <span class="n">socket</span><span class="o">.</span><span class="n">recv</span><span class="p">(</span><span class="n">size</span><span class="p">)</span> <span class="n">video_view</span> <span class="o">=</span> <span class="nb">memoryview</span><span class="p">(</span><span class="n">video_cache</span><span class="p">)</span> <span class="n">before</span> <span class="o">=</span> <span class="n">video_view</span><span class="p">[:</span><span class="n">byte_offset</span><span class="p">]</span> <span class="n">after</span> <span class="o">=</span> <span class="n">video_view</span><span class="p">[</span><span class="n">byte_offset</span> <span class="o">+</span> <span class="n">size</span><span class="p">:]</span> <span class="n">new_cache</span> <span class="o">=</span> <span class="sa">b</span><span class="s1">&#39;&#39;</span><span class="o">.</span><span class="n">join</span><span class="p">([</span><span class="n">before</span><span class="p">,</span> <span class="n">chunk</span><span class="p">,</span> <span class="n">after</span><span class="p">])</span> </pre></div> <p>The <code>socket.recv</code> method will return a <code>bytes</code> instance. I can splice the new data with the existing cache at the current <code>byte_offset</code> by using simple slicing operations and the &#8216;bytes.join&#8217; method. To understand the performance of this, I can run another micro-benchmark. I’m using a dummy socket so the performance test is only for the memory operations, not the I/O interaction.</p> <div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">run_test</span><span class="p">():</span> <span class="n">chunk</span> <span class="o">=</span> <span class="n">socket</span><span class="o">.</span><span class="n">recv</span><span class="p">(</span><span class="n">size</span><span class="p">)</span> <span class="n">before</span> <span class="o">=</span> <span class="n">video_view</span><span class="p">[:</span><span class="n">byte_offset</span><span class="p">]</span> <span class="n">after</span> <span class="o">=</span> <span class="n">video_view</span><span class="p">[</span><span class="n">byte_offset</span> <span class="o">+</span> <span class="n">size</span><span class="p">:]</span> <span class="n">new_cache</span> <span class="o">=</span> <span class="sa">b</span><span class="s1">&#39;&#39;</span><span class="o">.</span><span class="n">join</span><span class="p">([</span><span class="n">before</span><span class="p">,</span> <span class="n">chunk</span><span class="p">,</span> <span class="n">after</span><span class="p">])</span> <span class="n">result</span> <span class="o">=</span> <span class="n">timeit</span><span class="o">.</span><span class="n">timeit</span><span class="p">(</span> <span class="n">stmt</span><span class="o">=</span><span class="s1">&#39;run_test()&#39;</span><span class="p">,</span> <span class="nb">globals</span><span class="o">=</span><span class="nb">globals</span><span class="p">(),</span> <span class="n">number</span><span class="o">=</span><span class="mi">100</span><span class="p">)</span> <span class="o">/</span> <span class="mi">100</span> <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s1">&#39;</span><span class="si">{</span><span class="n">result</span><span class="si">:</span><span class="s1">0.9f</span><span class="si">}</span><span class="s1"> seconds&#39;</span><span class="p">)</span> </pre></div> <div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt;</span> <span class="go">0.033520550 seconds</span> </pre></div> <p>It takes 33 milliseconds to receive <span class="caps">1MB</span> and update the video cache. That means my maximum receive throughput is <span class="caps">1MB</span> / 33 milliseconds = <span class="caps">31MB</span> / second, and I’m limited to <span class="caps">31MB</span> / <span class="caps">1MB</span> = 31 simultaneous clients streaming in video data this way. This doesn’t scale.</p> <p>A better way to write this code is to use Python’s built-in <code>bytearray</code> type in conjunction with <code>memoryview</code>. One limitation with <code>bytes</code> instances is that they are read-only, and don’t allow for individual indexes to be updated.</p> <div class="highlight"><pre><span></span><span class="n">my_bytes</span> <span class="o">=</span> <span class="sa">b</span><span class="s1">&#39;hello&#39;</span> <span class="n">my_bytes</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">=</span> <span class="sa">b</span><span class="s1">&#39;</span><span class="se">\x79</span><span class="s1">&#39;</span> </pre></div> <div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt;</span> <span class="go">Traceback ...</span> <span class="go">TypeError: &#39;bytes&#39; object does not support item assignment</span> </pre></div> <p>The <code>bytearray</code> type is like a mutable version of <code>bytes</code> that allows for arbitrary positions to be overwritten. <code>bytearray</code> uses integers for its values instead of <code>bytes</code>.</p> <div class="highlight"><pre><span></span><span class="n">my_array</span> <span class="o">=</span> <span class="nb">bytearray</span><span class="p">(</span><span class="sa">b</span><span class="s1">&#39;hello&#39;</span><span class="p">)</span> <span class="n">my_array</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">=</span> <span class="mh">0x79</span> <span class="nb">print</span><span class="p">(</span><span class="n">my_array</span><span class="p">)</span> </pre></div> <div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt;</span> <span class="go">bytearray(b&#39;yello&#39;)</span> </pre></div> <p>A <code>memoryview</code> can also be used to wrap a <code>bytearray</code>. When you slice such a <code>memoryview</code>, the resulting object can be used to assign data to a particular portion of the underlying buffer. This avoids the copying costs from above that were required to splice the <code>bytes</code> instances back together after data was received from the client.</p> <div class="highlight"><pre><span></span><span class="n">my_array</span> <span class="o">=</span> <span class="nb">bytearray</span><span class="p">(</span><span class="sa">b</span><span class="s1">&#39;row, row, row your boat&#39;</span><span class="p">)</span> <span class="n">my_view</span> <span class="o">=</span> <span class="nb">memoryview</span><span class="p">(</span><span class="n">my_array</span><span class="p">)</span> <span class="n">write_view</span> <span class="o">=</span> <span class="n">my_view</span><span class="p">[</span><span class="mi">3</span><span class="p">:</span><span class="mi">13</span><span class="p">]</span> <span class="n">write_view</span><span class="p">[:]</span> <span class="o">=</span> <span class="sa">b</span><span class="s1">&#39;-10 bytes-&#39;</span> <span class="nb">print</span><span class="p">(</span><span class="n">my_array</span><span class="p">)</span> </pre></div> <div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt;</span> <span class="go">bytearray(b&#39;row-10 bytes- your boat&#39;)</span> </pre></div> <p>There are many libraries in Python that use the buffer protocol to receive or read data quickly, such as <code>socket.recv_into</code> and <code>RawIOBase.readinto</code>. The benefit of these methods is that they avoid allocating memory and creating another copy of the data—what’s received goes straight into an existing buffer. Here, I use <code>socket.recv_into</code> along with a <code>memoryview</code> slice to receive data into an underlying <code>bytearray</code> without the need for any splicing:</p> <div class="highlight"><pre><span></span><span class="n">video_array</span> <span class="o">=</span> <span class="nb">bytearray</span><span class="p">(</span><span class="n">video_cache</span><span class="p">)</span> <span class="n">write_view</span> <span class="o">=</span> <span class="nb">memoryview</span><span class="p">(</span><span class="n">video_array</span><span class="p">)</span> <span class="n">chunk</span> <span class="o">=</span> <span class="n">write_view</span><span class="p">[</span><span class="n">byte_offset</span><span class="p">:</span><span class="n">byte_offset</span> <span class="o">+</span> <span class="n">size</span><span class="p">]</span> <span class="n">socket</span><span class="o">.</span><span class="n">recv_into</span><span class="p">(</span><span class="n">chunk</span><span class="p">)</span> </pre></div> <p>I can run another micro-benchmark to compare the performance of this approach to the earlier example that used <code>socket.recv</code>.</p> <div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">run_test</span><span class="p">():</span> <span class="n">chunk</span> <span class="o">=</span> <span class="n">write_view</span><span class="p">[</span><span class="n">byte_offset</span><span class="p">:</span><span class="n">byte_offset</span> <span class="o">+</span> <span class="n">size</span><span class="p">]</span> <span class="n">socket</span><span class="o">.</span><span class="n">recv_into</span><span class="p">(</span><span class="n">chunk</span><span class="p">)</span> <span class="n">result</span> <span class="o">=</span> <span class="n">timeit</span><span class="o">.</span><span class="n">timeit</span><span class="p">(</span> <span class="n">stmt</span><span class="o">=</span><span class="s1">&#39;run_test()&#39;</span><span class="p">,</span> <span class="nb">globals</span><span class="o">=</span><span class="nb">globals</span><span class="p">(),</span> <span class="n">number</span><span class="o">=</span><span class="mi">100</span><span class="p">)</span> <span class="o">/</span> <span class="mi">100</span> <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s1">&#39;</span><span class="si">{</span><span class="n">result</span><span class="si">:</span><span class="s1">0.9f</span><span class="si">}</span><span class="s1"> seconds&#39;</span><span class="p">)</span> </pre></div> <div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt;</span> <span class="go">0.000033925 seconds</span> </pre></div> <p>It took 33 microseconds to receive a <span class="caps">1MB</span> video transmission. That means my server can support <span class="caps">1MB</span> / 33 microseconds = <span class="caps">31GB</span> / second of max throughput, and <span class="caps">31GB</span> / <span class="caps">1MB</span> = 31,000 parallel streaming clients. That’s the type of scalability that I’m looking for!</p> <h3>Things to Remember</h3> <ul> <li>The <code>memoryview</code> built-in type provides a zero-copy interface for reading and writing slices of objects that support Python’s high performance buffer protocol.</li> <li>The <code>bytearray</code> built-in type provides a mutable <code>bytes</code>-like type that can be used for zero-copy data reads with functions like <code>socket.recv_from</code>.</li> <li>A <code>memoryview</code> can wrap a <code>bytearray</code>, allowing for received data to be spliced into an arbitrary buffer location without copying costs.</li> </ul>Preorder the Second Edition2019-10-21T23:00:00-07:002019-10-21T23:00:00-07:00Brett Slatkintag:effectivepython.com,2019-10-21:/2019/10/21/preorder-second-edition/<p><em>Effective Python: Second Edition</em> is now available for preorder. <a target="_blank" href="https://amzn.to/30fJtiG">Follow this link to buy your copy in advance</a>. It will ship in mid-November (2019) once the book has finished printing and is stocked in the warehouse. <a target="_blank" href="https://click.linksynergy.com/link?id=YvEWtFaKGwg&offerid=145238.2820370&type=2&murl=http%3A%2F%2Fwww.informit.com%2Ftitle%2F9780134854762">Digital editions will become available</a> when the physical book ships or sooner.</p>Секреты Python2019-08-07T09:00:00-07:002019-08-07T09:00:00-07:00Brett Slatkintag:effectivepython.com,2019-08-07:/2019/08/07/секреты-python-ru/<p><a href="https://www.ozon.ru/context/detail/id/136880759/"><img class="learn-more-photo" alt="Секреты Python" src="https://effectivepython.com/images/cover_ru.jpg"></a></p> <p>The publishing house Вильямс has translated and released a Russian version of <em>Effective Python</em>. You can buy the book <a href="https://www.ozon.ru/context/detail/id/136880759/">online from Ozon</a> and learn more on <a href="http://www.williamspublishing.com/Books/978-5-8459-2078-2.html">the publisher website</a>.</p>Python Eficaz2016-08-06T18:45:00-07:002016-08-06T18:45:00-07:00Brett Slatkintag:effectivepython.com,2016-08-06:/2016/08/06/python-eficaz-português/<p><a href="http://novatec.com.br/livros/python-eficaz/"><img class="learn-more-photo" alt="Python Eficaz" src="https://effectivepython.com/images/cover_pt_br.jpg"></a></p> <p>The publishing house Novatec Editora has translated and released a Portuguese version of <em>Effective Python</em>. You can buy the book <a href="http://novatec.com.br/livros/python-eficaz/">directly from the publisher</a>.</p>Effective Python 简体中文2016-05-07T09:00:00-07:002016-05-07T09:00:00-07:00Brett Slatkintag:effectivepython.com,2016-05-07:/2016/05/07/effective-python-hans/<p><a href="https://www.amazon.cn/Effective-Python-%E7%BC%96%E5%86%99%E9%AB%98%E8%B4%A8%E9%87%8FPython%E4%BB%A3%E7%A0%81%E7%9A%8459%E4%B8%AA%E6%9C%89%E6%95%88%E6%96%B9%E6%B3%95-%E5%B8%83%E9%9B%B7%E7%89%B9%C2%B7%E6%96%AF%E6%8B%89%E7%89%B9%E9%87%91/dp/B01ASI36QS"><img class="learn-more-photo" alt="Effective Python 简体中文" src="https://effectivepython.com/images/cover_zh_hans.jpg"></a></p> <p>The publishing house 机械工业出版社 (China Machine Press) has translated and released a Chinese (Simplified) version of <em>Effective Python</em>. You can buy the book <a href="http://www.cmpbook.com/stackroom.php?id=41800">directly from the publisher</a> or <a href="https://www.amazon.cn/Effective-Python-%E7%BC%96%E5%86%99%E9%AB%98%E8%B4%A8%E9%87%8FPython%E4%BB%A3%E7%A0%81%E7%9A%8459%E4%B8%AA%E6%9C%89%E6%95%88%E6%96%B9%E6%B3%95-%E5%B8%83%E9%9B%B7%E7%89%B9%C2%B7%E6%96%AF%E6%8B%89%E7%89%B9%E9%87%91/dp/B01ASI36QS">get it on Amazon.cn</a>.</p>Effective Python 파이썬 코딩의 기술2016-04-03T17:00:00-07:002016-04-03T17:00:00-07:00Brett Slatkintag:effectivepython.com,2016-04-03:/2016/04/03/effective-python-kr/<p><a href="https://www.gilbut.co.kr/book/view?bookcode=BN001430&keyword=python&collection=GB_BOOK"><img class="learn-more-photo" alt="Effective Python 파이썬 코딩의 기술" src="https://effectivepython.com/images/cover_kr.jpg"></a></p> <p>The publishing house Gilbut Inc. has translated and released a Korean version of <em>Effective Python</em>. <a href="https://www.gilbut.co.kr/book/view?bookcode=BN001430&amp;keyword=python&amp;collection=GB_BOOK">The publisher&#8217;s website</a> links to many different retailers online where you can buy the book.</p>Effective Python 日本語2016-01-23T09:00:00-08:002016-01-23T09:00:00-08:00Brett Slatkintag:effectivepython.com,2016-01-23:/2016/01/23/effective-python-jp/<p><a href="http://www.amazon.co.jp/exec/obidos/ASIN/4873117569"><img class="learn-more-photo" alt="Effective Python 日本語" src="https://effectivepython.com/images/cover_jp.jpg"></a></p> <p>The publishing house O&#8217;Reilly Japan has translated and released a Japanese version of <em>Effective Python</em>. You can buy the book <a href="http://www.oreilly.co.jp/books/9784873117560/">directly from the publisher</a> or <a href="http://www.amazon.co.jp/exec/obidos/ASIN/4873117569">get it on Amazon.jp</a>.</p>Efektywny Python2015-11-20T09:00:00-08:002015-11-15T09:00:00-08:00Brett Slatkintag:effectivepython.com,2015-11-20:/2015/11/20/efektywny-python-pl/<p><a href="http://helion.pl/ksiazki/efektywny-python-59-sposobow-na-lepszy-kod-brett-slatkin,efepyt.htm"><img class="learn-more-photo" alt="Efektywny Python" src="https://effectivepython.com/images/cover_pl.jpg"></a></p> <p>The publishing house Helion has translated and released a Polish version of <em>Effective Python</em>. You can buy the book <a href="http://helion.pl/ksiazki/efektywny-python-59-sposobow-na-lepszy-kod-brett-slatkin,efepyt.htm">directly from the publisher</a>.</p>Effektiv Python ​programmieren2015-11-15T21:40:00-08:002015-11-15T21:40:00-08:00Brett Slatkintag:effectivepython.com,2015-11-15:/2015/11/15/effektiv-python-deutsch/<p><a href="http://www.amazon.de/Effektiv-Python-programmieren-mitp-Professional/dp/3958451810"><img class="learn-more-photo" alt="Effektiv Python programmieren" src="https://effectivepython.com/images/cover_de.jpg"></a></p> <p>The publishing house mitp-Verlag has translated and released a German version of <em>Effective Python</em>. You can buy the book <a href="http://www.mitp.de/IT-Web/Programmierung/Effektiv-Python-programmieren.html">directly from the publisher</a> or <a href="http://www.amazon.de/Effektiv-Python-programmieren-mitp-Professional/dp/3958451810">get it on Amazon.de</a> (including Kindle edition).</p>Talk Python To Me Podcast2015-09-09T09:00:00-07:002015-09-09T09:00:00-07:00Brett Slatkintag:effectivepython.com,2015-09-09:/2015/09/09/talk-python-to-me-podcast/<p>I was invited on to the <a href="https://talkpython.fm/episodes/show/25/effective-python">Talk Python To Me Podcast</a> to talk about <em>Effective Python</em>. You can <a href="https://talkpython.fm/episodes/transcript/25/effective-python">read the full transcript here</a> or listen to the audio embedded below. Thanks to Michael Kennedy for being such a welcoming host.</p> <p><audio controls> <source src="https://talkpython.fm/episodes/download/25/effective-python.mp3" type="audio/mpeg"> </audio></p>Effective Python 中文版2015-08-27T09:00:00-07:002015-08-27T09:00:00-07:00Brett Slatkintag:effectivepython.com,2015-08-27:/2015/08/27/effective-python-cn/<p><a href="http://books.gotop.com.tw/v_ICL043700"><img class="learn-more-photo" alt="Effective Python 中文版" src="https://effectivepython.com/images/cover_zh_hant.jpg"></a></p> <p>The publishing house 碁峰 (Acer Peak) has translated and released a Chinese (Traditional) version of <em>Effective Python</em>. You can buy the book <a href="http://books.gotop.com.tw/v_ICL043700">directly from the publisher</a> or <a href="https://play.google.com/store/books/details/Brett_Slatkin_Effective_Python_%E4%B8%AD%E6%96%87%E7%89%88_%E5%AF%AB%E5%87%BA%E8%89%AF%E5%A5%BD_Python_%E7%A8%8B%E5%BC%8F%E7%9A%84?id=V2m7CgAAQBAJ">get a digital edition on Google Play</a>.</p>Live Lessons Video2015-08-04T09:00:00-07:002015-08-04T09:00:00-07:00Brett Slatkintag:effectivepython.com,2015-08-04:/2015/08/04/live-lessons-video/<p><a href="http://www.informit.com/store/effective-python-livelessons-video-training-downloadable-9780134175164"><img alt="Effective Python Live Lessons" class="learn-more-photo" src="https://effectivepython.com/images/live_lessons.jpg"></a></p> <p>I worked with Addison-Wesley to produce a video version of the book <em>Effective Python</em>. You can view samples and buy the video <a href="http://www.informit.com/store/effective-python-livelessons-video-training-downloadable-9780134175164">on the publisher&#8217;s website</a>.</p> <p>It includes 5 hours of video, covering 32 items from the book in six lessons. The content is primarily me using a source code editor to write Python programs that demonstrate the items from the book.</p>Talk at PyCon Montréal2015-04-10T10:50:00-07:002015-04-11T10:15:00-07:00Brett Slatkintag:effectivepython.com,2015-04-10:/2015/04/10/pycon-montreal/ <p>I gave a talk at PyCon Montréal entitled &#8220;How to Be More Effective with Functions&#8221;. <p>I gave a talk at PyCon Montréal entitled &#8220;How to Be More Effective with Functions&#8221;.</p> <ul> <li>The slides are embedded below (click the gear to download as a <span class="caps">PDF</span>).</li> <li><a href="https://github.com/bslatkin/pycon2015">The code from the examples is here on GitHub</a>.</li> <li>The <a href="https://www.youtube.com/watch?v=WjJUPxKB164">video is available here</a> (also embedded below).</li> </ul> <p><br></p> <iframe src="https://docs.google.com/presentation/d/14GbOzGgZacdw7zQN6yt-V0MVO6upL5Gd9VJ9Il6DsHQ/embed?start=false&loop=false&delayms=10000" frameborder="0" width="480" height="389" allowfullscreen="true" mozallowfullscreen="true" webkitallowfullscreen="true"></iframe> <p><br></p> <iframe width="640" height="360" src="https://www.youtube.com/embed/WjJUPxKB164?rel=0" frameborder="0" allowfullscreen></iframe>Item 40: Consider Coroutines to Run Many Functions Concurrently2015-03-10T22:45:00-07:002015-03-10T22:45:00-07:00Brett Slatkintag:effectivepython.com,2015-03-10:/2015/03/10/consider-coroutines-to-run-many-functions-concurrently/ <p>Threads give Python programmers a way to run multiple functions seemingly at the same time. But there are three big problems with threads:<p><strong>This sample is from a previous version of the book. <a href="https://effectivepython.com/">See the new third edition here</a>.</strong><br><br></p> <p>Threads give Python programmers a way to run multiple functions seemingly at the same time. But there are three big problems with threads:</p> <ul> <li> <p>They require special tools to coordinate with each other safely. This makes code that uses threads harder to reason about than procedural, single-threaded code. This complexity makes threaded code more difficult to extend and maintain over time.</p> </li> <li> <p>Threads require a lot of memory, about <span class="caps">8MB</span> per executing thread. On many computers, that amount of memory doesn&#8217;t matter for a dozen threads or so. But what if you want your program to run tens of thousands of functions &#8220;simultaneously&#8221;? These functions may correspond to user requests to a server, pixels on a screen, particles in a simulation, etc. Running a thread per unique activity just won&#8217;t work.</p> </li> <li> <p>Threads are costly to start. If you want to constantly be creating new concurrent functions and finishing them, the overhead of using threads becomes large and slows everything down.</p> </li> </ul> <p>Python can work around all these issues with <em>coroutines</em>. Coroutines let you have many seemingly simultaneous functions in your Python programs. They&#8217;re implemented as an extension to generators. The cost of starting a generator coroutine is a function call. Once active, they each use less than <span class="caps">1KB</span> of memory until they&#8217;re exhausted.</p> <p>Coroutines work by enabling the code consuming a generator to <code>send</code> a value back into the generator function after each <code>yield</code> expression. The generator function receives the value passed to the <code>send</code> function as the result of the corresponding <code>yield</code> expression.</p> <div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">my_coroutine</span><span class="p">():</span> <span class="k">while</span> <span class="kc">True</span><span class="p">:</span> <span class="n">received</span> <span class="o">=</span> <span class="k">yield</span> <span class="nb">print</span><span class="p">(</span><span class="s1">&#39;Received:&#39;</span><span class="p">,</span> <span class="n">received</span><span class="p">)</span> <span class="n">it</span> <span class="o">=</span> <span class="n">my_coroutine</span><span class="p">()</span> <span class="nb">next</span><span class="p">(</span><span class="n">it</span><span class="p">)</span> <span class="c1"># Prime the coroutine</span> <span class="n">it</span><span class="o">.</span><span class="n">send</span><span class="p">(</span><span class="s1">&#39;First&#39;</span><span class="p">)</span> <span class="n">it</span><span class="o">.</span><span class="n">send</span><span class="p">(</span><span class="s1">&#39;Second&#39;</span><span class="p">)</span> </pre></div> <div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt;</span> <span class="go">Received: First</span> <span class="go">Received: Second</span> </pre></div> <p>The initial call to <code>next</code> is required to prepare the generator for receiving the first <code>send</code> by advancing it to the first <code>yield</code> expression. Together, <code>yield</code> and <code>send</code> provide generators with a standard way to vary their next yielded value in response to external input.</p> <p>For example, say you want to implement a generator coroutine that yields the minimum value it&#8217;s been sent so far. Here the bare <code>yield</code> prepares the coroutine with the initial minimum value sent in from the outside. Then the generator repeatedly yields the new minimum in exchange for the next value to consider.</p> <div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">minimize</span><span class="p">():</span> <span class="n">current</span> <span class="o">=</span> <span class="k">yield</span> <span class="k">while</span> <span class="kc">True</span><span class="p">:</span> <span class="n">value</span> <span class="o">=</span> <span class="k">yield</span> <span class="n">current</span> <span class="n">current</span> <span class="o">=</span> <span class="nb">min</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="n">current</span><span class="p">)</span> </pre></div> <p>The code consuming the generator can run one step at a time and will output the minimum value seen after each input.</p> <div class="highlight"><pre><span></span><span class="n">it</span> <span class="o">=</span> <span class="n">minimize</span><span class="p">()</span> <span class="nb">next</span><span class="p">(</span><span class="n">it</span><span class="p">)</span> <span class="c1"># Prime the generator</span> <span class="nb">print</span><span class="p">(</span><span class="n">it</span><span class="o">.</span><span class="n">send</span><span class="p">(</span><span class="mi">10</span><span class="p">))</span> <span class="nb">print</span><span class="p">(</span><span class="n">it</span><span class="o">.</span><span class="n">send</span><span class="p">(</span><span class="mi">4</span><span class="p">))</span> <span class="nb">print</span><span class="p">(</span><span class="n">it</span><span class="o">.</span><span class="n">send</span><span class="p">(</span><span class="mi">22</span><span class="p">))</span> <span class="nb">print</span><span class="p">(</span><span class="n">it</span><span class="o">.</span><span class="n">send</span><span class="p">(</span><span class="o">-</span><span class="mi">1</span><span class="p">))</span> </pre></div> <div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt;</span> <span class="go">10</span> <span class="go">4</span> <span class="go">4</span> <span class="go">-1</span> </pre></div> <p>The generator function will seemingly run forever, making forward progress with each new call to <code>send</code>. Like threads, coroutines are independent functions that can consume inputs from their environment and produce resulting outputs. The difference is that coroutines pause at each <code>yield</code> expression in the generator function and resume after each call to <code>send</code> from the outside. This is the magical mechanism of coroutines.</p> <p>This behavior allows the code consuming the generator to take action after each <code>yield</code> expression in the coroutine. The consuming code can use the generator&#8217;s output values to call other functions and update data structures. Most importantly, it can advance other generator functions until their next <code>yield</code> expressions. By advancing many separate generators in lockstep, they will all seem to be running simultaneously, mimicking the concurrent behavior of Python threads.</p> <h3>The Game of Life</h3> <p>Let me demonstrate the simultaneous behavior of coroutines with an example. Say you want to use coroutines to implement <a href="http://en.wikipedia.org/wiki/Conway's_Game_of_Life">Conway&#8217;s Game of Life</a>. The rules of the game are simple. You have a two-dimensional grid of an arbitrary size. Each cell in the grid can either be alive or empty.</p> <div class="highlight"><pre><span></span><span class="n">ALIVE</span> <span class="o">=</span> <span class="s1">&#39;*&#39;</span> <span class="n">EMPTY</span> <span class="o">=</span> <span class="s1">&#39;-&#39;</span> </pre></div> <p>The game progresses one tick of the clock at a time. At each tick, each cell counts how many of its neighboring eight cells are still alive. Based on its neighbor count, each cell decides if it will keep living, die, or regenerate. Here&#8217;s an example of a 5x5 Game of Life grid after four generations with time going to the right. I&#8217;ll explain the specific rules further below.</p> <div class="highlight"><pre><span></span> 0 | 1 | 2 | 3 | 4 ----- | ----- | ----- | ----- | ----- -*--- | --*-- | --**- | --*-- | ----- --**- | --**- | -*--- | -*--- | -**-- ---*- | --**- | --**- | --*-- | ----- ----- | ----- | ----- | ----- | ----- </pre></div> <p>I can model this game by representing each cell as a generator coroutine running in lockstep with all the others.</p> <p>To implement this, first I need a way to retrieve the status of neighboring cells. I can do this with a coroutine named <code>count_neighbors</code> that works by yielding <code>Query</code> objects. The <code>Query</code> class I define myself. Its purpose is to provide the generator coroutine with a way to ask its surrounding environment for information.</p> <div class="highlight"><pre><span></span><span class="n">Query</span> <span class="o">=</span> <span class="n">namedtuple</span><span class="p">(</span><span class="s1">&#39;Query&#39;</span><span class="p">,</span> <span class="p">(</span><span class="s1">&#39;y&#39;</span><span class="p">,</span> <span class="s1">&#39;x&#39;</span><span class="p">))</span> </pre></div> <p>The coroutine yields a <code>Query</code> for each neighbor. The result of each <code>yield</code> expression will be the value <code>ALIVE</code> or <code>EMPTY</code>. That&#8217;s the interface contract I&#8217;ve defined between the coroutine and its consuming code. The <code>count_neighbors</code> generator sees the neighbors&#8217; states and returns the count of living neighbors.</p> <div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">count_neighbors</span><span class="p">(</span><span class="n">y</span><span class="p">,</span> <span class="n">x</span><span class="p">):</span> <span class="n">n_</span> <span class="o">=</span> <span class="k">yield</span> <span class="n">Query</span><span class="p">(</span><span class="n">y</span> <span class="o">+</span> <span class="mi">1</span><span class="p">,</span> <span class="n">x</span> <span class="o">+</span> <span class="mi">0</span><span class="p">)</span> <span class="c1"># North</span> <span class="n">ne</span> <span class="o">=</span> <span class="k">yield</span> <span class="n">Query</span><span class="p">(</span><span class="n">y</span> <span class="o">+</span> <span class="mi">1</span><span class="p">,</span> <span class="n">x</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)</span> <span class="c1"># Northeast</span> <span class="c1"># Define e_, se, s_, sw, w_, nw ...</span> <span class="c1"># ...</span> <span class="n">neighbor_states</span> <span class="o">=</span> <span class="p">[</span><span class="n">n_</span><span class="p">,</span> <span class="n">ne</span><span class="p">,</span> <span class="n">e_</span><span class="p">,</span> <span class="n">se</span><span class="p">,</span> <span class="n">s_</span><span class="p">,</span> <span class="n">sw</span><span class="p">,</span> <span class="n">w_</span><span class="p">,</span> <span class="n">nw</span><span class="p">]</span> <span class="n">count</span> <span class="o">=</span> <span class="mi">0</span> <span class="k">for</span> <span class="n">state</span> <span class="ow">in</span> <span class="n">neighbor_states</span><span class="p">:</span> <span class="k">if</span> <span class="n">state</span> <span class="o">==</span> <span class="n">ALIVE</span><span class="p">:</span> <span class="n">count</span> <span class="o">+=</span> <span class="mi">1</span> <span class="k">return</span> <span class="n">count</span> </pre></div> <p>I can drive the <code>count_neighbors</code> coroutine with fake data to test it. Here I show how <code>Query</code> objects will be yielded for each neighbor. <code>count_neighbors</code> expects to receive cell states corresponding to each <code>Query</code> through the coroutine&#8217;s <code>send</code> method. The final count is returned in the <code>StopIteration</code> exception that is raised when the generator is exhausted by the <code>return</code> statement.</p> <div class="highlight"><pre><span></span><span class="n">it</span> <span class="o">=</span> <span class="n">count_neighbors</span><span class="p">(</span><span class="mi">10</span><span class="p">,</span> <span class="mi">5</span><span class="p">)</span> <span class="n">q1</span> <span class="o">=</span> <span class="nb">next</span><span class="p">(</span><span class="n">it</span><span class="p">)</span> <span class="c1"># Get the first query</span> <span class="nb">print</span><span class="p">(</span><span class="s1">&#39;First yield: &#39;</span><span class="p">,</span> <span class="n">q1</span><span class="p">)</span> <span class="n">q2</span> <span class="o">=</span> <span class="n">it</span><span class="o">.</span><span class="n">send</span><span class="p">(</span><span class="n">ALIVE</span><span class="p">)</span> <span class="c1"># Send q1 state, get q2</span> <span class="nb">print</span><span class="p">(</span><span class="s1">&#39;Second yield:&#39;</span><span class="p">,</span> <span class="n">q2</span><span class="p">)</span> <span class="n">q3</span> <span class="o">=</span> <span class="n">it</span><span class="o">.</span><span class="n">send</span><span class="p">(</span><span class="n">ALIVE</span><span class="p">)</span> <span class="c1"># Send q2 state, get q3</span> <span class="c1"># ...</span> <span class="k">try</span><span class="p">:</span> <span class="n">count</span> <span class="o">=</span> <span class="n">it</span><span class="o">.</span><span class="n">send</span><span class="p">(</span><span class="n">EMPTY</span><span class="p">)</span> <span class="c1"># Send q8 state, retrieve count</span> <span class="k">except</span> <span class="ne">StopIteration</span> <span class="k">as</span> <span class="n">e</span><span class="p">:</span> <span class="nb">print</span><span class="p">(</span><span class="s1">&#39;Count: &#39;</span><span class="p">,</span> <span class="n">e</span><span class="o">.</span><span class="n">value</span><span class="p">)</span> <span class="c1"># Value from return statement</span> </pre></div> <div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt;</span> <span class="go">First yield: Query(y=11, x=5)</span> <span class="go">Second yield: Query(y=11, x=6)</span> <span class="go">...</span> <span class="go">Count: 2</span> </pre></div> <p>Now I need the ability to indicate that a cell will transition to a new state in response to the neighbor count that it found from <code>count_neighbors</code>. To do this, I define another coroutine called <code>step_cell</code>. This generator will indicate transitions in a cell&#8217;s state by yielding <code>Transition</code> objects. This is another class that I define, just like the <code>Query</code> class.</p> <div class="highlight"><pre><span></span><span class="n">Transition</span> <span class="o">=</span> <span class="n">namedtuple</span><span class="p">(</span><span class="s1">&#39;Transition&#39;</span><span class="p">,</span> <span class="p">(</span><span class="s1">&#39;y&#39;</span><span class="p">,</span> <span class="s1">&#39;x&#39;</span><span class="p">,</span> <span class="s1">&#39;state&#39;</span><span class="p">))</span> </pre></div> <p>The <code>step_cell</code> coroutine receives its coordinates in the grid as arguments. It yields a <code>Query</code> to get the initial state of those coordinates. It runs <code>count_neighbors</code> to inspect the cells around it. It runs the game logic to determine what state the cell should have for the next clock tick. Finally, it yields a <code>Transition</code> object to tell the environment the cell&#8217;s next state.</p> <div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">game_logic</span><span class="p">(</span><span class="n">state</span><span class="p">,</span> <span class="n">neighbors</span><span class="p">):</span> <span class="c1"># ...</span> <span class="k">def</span> <span class="nf">step_cell</span><span class="p">(</span><span class="n">y</span><span class="p">,</span> <span class="n">x</span><span class="p">):</span> <span class="n">state</span> <span class="o">=</span> <span class="k">yield</span> <span class="n">Query</span><span class="p">(</span><span class="n">y</span><span class="p">,</span> <span class="n">x</span><span class="p">)</span> <span class="n">neighbors</span> <span class="o">=</span> <span class="k">yield from</span> <span class="n">count_neighbors</span><span class="p">(</span><span class="n">y</span><span class="p">,</span> <span class="n">x</span><span class="p">)</span> <span class="n">next_state</span> <span class="o">=</span> <span class="n">game_logic</span><span class="p">(</span><span class="n">state</span><span class="p">,</span> <span class="n">neighbors</span><span class="p">)</span> <span class="k">yield</span> <span class="n">Transition</span><span class="p">(</span><span class="n">y</span><span class="p">,</span> <span class="n">x</span><span class="p">,</span> <span class="n">next_state</span><span class="p">)</span> </pre></div> <p>Importantly, the call to <code>count_neighbors</code> uses the <code>yield from</code> expression. This expression allows Python to compose generator coroutines together, making it easy to reuse smaller pieces of functionality and build complex coroutines from simpler ones. When <code>count_neighbors</code> is exhausted, the final value it returns (with the <code>return</code> statement) will be passed to <code>step_cell</code> as the result of the <code>yield from</code> expression.</p> <p>Now I can finally define the simple game logic for Conway&#8217;s Game of Life. There are only three rules.</p> <div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">game_logic</span><span class="p">(</span><span class="n">state</span><span class="p">,</span> <span class="n">neighbors</span><span class="p">):</span> <span class="k">if</span> <span class="n">state</span> <span class="o">==</span> <span class="n">ALIVE</span><span class="p">:</span> <span class="k">if</span> <span class="n">neighbors</span> <span class="o">&lt;</span> <span class="mi">2</span><span class="p">:</span> <span class="k">return</span> <span class="n">EMPTY</span> <span class="c1"># Die: Too few</span> <span class="k">elif</span> <span class="n">neighbors</span> <span class="o">&gt;</span> <span class="mi">3</span><span class="p">:</span> <span class="k">return</span> <span class="n">EMPTY</span> <span class="c1"># Die: Too many</span> <span class="k">else</span><span class="p">:</span> <span class="k">if</span> <span class="n">neighbors</span> <span class="o">==</span> <span class="mi">3</span><span class="p">:</span> <span class="k">return</span> <span class="n">ALIVE</span> <span class="c1"># Regenerate</span> <span class="k">return</span> <span class="n">state</span> </pre></div> <p>I can drive the <code>step_cell</code> with fake data to test it.</p> <div class="highlight"><pre><span></span><span class="n">it</span> <span class="o">=</span> <span class="n">step_cell</span><span class="p">(</span><span class="mi">10</span><span class="p">,</span> <span class="mi">5</span><span class="p">)</span> <span class="n">q0</span> <span class="o">=</span> <span class="nb">next</span><span class="p">(</span><span class="n">it</span><span class="p">)</span> <span class="c1"># Initial location query</span> <span class="nb">print</span><span class="p">(</span><span class="s1">&#39;Me: &#39;</span><span class="p">,</span> <span class="n">q0</span><span class="p">)</span> <span class="n">q1</span> <span class="o">=</span> <span class="n">it</span><span class="o">.</span><span class="n">send</span><span class="p">(</span><span class="n">ALIVE</span><span class="p">)</span> <span class="c1"># Send my status, get neighbor query</span> <span class="nb">print</span><span class="p">(</span><span class="s1">&#39;Q1: &#39;</span><span class="p">,</span> <span class="n">q1</span><span class="p">)</span> <span class="c1"># ...</span> <span class="n">t1</span> <span class="o">=</span> <span class="n">it</span><span class="o">.</span><span class="n">send</span><span class="p">(</span><span class="n">EMPTY</span><span class="p">)</span> <span class="c1"># Send for q8, get game decision</span> <span class="nb">print</span><span class="p">(</span><span class="s1">&#39;Outcome: &#39;</span><span class="p">,</span> <span class="n">t1</span><span class="p">)</span> </pre></div> <div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt;</span> <span class="go">Me: Query(y=10, x=5)</span> <span class="go">Q1: Query(y=11, x=5)</span> <span class="go">...</span> <span class="go">Outcome: Transition(y=10, x=5, state=&#39;-&#39;)</span> </pre></div> <p>The goal of the game is to run this logic for a whole grid of cells in lockstep. To do this, I can further compose the <code>step_cell</code> coroutine into a <code>simulate</code> coroutine. This coroutine progresses the grid of cells forward by yielding from <code>step_cell</code> many times. After progressing every coordinate, it yields a <code>TICK</code> object to indicate that the current generation of cells have all transitioned.</p> <div class="highlight"><pre><span></span><span class="n">TICK</span> <span class="o">=</span> <span class="nb">object</span><span class="p">()</span> <span class="k">def</span> <span class="nf">simulate</span><span class="p">(</span><span class="n">height</span><span class="p">,</span> <span class="n">width</span><span class="p">):</span> <span class="k">while</span> <span class="kc">True</span><span class="p">:</span> <span class="k">for</span> <span class="n">y</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">height</span><span class="p">):</span> <span class="k">for</span> <span class="n">x</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">width</span><span class="p">):</span> <span class="k">yield from</span> <span class="n">step_cell</span><span class="p">(</span><span class="n">y</span><span class="p">,</span> <span class="n">x</span><span class="p">)</span> <span class="k">yield</span> <span class="n">TICK</span> </pre></div> <p>What&#8217;s impressive about <code>simulate</code> is that it&#8217;s completely disconnected from the surrounding environment. I still haven&#8217;t defined how the grid is represented in Python objects, how <code>Query</code>, <code>Transition</code>, and <code>TICK</code> values are handled on the outside, nor how the game gets its initial state. But the logic is clear. Each cell will transition by running <code>step_cell</code>. Then the game clock will tick. This will continue forever, as long as the <code>simulate</code> coroutine is advanced.</p> <p>This is the beauty of coroutines. They help you focus on the logic of what you&#8217;re trying to accomplish. They decouple your code&#8217;s instructions for the environment from the implementation that carries out your wishes. This enables you to run coroutines seemingly in parallel. This also allows you to improve the implementation of following those instructions over time without changing the coroutines.</p> <p>Now I want to run <code>simulate</code> in a real environment. To do that, I need to represent the state of each cell in the grid. Here I define a class to contain the grid.</p> <div class="highlight"><pre><span></span><span class="k">class</span> <span class="nc">Grid</span><span class="p">(</span><span class="nb">object</span><span class="p">):</span> <span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">height</span><span class="p">,</span> <span class="n">width</span><span class="p">):</span> <span class="bp">self</span><span class="o">.</span><span class="n">height</span> <span class="o">=</span> <span class="n">height</span> <span class="bp">self</span><span class="o">.</span><span class="n">width</span> <span class="o">=</span> <span class="n">width</span> <span class="bp">self</span><span class="o">.</span><span class="n">rows</span> <span class="o">=</span> <span class="p">[]</span> <span class="k">for</span> <span class="n">_</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">height</span><span class="p">):</span> <span class="bp">self</span><span class="o">.</span><span class="n">rows</span><span class="o">.</span><span class="n">append</span><span class="p">([</span><span class="n">EMPTY</span><span class="p">]</span> <span class="o">*</span> <span class="bp">self</span><span class="o">.</span><span class="n">width</span><span class="p">)</span> <span class="k">def</span> <span class="fm">__str__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span> <span class="c1"># ...</span> </pre></div> <p>The grid allows you to get and set the value of any coordinate. Coordinates that are out of bounds will wrap around, making the the grid act like infinite looping space.</p> <div class="highlight"><pre><span></span> <span class="k">def</span> <span class="nf">query</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">y</span><span class="p">,</span> <span class="n">x</span><span class="p">):</span> <span class="k">return</span> <span class="bp">self</span><span class="o">.</span><span class="n">rows</span><span class="p">[</span><span class="n">y</span> <span class="o">%</span> <span class="bp">self</span><span class="o">.</span><span class="n">height</span><span class="p">][</span><span class="n">x</span> <span class="o">%</span> <span class="bp">self</span><span class="o">.</span><span class="n">width</span><span class="p">]</span> <span class="k">def</span> <span class="nf">assign</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">y</span><span class="p">,</span> <span class="n">x</span><span class="p">,</span> <span class="n">state</span><span class="p">):</span> <span class="bp">self</span><span class="o">.</span><span class="n">rows</span><span class="p">[</span><span class="n">y</span> <span class="o">%</span> <span class="bp">self</span><span class="o">.</span><span class="n">height</span><span class="p">][</span><span class="n">x</span> <span class="o">%</span> <span class="bp">self</span><span class="o">.</span><span class="n">width</span><span class="p">]</span> <span class="o">=</span> <span class="n">state</span> </pre></div> <p>At last, I can define the function that interprets the values yielded from <code>simulate</code> and all of its interior coroutines. This function turns the instructions from the coroutines into interactions with the surrounding environment. It progresses the whole grid of cells forward a single step and then returns a new grid containing the next state.</p> <div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">live_a_generation</span><span class="p">(</span><span class="n">grid</span><span class="p">,</span> <span class="n">sim</span><span class="p">):</span> <span class="n">progeny</span> <span class="o">=</span> <span class="n">Grid</span><span class="p">(</span><span class="n">grid</span><span class="o">.</span><span class="n">height</span><span class="p">,</span> <span class="n">grid</span><span class="o">.</span><span class="n">width</span><span class="p">)</span> <span class="n">item</span> <span class="o">=</span> <span class="nb">next</span><span class="p">(</span><span class="n">sim</span><span class="p">)</span> <span class="k">while</span> <span class="n">item</span> <span class="ow">is</span> <span class="ow">not</span> <span class="n">TICK</span><span class="p">:</span> <span class="k">if</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">item</span><span class="p">,</span> <span class="n">Query</span><span class="p">):</span> <span class="n">state</span> <span class="o">=</span> <span class="n">grid</span><span class="o">.</span><span class="n">query</span><span class="p">(</span><span class="n">item</span><span class="o">.</span><span class="n">y</span><span class="p">,</span> <span class="n">item</span><span class="o">.</span><span class="n">x</span><span class="p">)</span> <span class="n">item</span> <span class="o">=</span> <span class="n">sim</span><span class="o">.</span><span class="n">send</span><span class="p">(</span><span class="n">state</span><span class="p">)</span> <span class="k">else</span><span class="p">:</span> <span class="c1"># Must be a Transition</span> <span class="n">progeny</span><span class="o">.</span><span class="n">assign</span><span class="p">(</span><span class="n">item</span><span class="o">.</span><span class="n">y</span><span class="p">,</span> <span class="n">item</span><span class="o">.</span><span class="n">x</span><span class="p">,</span> <span class="n">item</span><span class="o">.</span><span class="n">state</span><span class="p">)</span> <span class="n">item</span> <span class="o">=</span> <span class="nb">next</span><span class="p">(</span><span class="n">sim</span><span class="p">)</span> <span class="k">return</span> <span class="n">progeny</span> </pre></div> <p>To see this function in action, I need to create a grid and set its initial state. Here I make a classic shape called a glider.</p> <div class="highlight"><pre><span></span><span class="n">grid</span> <span class="o">=</span> <span class="n">Grid</span><span class="p">(</span><span class="mi">5</span><span class="p">,</span> <span class="mi">9</span><span class="p">)</span> <span class="n">grid</span><span class="o">.</span><span class="n">assign</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="n">ALIVE</span><span class="p">)</span> <span class="c1"># ...</span> <span class="nb">print</span><span class="p">(</span><span class="n">grid</span><span class="p">)</span> </pre></div> <div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt;</span> <span class="go">---*-----</span> <span class="go">----*----</span> <span class="go">--***----</span> <span class="go">---------</span> <span class="go">---------</span> </pre></div> <p>Now I can progress this grid forward one generation at a time. You can see how the glider moves down and to the right on the grid based on the simple rules from the <code>game_logic</code> function.</p> <div class="highlight"><pre><span></span><span class="k">class</span> <span class="nc">ColumnPrinter</span><span class="p">(</span><span class="nb">object</span><span class="p">):</span> <span class="c1"># ...</span> <span class="n">columns</span> <span class="o">=</span> <span class="n">ColumnPrinter</span><span class="p">()</span> <span class="n">sim</span> <span class="o">=</span> <span class="n">simulate</span><span class="p">(</span><span class="n">grid</span><span class="o">.</span><span class="n">height</span><span class="p">,</span> <span class="n">grid</span><span class="o">.</span><span class="n">width</span><span class="p">)</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">5</span><span class="p">):</span> <span class="n">columns</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="nb">str</span><span class="p">(</span><span class="n">grid</span><span class="p">))</span> <span class="n">grid</span> <span class="o">=</span> <span class="n">live_a_generation</span><span class="p">(</span><span class="n">grid</span><span class="p">,</span> <span class="n">sim</span><span class="p">)</span> <span class="nb">print</span><span class="p">(</span><span class="n">columns</span><span class="p">)</span> </pre></div> <div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt;</span> <span class="go"> 0 | 1 | 2 | 3 | 4</span> <span class="go">---*----- | --------- | --------- | --------- | ---------</span> <span class="go">----*---- | --*-*---- | ----*---- | ---*----- | ----*----</span> <span class="go">--***---- | ---**---- | --*-*---- | ----**--- | -----*---</span> <span class="go">--------- | ---*----- | ---**---- | ---**---- | ---***---</span> <span class="go">--------- | --------- | --------- | --------- | ---------</span> </pre></div> <p>The best part about this approach is that I can change the <code>game_logic</code> function without having to update the code that surrounds it. I can change the rules or add larger spheres of influence with the existing mechanics of <code>Query</code>, <code>Transition</code>, and <code>TICK</code>. This demonstrates how coroutines enable the separation of concerns, which is an important design principle.</p> <h3>Coroutines in Python 2</h3> <p>Unfortunately, Python 2 is missing some of the syntactical sugar that makes coroutines so elegant in Python 3. There are two limitations. First, there is no <code>yield from</code> expression. That means when you want to compose generator coroutines in Python 2, you need to include an additional loop at the delegation point.</p> <div class="highlight"><pre><span></span><span class="c1"># Python 2</span> <span class="k">def</span> <span class="nf">delegated</span><span class="p">():</span> <span class="k">yield</span> <span class="mi">1</span> <span class="k">yield</span> <span class="mi">2</span> <span class="k">def</span> <span class="nf">composed</span><span class="p">():</span> <span class="k">yield</span> <span class="s1">&#39;A&#39;</span> <span class="k">for</span> <span class="n">value</span> <span class="ow">in</span> <span class="n">delegated</span><span class="p">():</span> <span class="c1"># yield from in Python 3</span> <span class="k">yield</span> <span class="n">value</span> <span class="k">yield</span> <span class="s1">&#39;B&#39;</span> <span class="nb">print</span> <span class="nb">list</span><span class="p">(</span><span class="n">composed</span><span class="p">())</span> </pre></div> <div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt;</span> <span class="go">[&#39;A&#39;, 1, 2, &#39;B&#39;]</span> </pre></div> <p>The second limitation is that there is no support for the <code>return</code> statement in Python 2 generators. To get the same behavior that interacts correctly with <code>try</code>/<code>except</code>/<code>finally</code> blocks, you need to define your own exception type and raise it when you want to return a value.</p> <div class="highlight"><pre><span></span><span class="c1"># Python 2</span> <span class="k">class</span> <span class="nc">MyReturn</span><span class="p">(</span><span class="ne">Exception</span><span class="p">):</span> <span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">value</span><span class="p">):</span> <span class="bp">self</span><span class="o">.</span><span class="n">value</span> <span class="o">=</span> <span class="n">value</span> <span class="k">def</span> <span class="nf">delegated</span><span class="p">():</span> <span class="k">yield</span> <span class="mi">1</span> <span class="k">raise</span> <span class="n">MyReturn</span><span class="p">(</span><span class="mi">2</span><span class="p">)</span> <span class="c1"># return 2 in Python 3</span> <span class="k">yield</span> <span class="s1">&#39;Not reached&#39;</span> <span class="k">def</span> <span class="nf">composed</span><span class="p">():</span> <span class="k">try</span><span class="p">:</span> <span class="k">for</span> <span class="n">value</span> <span class="ow">in</span> <span class="n">delegated</span><span class="p">():</span> <span class="k">yield</span> <span class="n">value</span> <span class="k">except</span> <span class="n">MyReturn</span> <span class="k">as</span> <span class="n">e</span><span class="p">:</span> <span class="n">output</span> <span class="o">=</span> <span class="n">e</span><span class="o">.</span><span class="n">value</span> <span class="k">yield</span> <span class="n">output</span> <span class="o">*</span> <span class="mi">4</span> <span class="nb">print</span> <span class="nb">list</span><span class="p">(</span><span class="n">composed</span><span class="p">())</span> </pre></div> <div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt;</span> <span class="go">[1, 8]</span> </pre></div> <h3>Things to Remember</h3> <ul> <li>Coroutines provide an efficient way to run tens of thousands of functions seemingly at the same time.</li> <li>Within a generator, the value of the <code>yield</code> expression will be whatever value was passed to the generator&#8217;s <code>send</code> method from the exterior code.</li> <li>Coroutines give you a powerful tool for separating the core logic of your program from its interaction with the surrounding environment.</li> <li>Python 2 doesn&#8217;t support <code>yield from</code> or <code>return</code> within generators.</li> </ul>The Book Is Published2015-03-06T14:00:00-08:002015-03-06T14:00:00-08:00Brett Slatkintag:effectivepython.com,2015-03-06:/2015/03/06/books-is-published/<p>Printed, physical copies of the <em>Effective Python</em> book are now for sale! <a href="http://goo.gl/mpVxz2">Follow this link to buy directly from the publisher</a> (free shipping in the <span class="caps">USA</span>). The publisher also has ePub and <span class="caps">PDF</span> versions available. <a href="http://amzn.to/1ylkKmc">Follow this link to buy from Amazon</a>. Amazon also has a Kindle edition available.</p>Digital Editions Now Available2015-02-15T13:00:00-08:002015-02-15T13:00:00-08:00Brett Slatkintag:effectivepython.com,2015-02-15:/2015/02/15/digital-editions-now-available/<p>Digital editions of <em>Effective Python</em> are now available. <a href="http://click.linksynergy.com/link?id=YvEWtFaKGwg&amp;offerid=145238.2235742&amp;type=2&amp;murl=http%3A%2F%2Fwww.informit.com%2Ftitle%2F9780134034423">Follow this link to buy the ePub or <span class="caps">PDF</span> version</a>. <a href="http://amzn.to/1AFwumA">Follow this link to buy the Kindle edition</a>. The print copy is due out on March 6th.</p>Item 23: Accept Functions for Simple Interfaces Instead of Classes2015-02-12T10:20:00-08:002015-02-12T10:20:00-08:00Brett Slatkintag:effectivepython.com,2015-02-12:/2015/02/12/accept-functions-for-simple-interfaces-instead-of-classes/ <p>Many of Python&#8217;s built-in APIs allow you to customize behavior by passing in a function. These <em>hooks</em> are used by APIs to call back your code while they execute.<p><strong>This sample is from a previous version of the book. <a href="https://effectivepython.com/">See the new third edition here</a>.</strong><br><br></p> <p>Many of Python&#8217;s built-in APIs allow you to customize behavior by passing in a function. These <em>hooks</em> are used by APIs to call back your code while they execute. For example, the <code>list</code> type&#8217;s <code>sort</code> method takes an optional <code>key</code> argument that&#8217;s used to determine each index&#8217;s value for sorting. Here I sort a list of names based on their lengths by providing a <code>lambda</code> expression as the <code>key</code> hook.</p> <div class="highlight"><pre><span></span><span class="n">names</span> <span class="o">=</span> <span class="p">[</span><span class="s1">&#39;Socrates&#39;</span><span class="p">,</span> <span class="s1">&#39;Archimedes&#39;</span><span class="p">,</span> <span class="s1">&#39;Plato&#39;</span><span class="p">,</span> <span class="s1">&#39;Aristotle&#39;</span><span class="p">]</span> <span class="n">names</span><span class="o">.</span><span class="n">sort</span><span class="p">(</span><span class="n">key</span><span class="o">=</span><span class="k">lambda</span> <span class="n">x</span><span class="p">:</span> <span class="nb">len</span><span class="p">(</span><span class="n">x</span><span class="p">))</span> <span class="nb">print</span><span class="p">(</span><span class="n">names</span><span class="p">)</span> </pre></div> <div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt;</span> <span class="go">[&#39;Plato&#39;, &#39;Socrates&#39;, &#39;Aristotle&#39;, &#39;Archimedes&#39;]</span> </pre></div> <p>In other languages, you might expect hooks to be defined by an abstract class. In Python, many hooks are just stateless functions with well-defined arguments and return values. Functions are ideal for hooks because they are easier to describe and simpler to define than classes. Functions work as hooks because Python has <em>first-class</em> functions: Functions and methods can be passed around and referenced like any other value in the language.</p> <p>For example, say you want to customize the behavior of the <code>defaultdict</code> class. This data structure allows you to supply a function that will be called each time a missing key is accessed. The function must return the default value the missing key should have in the dictionary. Here I define a hook that logs each time a key is missing and returns <code>0</code> for the default value.</p> <div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">log_missing</span><span class="p">():</span> <span class="nb">print</span><span class="p">(</span><span class="s1">&#39;Key added&#39;</span><span class="p">)</span> <span class="k">return</span> <span class="mi">0</span> </pre></div> <p>Given an initial dictionary and a set of desired increments, I can cause the <code>log_missing</code> function to run and print twice (for <code>'red'</code> and <code>'orange'</code>).</p> <div class="highlight"><pre><span></span><span class="n">current</span> <span class="o">=</span> <span class="p">{</span><span class="s1">&#39;green&#39;</span><span class="p">:</span> <span class="mi">12</span><span class="p">,</span> <span class="s1">&#39;blue&#39;</span><span class="p">:</span> <span class="mi">3</span><span class="p">}</span> <span class="n">increments</span> <span class="o">=</span> <span class="p">[</span> <span class="p">(</span><span class="s1">&#39;red&#39;</span><span class="p">,</span> <span class="mi">5</span><span class="p">),</span> <span class="p">(</span><span class="s1">&#39;blue&#39;</span><span class="p">,</span> <span class="mi">17</span><span class="p">),</span> <span class="p">(</span><span class="s1">&#39;orange&#39;</span><span class="p">,</span> <span class="mi">9</span><span class="p">),</span> <span class="p">]</span> <span class="n">result</span> <span class="o">=</span> <span class="n">defaultdict</span><span class="p">(</span><span class="n">log_missing</span><span class="p">,</span> <span class="n">current</span><span class="p">)</span> <span class="nb">print</span><span class="p">(</span><span class="s1">&#39;Before:&#39;</span><span class="p">,</span> <span class="nb">dict</span><span class="p">(</span><span class="n">result</span><span class="p">))</span> <span class="k">for</span> <span class="n">key</span><span class="p">,</span> <span class="n">amount</span> <span class="ow">in</span> <span class="n">increments</span><span class="p">:</span> <span class="n">result</span><span class="p">[</span><span class="n">key</span><span class="p">]</span> <span class="o">+=</span> <span class="n">amount</span> <span class="nb">print</span><span class="p">(</span><span class="s1">&#39;After: &#39;</span><span class="p">,</span> <span class="nb">dict</span><span class="p">(</span><span class="n">result</span><span class="p">))</span> </pre></div> <div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt;</span> <span class="go">Before: {&#39;blue&#39;: 3, &#39;green&#39;: 12}</span> <span class="go">Key added</span> <span class="go">Key added</span> <span class="go">After: {&#39;red&#39;: 5, &#39;green&#39;: 12, &#39;blue&#39;: 20, &#39;orange&#39;: 9}</span> </pre></div> <p>Supplying functions like <code>log_missing</code> makes APIs easy to build and test because it separates side effects from deterministic behavior. For example, say you now want the default value hook passed to <code>defaultdict</code> to count the total number of keys that were missing. One way to achieve this is using a stateful closure (see [Item 15 for details). Here I define a helper function that uses such a closure as the default value hook.</p> <div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">increment_with_report</span><span class="p">(</span><span class="n">current</span><span class="p">,</span> <span class="n">increments</span><span class="p">):</span> <span class="n">added_count</span> <span class="o">=</span> <span class="mi">0</span> <span class="k">def</span> <span class="nf">missing</span><span class="p">():</span> <span class="k">nonlocal</span> <span class="n">added_count</span> <span class="c1"># Stateful closure</span> <span class="n">added_count</span> <span class="o">+=</span> <span class="mi">1</span> <span class="k">return</span> <span class="mi">0</span> <span class="n">result</span> <span class="o">=</span> <span class="n">defaultdict</span><span class="p">(</span><span class="n">missing</span><span class="p">,</span> <span class="n">current</span><span class="p">)</span> <span class="k">for</span> <span class="n">key</span><span class="p">,</span> <span class="n">amount</span> <span class="ow">in</span> <span class="n">increments</span><span class="p">:</span> <span class="n">result</span><span class="p">[</span><span class="n">key</span><span class="p">]</span> <span class="o">+=</span> <span class="n">amount</span> <span class="k">return</span> <span class="n">result</span><span class="p">,</span> <span class="n">added_count</span> </pre></div> <p>Running this function produces the expected result (<code>2</code>), even though the <code>defaultdict</code> has no idea that the <code>missing</code> hook maintains state. This is another benefit of accepting simple functions for interfaces. It&#8217;s easy to add functionality later by hiding state in a closure.</p> <div class="highlight"><pre><span></span><span class="n">result</span><span class="p">,</span> <span class="n">count</span> <span class="o">=</span> <span class="n">increment_with_report</span><span class="p">(</span><span class="n">current</span><span class="p">,</span> <span class="n">increments</span><span class="p">)</span> <span class="k">assert</span> <span class="n">count</span> <span class="o">==</span> <span class="mi">2</span> </pre></div> <p>The problem with defining a closure for stateful hooks is that it&#8217;s harder to read than the stateless function example. Another approach is to define a small class that encapsulates the state you want to track.</p> <div class="highlight"><pre><span></span><span class="k">class</span> <span class="nc">CountMissing</span><span class="p">(</span><span class="nb">object</span><span class="p">):</span> <span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span> <span class="bp">self</span><span class="o">.</span><span class="n">added</span> <span class="o">=</span> <span class="mi">0</span> <span class="k">def</span> <span class="nf">missing</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span> <span class="bp">self</span><span class="o">.</span><span class="n">added</span> <span class="o">+=</span> <span class="mi">1</span> <span class="k">return</span> <span class="mi">0</span> </pre></div> <p>In other languages, you might expect that now <code>defaultdict</code> would have to be modified to accommodate the interface of <code>CountMissing</code>. But in Python, thanks to first-class functions, you can reference the <code>CountMissing.missing</code> method directly on an object and pass it to <code>defaultdict</code> as the default value hook. It&#8217;s trivial to have a method satisfy a function interface.</p> <div class="highlight"><pre><span></span><span class="n">counter</span> <span class="o">=</span> <span class="n">CountMissing</span><span class="p">()</span> <span class="n">result</span> <span class="o">=</span> <span class="n">defaultdict</span><span class="p">(</span><span class="n">counter</span><span class="o">.</span><span class="n">missing</span><span class="p">,</span> <span class="n">current</span><span class="p">)</span> <span class="c1"># Method reference</span> <span class="k">for</span> <span class="n">key</span><span class="p">,</span> <span class="n">amount</span> <span class="ow">in</span> <span class="n">increments</span><span class="p">:</span> <span class="n">result</span><span class="p">[</span><span class="n">key</span><span class="p">]</span> <span class="o">+=</span> <span class="n">amount</span> <span class="k">assert</span> <span class="n">counter</span><span class="o">.</span><span class="n">added</span> <span class="o">==</span> <span class="mi">2</span> </pre></div> <p>Using a helper class like this to provide the behavior of a stateful closure is clearer than the <code>increment_with_report</code> function above. However, in isolation it&#8217;s still not immediately obvious what the purpose of the <code>CountMissing</code> class is. Who constructs a <code>CountMissing</code> object? Who calls the <code>missing</code> method? Will the class need other public methods to be added in the future? Until you see its usage with <code>defaultdict</code> the class is a mystery.</p> <p>To clarify this situation, Python allows classes to define the <code>__call__</code> special method. <code>__call__</code> allows an object to be called just like a function. It also causes the <code>callable</code> built-in function to return <code>True</code> for such an instance.</p> <div class="highlight"><pre><span></span><span class="k">class</span> <span class="nc">BetterCountMissing</span><span class="p">(</span><span class="nb">object</span><span class="p">):</span> <span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span> <span class="bp">self</span><span class="o">.</span><span class="n">added</span> <span class="o">=</span> <span class="mi">0</span> <span class="k">def</span> <span class="fm">__call__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span> <span class="bp">self</span><span class="o">.</span><span class="n">added</span> <span class="o">+=</span> <span class="mi">1</span> <span class="k">return</span> <span class="mi">0</span> <span class="n">counter</span> <span class="o">=</span> <span class="n">BetterCountMissing</span><span class="p">()</span> <span class="n">counter</span><span class="p">()</span> <span class="k">assert</span> <span class="nb">callable</span><span class="p">(</span><span class="n">counter</span><span class="p">)</span> </pre></div> <p>Here I use a <code>BetterCountMissing</code> instance as the default value hook for a <code>defaultdict</code> to track the number of missing keys that were added.</p> <div class="highlight"><pre><span></span><span class="n">counter</span> <span class="o">=</span> <span class="n">BetterCountMissing</span><span class="p">()</span> <span class="n">result</span> <span class="o">=</span> <span class="n">defaultdict</span><span class="p">(</span><span class="n">counter</span><span class="p">,</span> <span class="n">current</span><span class="p">)</span> <span class="c1"># Relies on __call__</span> <span class="k">for</span> <span class="n">key</span><span class="p">,</span> <span class="n">amount</span> <span class="ow">in</span> <span class="n">increments</span><span class="p">:</span> <span class="n">result</span><span class="p">[</span><span class="n">key</span><span class="p">]</span> <span class="o">+=</span> <span class="n">amount</span> <span class="k">assert</span> <span class="n">counter</span><span class="o">.</span><span class="n">added</span> <span class="o">==</span> <span class="mi">2</span> </pre></div> <p>This is much clearer than the <code>CountMissing.missing</code> example. The <code>__call__</code> method indicates that a class&#8217;s instances will be used somewhere a function argument would also be suitable (like <span class="caps">API</span> hooks). It directs new readers of the code to the entry point that&#8217;s responsible for the class&#8217;s primary behavior. It provides a strong hint that the goal of the class is to act as a stateful closure.</p> <p>Best of all, <code>defaultdict</code> still has no view into what&#8217;s going on when you use <code>__call__</code>. All that <code>defaultdict</code> requires is a function for the default value hook. Python provides many different ways to satisfy a simple function interface depending on what you need to accomplish.</p> <h3>Things to Remember</h3> <ul> <li>Instead of defining and instantiating classes, functions are often all you need for simple interfaces between components in Python.</li> <li>References to functions and methods in Python are first-class, meaning they can be used in expressions like any other type.</li> <li>The <code>__call__</code> special method enables instances of a class to be called like plain Python functions.</li> <li>When you need a function to maintain state, consider defining a class that provides the <code>__call__</code> method instead of defining a stateful closure.</li> </ul>Item 34: Register Class Existence with Metaclasses2015-02-02T22:22:00-08:002015-02-02T22:22:00-08:00Brett Slatkintag:effectivepython.com,2015-02-02:/2015/02/02/register-class-existence-with-metaclasses/ <p>A common use of metaclasses is to automatically register types in your program. Registration is useful for doing reverse lookups, where you need to map a simple identifier back to a corresponding class.<p><strong>This sample is from a previous version of the book. <a href="https://effectivepython.com/">See the new third edition here</a>.</strong><br><br></p> <p>A common use of metaclasses is to automatically register types in your program. Registration is useful for doing reverse lookups, where you need to map a simple identifier back to a corresponding class.</p> <p>For example, say you want to implement your own serialized representation of a Python object using <span class="caps">JSON</span>. You need a way to take an object and turn it into a <span class="caps">JSON</span> string. Here I do this generically by defining a base class that records the constructor parameters and turns them into a <span class="caps">JSON</span> dictionary.</p> <div class="highlight"><pre><span></span><span class="k">class</span> <span class="nc">Serializable</span><span class="p">(</span><span class="nb">object</span><span class="p">):</span> <span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="o">*</span><span class="n">args</span><span class="p">):</span> <span class="bp">self</span><span class="o">.</span><span class="n">args</span> <span class="o">=</span> <span class="n">args</span> <span class="k">def</span> <span class="nf">serialize</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span> <span class="k">return</span> <span class="n">json</span><span class="o">.</span><span class="n">dumps</span><span class="p">({</span><span class="s1">&#39;args&#39;</span><span class="p">:</span> <span class="bp">self</span><span class="o">.</span><span class="n">args</span><span class="p">})</span> </pre></div> <p>This class makes it easy to serialize simple, immutable data structures like <code>Point2D</code> to a string.</p> <div class="highlight"><pre><span></span><span class="k">class</span> <span class="nc">Point2D</span><span class="p">(</span><span class="n">Serializable</span><span class="p">):</span> <span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">):</span> <span class="nb">super</span><span class="p">()</span><span class="o">.</span><span class="fm">__init__</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">)</span> <span class="bp">self</span><span class="o">.</span><span class="n">x</span> <span class="o">=</span> <span class="n">x</span> <span class="bp">self</span><span class="o">.</span><span class="n">y</span> <span class="o">=</span> <span class="n">y</span> <span class="k">def</span> <span class="fm">__repr__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span> <span class="k">return</span> <span class="s1">&#39;Point2D(</span><span class="si">%d</span><span class="s1">, </span><span class="si">%d</span><span class="s1">)&#39;</span> <span class="o">%</span> <span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">x</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">y</span><span class="p">)</span> <span class="n">point</span> <span class="o">=</span> <span class="n">Point2D</span><span class="p">(</span><span class="mi">5</span><span class="p">,</span> <span class="mi">3</span><span class="p">)</span> <span class="nb">print</span><span class="p">(</span><span class="s1">&#39;Object: &#39;</span><span class="p">,</span> <span class="n">point</span><span class="p">)</span> <span class="nb">print</span><span class="p">(</span><span class="s1">&#39;Serialized:&#39;</span><span class="p">,</span> <span class="n">point</span><span class="o">.</span><span class="n">serialize</span><span class="p">())</span> </pre></div> <div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt;</span> <span class="go">Object: Point2D(5, 3)</span> <span class="go">Serialized: {&quot;args&quot;: [5, 3]}</span> </pre></div> <p>Now I need to deserialize this <span class="caps">JSON</span> string and construct the <code>Point2D</code> object it represents. Here I define another class that can deserialize the data from its <code>Serializable</code> parent class.</p> <div class="highlight"><pre><span></span><span class="k">class</span> <span class="nc">Deserializable</span><span class="p">(</span><span class="n">Serializable</span><span class="p">):</span> <span class="nd">@classmethod</span> <span class="k">def</span> <span class="nf">deserialize</span><span class="p">(</span><span class="bp">cls</span><span class="p">,</span> <span class="n">json_data</span><span class="p">):</span> <span class="n">params</span> <span class="o">=</span> <span class="n">json</span><span class="o">.</span><span class="n">loads</span><span class="p">(</span><span class="n">json_data</span><span class="p">)</span> <span class="k">return</span> <span class="bp">cls</span><span class="p">(</span><span class="o">*</span><span class="n">params</span><span class="p">[</span><span class="s1">&#39;args&#39;</span><span class="p">])</span> </pre></div> <p>Using <code>Deserializable</code> makes it easy to serialize and deserialize simple, immutable objects in a generic way.</p> <div class="highlight"><pre><span></span><span class="k">class</span> <span class="nc">BetterPoint2D</span><span class="p">(</span><span class="n">Deserializable</span><span class="p">):</span> <span class="c1"># ...</span> <span class="n">point</span> <span class="o">=</span> <span class="n">BetterPoint2D</span><span class="p">(</span><span class="mi">5</span><span class="p">,</span> <span class="mi">3</span><span class="p">)</span> <span class="nb">print</span><span class="p">(</span><span class="s1">&#39;Before: &#39;</span><span class="p">,</span> <span class="n">point</span><span class="p">)</span> <span class="n">data</span> <span class="o">=</span> <span class="n">point</span><span class="o">.</span><span class="n">serialize</span><span class="p">()</span> <span class="nb">print</span><span class="p">(</span><span class="s1">&#39;Serialized:&#39;</span><span class="p">,</span> <span class="n">data</span><span class="p">)</span> <span class="n">after</span> <span class="o">=</span> <span class="n">BetterPoint2D</span><span class="o">.</span><span class="n">deserialize</span><span class="p">(</span><span class="n">data</span><span class="p">)</span> <span class="nb">print</span><span class="p">(</span><span class="s1">&#39;After: &#39;</span><span class="p">,</span> <span class="n">after</span><span class="p">)</span> </pre></div> <div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt;</span> <span class="go">Before: BetterPoint2D(5, 3)</span> <span class="go">Serialized: {&quot;args&quot;: [5, 3]}</span> <span class="go">After: BetterPoint2D(5, 3)</span> </pre></div> <p>The problem with this approach is it only works if you know the intended type of the serialized data ahead of time (e.g., <code>Point2D</code>, <code>BetterPoint2D</code>). Ideally you&#8217;d have a large number of classes serializing to <span class="caps">JSON</span> and one common function that could deserialize any of them back to a corresponding Python object.</p> <p>To do this, I can include the serialized object&#8217;s class name in the <span class="caps">JSON</span> data.</p> <div class="highlight"><pre><span></span><span class="k">class</span> <span class="nc">BetterSerializable</span><span class="p">(</span><span class="nb">object</span><span class="p">):</span> <span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="o">*</span><span class="n">args</span><span class="p">):</span> <span class="bp">self</span><span class="o">.</span><span class="n">args</span> <span class="o">=</span> <span class="n">args</span> <span class="k">def</span> <span class="nf">serialize</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span> <span class="k">return</span> <span class="n">json</span><span class="o">.</span><span class="n">dumps</span><span class="p">({</span> <span class="s1">&#39;class&#39;</span><span class="p">:</span> <span class="bp">self</span><span class="o">.</span><span class="vm">__class__</span><span class="o">.</span><span class="vm">__name__</span><span class="p">,</span> <span class="s1">&#39;args&#39;</span><span class="p">:</span> <span class="bp">self</span><span class="o">.</span><span class="n">args</span><span class="p">,</span> <span class="p">})</span> <span class="k">def</span> <span class="fm">__repr__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span> <span class="c1"># ...</span> </pre></div> <p>Then I can maintain a mapping of class names back to constructors for those objects. The general <code>deserialize</code> function will work for any classes passed to <code>register_class</code>.</p> <div class="highlight"><pre><span></span><span class="n">registry</span> <span class="o">=</span> <span class="p">{}</span> <span class="k">def</span> <span class="nf">register_class</span><span class="p">(</span><span class="n">target_class</span><span class="p">):</span> <span class="n">registry</span><span class="p">[</span><span class="n">target_class</span><span class="o">.</span><span class="vm">__name__</span><span class="p">]</span> <span class="o">=</span> <span class="n">target_class</span> <span class="k">def</span> <span class="nf">deserialize</span><span class="p">(</span><span class="n">data</span><span class="p">):</span> <span class="n">params</span> <span class="o">=</span> <span class="n">json</span><span class="o">.</span><span class="n">loads</span><span class="p">(</span><span class="n">data</span><span class="p">)</span> <span class="n">name</span> <span class="o">=</span> <span class="n">params</span><span class="p">[</span><span class="s1">&#39;class&#39;</span><span class="p">]</span> <span class="n">target_class</span> <span class="o">=</span> <span class="n">registry</span><span class="p">[</span><span class="n">name</span><span class="p">]</span> <span class="k">return</span> <span class="n">target_class</span><span class="p">(</span><span class="o">*</span><span class="n">params</span><span class="p">[</span><span class="s1">&#39;args&#39;</span><span class="p">])</span> </pre></div> <p>To ensure <code>deserialize</code> always works properly, I must call <code>register_class</code> for every class I may want to deserialize in the future.</p> <div class="highlight"><pre><span></span><span class="k">class</span> <span class="nc">EvenBetterPoint2D</span><span class="p">(</span><span class="n">BetterSerializable</span><span class="p">):</span> <span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">):</span> <span class="nb">super</span><span class="p">()</span><span class="o">.</span><span class="fm">__init__</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">)</span> <span class="bp">self</span><span class="o">.</span><span class="n">x</span> <span class="o">=</span> <span class="n">x</span> <span class="bp">self</span><span class="o">.</span><span class="n">y</span> <span class="o">=</span> <span class="n">y</span> <span class="n">register_class</span><span class="p">(</span><span class="n">EvenBetterPoint2D</span><span class="p">)</span> </pre></div> <p>Now I can deserialize an arbitrary <span class="caps">JSON</span> string without having to know which class it contains.</p> <div class="highlight"><pre><span></span><span class="n">point</span> <span class="o">=</span> <span class="n">EvenBetterPoint2D</span><span class="p">(</span><span class="mi">5</span><span class="p">,</span> <span class="mi">3</span><span class="p">)</span> <span class="nb">print</span><span class="p">(</span><span class="s1">&#39;Before: &#39;</span><span class="p">,</span> <span class="n">point</span><span class="p">)</span> <span class="n">data</span> <span class="o">=</span> <span class="n">point</span><span class="o">.</span><span class="n">serialize</span><span class="p">()</span> <span class="nb">print</span><span class="p">(</span><span class="s1">&#39;Serialized:&#39;</span><span class="p">,</span> <span class="n">data</span><span class="p">)</span> <span class="n">after</span> <span class="o">=</span> <span class="n">deserialize</span><span class="p">(</span><span class="n">data</span><span class="p">)</span> <span class="nb">print</span><span class="p">(</span><span class="s1">&#39;After: &#39;</span><span class="p">,</span> <span class="n">after</span><span class="p">)</span> </pre></div> <div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt;</span> <span class="go">Before: EvenBetterPoint2D(5, 3)</span> <span class="go">Serialized: {&quot;class&quot;: &quot;EvenBetterPoint2D&quot;, &quot;args&quot;: [5, 3]}</span> <span class="go">After: EvenBetterPoint2D(5, 3)</span> </pre></div> <p>The problem with this approach is that you can forget to call <code>register_class</code>.</p> <div class="highlight"><pre><span></span><span class="k">class</span> <span class="nc">Point3D</span><span class="p">(</span><span class="n">BetterSerializable</span><span class="p">):</span> <span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">,</span> <span class="n">z</span><span class="p">):</span> <span class="nb">super</span><span class="p">()</span><span class="o">.</span><span class="fm">__init__</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">,</span> <span class="n">z</span><span class="p">)</span> <span class="bp">self</span><span class="o">.</span><span class="n">x</span> <span class="o">=</span> <span class="n">x</span> <span class="bp">self</span><span class="o">.</span><span class="n">y</span> <span class="o">=</span> <span class="n">y</span> <span class="bp">self</span><span class="o">.</span><span class="n">z</span> <span class="o">=</span> <span class="n">z</span> <span class="c1"># Forgot to call register_class! Whoops!</span> </pre></div> <p>This will cause your code to break at runtime, when you finally try to deserialize an object of a class you forgot to register.</p> <div class="highlight"><pre><span></span><span class="n">point</span> <span class="o">=</span> <span class="n">Point3D</span><span class="p">(</span><span class="mi">5</span><span class="p">,</span> <span class="mi">9</span><span class="p">,</span> <span class="o">-</span><span class="mi">4</span><span class="p">)</span> <span class="n">data</span> <span class="o">=</span> <span class="n">point</span><span class="o">.</span><span class="n">serialize</span><span class="p">()</span> <span class="n">deserialize</span><span class="p">(</span><span class="n">data</span><span class="p">)</span> </pre></div> <div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt;</span> <span class="go">KeyError: &#39;Point3D&#39;</span> </pre></div> <p>Even though you chose to subclass <code>BetterSerializable</code>, you won&#8217;t actually get all of its features if you forget to call <code>register_class</code> after your <code>class</code> statement body. This approach is error prone and especially challenging for beginners. The same omission can happen with <em>class decorators</em> in Python 3.</p> <p>What if you could somehow act on the programmer&#8217;s intent to use <code>BetterSerializable</code> and ensure <code>register_class</code> is called in all cases? Metaclasses enable this by intercepting the <code>class</code> statement when subclasses are defined. This lets you register the new type immediately after the class&#8217;s body.</p> <div class="highlight"><pre><span></span><span class="k">class</span> <span class="nc">Meta</span><span class="p">(</span><span class="nb">type</span><span class="p">):</span> <span class="k">def</span> <span class="fm">__new__</span><span class="p">(</span><span class="n">meta</span><span class="p">,</span> <span class="n">name</span><span class="p">,</span> <span class="n">bases</span><span class="p">,</span> <span class="n">class_dict</span><span class="p">):</span> <span class="bp">cls</span> <span class="o">=</span> <span class="nb">type</span><span class="o">.</span><span class="fm">__new__</span><span class="p">(</span><span class="n">meta</span><span class="p">,</span> <span class="n">name</span><span class="p">,</span> <span class="n">bases</span><span class="p">,</span> <span class="n">class_dict</span><span class="p">)</span> <span class="n">register_class</span><span class="p">(</span><span class="bp">cls</span><span class="p">)</span> <span class="k">return</span> <span class="bp">cls</span> <span class="k">class</span> <span class="nc">RegisteredSerializable</span><span class="p">(</span><span class="n">BetterSerializable</span><span class="p">,</span> <span class="n">metaclass</span><span class="o">=</span><span class="n">Meta</span><span class="p">):</span> <span class="k">pass</span> </pre></div> <p>When I define a subclass of <code>RegisteredSerializable</code>, I can be confident that the call to <code>register_class</code> happened and <code>deserialize</code> will always work as expected.</p> <div class="highlight"><pre><span></span><span class="k">class</span> <span class="nc">Vector3D</span><span class="p">(</span><span class="n">RegisteredSerializable</span><span class="p">):</span> <span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">,</span> <span class="n">z</span><span class="p">):</span> <span class="nb">super</span><span class="p">()</span><span class="o">.</span><span class="fm">__init__</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">,</span> <span class="n">z</span><span class="p">)</span> <span class="bp">self</span><span class="o">.</span><span class="n">x</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">y</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">z</span> <span class="o">=</span> <span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">,</span> <span class="n">z</span> <span class="n">v3</span> <span class="o">=</span> <span class="n">Vector3D</span><span class="p">(</span><span class="mi">10</span><span class="p">,</span> <span class="o">-</span><span class="mi">7</span><span class="p">,</span> <span class="mi">3</span><span class="p">)</span> <span class="nb">print</span><span class="p">(</span><span class="s1">&#39;Before: &#39;</span><span class="p">,</span> <span class="n">v3</span><span class="p">)</span> <span class="n">data</span> <span class="o">=</span> <span class="n">v3</span><span class="o">.</span><span class="n">serialize</span><span class="p">()</span> <span class="nb">print</span><span class="p">(</span><span class="s1">&#39;Serialized:&#39;</span><span class="p">,</span> <span class="n">data</span><span class="p">)</span> <span class="nb">print</span><span class="p">(</span><span class="s1">&#39;After: &#39;</span><span class="p">,</span> <span class="n">deserialize</span><span class="p">(</span><span class="n">data</span><span class="p">))</span> </pre></div> <div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt;</span> <span class="go">Before: Vector3D(10, -7, 3)</span> <span class="go">Serialized: {&quot;class&quot;: &quot;Vector3D&quot;, &quot;args&quot;: [10, -7, 3]}</span> <span class="go">After: Vector3D(10, -7, 3)</span> </pre></div> <p>Using metaclasses for class registration ensures that you&#8217;ll never miss a class as long as the inheritance tree is right. This works well for serialization, as I&#8217;ve shown, and also applies to Database ORMs, plug-in systems, and system hooks.</p> <h3>Things to Remember</h3> <ul> <li>Class registration is a helpful pattern for building modular Python programs.</li> <li>Metaclasses let you run registration code automatically each time your base class is subclassed in a program.</li> <li>Using metaclasses for class registration avoids errors by ensuring that you never miss a registration call.</li> </ul>Item 53: Use Virtual Environments for Isolated and Reproducible Dependencies2015-01-21T19:45:00-08:002015-01-21T19:45:00-08:00Brett Slatkintag:effectivepython.com,2015-01-21:/2015/01/21/use-virtual-environments-for-isolated-and-reproducible-dependencies/ <p>Building larger and more complex programs often leads you to rely on various packages from the Python community. You&#8217;ll find yourself running <code>pip</code> to install packages like <code>pytz</code>, <code>numpy</code>, and many others.</p> <p>The problem is that by default <code>pip</code> installs new packages in a global location. That causes all Python programs on your system to be affected by these installed modules. In theory, this shouldn&#8217;t be an issue. If you install a package and never <code>import</code> it, how could it affect your programs?<p><strong>This sample is from a previous version of the book. <a href="https://effectivepython.com/">See the new third edition here</a>.</strong><br><br></p> <p>Building larger and more complex programs often leads you to rely on various packages from the Python community. You&#8217;ll find yourself running <code>pip</code> to install packages like <code>pytz</code>, <code>numpy</code>, and many others.</p> <p>The problem is that by default <code>pip</code> installs new packages in a global location. That causes all Python programs on your system to be affected by these installed modules. In theory, this shouldn&#8217;t be an issue. If you install a package and never <code>import</code> it, how could it affect your programs?</p> <p>The trouble comes from transitive dependencies: the packages that the packages you install depend on. For example, you can see what the <code>Sphinx</code> package depends on after installing it by asking <code>pip</code>.</p> <div class="highlight"><pre><span></span>$ pip3 show Sphinx --- Name: Sphinx Version: 1.2.2 Location: /usr/local/lib/python3.4/site-packages Requires: docutils, Jinja2, Pygments </pre></div> <p>If you install another package like <code>flask</code>, you can see that it too depends on the <code>Jinja2</code> package.</p> <div class="highlight"><pre><span></span>$ pip3 show flask --- Name: Flask Version: 0.10.1 Location: /usr/local/lib/python3.4/site-packages Requires: Werkzeug, Jinja2, itsdangerous </pre></div> <p>The conflict arises as <code>Sphinx</code> and <code>flask</code> diverge over time. Perhaps right now they both require the same version of <code>Jinja2</code> and everything is fine. But six months or a year from now, <code>Jinja2</code> may release a new version that makes breaking changes to users of the library. If you update your global version of <code>Jinja2</code> with <code>pip install --upgrade</code>, you may find that <code>Sphinx</code> breaks while <code>flask</code> keeps working.</p> <p>The cause of this breakage is that Python can only have a single global version of a module installed at a time. If one of your installed packages must use the new version and another package must use the old version, your system isn&#8217;t going to work properly.</p> <p>Such breakage can even happen when package maintainers try their best to preserve <span class="caps">API</span> compatibility between releases. New versions of a library can subtly change behaviors that <span class="caps">API</span> consuming code relies on. Users on a system may upgrade one package to a new version but not others, breaking dependencies. There&#8217;s a constant risk of the ground moving beneath your feet.</p> <p>These difficulties are magnified when you collaborate with other developers who do their work on separate computers. It&#8217;s reasonable to assume that the versions of Python and global packages they have installed on their machines will be slightly different than your own. This can cause frustrating situations where a codebase works perfectly on one programmer&#8217;s machine and is completely broken on another&#8217;s.</p> <p>The solution to all of these problems is a tool called <code>pyvenv</code>, which provides <em>virtual environments</em>. Since Python 3.4, the <code>pyvenv</code> command-line tool is available by default along with the Python installation (it&#8217;s also accessible with <code>python -m venv</code>). Prior versions of Python require installing a separate package (with <code>pip install virtualenv</code>) and using a command-line tool called <code>virtualenv</code>.</p> <p><code>pyvenv</code> allows you to create isolated versions of the Python environment. Using <code>pyvenv</code>, you can have many different versions of the same package installed on the same system at the same time without conflicts. This lets you work on many different projects and use many different tools on the same computer.</p> <p><code>pyvenv</code> does this by installing explicit versions of packages and their dependencies into completely separate directory structures. This makes it possible to reproduce a Python environment that you know will work with your code. It&#8217;s a reliable way to avoid surprising breakages.</p> <h3>The <code>pyvenv</code> Command</h3> <p>Here&#8217;s a quick tutorial on how to use <code>pyvenv</code> effectively. Before using the tool, it&#8217;s important to note the meaning of the <code>python3</code> command-line on your system. On my computer, <code>python3</code> is located in the <code>/usr/local/bin</code> directory and evaluates to version 3.4.2.</p> <div class="highlight"><pre><span></span>$ which python3 /usr/local/bin/python3 $ python3 --version Python 3.4.2 </pre></div> <p>To demonstrate the setup of my environment, I can test that running a command to import the <code>pytz</code> module doesn&#8217;t cause an error. This works because I already have the <code>pytz</code> package installed as a global module.</p> <div class="highlight"><pre><span></span>$ python3 -c &#39;import pytz&#39; $ </pre></div> <p>Now I use <code>pyvenv</code> to create a new virtual environment called <code>myproject</code>. Each virtual environment must live in its own unique directory. The result of the command is a tree of directories and files.</p> <div class="highlight"><pre><span></span>$ pyvenv /tmp/myproject $ cd /tmp/myproject $ ls bin include lib pyvenv.cfg </pre></div> <p>To start using the virtual environment, I use the <code>source</code> command from my shell on the <code>bin/activate</code> script. <code>activate</code> modifies all of my environment variables to match the virtual environment. It also updates my command-line prompt to include the virtual environment name (<code>'myproject'</code>) to make it extremely clear what I&#8217;m working on.</p> <div class="highlight"><pre><span></span>$ source bin/activate (myproject)$ </pre></div> <p>After activation, you can see that the path to the <code>python3</code> command-line tool has moved to within the virtual environment directory.</p> <div class="highlight"><pre><span></span>(myproject)$ which python3 /tmp/myproject/bin/python3 (myproject)$ ls -l /tmp/myproject/bin/python3 ... -&gt; /tmp/myproject/bin/python3.4 (myproject)$ ls -l /tmp/myproject/bin/python3.4 ... -&gt; /usr/local/bin/python3.4 </pre></div> <p>This ensures that changes to the outside system will not affect the virtual environment. Even if the outer system upgrades its default <code>python3</code> to version 3.5, my virtual environment will still explicitly point at version 3.4.</p> <p>The virtual environment I created with <code>pyvenv</code> starts with no packages installed except for <code>pip</code> and <code>setuptools</code>. Trying to use the <code>pytz</code> package that was installed as a global module in the outside system will fail because it&#8217;s unknown to the virtual environment.</p> <div class="highlight"><pre><span></span>(myproject)$ python3 -c &#39;import pytz&#39; Traceback (most recent call last): File &quot;&lt;string&gt;&quot;, line 1, in &lt;module&gt; ImportError: No module named &#39;pytz&#39; </pre></div> <p>I can use <code>pip</code> to install the <code>pytz</code> module into my virtual environment.</p> <div class="highlight"><pre><span></span>(myproject)$ pip3 install pytz </pre></div> <p>Once it&#8217;s installed, I can verify it&#8217;s working with the same test import command.</p> <div class="highlight"><pre><span></span>(myproject)$ python3 -c &#39;import pytz&#39; (myproject)$ </pre></div> <p>When you&#8217;re done with a virtual environment and want to go back to your default system, you use the <code>deactivate</code> command. This restores your environment to the system defaults, including the location of the <code>python3</code> command-line tool.</p> <div class="highlight"><pre><span></span>(myproject)$ deactivate $ which python3 /usr/local/bin/python3 </pre></div> <p>If you ever want to work in the <code>myproject</code> environment again, you can just run <code>source bin/activate</code> in the directory like before.</p> <h3>Reproducing Dependencies</h3> <p>Once you have a virtual environment, you can continue installing packages with <code>pip</code> as you need them. Eventually, you may want to copy your environment somewhere else. For example, say you want to reproduce your development environment on a production server. Or maybe you want to clone someone else&#8217;s environment on your own machine so you can run their code.</p> <p><code>pyvenv</code> makes these situations easy. You can use the <code>pip freeze</code> command to save all of your explicit package dependencies into a file. By convention this file is named <code>requirements.txt</code>.</p> <div class="highlight"><pre><span></span>(myproject)$ pip3 freeze &gt; requirements.txt (myproject)$ cat requirements.txt numpy==1.8.2 pytz==2014.4 requests==2.3.0 </pre></div> <p>Now imagine you&#8217;d like to have another virtual environment that matches the <code>myproject</code> environment. You can create a new directory like before using <code>pyvenv</code> and <code>activate</code> it.</p> <div class="highlight"><pre><span></span>$ pyvenv /tmp/otherproject $ cd /tmp/otherproject $ source bin/activate (otherproject)$ </pre></div> <p>The new environment will have no extra packages installed.</p> <div class="highlight"><pre><span></span>(otherproject)$ pip3 list pip (1.5.6) setuptools (2.1) </pre></div> <p>You can install all of the packages from the first environment by running <code>pip install</code> on the <code>requirements.txt</code> that you generated with the <code>pip freeze</code> command.</p> <div class="highlight"><pre><span></span>(otherproject)$ pip3 install -r /tmp/myproject/requirements.txt </pre></div> <p>This command will crank along for a little while as it retrieves and installs all of the packages required to reproduce the first environment. Once it&#8217;s done, listing the set of installed packages in the second virtual environment will produce the same list of dependencies found in the first virtual environment.</p> <div class="highlight"><pre><span></span>(otherproject)$ pip list numpy (1.8.2) pip (1.5.6) pytz (2014.4) requests (2.3.0) setuptools (2.1) </pre></div> <p>Using a <code>requirements.txt</code> file is ideal for collaborating with others through a revision control system. You can commit changes to your code at the same time you update your list of package dependencies, ensuring they move in lockstep.</p> <p>The gotcha with virtual environments is that moving them breaks everything because all of the paths, like <code>python3</code>, are hard-coded to the environment&#8217;s install directory. But that doesn&#8217;t matter. The whole purpose of virtual environments is to make it easy to reproduce the same setup. Instead of moving a virtual environment directory, just <code>freeze</code> the old one, create a new one somewhere else, and reinstall everything from the <code>requirements.txt</code> file.</p> <h3>Things to Remember</h3> <ul> <li>Virtual environments allow you to use <code>pip</code> to install many different versions of the same package on the same machine without conflicts.</li> <li>Virtual environments are created with <code>pyvenv</code>, enabled with <code>source bin/activate</code>, and disabled with <code>deactivate</code>.</li> <li>You can dump all of the requirements of an environment with <code>pip freeze</code>. You can reproduce the environment by supplying the <code>requirements.txt</code> file to <code>pip install -r</code>.</li> <li>In versions of Python before 3.4, the <code>pyvenv</code> tool must be downloaded and installed separately. The command-line tool is called <code>virtualenv</code> instead of <code>pyvenv</code>.</li> </ul>Item 17: Be Defensive When Iterating Over Arguments2015-01-03T23:30:00-08:002014-12-05T23:30:00-08:00Brett Slatkintag:effectivepython.com,2015-01-03:/2015/01/03/be-defensive-when-iterating-over-arguments/ <p>When a function takes a list of objects as a parameter, it&#8217;s often important to iterate over that list multiple times.<p><strong>This sample is from a previous version of the book. <a href="https://effectivepython.com/">See the new third edition here</a>.</strong><br><br></p> <p>When a function takes a list of objects as a parameter, it&#8217;s often important to iterate over that list multiple times. For example, say you want to analyze tourism numbers for the <span class="caps">U.S.</span> State of Texas. Imagine the data set is the number of visitors to each city (in millions per year). You&#8217;d like to figure out what percentage of overall tourism each city receives.</p> <p>To do this you need a normalization function. It sums the inputs to figure out the total number of tourists per year. Then it divides each city&#8217;s individual visitor count by the total to find that city&#8217;s contribution to the whole.</p> <div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">normalize</span><span class="p">(</span><span class="n">numbers</span><span class="p">):</span> <span class="n">total</span> <span class="o">=</span> <span class="nb">sum</span><span class="p">(</span><span class="n">numbers</span><span class="p">)</span> <span class="n">result</span> <span class="o">=</span> <span class="p">[]</span> <span class="k">for</span> <span class="n">value</span> <span class="ow">in</span> <span class="n">numbers</span><span class="p">:</span> <span class="n">percent</span> <span class="o">=</span> <span class="mi">100</span> <span class="o">*</span> <span class="n">value</span> <span class="o">/</span> <span class="n">total</span> <span class="n">result</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">percent</span><span class="p">)</span> <span class="k">return</span> <span class="n">result</span> </pre></div> <p>This function works when given a list of visits.</p> <div class="highlight"><pre><span></span><span class="n">visits</span> <span class="o">=</span> <span class="p">[</span><span class="mi">15</span><span class="p">,</span> <span class="mi">35</span><span class="p">,</span> <span class="mi">80</span><span class="p">]</span> <span class="n">percentages</span> <span class="o">=</span> <span class="n">normalize</span><span class="p">(</span><span class="n">visits</span><span class="p">)</span> <span class="nb">print</span><span class="p">(</span><span class="n">percentages</span><span class="p">)</span> </pre></div> <div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt;</span> <span class="go">[11.538461538461538, 26.923076923076923, 61.53846153846154]</span> </pre></div> <p>To scale this up, I need to read the data from a file that contains every city in all of Texas. I define a generator to do this because then I can reuse the same function later when I want to compute tourism numbers for the whole world, a much larger data set.</p> <div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">read_visits</span><span class="p">(</span><span class="n">data_path</span><span class="p">):</span> <span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="n">data_path</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span> <span class="k">for</span> <span class="n">line</span> <span class="ow">in</span> <span class="n">f</span><span class="p">:</span> <span class="k">yield</span> <span class="nb">int</span><span class="p">(</span><span class="n">line</span><span class="p">)</span> </pre></div> <p>Surprisingly, calling <code>normalize</code> on the generator&#8217;s return value produces no results.</p> <div class="highlight"><pre><span></span><span class="n">it</span> <span class="o">=</span> <span class="n">read_visits</span><span class="p">(</span><span class="s1">&#39;/tmp/my_numbers.txt&#39;</span><span class="p">)</span> <span class="n">percentages</span> <span class="o">=</span> <span class="n">normalize</span><span class="p">(</span><span class="n">it</span><span class="p">)</span> <span class="nb">print</span><span class="p">(</span><span class="n">percentages</span><span class="p">)</span> </pre></div> <div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt;</span> <span class="go">[]</span> </pre></div> <p>The cause of this behavior is that an iterator only produces its results a single time. If you iterate over an iterator or generator that has already raised a <code>StopIteration</code> exception, you won&#8217;t get any results the second time around.</p> <div class="highlight"><pre><span></span><span class="n">it</span> <span class="o">=</span> <span class="n">read_visits</span><span class="p">(</span><span class="s1">&#39;/tmp/my_numbers.txt&#39;</span><span class="p">)</span> <span class="nb">print</span><span class="p">(</span><span class="nb">list</span><span class="p">(</span><span class="n">it</span><span class="p">))</span> <span class="nb">print</span><span class="p">(</span><span class="nb">list</span><span class="p">(</span><span class="n">it</span><span class="p">))</span> <span class="c1"># Already exhausted</span> </pre></div> <div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt;</span> <span class="go">[15, 35, 80]</span> <span class="go">[]</span> </pre></div> <p>What&#8217;s confusing is that you also won&#8217;t get any errors when you iterate over an already-exhausted iterator. <code>for</code> loops, the <code>list</code> constructor, and many other functions throughout the Python standard library expect the <code>StopIteration</code> exception to be raised during normal operation. These functions can&#8217;t tell the difference between an iterator that has no output and an iterator that had output and is now exhausted.</p> <p>To solve this problem, you can explicitly exhaust an input iterator and keep a copy of its entire contents in a list. You can then iterate over the list version of the data as many times as you need to. Here&#8217;s the same function as before, but it defensively copies the input iterator.</p> <div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">normalize_copy</span><span class="p">(</span><span class="n">numbers</span><span class="p">):</span> <span class="n">numbers</span> <span class="o">=</span> <span class="nb">list</span><span class="p">(</span><span class="n">numbers</span><span class="p">)</span> <span class="c1"># Copy the iterator</span> <span class="n">total</span> <span class="o">=</span> <span class="nb">sum</span><span class="p">(</span><span class="n">numbers</span><span class="p">)</span> <span class="n">result</span> <span class="o">=</span> <span class="p">[]</span> <span class="k">for</span> <span class="n">value</span> <span class="ow">in</span> <span class="n">numbers</span><span class="p">:</span> <span class="n">percent</span> <span class="o">=</span> <span class="mi">100</span> <span class="o">*</span> <span class="n">value</span> <span class="o">/</span> <span class="n">total</span> <span class="n">result</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">percent</span><span class="p">)</span> <span class="k">return</span> <span class="n">result</span> </pre></div> <p>Now the function works correctly on a generator&#8217;s return value.</p> <div class="highlight"><pre><span></span><span class="n">it</span> <span class="o">=</span> <span class="n">read_visits</span><span class="p">(</span><span class="s1">&#39;/tmp/my_numbers.txt&#39;</span><span class="p">)</span> <span class="n">percentages</span> <span class="o">=</span> <span class="n">normalize_copy</span><span class="p">(</span><span class="n">it</span><span class="p">)</span> <span class="nb">print</span><span class="p">(</span><span class="n">percentages</span><span class="p">)</span> </pre></div> <div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt;</span> <span class="go">[11.538461538461538, 26.923076923076923, 61.53846153846154]</span> </pre></div> <p>The problem with this approach is the copy of the input iterator&#8217;s contents could be large. Copying the iterator could cause your program to run out of memory and crash. One way around this is to accept a function that returns a new iterator each time it&#8217;s called.</p> <div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">normalize_func</span><span class="p">(</span><span class="n">get_iter</span><span class="p">):</span> <span class="n">total</span> <span class="o">=</span> <span class="nb">sum</span><span class="p">(</span><span class="n">get_iter</span><span class="p">())</span> <span class="c1"># New iterator</span> <span class="n">result</span> <span class="o">=</span> <span class="p">[]</span> <span class="k">for</span> <span class="n">value</span> <span class="ow">in</span> <span class="n">get_iter</span><span class="p">():</span> <span class="c1"># New iterator</span> <span class="n">percent</span> <span class="o">=</span> <span class="mi">100</span> <span class="o">*</span> <span class="n">value</span> <span class="o">/</span> <span class="n">total</span> <span class="n">result</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">percent</span><span class="p">)</span> <span class="k">return</span> <span class="n">result</span> </pre></div> <p>To use <code>normalize_func</code>, you can pass in a <code>lambda</code> expression that calls the generator and produces a new iterator each time.</p> <div class="highlight"><pre><span></span><span class="n">percentages</span> <span class="o">=</span> <span class="n">normalize_func</span><span class="p">(</span><span class="k">lambda</span><span class="p">:</span> <span class="n">read_visits</span><span class="p">(</span><span class="n">path</span><span class="p">))</span> </pre></div> <p>Though it works, having to pass a lambda function like this is clumsy. The better way to achieve the same result is to provide a new container class that implements the <em>iterator protocol</em>.</p> <p>The iterator protocol is how Python <code>for</code> loops and related expressions traverse the contents of a container type. When Python sees a statement like <code>for x in foo</code> it will actually call <code>iter(foo)</code>. The <code>iter</code> built-in function calls the <code>foo.__iter__</code> special method in turn. The <code>__iter__</code> method must return an iterator object (which itself implements the <code>__next__</code> special method). Then the <code>for</code> loop repeatedly calls the <code>next</code> built-in function on the iterator object until it&#8217;s exhausted (and raises a <code>StopIteration</code> exception).</p> <p>It sounds complicated, but practically speaking you can achieve all of this behavior for your classes by implementing the <code>__iter__</code> method as a generator. Here I define an iterable container class that reads the files containing tourism data.</p> <div class="highlight"><pre><span></span><span class="k">class</span> <span class="nc">ReadVisits</span><span class="p">(</span><span class="nb">object</span><span class="p">):</span> <span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">data_path</span><span class="p">):</span> <span class="bp">self</span><span class="o">.</span><span class="n">data_path</span> <span class="o">=</span> <span class="n">data_path</span> <span class="k">def</span> <span class="fm">__iter__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span> <span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">data_path</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span> <span class="k">for</span> <span class="n">line</span> <span class="ow">in</span> <span class="n">f</span><span class="p">:</span> <span class="k">yield</span> <span class="nb">int</span><span class="p">(</span><span class="n">line</span><span class="p">)</span> </pre></div> <p>This new container type works correctly when passed to the original function without any modifications.</p> <div class="highlight"><pre><span></span><span class="n">visits</span> <span class="o">=</span> <span class="n">ReadVisits</span><span class="p">(</span><span class="n">path</span><span class="p">)</span> <span class="n">percentages</span> <span class="o">=</span> <span class="n">normalize</span><span class="p">(</span><span class="n">visits</span><span class="p">)</span> <span class="nb">print</span><span class="p">(</span><span class="n">percentages</span><span class="p">)</span> </pre></div> <div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt;</span> <span class="go">[11.538461538461538, 26.923076923076923, 61.53846153846154]</span> </pre></div> <p>This works because the <code>sum</code> method in <code>normalize</code> will call <code>ReadVisits.__iter__</code> to allocate a new iterator object. The <code>for</code> loop to normalize the numbers will also call <code>__iter__</code> to allocate a second iterator object. Each of those iterators will be advanced and exhausted independently, ensuring that each unique iteration sees all of the input data values. The only downside of this approach is it reads the input data multiple times.</p> <p>Now that you know how containers like <code>ReadVisits</code> work, you can write your functions to ensure that parameters aren&#8217;t just iterators. The protocol states that when an iterator is passed to the <code>iter</code> built-in function, <code>iter</code> will return the iterator itself. In contrast, when a container type is passed to <code>iter</code>, a new iterator object will be returned each time. Thus, you can test an input value for this behavior and raise a <code>TypeError</code> to reject iterators.</p> <div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">normalize_defensive</span><span class="p">(</span><span class="n">numbers</span><span class="p">):</span> <span class="k">if</span> <span class="nb">iter</span><span class="p">(</span><span class="n">numbers</span><span class="p">)</span> <span class="ow">is</span> <span class="nb">iter</span><span class="p">(</span><span class="n">numbers</span><span class="p">):</span> <span class="c1"># An iterator -- bad!</span> <span class="k">raise</span> <span class="ne">TypeError</span><span class="p">(</span><span class="s1">&#39;Must supply a container&#39;</span><span class="p">)</span> <span class="n">total</span> <span class="o">=</span> <span class="nb">sum</span><span class="p">(</span><span class="n">numbers</span><span class="p">)</span> <span class="n">result</span> <span class="o">=</span> <span class="p">[]</span> <span class="k">for</span> <span class="n">value</span> <span class="ow">in</span> <span class="n">numbers</span><span class="p">:</span> <span class="n">percent</span> <span class="o">=</span> <span class="mi">100</span> <span class="o">*</span> <span class="n">value</span> <span class="o">/</span> <span class="n">total</span> <span class="n">result</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">percent</span><span class="p">)</span> <span class="k">return</span> <span class="n">result</span> </pre></div> <p>This is ideal if you you don&#8217;t want to copy the full input iterator like <code>normalize_copy</code> above, but you also need to iterate over the input data multiple times. This function works as expected for <code>list</code> and <code>ReadVisits</code> inputs because they are containers. It will work for any type of container that follows the iterator protocol.</p> <div class="highlight"><pre><span></span><span class="n">visits</span> <span class="o">=</span> <span class="p">[</span><span class="mi">15</span><span class="p">,</span> <span class="mi">35</span><span class="p">,</span> <span class="mi">80</span><span class="p">]</span> <span class="n">normalize_defensive</span><span class="p">(</span><span class="n">visits</span><span class="p">)</span> <span class="c1"># No error</span> <span class="n">visits</span> <span class="o">=</span> <span class="n">ReadVisits</span><span class="p">(</span><span class="n">path</span><span class="p">)</span> <span class="n">normalize_defensive</span><span class="p">(</span><span class="n">visits</span><span class="p">)</span> <span class="c1"># No error</span> </pre></div> <p>The function will raise an exception if the input is iterable but not a container.</p> <div class="highlight"><pre><span></span><span class="n">it</span> <span class="o">=</span> <span class="nb">iter</span><span class="p">(</span><span class="n">visits</span><span class="p">)</span> <span class="n">normalize_defensive</span><span class="p">(</span><span class="n">it</span><span class="p">)</span> </pre></div> <div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt;</span> <span class="go">TypeError: Must supply a container</span> </pre></div> <h3>Things to Remember</h3> <ul> <li>Beware of functions that iterate over input arguments multiple times. If these arguments are iterators you may see strange behavior and missing values.</li> <li>Python&#8217;s iterator protocol defines how containers and iterators interact with the <code>iter</code> and <code>next</code> built-in functions, <code>for</code> loops, and related expressions.</li> <li>You can easily define your own iterable container type by implementing the <code>__iter__</code> method as a generator.</li> <li>You can detect that a value is an iterator (instead of a container) if calling <code>iter</code> on it twice produces the same result, which can then be progress with the <code>next</code> built-in function.</li> </ul>The Rough Cut Is Online2014-12-18T19:15:00-08:002014-12-18T19:15:00-08:00Brett Slatkintag:effectivepython.com,2014-12-18:/2014/12/18/rough-cut-is-online/<p>Can&#8217;t wait until next year for the book to be published? The rough cut is now <a href="http://safari.informit.com/9780134034416">available on Safari Books Online</a>. This is an early preview of the full content of the book before editing has been completed.</p>Now Available for Preorder2014-12-04T23:45:00-08:002014-12-04T23:45:00-08:00Brett Slatkintag:effectivepython.com,2014-12-04:/2014/12/04/preorder-the-book/<p><em>Effective Python</em> is now available for preorder on Amazon. <a href="http://amzn.to/1ylkKmc">Follow this link to buy your copy in advance</a>. It will ship in early 2015 once the book is published.</p>Final Draft Done2014-11-16T00:30:00-08:002014-11-16T00:30:00-08:00Brett Slatkintag:effectivepython.com,2014-11-16:/2014/11/16/final-draft-done/<p>The final draft of the book is done. It&#8217;s 55,000 words, 250+ pages, 8 chapters, 59 items. Reviewers spent a lot of time looking over earlier drafts to ensure that the book will be useful. Now it&#8217;s off to production to be turned into a printable layout.</p>Welcome2014-08-18T09:45:00-07:002014-08-18T09:45:00-07:00Brett Slatkintag:effectivepython.com,2014-08-18:/2014/08/18/welcome-to-effective-python/<p>This website is now live! Here you&#8217;ll find updates about my progress towards <em>Effective Python</em>&#8216;s eventual publishing by Addison-Wesley. You can find other books from the <em>Effective</em> series <a href="http://www.informit.com/imprint/series_detail.aspx?st=61267">in Pearson&#8217;s online store</a>.</p>