tag:blogger.com,1999:blog-69505590499157687472024-09-02T04:44:51.525+03:00Alex's Tech RantsAlex Grönholmhttp://www.blogger.com/profile/03624571541745668603noreply@blogger.comBlogger6125tag:blogger.com,1999:blog-6950559049915768747.post-17034269300451881392017-10-17T03:38:00.000+03:002017-10-17T03:38:48.013+03:00Automating packaging and uploading of binary wheelsThere has been plenty of frustration surrounding Python libraries which contain C extensions. Users are frustrated because pre-built wheels are not available for their platform, requiring them to install a compatible compiler. This has been particularly painful for Windows users. On the developer side, projects that have gone through the trouble of providing binary wheels often have a rather complicated process for building and uploading them, requiring several manual steps to get it done. On the other hand, projects that choose not to build binary wheels sometimes have droves of users nagging them to build those wheels (yes, I've been guilty of that). All in all, nobody's happy.<br /> <div> <br /></div> <div> What makes this situation particularly challenging is that there is no single CI service available for F/OSS projects which could build wheels for all three major platforms (Linux, macOS and Windows). <a href="https://travis-ci.org/" target="_blank">Travis</a>&nbsp;supports Linux and macOS but not Windows. Then there's <a href="https://www.appveyor.com/" target="_blank">AppVeyor</a>&nbsp;which only supports Windows. So in order to make everyone happy, you would need to combine the powers of both continuous integration services.</div> <div> <br /></div> <div> An additional challenge has been the lack of coordination tools for ensuring that a new release is uploaded to <a href="https://pypi.org/" target="_blank">PyPI</a>&nbsp;only if all the build jobs succeed. Naive configurations (yes, I've been guilty of that too) build the wheels and upload them to PyPI independently. This can lead to situations where a build job legitimately fails in a way that should've been a release blocker, but the jobs that succeeded have already uploaded their artifacts to PyPI. Now you have a botched release in your hands.</div> <div> <br /></div> <div class="separator" style="clear: both; text-align: center;"> <a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhVQeHiOQA7xyQWKc7JtMj3K97DW2f_pDZPqqxVfg22LLeawMzDJM8OeZcsCDwYff0t82VCiQigQ1dnRhAxTX5XBhjaN8c0uKO725xVkWGg30ArSr9k5t1sPU7KMf3rUBLEsKOjuDMLoZRU/s1600/morpheus.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="303" data-original-width="500" height="193" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhVQeHiOQA7xyQWKc7JtMj3K97DW2f_pDZPqqxVfg22LLeawMzDJM8OeZcsCDwYff0t82VCiQigQ1dnRhAxTX5XBhjaN8c0uKO725xVkWGg30ArSr9k5t1sPU7KMf3rUBLEsKOjuDMLoZRU/s320/morpheus.jpg" width="320" /></a></div> <div> <br /></div> What if I told you that there is a way to set up your project so that all you have to do is add a git tag and push to Github, and the wheels and the source distribution would automatically get built and uploaded to PyPI if (and only if) all goes well?<br /> <br /> Yes, folks. It can be done. You just need an adventurous mind. Take the red pill and find out how deep the rabbit hole goes.<br /> <br /> <h2> How it works</h2> <div> As I hinted earlier, the recipe I'm about to present combines three important factors:</div> <div> <ol> <li>Use of Travis's "<a href="https://docs.travis-ci.com/user/build-stages/" target="_blank">Build Stages</a>" feature</li> <li>Use of AppVeyor via its <a href="https://www.appveyor.com/docs/api/" target="_blank">ReST API</a></li> <li>Use of an external storage service (Amazon S3 is used here, but it could be something else)</li> </ol> </div> <div> The gist is this: The Travis build first runs the tests against all supported Python versions. After the tests have finished successfully, Travis starts building wheels for Linux and macOS. Meanwhile, an additional job is started which sends a request to AppVeyor's ReST API which tells it to start a build against the current git changeset. It will then poll the status of the build on regular intervals until it finishes one way or another. If it fails, the Travis build is failed. If it succeeds, the build artifacts are downloaded to the container running the Travis build.</div> <div> <br /></div> <div> When all the wheels have been built, their respective build jobs will upload them to the shared storage. Then the final job is started which pulls all the artifacts from this storage and uploads them to PyPI.<br /> <br /></div> <h2> Setting it up</h2> Tl;dr: Go see the <a href="https://github.com/agronholm/sample-c-extension" target="_blank">example project</a>&nbsp;and adapt its configuration to your needs.<br /> <br /> You will need to have the following set up before proceeding:<br /> <ul> <li>A project on Github</li> <li>A PyPI account</li> <li>An AppVeyor account</li> <li>An AWS account (if S3 is used)</li> </ul> <div> The use of Amazon's S3 could be replaced with any other storage service. However, the free tier on AWS should get you enough disk space to satisfy the needs of most projects. You do need to have a valid credit card, however.</div> <div> <br /></div> <div> <div> You will need at least these two configuration files present in the project's root directory:</div> <div> <ul> <li><span style="font-family: &quot;courier new&quot; , &quot;courier&quot; , monospace;"><a href="https://github.com/agronholm/sample-c-extension/blob/master/.travis.yml" target="_blank">.travis.yml</a></span>: build configuration for Travis (this is the most important part)</li> <li><span style="font-family: &quot;courier new&quot; , &quot;courier&quot; , monospace;"><a href="https://github.com/agronholm/sample-c-extension/blob/master/appveyor.yml" target="_blank">appveyor.yml</a></span>: build configuration for AppVeyor</li> </ul> <div> You can copy the linked files to your own project as a base. Just remember to replace the environment variables in <span style="font-family: Courier New, Courier, monospace;">.travis.yml</span> (or remove them altogether, as explained below).<br /> <br /></div> </div> </div> <div> </div> <h3> Travis setup</h3> <div> If your project does not yet have Travis integration enabled, you need to do the following:</div> <div> <ol> <li>Go to your project's settings on Github</li> <li>Click on "Integrations and services"</li> <li>Click on "Add service"</li> <li>Choose "Travis CI" and enter your Github password when prompted to do so</li> <li>Go to your <a href="https://travis-ci.org/profile" target="_blank">Travis profile settings</a> on their site</li> <li>Click on "Sync account" (at the top right) to refresh the list of projects</li> <li>Find your project on the list after the sync is complete and turn the switch on</li> <li>Click on the cogwheel next to your project's name to enter the settings page</li> </ol> <div> The following Travis project settings are recommended:</div> <div> <br /></div> <div class="separator" style="clear: both; text-align: center;"> <a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiLpFrgsNzHAIF0042sVRgwEYXlK1FsTrtsiUFDP4-m4n_N5-Cchp415-2XvpPifeyBpoEM-s9vjjKi0IIH7JAeVHkyn_EuBZYGDlQeYOclCRv2ukORPFTPBntOfmVf5JnNdJbP9jP6X35T/s1600/travis-settings.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="374" data-original-width="789" height="151" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiLpFrgsNzHAIF0042sVRgwEYXlK1FsTrtsiUFDP4-m4n_N5-Cchp415-2XvpPifeyBpoEM-s9vjjKi0IIH7JAeVHkyn_EuBZYGDlQeYOclCRv2ukORPFTPBntOfmVf5JnNdJbP9jP6X35T/s320/travis-settings.png" width="320" /></a></div> <h3 style="clear: both;"> <div style="font-size: medium; font-weight: normal;"> <span style="font-family: inherit;">Next, you will need to define the following environment variables:</span></div> <ul style="font-size: medium; font-weight: normal;"> <li><span style="font-family: &quot;courier new&quot; , &quot;courier&quot; , monospace;">APPVEYOR_SLUG</span>&nbsp;(your project name on AppVeyor)</li> <li><span style="font-family: &quot;courier new&quot; , &quot;courier&quot; , monospace;">APPVEYOR_ACCOUNT</span>&nbsp;(your account name on AppVeyor)</li> <li><span style="font-family: &quot;courier new&quot; , &quot;courier&quot; , monospace;">TWINE_USERNAME</span>&nbsp;(your PyPI user name)</li> <li><span style="font-family: &quot;courier new&quot; , &quot;courier&quot; , monospace;">TWINE_PASSWORD</span>&nbsp;(your PyPI password)</li> <li><span style="font-family: &quot;courier new&quot; , &quot;courier&quot; , monospace;">AWS_ACCESS_KEY_ID</span>&nbsp;(the access key ID from AWS, for shared storage)</li> <li><span style="font-family: &quot;courier new&quot; , &quot;courier&quot; , monospace;">AWS_SECRET_ACCESS_KEY</span>&nbsp;(the secret key from AWS, for shared storage)</li> </ul> </h3> <h3 style="clear: both; text-align: left;"> <div> <div style="font-size: medium; font-weight: normal;"> There are two ways you can provide your build jobs environment variables:</div> <ol> <li><span style="font-size: small;"><span style="font-weight: normal;">Add them to your <span style="font-family: &quot;courier new&quot; , &quot;courier&quot; , monospace;">.travis.yml</span> file, encrypting any confidential ones like passwords</span></span></li> <li><span style="font-size: small;"><span style="font-weight: normal;">Add them on your Travis project settings page</span></span></li> </ol> <span style="font-size: small; font-weight: normal;">To encrypt an environment variable, you will need to have Travis's&nbsp;</span><a href="https://github.com/travis-ci/travis.rb" style="font-size: medium; font-weight: normal;" target="_blank">command line client</a><span style="font-size: small; font-weight: normal;">&nbsp;installed. Then, you can do something like this:</span></div> <blockquote class="tr_bq" style="font-size: medium; font-weight: normal;"> <span style="font-family: &quot;courier new&quot; , &quot;courier&quot; , monospace;">echo -n TWINE_PASSWORD=foobarbaz | travis encrypt</span></blockquote> <div style="font-size: medium; font-weight: normal;"> This will output an encrypted secret which you can paste into&nbsp;<span style="font-family: &quot;courier new&quot; , &quot;courier&quot; , monospace;">.travis.yml</span>. Note the importance of the&nbsp;<span style="font-family: &quot;courier new&quot; , &quot;courier&quot; , monospace;">-n</span>&nbsp;switch, as without that a newline character would be added to the end which would cause the wrong text to be encrypted.<br /> <br /> <h3> AppVeyor setup</h3> Assuming you have your AppVeyor account set up, you need to add your project to it. First, go to the Projects section in the top level menu and click "New Project". Then select Github and pick the project from that list.<br /> <br /> Next, you need to disable the web hook AppVeyor just added to your Github project. This is necessary because&nbsp;the&nbsp;AppVeyor&nbsp;build should only be triggered by the wheel build stage on Travis. On Github, go to Settings -&gt; Webhooks&nbsp;and edit the AppVeyor hook. Uncheck the "Active" check box and press "Update webhook", as shown below:<br /> <div class="separator" style="clear: both; text-align: center;"> <a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjBd6TZaAJWgTlfCtuWVb7p27FJ2pz37nX8phK-tShZdvIPDNit6NzF-zI4cAC1dXEaGJk8ff9uwXXuU0x3axbh3HexR4WKaC7372NwVN4AayCSyZlQ7QFM97_z07S8pzCuT81FopOJPZPy/s1600/appveyor-webhook.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="109" data-original-width="338" height="103" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjBd6TZaAJWgTlfCtuWVb7p27FJ2pz37nX8phK-tShZdvIPDNit6NzF-zI4cAC1dXEaGJk8ff9uwXXuU0x3axbh3HexR4WKaC7372NwVN4AayCSyZlQ7QFM97_z07S8pzCuT81FopOJPZPy/s320/appveyor-webhook.png" width="320" /></a></div> That's it for the AppVeyor configuration, unless your project has some special requirements.<br /> <br /> <h2> Afterthoughts</h2> <div> I hope this will lower the barrier for projects to start producing binary wheels.</div> <div> <br /></div> <div> It should be noted that some projects prefer running their test suites against the wheels separately on each platform, but this is left as an exercise for the reader to implement.</div> <div> <br /></div> <div> Hopefully Travis will some day sort out their <a href="https://github.com/travis-ci/travis-ci/issues/2104" target="_blank">lack of Windows support</a>.</div> <div> <br /></div> <div> The example configuration could be simplified somewhat once pip starts supporting <span style="font-family: Courier New, Courier, monospace;">pyproject.toml</span> (included in the sample project). That should enable the removal of the "<span style="font-family: Courier New, Courier, monospace;">pip install wheel Cython</span>" lines all over the configuration.</div> </div> </h3> </div> Alex Grönholmhttp://www.blogger.com/profile/03624571541745668603noreply@blogger.com0tag:blogger.com,1999:blog-6950559049915768747.post-65631690992988729092015-05-04T11:22:00.001+03:002022-03-03T11:51:03.485+02:00concurrent.futures backport updatedIt's been a while since I last updated the <a href="https://pypi.python.org/pypi/futures" target="_blank">backport</a> of the <a href="https://docs.python.org/3.2/library/concurrent.futures.html">concurrent.futures</a> package for Python versions older than 3.2. I was looking at my github repositories and noticed that I wasn't marked as watching the <a href="https://github.com/agronholm/pythonfutures">pythonfutures</a> repository. As I viewed the issue tracker, I noticed a huge slew of issues in it, some of them dating back to 2013! "Oops", I said to myself. I immediately started working on the bugs, mostly applying patches from upstream CPython code. I also decided to drop support for Python 2.5 and 3.1, since nobody seems to be using them anymore. Likewise, the "futures" top level package (which was an alias to concurrent.futures) is now gone.<br /> <br /> Several issues were fixed by the 3.0.1 release:<br /> <br /> <ul> <li>CPython issue&nbsp;<a href="http://bugs.python.org/issue16284" target="_blank">16284</a> (bug&nbsp;<a href="https://github.com/agronholm/pythonfutures/issues/32" target="_blank">#32</a>;&nbsp;memory leak in thread and process pools)</li> <li>CPython issue&nbsp;<a href="http://bugs.python.org/issue11777" target="_blank">11777</a> (bug&nbsp;<a href="https://github.com/agronholm/pythonfutures/issues/28" target="_blank">#28</a>;&nbsp;map() doesn't start tasks until generator is used)</li> <li>CPython issue&nbsp;<a href="http://bugs.python.org/issue15015" target="_blank">15015</a>&nbsp;(accessing a nonexistent attribute)</li> <li>CPython issue&nbsp;<a href="http://bugs.python.org/issue15015" target="_blank">20367</a> (behavior of concurrent.futures.as_completed() for duplicate arguments)</li> <li>backport specific bug&nbsp;<a href="https://github.com/agronholm/pythonfutures/issues/20" target="_blank">#20</a>&nbsp;(map() is greedy; fixed in 3.0.1)</li> </ul> <div> Many apologies for the unreasonable delays in getting these fixed!</div> <div> <br /></div> <div> There are still some major issues with ProcessPoolExecutor, but the upstream code in Python 3.3+ now uses Python 3 specific features and I'm not brave enough to try and backport that. I'm not sure it's even possible. So if you need reliable process pools, you're going to have to switch to Python 3 :)</div> Alex Grönholmhttp://www.blogger.com/profile/03624571541745668603noreply@blogger.com0tag:blogger.com,1999:blog-6950559049915768747.post-36674505587304730692014-08-01T22:34:00.002+03:002014-08-01T22:59:01.751+03:00APScheduler 3.0 releasedThe first final version of <a href="https://pypi.python.org/pypi/APScheduler" target="_blank">APScheduler</a>'s 3.0 branch has been released. <br /> <br /> For the uninitiated, APScheduler is a task scheduling and management system written in Python. Thinking of it as a cron/at daemon running inside your application is not far off, but APScheduler also provides management and monitoring of jobs, and much more. And of course it runs Python code instead of shell commands.<br /> <br /> If one were to compare APScheduler with <a href="http://www.celeryproject.org/" target="_blank">Celery</a>, the difference could be summarized like this: Celery is a distributed task queue with basic scheduling capabilities, while APScheduler is a full featured scheduler with basic task queuing capabilities. Users tell me that APScheduler is easier to set up. I haven't personally used Celery, so I can't comment on that.<br /> <br /> The 3.0 update brings many new features and enhancements, albeit at the cost of a backward-incompatible API. Virtually all of the feature requests from 2.x have been fulfilled. A <a href="https://apscheduler.readthedocs.org/en/latest/migration.html" target="_blank">guide</a> is also provided for 2.x users for smoother migration to 3.0.<br /> <br /> <h3> Performance improvements</h3> Probably the most important change in 3.0 is about the job stores. In previous versions, all the job stores cached <i>all</i> their jobs in memory. This was to eliminate the overhead of fetching them from the backend (file or database). And that would've been fine with a small number of jobs, but when use cases started popping up that required thousands upon thousands of jobs, it became a severe problem. So starting with 3.0, persistent job stores no longer keep the jobs in memory, but instead rely on backend specific mechanisms (such as indexes) to efficiently fetch due jobs. This will greatly help reduce the memory footprints of applications that need to handle large numbers of jobs.<br /> <br /> <h3> Time zone support</h3> One of the most frequent complaints about APScheduler was that it always operated in the host's local time. Many users would've preferred it to always use the UTC timezone instead. Now, in 3.0, all datetimes are timezone aware. The scheduler has a set timezone which defaults to the local timezone, but can easily set to, say, UTC instead. Individual jobs can also be scheduled with different timezones if necessary.<br /> <br /> <h3> Integration with asynchronous event loops</h3> APScheduler now integrates with several widely used asynchronous application frameworks. The integration involves, at a minimum, the use of the event loop's built-in delayed execution mechanism. This avoids the use of a dedicated thread for the scheduler. With some frameworks, the integration can even provide a custom default executor (more on those in the next section) that runs the jobs in a built-in thread pool or similar.<br /> <br /> <h3> Pluggable executor system</h3> The built-in thread pool from the previous versions has been replaced with a pluggable executor system. Each scheduler subclass can specify its own default executor. For example, GeventScheduler uses a gevent specific executor that spawns jobs as greenlets. An executor based on the PEP 3148 (concurrent.futures) thread pool is used as the default executor on most scheduler subclasses.<br /> <br /> While the thread pool in APScheduler 2.x was supposedly replaceable, using a process pool as a replacement didn't work in practice. This has been rectified by providing an officially supported ProcessPoolExecutor.<br /> <br /> Although no such executors are yet provided, this API allows for remote execution, much like Celery does.<br /> <br /> <h3> Scheduler API improvements</h3> With 3.0, all the parameters of the job (except for its ID) can now be modified. In the previous versions, you had to remove and recreate the job from scratch to change anything about it. You can also pause, resume or completely reschedule jobs. This avoids having to keep the job parameters around in order to recreate the job. <br /> <br /> The scheduler API now operates on jobs based on their IDs. This removes a lot of pain when implementing a remote scheduler service based on APScheduler. All the job related methods are also proxied on the job instances returned by <span style="font-family: &quot;Courier New&quot;,Courier,monospace;">add_job()</span>. You can also now retrieve a particular job instance from the scheduler based on its ID – something that was painful to do with older versions of APScheduler.<br /> <br /> The scheduler now allows you to schedule callables based on a text reference consisting of the fully qualified module name and a variable lookup path (for example, <span style="font-family: &quot;Courier New&quot;,Courier,monospace;">x.y.z:func_name</span>). This is handy for when you need to schedule a function for which a reference can't be automatically determined, like static methods.<br /> <br /> Finally, <span style="font-family: &quot;Courier New&quot;,Courier,monospace;">add_job()</span> can now optionally replace an existing job (by its ID). This fixes a long-standing design flaw in APScheduler 2.x, in which adding a job in a persistent job store at application startup (usually using the scheduling decorators) would always add a new instance of the job without removing the old one. By supplying a static ID for the job, the user can ensure that there will be no duplicates of the job.<br /> <br /> <h3> What's next?</h3> A couple new features, contributed by other people, didn't make it into the 3.0 release. For one, there is a job store class for <a href="http://rethinkdb.com/" target="_blank">RethinkDB</a>. Then there is support for getting the current number of running instances for each job. These will likely debut in the 3.1 release.<br /> <br /> The rest will depend on user requirements and feedback. Happy scheduling :)Alex Grönholmhttp://www.blogger.com/profile/03624571541745668603noreply@blogger.com0tag:blogger.com,1999:blog-6950559049915768747.post-56506135185648478722014-01-01T19:19:00.000+02:002014-02-12T03:52:52.095+02:00Unit testing SQLAlchemy apps, part 2: the universal method<h3> The problem</h3> In an <a href="http://alextechrants.blogspot.com/2013/08/unit-testing-sqlalchemy-apps.html" target="_blank">earlier post</a> I presented a method for unit testing SQLAlchemy applications cleanly, without leaving any trace of the tables created during the run of the test suite. However, the problem with this method was that it required the target RDBMS to support both transactional DDL and nested transactions. This requirement left two prominent vendors in the cold: MySQL and Oracle.<br /> <br /> <h3> What didn't work</h3> In search for a solution, I was reminded of the existence of a feature called <b>temporary tables</b>. This seemed like a perfect fit for the problem! Most RDBMS's can create both "local" and "global" temporary tables. Local temporary tables are only visible to the connection that created them, while global temporary tables are structurally visible to all connections, but the data still remains specific to each connection. Unfortunately, Oracle only supports global temporary tables which are unusable for unit testing since they stick around after the test suite has finished. MySQL, however, <i>does</i> support local temporary tables, so I modified the original unit testing code to create all the tables as temporary. But that didn't work either because apparently foreign keys between temporary tables aren't allowed.<br /> <br /> Since temporary tables didn't provide a working solution, I had to look into other alternatives. The most obvious one would of course be to reflect the metadata at the beginning and use <span style="font-family: &quot;Courier New&quot;,Courier,monospace;">metadata.drop_all()</span>. This approach, however, has one subtle issue <i>–</i> circular relationships. If the tables are linked in a cyclic relationship, like A → B → C → A, then these tables can't be dropped without first dropping the foreign key constraints. But then I heard of the <span style="font-family: &quot;Courier New&quot;,Courier,monospace;">DROP TABLE ... CASCADE</span> command which supposedly drops the foreign keys too along the way. This got my hopes up, even though SQLAlchemy didn't support this directly. Those hopes, however, died quickly when I looked at the <a href="http://dev.mysql.com/doc/refman/5.6/en/drop-table.html" target="_blank">MySQL documentation</a> and saw this:<br /> <blockquote class="tr_bq"> <b><code class="literal">RESTRICT</code></b> and <b><code class="literal">CASCADE</code></b> are permitted to make porting easier. In MySQL 5.6, they do nothing.</blockquote> <br /> <h3> The solution </h3> With this, I had to admit defeat and settle for the lowest common denominator. As such, the revised testing process goes as follows:<br /> <ol> <li>Reflect the metadata from the database reserved for unit testing</li> <li>Drop all foreign key constraints using the old metadata</li> <li>Drop all tables using the reflected metadata</li> <li>Create all tables from the current model</li> <li>Add any base data (fixtures) and commit the session</li> <li>Prevent your application and framework from committing or closing the session</li> <li>Run the tests, rolling back the transaction at the end of each test</li> </ol> Step 1 is necessary because there's no telling how much your model has changed between test runs, so simply running <span style="font-family: &quot;Courier New&quot;,Courier,monospace;">drop_all()</span> with your current model's metadata is not guaranteed to do the right thing.<br /> Step 2 is necessary because of potential circular foreign key dependencies preventing some tables from being dropped (see the previous section for a deeper explanation).<br /> Step 6 is necessary for two reasons:<br /> <ol> <li>Allowing <span style="font-family: &quot;Courier New&quot;,Courier,monospace;">commit()</span> would break test isolation by leaking database changes to other tests</li> <li>Allowing the session to close would mean that any changes made between requests within a single test would be rolled back</li> </ol> Finally, a couple reminders:<br /> <ul> <li>Remember to point the connection URI to a dedicated testing database so you don't lose development (not to mention production) data</li> <li>When testing on MySQL, use InnoDB tables since MyISAM doesn't support transactions</li> </ul> <br /> <ul> </ul> <h3> Putting it to practice with Flask and py.test</h3> I developed this testing method to run unit tests on a new, mid-sized Flask based web app of mine that had to use MySQL as its data store for compatibility with an older version. I've also recently migrated from <a href="https://nose.readthedocs.org/en/latest/" target="_blank">nose</a> to the wonderful <a href="http://pytest.org/" target="_blank">py.test</a> testing framework.<br /> <br /> The following code is a generic testing example, adapted from my application's test suite. It works with a single-db configuration, but could be adapted to a multi-db configuration. I've tested it against MySQL 5.5, PostgreSQL 9.1 and SQLite 2.6.0.<br /> <br /> It should be mentioned that I experimented with applying the faster method (based on nested transactions) but my test suite ran only 0.6 seconds faster with it and the approach required a whole different code path on several fixtures so I decided to drop it.<br /> <br /> <pre style="background: #ffffff; color: black;"><pre><pre><span style="color: maroon; font-weight: bold;">from</span> sqlalchemy<span style="color: #808030;">.</span>schema <span style="color: maroon; font-weight: bold;">import</span> MetaData<span style="color: #808030;">,</span> DropConstraint <span style="color: maroon; font-weight: bold;">import</span> pytest <span style="color: maroon; font-weight: bold;">from</span> yourapp <span style="color: maroon; font-weight: bold;">import</span> db<span style="color: #808030;">,</span> create_app @pytest<span style="color: #808030;">.</span>fixture<span style="color: #808030;">(</span>scope<span style="color: #808030;">=</span><span style="color: #0000e6;">'session'</span><span style="color: #808030;">)</span> <span style="color: maroon; font-weight: bold;">def</span> app<span style="color: #808030;">(</span>request<span style="color: #808030;">)</span><span style="color: #808030;">:</span> <span style="color: maroon; font-weight: bold;">return</span> create_app<span style="color: #808030;">(</span><span style="color: #808030;">)</span> @pytest<span style="color: #808030;">.</span>fixture<span style="color: #808030;">(</span>scope<span style="color: #808030;">=</span><span style="color: #0000e6;">'session'</span><span style="color: #808030;">,</span> autouse<span style="color: #808030;">=</span><span style="color: #e34adc;">True</span><span style="color: #808030;">)</span> <span style="color: maroon; font-weight: bold;">def</span> setup_db<span style="color: #808030;">(</span>request<span style="color: #808030;">,</span> app<span style="color: #808030;">)</span><span style="color: #808030;">:</span> <span style="color: dimgrey;"># Clear out any existing tables</span> metadata <span style="color: #808030;">=</span> MetaData<span style="color: #808030;">(</span>db<span style="color: #808030;">.</span>engine<span style="color: #808030;">)</span> metadata<span style="color: #808030;">.</span>reflect<span style="color: #808030;">(</span><span style="color: #808030;">)</span> <span style="color: maroon; font-weight: bold;">for</span> table <span style="color: maroon; font-weight: bold;">in</span> metadata<span style="color: #808030;">.</span>tables<span style="color: #808030;">.</span>values<span style="color: #808030;">(</span><span style="color: #808030;">)</span><span style="color: #808030;">:</span> <span style="color: maroon; font-weight: bold;">for</span> fk <span style="color: maroon; font-weight: bold;">in</span> table<span style="color: #808030;">.</span>foreign_keys<span style="color: #808030;">:</span> db<span style="color: #808030;">.</span>engine<span style="color: #808030;">.</span>execute<span style="color: #808030;">(</span>DropConstraint<span style="color: #808030;">(</span>fk<span style="color: #808030;">.</span>constraint<span style="color: #808030;">)</span><span style="color: #808030;">)</span> metadata<span style="color: #808030;">.</span>drop_all<span style="color: #808030;">(</span><span style="color: #808030;">)</span> <span style="color: dimgrey;"># Create the tables based on the current model</span> db<span style="color: #808030;">.</span>create_all<span style="color: #808030;">(</span><span style="color: #808030;">)</span> <span style="color: dimgrey;"># Add base data here</span> <span style="color: dimgrey;"># ...</span> db<span style="color: #808030;">.</span>session<span style="color: #808030;">.</span>flush<span style="color: #808030;">(</span><span style="color: #808030;">)</span> db<span style="color: #808030;">.</span>session<span style="color: #808030;">.</span>expunge_all<span style="color: #808030;">(</span><span style="color: #808030;">)</span> db<span style="color: #808030;">.</span>session<span style="color: #808030;">.</span>commit<span style="color: #808030;">(</span><span style="color: #808030;">)</span> @pytest<span style="color: #808030;">.</span>fixture<span style="color: #808030;">(</span>autouse<span style="color: #808030;">=</span><span style="color: #e34adc;">True</span><span style="color: #808030;">)</span> <span style="color: maroon; font-weight: bold;">def</span> dbsession<span style="color: #808030;">(</span>request<span style="color: #808030;">,</span> monkeypatch<span style="color: #808030;">)</span><span style="color: #808030;">:</span> <span style="color: dimgrey;"># Roll back at the end of every test</span> request<span style="color: #808030;">.</span>addfinalizer<span style="color: #808030;">(</span>db<span style="color: #808030;">.</span>session<span style="color: #808030;">.</span>remove<span style="color: #808030;">)</span> <span style="color: dimgrey;"># Prevent the session from closing (make it a no-op) and</span> <span style="color: dimgrey;"># committing (redirect to flush() instead)</span> monkeypatch<span style="color: #808030;">.</span><span style="color: #e34adc;">setattr</span><span style="color: #808030;">(</span>db<span style="color: #808030;">.</span>session<span style="color: #808030;">,</span> <span style="color: #0000e6;">'commit'</span><span style="color: #808030;">,</span> db<span style="color: #808030;">.</span>session<span style="color: #808030;">.</span>flush<span style="color: #808030;">)</span> monkeypatch<span style="color: #808030;">.</span><span style="color: #e34adc;">setattr</span><span style="color: #808030;">(</span>db<span style="color: #808030;">.</span>session<span style="color: #808030;">,</span> <span style="color: #0000e6;">'remove'</span><span style="color: #808030;">,</span> <span style="color: #e34adc;">lambda</span><span style="color: #808030;">:</span> <span style="color: #e34adc;">None</span><span style="color: #808030;">)</span> <span style="color: maroon; font-weight: bold;">def</span> test_example<span style="color: #808030;">(</span>app<span style="color: #808030;">)</span><span style="color: #808030;">:</span> with app<span style="color: #808030;">.</span>test_client<span style="color: #808030;">(</span><span style="color: #808030;">)</span> as client<span style="color: #808030;">:</span> response <span style="color: #808030;">=</span> client<span style="color: #808030;">.</span>get<span style="color: #808030;">(</span><span style="color: #0000e6;">'/some/path'</span><span style="color: #808030;">)</span> <span style="color: maroon; font-weight: bold;">assert</span> response<span style="color: #808030;">.</span>status_code <span style="color: #808030;">=</span><span style="color: #808030;">=</span> <span style="color: #008c00;">200</span> <span style="color: maroon; font-weight: bold;">assert</span> db<span style="color: #808030;">.</span>session<span style="color: #808030;">.</span>query<span style="color: #808030;">(</span>Foo<span style="color: #808030;">)</span><span style="color: #808030;">.</span>count<span style="color: #808030;">(</span><span style="color: #808030;">)</span> <span style="color: #808030;">=</span><span style="color: #808030;">=</span> <span style="color: #008c00;">5</span><span style="color: #008c00;"></span></pre> </pre> </pre> Alex Grönholmhttp://www.blogger.com/profile/03624571541745668603noreply@blogger.com0tag:blogger.com,1999:blog-6950559049915768747.post-64762918324977296842013-11-09T04:18:00.001+02:002013-11-09T18:16:36.464+02:0010 common stumbling blocks for SQLAlchemy newbiesI've been hanging around on #sqlalchemy for years now, helping other users where I could. People come there for help with many different kinds of problems, but after a while, you start seeing common patterns. Certain difficulties seem to persist amongst SQLAlchemy newbies. This is my attempt at documenting these issues and providing answers and solutions where applicable. These 10 issues are not in any real order of appearance frequency – I just wrote them down in the order in which they popped into my mind.<br /> <br /> <b>Updated </b>2013-09-09: Corrected facts in section 5.<br /> <ol> </ol> <h3> 1. Not finding the (right) tutorial</h3> Occasionally, someone comes on the channel asking questions while referring to a tutorial outside of <a href="http://sqlalchemy.org/">sqlalchemy.org</a>. These tutorials are often out of date or just plain incorrect. Personally, when I look for a tutorial on something, the first place to look would be the official website of the project you're trying to learn to use. I would say that is common sense.<br /> <br /> There's also another group of people, who are able to find the official documentation but miss the tutorials, despite them being the first links on both the ORM and Core sections.<br /> <br /> While I don't think there's any important information really missing from the SQLAlchemy documentation, the top page could probably be laid out differently to emphasize the suggested starting points for new users.<br /> <br /> <h3> 2. Boilerplate constructors</h3> The declarative metaclass provides a default constructor for model classes, making explicit constructors largely unnecessary. The issue is that many people just don't seem to know about this, and happily add explicit constructors to all their classes. I actually blame the SQLAlchemy documentation for not clearly pointing this out. I've only ever found a single mention of this feature in the documentation. In my opinion, its existence should be prominently advertised in the ORM tutorial.<br /> <br /> So instead of creating a constructor that accepts all your fields as arguments, just skip the constructor and instantiate classes like <span style="font-family: &quot;Courier New&quot;,Courier,monospace;">Company(name='Foo', address='blah')</span>.<br /> <ol> </ol> <h3> 3. Believing that defining or modifying the model classes creates or modifies the tables themselves</h3> A surprising amount of new users seem to believe that SQLAlchemy is capable of automatically modifying their tables to match their classes. I am not aware of any ORM that does such a thing automatically, least of all reliably.<br /> <br /> If you change your declared classes, you need to explicitly alter the schema in the actual database too, by either:<br /> <ol> <li>using the tool of your choice (usually <span style="font-family: &quot;Courier New&quot;,Courier,monospace;">psql</span> / <span style="font-family: &quot;Courier New&quot;,Courier,monospace;">mysql</span> / other command line tool or a graphical tool like phpMyAdmin etc.) to manually create/alter/drop the tables</li> <li>using a migration library like <a href="https://pypi.python.org/pypi/alembic" target="_blank">Alembic</a></li> </ol> <h3> 4. Forgetting to import model modules before attempting to call metadata.create_all()</h3> I must admit that I was bitten by this one when I was starting with SQLAlchemy. You already know that <span style="font-family: &quot;Courier New&quot;,Courier,monospace;">metadata.create_all()</span> creates all your tables, right? Sure, but before that can happen, all the modules that declare tables must have been imported. I mean, how else would SQLAlchemy know what the tables are?<br /> <br /> Personally I hadn't even thought about how that works, but then there are some people who seem to expect SQLAlchemy to scan the project structure for model classes! Admittedly, the metaclass machinery that puts declarative classes in the metadata may in the beginning seem so magical that you expect everything to "just work" by itself.<br /> <br /> So remember kids, you need to import all your model modules so that the metadata can be filled in. Only then will the call to <span style="font-family: &quot;Courier New&quot;,Courier,monospace;">metadata.create_all()</span> do anything useful.<br /> <br /> <h3> 5. Confusion between .one(), .first() and .scalar()</h3> Every week I see someone pasting code that uses <span style="font-family: &quot;Courier New&quot;,Courier,monospace;">Query.one() </span>and catching the <span style="font-family: &quot;Courier New&quot;,Courier,monospace;">NoResultException</span>. Then others complain that <span style="font-family: &quot;Courier New&quot;,Courier,monospace;">Query.first()</span> should return a scalar value instead of a <span style="font-family: &quot;Courier New&quot;,Courier,monospace;">KeyedTuple</span> when a single column or expression has been selected. <span style="font-family: &quot;Courier New&quot;,Courier,monospace;">Query.scalar()</span> is virtually unheard of.<br /> <br /> So let me clear up the confusion once and for all:<br /> <ul> <li><span style="font-family: &quot;Courier New&quot;,Courier,monospace;">Query.one()</span> executes the query and raises <span style="font-family: &quot;Courier New&quot;,Courier,monospace;">MultipleResultsFound</span> if the number of rows was more than 1, and <span style="font-family: &quot;Courier New&quot;,Courier,monospace;">NoResultFound</span> no rows were found. Otherwise it returns the sole result row (as tuple)</li> <li><span style="font-family: &quot;Courier New&quot;,Courier,monospace;">Query.first()</span> executes the query with <span style="font-family: &quot;Courier New&quot;,Courier,monospace;">LIMIT 1</span> and returns the result row as tuple, or <span style="font-family: &quot;Courier New&quot;,Courier,monospace;">None</span> if no rows were found</li> <li><span style="font-family: &quot;Courier New&quot;,Courier,monospace;">Query.scalar()</span> executes the query and raises <span style="font-family: &quot;Courier New&quot;,Courier,monospace;">MultipleResultsFound<span style="font-family: inherit;"> </span></span> if the number of rows was more than 1. If no rows were found, it returns <span style="font-family: &quot;Courier New&quot;,Courier,monospace;">None<span style="font-family: inherit;">. Otherwise it </span></span>returns the first column of the sole result row. </li> </ul> <span style="font-family: &quot;Courier New&quot;,Courier,monospace;">Query.one()</span> should be used if you want to select a single row and you assume there is exactly one row to be found. <span style="font-family: &quot;Courier New&quot;,Courier,monospace;">Query.first()</span> should be used when you want just one row but you're not sure if one will be found. <span style="font-family: &quot;Courier New&quot;,Courier,monospace;">Query.scalar()</span> should be used when you want, for example, just the maximum of a single column.<br /> <br /> <h3> 6. Not understanding scoped_session and sessionmaker</h3> This is quite possibly the greatest source of confusion among SQLAlchemy users. Especially so for developers who have very little experience in threaded programming. First, a quick recap:<br /> <ul> <li><span style="font-family: &quot;Courier New&quot;,Courier,monospace;">Session</span>: a container of instances of mapped classes; among other things lets you construct queries dynamically and add/delete/update entities</li> <li><span style="font-family: &quot;Courier New&quot;,Courier,monospace;">sessionmaker</span>: a session factory generator (in other words, a function that returns a function that returns a new Session for each call)</li> <li><span style="font-family: &quot;Courier New&quot;,Courier,monospace;">scoped_session</span>: a threadlocal manager and proxy of sessions </li> </ul> If you're still confused, you're not alone! So let me try and clear this up a bit. The <span style="font-family: &quot;Courier New&quot;,Courier,monospace;">sqlalchemy.orm.session.Session</span> class is where the real meat is. The other two are merely helpers. In trivial scripts, this one is all you need.<br /> <br /> The <span style="font-family: &quot;Courier New&quot;,Courier,monospace;">sessionmaker</span> function is a convenience function for instantiating <span style="font-family: &quot;Courier New&quot;,Courier,monospace;">Session<span style="font-family: inherit;"> objects </span></span>with customized parameters. Most notably session makers can't be used as proxies – you have to call one to get a session first.<br /> <br /> The <span style="font-family: &quot;Courier New&quot;,Courier,monospace;">scoped_session</span> function creates a thread local proxy/manager of <span style="font-family: &quot;Courier New&quot;,Courier,monospace;">Session</span> instances. What this means is that you can access the scoped session object as if it was like a Session instance. The scoped session "knows" (by virtue of <a href="http://docs.python.org/3.3/library/threading.html#thread-local-data" target="_blank">threadlocal</a> objects) which thread is asking for a session and so it always returns the same session for the same thread. This is very convenient for multithreaded applications because you don't have to worry about sessions accidentally crossing threads.<br /> <br /> Some want to call the scoped session to get a Session instance. While that certainly works, it's almost always unnecessary. Instead:<br /> <br /> <pre style="background: #ffffff; color: black;"><pre>DBSession <span style="color: #808030;">=</span> scoped_session<span style="color: #808030;">(</span>sessionmaker<span style="color: #808030;">(</span>engine<span style="color: #808030;">)</span><span style="color: #808030;">)</span> companies <span style="color: #808030;">=</span> DBSession<span style="color: #808030;">.</span>query<span style="color: #808030;">(</span>Company<span style="color: #808030;">)</span><span style="color: #808030;">.</span>all<span style="color: #808030;">(</span><span style="color: #808030;">)</span> </pre> </pre> <h3> 7. Query properties</h3> Certain web frameworks (I'm looking at you, <a href="http://pythonhosted.org/Flask-SQLAlchemy/queries.html#querying-records" rel="nofollow" target="_blank">Flask-SQLAlchemy</a>!) promote the use of query properties that supposedly make querying easier. Trouble is, it is apparently not obvious from the example code how one would query for specific columns, aggregates or other expressions using said query property.<br /> <br /> The solution is to use the session directly:<br /> <br /> <pre style="background: #ffffff; color: black;"><pre>db<span style="color: #808030;">.</span>session<span style="color: #808030;">.</span>query<span style="color: #808030;">(</span>Company<span style="color: #808030;">.</span>address<span style="color: #808030;">)</span> </pre> </pre> <h3> 8. Attempting to access related classes through relationships in queries:</h3> Someone asks something like this every week on #sqlalchemy:<br /> <blockquote class="tr_bq"> &lt;<b>anon</b>&gt; anyone see anything wrong with this query: c = Category.query.join(Category.parent).filter(Category.name=='Social Issues', Category.parent.name=='Teens').first()</blockquote> What is wrong with this query is of course that Category.parent is a relationship and thus it doesn't have a "name" attribute. I don't blame anyone who falls for this though. It would make sense, from the syntactic perspective, for this to work. It might automatically add a join to Category.parent in the query. The reason why this can't be done is beyond my knowledge of SQLAlchemy. Anyway, to fix the query, you add an alias and join that:<br /> <br /> <pre style="background: #ffffff; color: black;"><pre>parent <span style="color: #808030;">=</span> aliased<span style="color: #808030;">(</span>Category<span style="color: #808030;">)</span> c <span style="color: #808030;">=</span> Category<span style="color: #808030;">.</span>query<span style="color: #808030;">.</span>join<span style="color: #808030;">(</span>parent<span style="color: #808030;">,</span> Category<span style="color: #808030;">.</span>parent<span style="color: #808030;">)</span><span style="color: #808030;">.</span>\ <span style="color: #e34adc;">filter</span><span style="color: #808030;">(</span>Category<span style="color: #808030;">.</span>name<span style="color: #808030;">=</span><span style="color: #808030;">=</span><span style="color: #0000e6;">'Social Issues'</span><span style="color: #808030;">,</span> parent<span style="color: #808030;">.</span>name<span style="color: #808030;">=</span><span style="color: #808030;">=</span><span style="color: #0000e6;">'Teens'</span><span style="color: #808030;">)</span><span style="color: #808030;">.</span>first<span style="color: #808030;">(</span><span style="color: #808030;">)</span> </pre> </pre> <h3> 9. Wanting to commit the session to get the ID of an object so it can be used to insert related objects</h3> Most nontrivial SQLAlchemy applications need to work with relationships. Imagine a situation like this: You need to create a <span style="font-family: &quot;Courier New&quot;,Courier,monospace;">Company</span> and add an <span style="font-family: &quot;Courier New&quot;,Courier,monospace;">Employee</span> to it. You already have a relationship called employees in the <span style="font-family: &quot;Courier New&quot;,Courier,monospace;">Company</span> class. So how would you accomplish this task? The typical answer goes along the lines of:<br /> <ol> <li>Create the <span style="font-family: &quot;Courier New&quot;,Courier,monospace;">Company</span> instance</li> <li>Add the <span style="font-family: &quot;Courier New&quot;,Courier,monospace;">Company</span> instance to the session</li> <li>Flush the session</li> <li>Create the <span style="font-family: &quot;Courier New&quot;,Courier,monospace;">Employee</span> instance with <span style="font-family: &quot;Courier New&quot;,Courier,monospace;">company_id</span> from the previously flushed Company instance</li> <li>Add the <span style="font-family: &quot;Courier New&quot;,Courier,monospace;">Employee</span> instance to the session</li> <li>Commit the session</li> </ol> This is just about how you'd do it without SQLAlchemy. But with SQLAlchemy this can be done much easier:<br /> <ol> <li>Create the <span style="font-family: &quot;Courier New&quot;,Courier,monospace;">Company</span> instance</li> <li>Create the <span style="font-family: &quot;Courier New&quot;,Courier,monospace;">Employee</span> instance</li> <li>Add the <span style="font-family: &quot;Courier New&quot;,Courier,monospace;">Company</span> instance to the session</li> <li>Commit the session</li> </ol> When the session is flushed, the <span style="font-family: &quot;Courier New&quot;,Courier,monospace;">Company</span> row is first inserted to the database. Then, by virtue of relationship cascade, the <span style="font-family: &quot;Courier New&quot;,Courier,monospace;">Employee</span> instance is discovered and also inserted into the database. The RDBMS can't tell the difference, but for the developer, the latter approach is much nicer.<br /> <br /> <h3> 10. Running tests using SQLite instead of the production RDBMS</h3> I've written a <a href="http://alextechrants.blogspot.fi/2013/08/unit-testing-sqlalchemy-apps.html" target="_blank">separate blog post</a> about this, but tl;dr: always test against the same RDBMS that you deploy with.Alex Grönholmhttp://www.blogger.com/profile/03624571541745668603noreply@blogger.com0tag:blogger.com,1999:blog-6950559049915768747.post-17123989369665017102013-08-25T23:33:00.003+03:002022-03-03T11:49:01.611+02:00Unit testing SQLAlchemy appsI help out a lot on the #sqlalchemy channel on Freenode IRC. When people ask questions there, one of the issues that comes up often is unit testing applications that use SQLAlchemy. Almost all developers instinctively use SQLite for testing due to its simplicity and the possibility to use an in-memory database that leaves no garbage behind to be cleaned up. It seems like a clean and easy solution, yes? Well, my answer is that if you don't plan on <b>deploying</b> with SQLite support, <b>don't test on SQLite!</b> Always use the production RDBMS for testing. Going to deploy with PostgreSQL? Then test with PostgreSQL! Why?<br /> <br /> There are at least two good reasons why testing and deploying with the same RDBMS is a good idea. The first and foremost is that SQLite is vastly different from other RDBMS's. For one, it does not really enforce column types, so code that erroneously inputs data of the wrong type won't cause errors when it should. There are also many semantic differences on how embedded and your typical client-server RDBMS's work, so you may run into bugs that only occur in production while all the tests pass just fine. The second reason is that SQLite's rather modest design, which lets it fit into small memory spaces, is also a big hindrance since it can't support some more advanced database features like window functions or recursive queries. This shortcoming prevents you from taking full advantage of the features of your chosen RDBMS.<br /> <br /> If I managed to convince you, then you'll probably be asking how testing <b>should</b> be done on RDBMS's other than SQLite. The answer boils down to whether your RDBMS supports two crucial features: <a href="https://en.wikipedia.org/wiki/Nested_transaction" target="_blank">nested transactions</a> and <a href="http://wiki.postgresql.org/wiki/Transactional_DDL_in_PostgreSQL:_A_Competitive_Analysis" target="_blank">transactional DDL</a>. Nested transactions are savepoints within a transaction, to which you can roll back without losing any changes done before the savepoint. Transactional DDL means that in addition to normal data modification (INSERT, UPDATE, DELETE), schema changes are also transactional. That means they can be rolled back, which is a very nice thing to have when unit testing. According to the article linked to above, the following RDBMS's support transactional DDL: PostgreSQL, SyBase, Microsoft SQL Server, DB2, Informix and Firebird. Most notably, MySQL and Oracle <b>do not</b> support it.<br /> <br /> If your RDBMS does support the aforementioned two features, then you can conveniently test your SQLAlchemy apps in the following manner:<br /> <ol> <li>Make sure you have an empty database for unit testing </li> <li>Create the engine, create one connection and start a transaction in it</li> <li>Create the tables</li> <li>Optionally, insert test fixtures </li> <li>For every test, repeat:</li> <ol> <li>Create a savepoint</li> <li>Run the test</li> <li>Roll back to the savepoint</li> </ol> <li>Roll back the transaction</li> </ol> This way all the tests are isolated from each other. Every tests gets the same "blank slate" as far as the database state is concerned. After the tests are done, the database will look just as empty as before the test suite was run, even if the tests were interrupted.<br /> <br /> So how to actually accomplish this in practice? I'll give you an example using PostgreSQL and <a href="https://nose.readthedocs.org/en/latest/" target="_blank">nose.</a> This example should be adaptable for other equally capable test runners. The following code should be placed in <span style="font-family: &quot;Courier New&quot;,Courier,monospace;">__init__.py</span> of the root package of your tests.<br /> <br /> <pre style="background: #ffffff; color: black;"><span style="color: maroon; font-weight: bold;">from</span> sqlalchemy<span style="color: #808030;">.</span>engine <span style="color: maroon; font-weight: bold;">import</span> create_engine <span style="color: maroon; font-weight: bold;">from</span> sqlalchemy<span style="color: #808030;">.</span>orm<span style="color: #808030;">.</span>session <span style="color: maroon; font-weight: bold;">import</span> Session <span style="color: maroon; font-weight: bold;">from</span> your<span style="color: #808030;">.</span>package <span style="color: maroon; font-weight: bold;">import</span> Base <span style="color: dimgrey;"># This is your declarative base class</span> <span style="color: maroon; font-weight: bold;">def</span> setup_module<span style="color: #808030;">(</span><span style="color: #808030;">)</span><span style="color: #808030;">:</span> <span style="color: maroon; font-weight: bold;">global</span> transaction<span style="color: #808030;">,</span> connection<span style="color: #808030;">,</span> engine <span style="color: dimgrey;"># Connect to the database and create the schema within a transaction</span> engine <span style="color: #808030;">=</span> create_engine<span style="color: #808030;">(</span><span style="color: #0000e6;">'postgresql:///yourdb'</span><span style="color: #808030;">)</span> connection <span style="color: #808030;">=</span> engine<span style="color: #808030;">.</span>connect<span style="color: #808030;">(</span><span style="color: #808030;">)</span> transaction <span style="color: #808030;">=</span> connection<span style="color: #808030;">.</span>begin<span style="color: #808030;">(</span><span style="color: #808030;">)</span> Base<span style="color: #808030;">.</span>metadata<span style="color: #808030;">.</span>create_all<span style="color: #808030;">(</span>connection<span style="color: #808030;">)</span> <span style="color: dimgrey;"># If you want to insert fixtures to the DB, do it here</span> <span style="color: maroon; font-weight: bold;">def</span> teardown_module<span style="color: #808030;">(</span><span style="color: #808030;">)</span><span style="color: #808030;">:</span> <span style="color: dimgrey;"># Roll back the top level transaction and disconnect from the database</span> transaction<span style="color: #808030;">.</span>rollback<span style="color: #808030;">(</span><span style="color: #808030;">)</span> connection<span style="color: #808030;">.</span>close<span style="color: #808030;">(</span><span style="color: #808030;">)</span> engine<span style="color: #808030;">.</span>dispose<span style="color: #808030;">(</span><span style="color: #808030;">)</span> <span style="color: maroon; font-weight: bold;">class</span> DatabaseTest<span style="color: #808030;">(</span>object<span style="color: #808030;">)</span><span style="color: #808030;">:</span> <span style="color: maroon; font-weight: bold;">def</span> setup<span style="color: #808030;">(</span>self<span style="color: #808030;">)</span><span style="color: #808030;">:</span> self<span style="color: #808030;">.</span>__transaction <span style="color: #808030;">=</span> connection<span style="color: #808030;">.</span>begin_nested<span style="color: #808030;">(</span><span style="color: #808030;">)</span> self<span style="color: #808030;">.</span>session <span style="color: #808030;">=</span> Session<span style="color: #808030;">(</span>connection<span style="color: #808030;">)</span> <span style="color: maroon; font-weight: bold;">def</span> teardown<span style="color: #808030;">(</span>self<span style="color: #808030;">)</span><span style="color: #808030;">:</span> self<span style="color: #808030;">.</span>session<span style="color: #808030;">.</span>close<span style="color: #808030;">(</span><span style="color: #808030;">)</span> self<span style="color: #808030;">.</span>__transaction<span style="color: #808030;">.</span>rollback<span style="color: #808030;">(</span><span style="color: #808030;">)</span> </pre> <br /> To take advantage of this setup, your test class should inherit from DatabaseTest. If you override either <span style="font-family: &quot;Courier New&quot;,Courier,monospace;">setup()</span> or <span style="font-family: &quot;Courier New&quot;,Courier,monospace;">teardown()</span>, make sure you remember to call the superclass implementations too.<br /> <br /> If you want to use this testing scheme with your web framework or other application framework, you should be aware of the requirement that the framework's SQLAlchemy extension/middleware/whatever must have a way to receive the connectable (<span style="font-family: &quot;Courier New&quot;,Courier,monospace;">Engine</span> or <span style="font-family: &quot;Courier New&quot;,Courier,monospace;">Connection</span>) as a Python object and not just as a connection URL.Alex Grönholmhttp://www.blogger.com/profile/03624571541745668603noreply@blogger.com9