Neopythonic: appengine

Showing posts with label appengine. Show all posts

Thursday, August 25, 2011

Compare-And-Set in Memcache

With the most recent release (1.5.3, last week) App Engine's Python API for Memcache has added a new feature, Compare-And-Set. This feature (with a different API) was already available in Java; it has also been available in the non-App-Engine pure-Python memcache client. In fact, I designed the App Engine Python API for this feature to be compatible with the latter, since most of the rest of the App Engine Python API also strives to be at least a superset of that package.

But what is it? There seems to be little information on how to use Compare-And-Set with memcache. It is also sometimes (incorrectly) referred to as Compare-And-Swap -- incorrect, because the cas() operation does not actually "swap" anything. The first response when we closed the bug requesting this feature was "Some examples of usage are appreciated." So here goes.

The basic use case for Compare-And-Set is when multiple requests that are being handled concurrently need to update the same memcache key in an atomic fashion. Let's assume you are managing a counter in memcache. (Actually, you could use the incr() and decr() operations to update 64-bit integer counters atomically, but just for argument's sake assume you cannot use those -- there are other data types for which the memcache service does not have built-in support.)

The naive code to update a counter would be something like this:

def init_counter(key):
. memcache.set(key, 0)

def bump_counter(key):
. counter = memcache.get(key)
. assert counter is not None, 'Uninitialized counter'
. memcache.set(key, counter+1)

(Aside: I don't want to have to think about how to get blogger to properly format Python code. I really don't. So just bear with the dots I use for indentation. Okay? Comments pointing me to solutions will be DELETED.)

(Aside 2: The assert is kind of naive; in practice you'll have to somehow deal with counter initialization. You also should implement a backup for your counter using the App Engine datastore, so that it can survive eviction by the memcache service. However interesting these details are on their own, I leave them for another time.)

Hopefully you can spot the problem in this version of bump_counter(): if two requests execute concurrently (on different instances of the same app), the sequence of operations might be as follows, labeling the two requests as A and B:

A: counter = memcache.get(key) # Reads 42
B: counter = memcache.get(key) # Reads 42
A: memcache.set(key, counter+1) # Writes 43
B: memcache.set(key, counter+1) # Writes 43

So even though two requests were executed, the counter only gets incremented by one. This is called a race condition. Various interlacings of these lines can have the same effect; a race condition occurs whenever B reads the counter before A has written it (or vice versa).

You could try to guard against this by reading the counter value back and checking that it was incremented by one; however this solution still has a race condition (see if you can figure it out for yourself). There are other solutions possible involving a separate "lock" variable, managed using the add() and delete() operations. However these in general require more server roundtrips, and it is pretty hard to manufacture a decent lock out of the basic memcache operations (try for yourself -- think about what would happen if your request was somehow aborted after acquiring the lock, without having a chance of releasing it).

Using the Compare-And-Set operation, writing a reliable bump_counter() function is a cinch:

def bump_counter(key):
. client = memcache.Client()
. while True: # Retry loop
. . counter = client.gets(key)
. . assert counter is not None, 'Uninitialized counter'
. . if client.cas(key, counter+1):
. . . break

I've highlighted the changes from the previous version. There are several essential differences:

The use of a memcache Client object instead of memcache functions
The use of a retry loop
The use of gets() and cas() instead of get() and set()

The Client object is required because the gets() operation actually squirrels away some hidden information that is used by the subsequent cas() operation. Because the memcache functions are stateless (meaning they don't alter any global values), these operations are only available as methods on the Client object, not as functions in the memcache module. (Apart from these two, the methods on the Client object are exactly the same as the functions in the module, as you can tell by comparing the documentation.)

The retry loop is necessary because this code doesn't actually avoid race conditions -- it just detects them! The memcache service guarantees that when used in the pattern shown here (i.e. using gets() instead of get() and cas() instead of set()), if two (or more) different client instances happen to be involved a race condition like I showed earlier, only the first one to execute the cas() operation will succeed (return True), while the second one (and later ones) will fail (return False). Let's spell out the events that happen when a race condition occurs:

A: counter = memcache.gets(key) # Reads 42
B: counter = memcache.gets(key) # Reads 42
A: memcache.cas(key, counter+1) # Writes 43, returns True
B: memcache.cas(key, counter+1) # Returns False
B: counter = memcache.gets(key) # Reads 43
B: memcache.cas(key, counter+1) # Writes 44, returns True

Another refinement I've left out here for brevity is to set a limit on the number of retries, to avoid an infinite loop in worst-case scenarios where there is a lot of contention for the same counter (meaning more requests are trying to update the counter than the memcache service can process in real time). You can figure out how to code this for yourself. (UPDATE: see comments #1 and #2 for a note about busy-waiting.)

Now let me explain roughly how this actually works. For some people that helps understanding how to use it: this is often the case for me when I am trying to understand some new concept.

The gets() operation internally receives two values from the memcache service: the value stored for the key (in our example the counter value), and a timestamp (also known as the cas_id). The timestamp is an opaque number; only the memcache service knows what it means. The important thing is that each time the value associated with a memcache key is updated, the associated timestamp is changed. The gets() operation stores this timestamp in a Python dict on the Client object, using the key passed to gets() as the dict key.

The cas() operation internally adds the timestamp to the request it sends to the memcache service. The service then compares the timestamp received with a cas() operation to the timestamp currently associated with the key. If they match, it updates the value and the timestamp, and returns success. If they don't match, it leaves the value and timestamp alone, and returns failure. (By the way, it does not send the new timestamp back with a successful response. The only way to retrieve the timestamp is to call gets().)

Of course, there's one more important ingredient: the App Engine memcache service itself behaves atomically. That is, when two concurrent requests (for the same app id) use memcache, they will go to the same memcache service instance (for historic reasons called a shard), and the memcache service has enough internal locking so that concurrent requests for the same key are properly serialized. In particular this means that two cas() requests for the same key do not actually run in parallel -- the service handles the first request that came in until completion (i.e., updating the value and timestamp) before it starts handling the second request.

And that's Compare-And-Set in a nutshell. If you have questions please don't hesitate to ask!

(UPDATE: The memcache API defines batch versions of most of its APIs. For example, to get multiple keys in a single call, there is get_multi(); to set multiple keys, there is set_multi(). Corresponding to cas(), there is cas_multi(). But there is no gets_multi(): instead, you can use get_multi(keys, for_cas=True). Finally, there's cas_reset(), which clears the dict used to store timestamps. But I haven't figured out what to do with it yet. :-)

Monday, January 24, 2011

Asynchronous RPC in App Engine Today

While I was laying the groundwork for a new datastore client library with support for asynchronous requests, I added some low-level support for asynchronous RPCs that you can use today. The only App Engine API with documented support for asynchronous RPCs is urlfetch, and it happens to be quite useful with that.

Suppose you want to fetch some data from a remote service. The remote service has two instances, both of which are slightly flaky. What you want to do is send off requests to both servers simultaneous (this is the easy part) and then wait for the first one to give you a result. The latter uses the new API that I'm about to describe here.


from google.appengine.api import urlfetch, apiproxy_stub_map

urls = ['http://service1.com', 'http://service2.com']  # Etc.

rpcs = []
for url in urls:
   rpc = urlfetch.create_rpc(deadline=1.0)
   urlfetch.make_fetch_call(rpc, url)
   rpcs.append(rpc)

rpc = apiproxy_stub_map.UserRPC.wait_any(rpcs)
# Now rpc is the first rpc that returned a result. Have at it!

That's all! If you're interested in learning more about this handy class method, just check out its docstring in the App Engine SDK. Note that technically you should loop until it doesn't return None.

You can also repeatedly call wait_any() to get subsequent result. Make sure to remove the rpc it returns (if any) from the list, since otherwise it will return the same rpc over and over again: the specification of wait_any() says it returns the first rpc in the given list that completes, regardless of whether you have seen it before.

Also note that there currently is no way to cancel the other RPCs, which is why I passed a low deadline to the create_rpc() call. The problem is that even if you completely ignore the other RPCs, the App Engine runtime still waits for them to finish or timeout.

Finally, there is also a similar class method UserRPC.wait_all(), which waits until all RPCs in the list you pass it are complete. (It doesn't return anything.)

PS. Don't look too closely at the implementation of these methods. It may change as we think of a better way to do it. But we're committed to the API.

Friday, January 7, 2011

A new App Engine datastore API

This post is primarily intended for App Engine users (and of those, only Python users :-).

Over the past months I've been working on a new design for the Python datastore API, under the code name Datastore Plus. The new design is very ambitious, and changes a lot of things:

New, cleaner implementations of Key, Model, Property and Query classes
High-level asynchronous API using Python generators as coroutines (PEP 342)

The design is meant to eventually replace the existing db package in the App Engine runtime library, but for now, it is just an open source project which you have to download and copy into your application.

I am not at all finished with this design, but I believe in listening to users, so I am making a preliminary version of the new API available for review. Please send me your thoughts, either in this blog, or via private mail to guido (at) google.com. Note that the implementation works, but I cannot guarantee that it won't change.

Documentation is here: http://goo.gl/D6Onw

The project to check out is here: http://goo.gl/GapXI

You must use Mercurial to check out the project, but it's fine to check it out anonymously -- I don't require anybody to log in to look or comment. (If using Mercurial is too much of a burden, there's also a zipfile on the site, but I don't plan to update it frequently.)

I'm interested in receiving any kind of feedback at all. It would help me if you could clarify whether your feedback is about an issue with the documentation, an issue with the implementation, or an issue with the API design -- though I realize you can't always tell the difference. :-)

(You can also comment on the thread in the google-appengine-python group here: https://groups.google.com/group/google-appengine-python/browse_thread/thread/454cb81d49e759f2.)

Monday, June 15, 2009

New App Engine Book

At Google I/O I received a copy of Using Google App Engine by Charles Severance, published by O'Reilly. I haven't kept track, but this appears one of the first App Engine books to actually hit the stores -- an Amazon search for App Engine showed up one other book (Developing with Google App Engine by Eugene Ciara, published by APress) and many titles available for pre-order (including additional titles from the same publishers).

Severance's book is a quick read if you're already familiar with the basic premises of web programming. I think it would do well in an introductory course about the topic. (The author teaches at the University of Michigan so this is likely how he developed the material in the first place.) In fact, quite a bit of the book could well have come from a pre-existing earlier course: the chapters on HTML, CSS, Python and JavaScript barely mention App Engine.

Don't get me wrong, I think that's a good approach: in my experience quite a few App Engine users are new to web programming in general, or could at least use a refresher course. If you don't fall in this category, don't feel offended: you just probably aren't the intended audience for this book. On the other hand, if you've developed for the web but haven't used Python before, you could probably just skip the HTML/CSS chapter and dive right into Python and App Engine.

If you're a blank sheet where it comes to programming, don't expect to come out an experienced Python developer: the book only covers enough of the language so you can get started with App Engine without feeling you're just copying and pasting text. The same is actually true for any topic covered -- in many cases the book actually recommends that you study a topic more in-depth using other resources. But in each case the book's coverage is enough to get you started with the creation of dynamic web sites, and that's the important part. After all, you didn't learn your mother tongue by studying the rules of grammar either: you learned a few nouns, a few verbs, a few adjectives, and a few grammatical forms ("Daddy throw toy again") and you were on your way to communicating with others.

Actually, if you read this book from cover to cover, you might not be ready to create the Greate American Website, but you'll be well past the "Daddy throw toy" level. For example, you'll be creating App Engine datastore models with ease, tying them together with forms, and you'll even be able to use simple AJAX patterns. You will also have learned about the importance of caching, and you'll have more than a fleeting experience debugging problems using tracebacks and logs.

I also enjoyed some of the history bits that Severance presents (it makes me feel old to see 1990 referred to as ancient history :-). A downside is that sometimes the exercises given at the end of each chapter seem to be focused more on assessing that you were awake during class than that you actually have learned a useful skill (e.g. "Give a brief history of the major phases of the internet"). Teachers considering to use this book in the classroom might appreciate such questions; but for self-study, I would focus on the difference between the class= and id= attributes in HTML...

What's missing? The book doesn't touch Django (except for the templating facility built into App Engine's webapp package, which is based on Django). If our customer support traffic is any indication, Django is very popular with professional App Engine developers. The book also doesn't describe the various APIs offered by App Engine for things like sending mail, fetching other web resources by URL, or image processing. But arguably you can learn those directly from the App Engine docs. Oh, and the book doesn't touch on App Engine's Java support. I expect other books will fill that void.

Monday, October 27, 2008

Questions Answered

I have now answered the top 20 questions in my section of "Ask a Google Engineer". Many of the remaining ones sound inappropriate or unanswerable, so I don't expect I'll be answering them, unless the popular vote really brings some of them to the top.