Guide to twisted.internet.defer

  1. Deferreds standardize callbacks
  2. Deferred
  3. Chaining Deferreds
  4. Composing Deferreds

This document addresses Twisted's implementation of Deferred objects in twisted.internet.defer.Deferred. It assumes familiarity with the basics of event loops and asynchronous programming.

Deferreds standardize callbacks

Callbacks are the lingua franca of asynchronous programming: any time you need to process the result of a non-blocking operation, you give that operation a callback for it to call when it has finished processing and has a result for you.

If you were implementing an asynchronous function from scratch, you might be tempted to define it like this:

1 2

def nonblocking_call(input, on_success, on_error): pass

The person using this code, then, would pass the functions he wanted called to this function at call time, like this:

1 2 3 4 5

def success_handler(result): print "Success: %s" % result def error_handler(error): print "Failure: %s" % str(error) nonblocking_call("input", success_handler, error_handler)

This works quite well for many simple cases, where you only need one success handler and one error handler, and the nonblocking call is a relatively one off function.

But what if you are Twisted and you have many nonblocking functions: do you force every one of these functions to have a on_success and on_error parameter? What if you want to perform a calculation on the result of the success_handler: do you write all of the code into a bigger success_handler and increase the indentation?

Twisted's elegant solution to this problem is Deferreds. Since the nonblocking call doesn't have a meaningful return value anyway (remember, it's asynchronous; it can return before it has a result), we return a Deferred which you can attach callbacks to.

1 2 3

d = nonblocking_call("input") d.addCallback(success_handler) d.addErrback(error_handler)

The Deferred object doesn't do anything that you couldn't have done with the two callback parameters. This point is worth repeating: Deferred is an abstraction over callback parameters: it does nothing magical and is not, itself, asynchronous. It is a merely a standard: if a function returns a Deferred, you know that you are dealing with an asynchronous function, and you know exactly what its API for adding callbacks is.

Deferred

Callbacks

At its very simplest, the Deferred has a single callback attached to it, which gets invoked with the result as an argument when it becomes available:

Synchronous Asynchronous

1 2

value = synchronous_operation() process(value)

1 2

d = asynchronous_operation() d.addCallback(process)

Errbacks

Error handling is an ever present concern in synchronous code. Deferred implements a system of errbacks in order to simulate Python try/except blocks. Just like in synchronous code, you always should register an errback in order to deal with an error gracefully.

Synchronous Asynchronous

1 2 3 4

try: synchronous_operation() except UserError as e: handle_error(e)

1 2 3 4 5

d = asynchronous_operation() def handle_twisted_error(failure): e = failure.trap(UserError) handle_error(e) d.addErrback(handle_twisted_error)

There are plenty of things going on here:

  • Instead of being passed an exception object, which is roughly analogous to the result in the no error case, you are passed a twisted.python.failure.Failure object. This is roughly a wrapper around the standard Exception with a few crucial enhancements to make it useful in an asynchronous context.
  • Consequently, we pull out the real exception by using failure.trap(UserError). This is the userland implementation of except; if the exception is not trapped, it gets re-thrown and our errback is bypassed.
  • You can trap multiple types of exceptions by simply calling trap with multiple arguments, e.g. failure.trap(UserError, OtherUserError)

Omitting the trap declaration is equivalent to a catch-all except block:

Synchronous Asynchronous

1 2 3 4 5

try: synchronous_operation() except: handle_error() raise

1 2 3 4 5

d = asynchronous_operation() def handle_twisted_error(failure): handle_error() return failure d.addErrback(handle_twisted_error)

Notice that in order to re-raise the exception, we simply return it from our errback handler. Deferred will notice that it is the type of a failure object, and act accordingly. You can also throw an exception and Deferred will handle it properly:

1 2 3 4 5 6

d = asynchronous_operation() def handle_twisted_error(failure): status = handle_error(failure.value) if not status: raise UserError d.addErrback(handle_twisted_error)

If you would like to re-raise the original error, it is preferred to use failure.raiseException(), which preserves traceback information if available.

Failure has another convenience function, check(), which makes it easier to simulate multiple except blocks:

Synchronous Asynchronous

1 2 3 4 5 6

try: synchronous_operation() except UserError: handle_error() except AnotherUserError: handler_another_error()

1 2 3 4 5 6 7 8 9

d = asynchronous_operation() def handle_twisted_error(failure): if failure.check(UserError): handle_error() elif failure.check(AnotherUserError): handle_another_error() else: failure.raiseException() d.addErrback(handle_twisted_error)

Callbacks and errbacks

In most cases, you'll want to perform some processing on the deferred result as well as have error handling. As you may have guessed, this simply means you should define both a callback and an errback.

Synchronous Asynchronous

1 2 3 4 5

try: value = synchronous_operation() process(value) except UserError as e: handle_error(e)

1 2 3 4 5 6

d = asynchronous_operation() d.addCallback(process) def handle_twisted_error(failure): e = failure.trap(UserError) handle_error(e) d.addErrback(handle_twisted_error)

Notice that in the synchronous version, process is inside the try..except block. This translates over to the asynchronous code: if process throws an exception, handle_twisted_error will get a Failure object corresponding to that exception. The errback could handle either an error from the asynchronous operation or from our callback. Why does this happen? This is because Deferreds chain callbacks.

Chaining callbacks

A common pattern in programs is the notion of one function returning an intermediate result, which gets passed to another function to calculate a further result, and so forth. Such a chain of data processing entities is called a pipeline, and Deferreds are ideally suited for modeling them.

Synchronous Asynchronous

1 2 3

value = synchronous_operation() value2 = process(value) another_process(value2) # value2, not value!

1 2 3

d = asynchronous_operation() d.addCallback(process) d.addCallback(another_process)

This behavior makes the name addCallback slightly misleading, since each of these callbacks will get a different result. If you would like to multiplex (have multiple callbacks handle the same result), you should code this directly into your callback:

Synchronous Asynchronous

1 2 3

value = synchronous_operation() process(value) another_process(value)

1 2 3 4 5

d = asynchronous_operation() def multi_process(value): process(value) return another_process(value) d.addCallback(multi_process)

Errbacks work similarly, but instead of pipelining values through multiple functions, they create nested try..except blocks:

Synchronous Asynchronous

1 2 3 4 5 6 7

try: try: synchronous_operation() except UserError as e: handle_error(e) except AnotherUserError as e: handle_another_error(e)

1 2 3 4 5 6 7 8 9 10 11

d = asynchronous_operation() def handle_twisted_error(failure): e = failure.trap(UserError) handle_error(e) d.addErrback(handle_twisted_error) def handle_twisted_another_error(failure): e = failure.trap(AnotherUserError) handle_another_error(e) d.addErrback(handle_twisted_another_error)

Now, we can do tricky things with chaining callbacks and errbacks. The following code makes it possible for the errback function to gracefully provide the result of the computation, even though it failed (perhaps from a cache).

Synchronous Asynchronous

1 2 3 4 5

try: value = synchronous_operation() except UserError as e: value = handle_error(e) process(value)

1 2 3 4 5 6

d = asynchronous_operation() def handle_twisted_error(failure): e = failure.trap(UserError) return handle_error(e) d.addErrback(handle_twisted_error) d.addCallback(process)

This code introduces a new function: addCallbacks, which adds both a callback and an errback. Unlike adding them individually, if the callback errors, the errback will not receive the error, and if the errback returns a valid result, the callback will not receive it. They are completely isolated from each other.

Synchronous Asynchronous

1 2 3 4 5 6

try: value = synchronous_operation() except UserError as e: handle_error(e) else: process(value)

1 2 3 4 5

d = asynchronous_operation() def handle_twisted_error(failure): e = failure.trap(UserError) handle_error(e) d.addCallbacks(process, handle_twisted_error)

Let's stick our hand inside the black box and see what actually is happening. The order in which we add callbacks and errbacks is obviously influencing the end behavior. Here's why:

Internally, Deferred stores callbacks and errbacks in a list of callback/errback tuples. When you call addCallback or addErrback, you are not adding a callback/errback to separate stacks; instead, Deferred wraps your callback into a tuple (substituting a "pass through" function for the missing callback/errback) and sticks this on the callback/errback tuple list.

The result from the asynchronous function will either be a Failure object, or some other Python value. If it is the former, Deferred will call your errback function in the tuple with the result; the latter will result in a call to the callback function in the tuple. The function call itself can result in two end results, another failure (either by returning a Failure object or by raising an Exception) or a regular Python value. Deferred will then move to the next tuple and repeat until there are no more tuples left.

Take the following code as an example:

1 2 3 4 5

d = asynchronous_operation() d.addCallback(callback1) # tuple 1 d.addCallback(callback2) # tuple 2 d.addErrback(errback3) # tuple 3 d.addCallbacks(callback4, errback4) # tuple 4

Consider two possible scenarios. First, success:

  1. The asynchronous operation succeeds with a result of "Foo".
  2. No failure. We give "Foo" to the callback of tuple 1, callback1. It returns ("Foo", 123).
  3. No failure. We give ("Foo", 123) to the callback of tuple 2, callback2. It returns "Foo123".
  4. No failure. We give "Foo123" to the callback of tuple 3, which happens to be a pass through function. It returns "Foo123".
  5. No failure. We give "Foo123" to the callback of tuple 4, callback4. It does something, but the return value is not given to anyone.

What about failure?

  1. The asynchronous operation fails, and a Failure object is constructed.
  2. Failure. We give the failure object to the errback of tuple 1, which happens to be pass through function. It returns the failure object.
  3. Failure. We give the failure object to the errback of tuple 2, which is also a pass through function. It returns the failure object.
  4. Failure. We give the failure object to the errback of tuple 3, errback3. It acknowledges and logs the error. It doesn't return anything.
  5. No failure (remember, None is a valid result value!) We give None to the callback of tuple 4, callback4.

Think of your callback/errback chains as parallel pipes of execution, which could transfer to one another at any point. As a parting word, here is a use of one convenience function, addBoth.

Synchronous Asynchronous

1 2 3 4

try: synchronous_operation() finally: cleanup()

1 2

d = asynchronous_operation() d.addBoth(lambda x: cleanup())

The lambda is simply a convenient way to avoid passing x to clean() (lest Python raise a TypeError).

Fluent interface

Deferred implements a fluent interface for adding callbacks, where the return value of addCallback, addErrback or any other similar method is the object itself (return self). This means you can write this:

1 2

d = asynchronous_operation() d.addCallback(f1).addCallback(f2).addCallback(f3)

which is equivalent to:

1 2 3 4

d = asynchronous_operation() d.addCallback(f1) d.addCallback(f2) d.addCallback(f3)

Use of this style is a matter of taste and consistency.

Chaining Deferreds

All of the examples, to this point, have been focused around a single asynchronous operation, and the synchronous post-processing of that operation. However, in the real world, you will often need to run multiple asynchronous operations, one after the other. For example, if you make an HTTP request, and find out that the request is a redirect, you need to make another (asynchronous) HTTP request.

Our code, then, is fatally hobbled if we can't easily chain deferreds together. With the framework we setup previously, we could implement something along the lines of having the callback call the next asynchronous function, and then setup the callbacks on the deferreds that function returned.

Synchronous Asynchronous

1 2 3

value = synchronous_operation_a() value2 = synchronous_operation_b(value) process(value2)

1 2 3 4 5

d = asynchronous_operation_a() def chain(result): d = asynchronous_operation_b(result) d.addCallback(process) d.addCallback(chain)

But we just spent the first section explaining our wonderful system of multiple callbacks and errbacks and, as you might notice, there isn't actually a way to get chain to return the value of process in this example without making it synchronous.

To make this work, Twisted does something special: it lets callbacks return Deferred, and treats it to mean, "this callback doesn't have the answer yet, but when this Deferred fires it will!"

Synchronous Asynchronous

1 2 3

value = synchronous_operation_a() value2 = synchronous_operation_b(value) process(value2)

1 2 3

d = asynchronous_operation_a() d.addCallback(asynchronous_operation_b) d.addCallback(process)

Written a little more explicitly (in case you're still squeamish about higher order functions), the asynchronous code is equivalent to this:

1 2 3 4 5

d = asynchronous_operation_a() def chain(result): return asynchronous_operation_b(result) d.addCallback(chain) d.addCallback(process)

Here is the mantra: Callbacks and errbacks can return deferreds.

Chaining in Pictures

We're now going to introduce some visual aids to see how you can use deferred chaining to modify program flow. We'll represent Deferred objects as "pipes," that is, a series of callbacks that take some input, process it in turn, and then return some output.

Plain old deferred object

This is a Deferred that we instantiated from scratch; it doesn't do anything and unless we explicitly call it, it will never run (in the next section, Composing Deferreds, we will see why Deferred objects like this can be useful). In many other cases, the function we called to get this deferred object promised to call back at some point: we'll represent as the red text "Asynchronous Code". This code provides the input A that gets the ball rolling.

Deferred object that some asynchronous code will call

Under normal circumstances, C simply falls off into oblivion; no other code cares about it!

Now, suppose that the asynchronous code finishes its job and calls the Deferred. While processing this value, Callback 1 returns a Deferred B instead of an actual B, indicating, "No wait, the value isn't ready yet!"

Deferred object that some asynchronous code will call

We can't just pass Deferred B to Callback 2, since it's expecting a B, not a deferred. How do we get B out of Deferred B? Well, recall what Deferred B looks like:

The deferred object that callback 1 returned.

There are a few comments to be made about this deferred: first off, it's a fully formed deferred object that some other asynchronous code, Asynchronous Code for B, has promised to call back with a result. However, this result in this example isn't actually B; it's us B''. You can imagine this as some precursor value for B that needs to go through Callback 1' and Callback 2' before it becomes B. We've used the prime symbol (') in order to distinguish Callback 1 from Callback 1'; they are distinct and may be completely different functions.

By now, the words "chain" and the arrow labeled B probably have given you some idea how to reincorporate Deferred B into the original deferred. Sure enough, we simply plug it in.

The deferred object that callback 1 returned.

(We've omitted Callback 1 from the diagram for the interest of brevity; it is now inaccessible and non-existent for the purposes of finishing processing.) The evaluation proceeds as normal. Note that any of the callbacks in our new chained Deferred can return a deferred and repeat this process.

One last comment: something interesting has happened to the value that comes out of the last callback: for Deferred B, it was actually used! Chaining deferreds means that we care about the ultimate end result of our callback chain.

Dependencies

Well written, maintainable callbacks maintain "contracts" with respect to their behavior. Any given callback should have a well-defined value it takes and a well-defined value that it returns. This is good sense that applies not only to callbacks but also to functions.

We've now added a slight twist to this, in that any callback can return the value that it is contractually obligated to supply, or it can promise to return to the value in the form of a Deferred. (Imagine if you could get away doing this in real life!) And, in the process of fulfilling that promise, you discover you need to do another asynchronous request. Something has just happened: you're resolving a dependency chain.

[ here goes an example with actual running Twisted code in three steps. Pictures of how the "callback" chain looks like as we discover more and more dependencies should be supplied ]

Looping

A common form of dependency is needing to perform the asynchronous operation all over again. The canonical example of this an HTTP redirect: when the callback for a deferred from a page request is returned, it could be the result, or it could be an empty body with the Location HTTP header set, in which case you simply perform the operation over again.

[ here is the HTTP redirect example. It also should have pictures. ]

Lambdas

We now take this opportunity to remind you that chaining deferreds often results in the creation of lots of little functions to shuffle the result of one operation to the next asynchronous function. Sometimes you can be clever and pass the asynchronous function itself as a callback, but this only works if the next asynchronous function takes a single parameter, and that parameter is the result of the previous computation.

In simple cases, you may want to use a lambda to move a parameter around, or partially apply a function. Suppose we have an asynchronous function send_message(value, type), and we know that in our code type should equal 2, then:

Without lambdas With lambdas

1 2 3 4

d = asynchronous_operation() def send_message_callback(result): return send_message(result, 2) d.addCallback(send_message_callback)

1 2

d = asynchronous_operation() d.addCallback(lambda x: send_message(x, 2))

Composing Deferreds

Chaining deferred dealt with sequential computation: each successive asynchronous operation required the result of the previous computation in order to run. But we could have done this very easily synchronously: asynchronous execution shines when we want to perform computations in parallel. But parallelizing computations results in some questions: when is a parallel computation complete? How do I treat these parallel computations as a single unit?

The answer is composition, that is, we can combine deferreds together into a single deferred. As it turns out, Twisted has some built-in facilities for doing this.

The implementation of a DeferredList

Consider a Deferred that would only fire after some other number of Deferreds fired.

1 2 3 4

class FireWhenAllFinish(Deferred): def __init__(self, deferreds): super(FireWhenAllFinish, self).__init__() self.deferreds = deferreds

We start off with a logical constructor for our class: a simple list of the Deferred objects we want to finish before this Deferred fires. Recall that we need to setup callbacks in each Deferred in deferreds to tell us when they've finished. Thus:

1 2 3 4 5 6 7 8 9 10 11

class FireWhenAllFinish(Deferred): def __init__(self, deferreds): super(FireWhenAllFinish, self).__init__() self.deferreds = deferreds for d in self.deferreds: self.addCallbacks(self._cbDeferred, self._ebDeferred) self.addErrback(self._ebDeferred) def _cbDeferred(self, result): raise NotImplemented def _ebDeferred(self, failure): raise NotImplemented

Now, for the definition of _cbDeferred, after a little thought, and the knowledge that callback() and errback() are the methods you can use to fire a deferred (it's what asynchronous_operation() would have called behind the veil), a relatively simple implementation comes to mind:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17

class FireWhenAllFinish(Deferred): def __init__(self, deferreds): super(FireWhenAllFinish, self).__init__() self.deferreds = deferreds self.finishedCount = 0 if not self.deferreds: self.callback() for d in self.deferreds: self.addCallbacks(self._cbDeferred, self._ebDeferred) def _cbDeferred(self, result): self.finishedCount += 1 if self.finishedCount == len(self.deferreds): self.callback() def _ebDeferred(self, failure): if not self.called: # this property is True if callback()/errback() has already been called self.failed = True self.errback()

There are two gotchas: the first is that if there were no deferreds passed into this deferred, we should automatically fire our callback; after all, we're not waiting on anything thing. The second is that callback() and errback() must only be called (between the two of them) once, so we manually guard for this by checking if self.called is False before making the errback call (why such a check is unnecessary for callback call is left as an exercise for the reader.)

Making your own deferreds

Simple case: batons

Index

Version: