Continuation Passing Style
Continuations and Continuation Passing Style come up in a number of different scenarios and have been (re-)discovered multiple times since the 1960s.1 Reynolds, J.C. 1993. The discoveries of continuations. Lisp and symbolic computation. 6, (1993), 233β247.
What are continuations and what does it mean for a program to be in continuation passing style? How does continuation passing style transform recursive functions into a tail-recursive function and how does it allow one to make the order of evaluation explicit? These are some of the questions we will be answering in this post.
All Python code examples were interpreted on version:
3.11.4 | packaged by conda-forge | (main, Jun 10 2023, 18:08:17) [GCC 12.2.0]
Let’s dig in!
Continuations: rest of the program2The term Continuations is credited to Wadsworth and Strachey.
Wikipedia describes continuations as
… an abstract representation of the control state of a computer program.
That’s a mouthful. There’s a simpler, more intuitive, definition:3 Strachey, C. and Wadsworth, C.P. 1974. Continuations: A mathematical semantics for handling full jumps. (no title). (1974).
[A continuation is] [t]he (meaning of the) “rest of the program” …
Let’s say, our program is as follows:
1: x = "hello world!" 2: print(x)
Then, the continuation at the point where x
is defined, is: lambda x: print(x)
. After x
gets defined, the Python interpreter continues with
the rest of the program. In other words, it continues with … π₯ … the
continuation.
For a given point in the execution of a program, the corresponding continuation is whatever will happen from that point to the end of the corresponding program’s “main” function.
One thing to note is that continuations can be represented by functions. As
functions they take as input the current state of the program4In the code
example, this is simply x
. and produce as output the output of the program.
Another thing to note is that continuations of a program do not compose.5Composable continuations, aka delimited continuations, do exist and are distinct. Intuitively, they represent the meaning of the “rest of the program” up to a point (i.e., they are delimited). Why is that? Continuations extend till the end of the program. If they could be composed with something, that something would have to be after the end of the program and as such wouldn’t be a continuation of that program.
Continuation Passing Style
Having defined continuations, continuation passing style (CPS) is a program transformation that transforms functions such that they accept an extra argument which represents the continuation at the call-site.
We begin our exploration with the following recursive function. For the purposes
of this blog post we will ignore what might happen when subtractorial
is
evaluated with a sufficiently large integer as input.
from __future__ import annotations def subtractorial(n: int) -> int: match n: case 0: return 0 case _: return (n - subtractorial(n-1))
With a little algebra it can be demonstrated that subtractorial(x)
6For a
non-negative number. is equivalent to \(\lceil \frac{x}{2} \rceil\). We can
observe this below:
for x in range(10): print(f"{x} => {subtractorial(x)}")
0 => 0 1 => 1 2 => 1 3 => 2 4 => 2 5 => 3 6 => 3 7 => 4 8 => 4 9 => 5
The CPS transformation of subtractorial
is subtractorial1
, which is defined
as:
1: from typing import Callable, TypeVar 2: 3: R = TypeVar('R') 4: 5: def subtractorial1(n: int, c: Callable[[int], R]) -> R: 6: return c(subtractorial(n))
Our previously defined subtractorial
is what you get when you supply the
identity function7lambda x: x
as the continuation:
print(subtractorial1(11, lambda x: x))
6
CPS as a tail-recursive program transformation
Using the above definitions of subtractorial1
and subtractorial
we can
synthesize an equivalent definition of subtractorial1
that no longer depends
on subtractorial
.
1: from __future__ import annotations 2: from typing import Callable, TypeVar 3: 4: R = TypeVar('R') 5: 6: def subtractorial2(n: int, c: Callable[[int], R]) -> R: 7: match n: 8: case 0: 9: return c(0) 10: case _: 11: return subtractorial2((n - 1), c(lambda x: n - x))
CPS is implemented by having the last call in a function be the call to the
continuation. And since subtractorial2
is recursive and in CPS, it is
tail-recursive. Additionally, this form of subtractorial2
make clear that
the continuation is acting as an accumulating parameter. During the tail-call,
the continuation being passed to the recursive call is the composition of (i.e.,
the accumulation of) c
and lambda x: n - x
.
For completeness, an equivalent formulation for the tail-call is:
return subtractorial2((n - 1), lambda x: c(n - x))
CPS and defunctionalization
Functions aren’t the only representation for continuations. Specifically, CPS can be defunctionalized. Defunctionalization8 Reynolds, J.C. 1972. Definitional interpreters for higher-order programming languages. Proceedings of the acm annual conference-volume 2 (1972), 717β740. proceeds by choosing an appropriate representation to represent and distinguish all the higher-order calls in a program.
All the continuations in subtractorial2
are of the form
int -> int
.9I.e., taking a single int
as an input
and producing an int
. Even more specifically, the continuations are
a result of composing functions of the form lambda x: n - x
for some
integer n
. One way to represent compositions of such functions is as
a list of integers. The abstraction function subtabs3
converts from the
“concrete” representation (list of integers) to the previous “abstract”
(functional) representation.
1: from __future__ import annotations 2: from typing import Callable 3: from functools import reduce 4: 5: def subtabs3(xs: [int]) -> Callable[[int], int]: 6: return reduce(lambda x, y: lambda z: x(y - z), xs, lambda x: x)
Note the similarity between lambda z: x(y - z)
in subtabs3
and
lambda x: c(n - x)
in subtractorial
.
Using the above we can now define a new implementation that is both tail-recursive and first-order.
7: def subtractorial3(x: int) -> int: 8: return _subtractorial3(x, []) 9: 10: def _subtractorial3(x: int, xs: [int]) -> int: 11: match x: 12: case 0: 13: return subtabs3(xs)(0) 14: case _: 15: xs.append(x) 16: return _subtractorial3((x - 1), xs)
And we can confirm that it computes the same results as subtractorial
above.
for x in range(10): print(f"{x} => {subtractorial3(x)}")
0 => 0 1 => 1 2 => 1 3 => 2 4 => 2 5 => 3 6 => 3 7 => 4 8 => 4 9 => 5
The significance of the choice of representation
A list of integers isn’t the only way to represent the continuations in
subtractorial2
. Another way is to represent the composition of functions as
a single integer.
1: from __future__ import annotations 2: from typing import Callable 3: from functools import reduce 4: 5: def subtabs4(x: int) -> Callable[[int], int]: 6: return lambda y: abs(x - y)
If, starting with the initial argument to subtractorial4
, we club every pair
of integers we observe that every pair contributes 1
10x - (x - 1) == 1
to the final result. Thus, the key is to pick a representation
that focuses on the absolute value of the difference, rather than the difference
itself. Having done so, the definition of subtractorial4
is straightforward.
7: def subtractorial4(x: int) -> int: 8: return _subtractorial4(x, 0) 9: 10: def _subtractorial4(x: int, y: int) -> int: 11: match x: 12: case 0: 13: return subtabs4(y)(0) 14: case _: 15: return _subtractorial4((x - 1), (x - y))
Having defined subtractorial4
as above, we can indeed confirm that it computes
the same results as subtractorial
and subtractorial3
above.
for x in range(10): print(f"{x} => {subtractorial4(x)}")
0 => 0 1 => 1 2 => 1 3 => 2 4 => 2 5 => 3 6 => 3 7 => 4 8 => 4 9 => 5
As an optimization aid
Let’s consider the example of the singly-linked lists and the naive
implementation of reverse
11The departure to Haskell for this section is a
didactic choice. Haskell lists have less magic in their implementation than
Python’s list. which reverses the list:
- The
reverse
of an empty list is the empty list. - The
reverse
of a list with a head and a tail is the result of appending the head to thereverse
of the tail.
reverse :: [a] -> [a] reverse [] = [] reverse (h:t) = (reverse t) ++ [h]
This naive implementation is quadratic in the length of the list since append
(++
) is linear in the length of its left argument.
For the uninitiated, here’s a brief glossary of Haskell syntax:
::
denotes the type of an expression.x::[a]
states that the type of the variablex
is that of[a]
.12I.e., a list comprised of elements whose type isa
, for some typea
.a -> b
is the type of a function that takesa
as input and produces a result of typeb
.a -> b -> c
is equivalent13More accurately, it is the curried form of such a function. to a function that takes two arguments, one of typea
and the other of typeb
, and produces a result of typec
.(h:t)
deconstructs a list into a head element and the remainder of the list, i.e., the tail.
Returning to the example, we can CPS-transform reverse
and get reverse2
in a
manner similar to how we obtained subtractorial2
from subtractorial
. Eliding
the intermediate steps, we get:
reverse2 :: [a] -> [a] reverse2 xs = reverse2' xs id where reverse2' :: [a] -> ([a] -> [a]) -> [a] reverse2' [] c = c [] reverse2' (h:t) c = reverse2' t (\xs -> c (xs ++ [h]))
In reverse2
, id
stands for the identity function.
We can continue in the spirit of subtractorial3
and defunctionalize
reverse2
. As the continuations accumulate they form a chain of function
compositions where each function appends a single element. If we14How we
choose a representation appropriately is out of scope for this post. represent
this composition (++ [a]) . (++ [b]) ... (++ [c])
as the list
[c, ..., b, a]
of appendees in reverse order we obtain:
revabs3 :: [a] -> ([a] -> [a]) revabs3 xs = \ys -> (ys ++ xs) reverse3 :: [a] -> [a] reverse3 xs = reverse3' xs [] where reverse3' :: [a] -> [a] -> [a] reverse3' [] c = revabs3 c [] reverse3' (h : t) c = reverse3' t (h : c)
Inlining the definition of revabs3
, we obtain the well-known, linear-time
reverse function using a list accumulator.
reverse4 :: [a] -> [a] reverse4 xs = reverse4' xs [] where reverse4' :: [a] -> [a] -> [a] reverse4' [] c = c reverse4' (h : t) c = reverse4' t (h : c)
Let’s reflect for a moment. The transformation to CPS made explicit the context surrounding the recursive call(s). Performing defunctionalization forced us to consider the question of what would be an appropriate representation that can adequately address the, possibly many, recursive calls. Having picked an appropriate representation our function went from quadratic-time to linear-time.
CPS and Evaluation order
There is more than one evaluation strategy that an interpreter or compiler can follow when evaluating expressions.
As noted by Reynolds15 Reynolds, J.C. 1998. Definitional interpreters revisited. Higher-order and symbolic computation. 11, (1998), 355β361. a program in CPS has the property that
Whenever some function calls a serious function16A serious function is defined as a function that may not terminate (depending on the arguments passed to it)., the calling function must return the same result as the called function, without performing any further computation.
In other words, as we’ve seen in this blog post, CPS results in tail-calls which, at a particular point, explicitly represent what the meaning of the “rest of the program” is. The continuations in these tail-calls stack or nest,17Not unlike Matryoshka dolls. with the order of nesting representing the order in which computations are to be evaluated.
This provides an additional “degree of freedom” when defining interpreters which allow sufficient flexibility18As it turns out, CPS isn’t necessary for order-of-application independence. such that the order of evaluation of the defined language can be made independent of the order of evaluation of the defining language.8 Reynolds, J.C. 1972. Definitional interpreters for higher-order programming languages. Proceedings of the acm annual conference-volume 2 (1972), 717β740. , 19At least for Call by name and Call by value evaluation orders.
And that is all we’ll say on this specific topic. Not because the topic isn’t worth elaborating, but because it’d be difficult to compete with Reynolds excellent treatment of it.15 Reynolds, J.C. 1998. Definitional interpreters revisited. Higher-order and symbolic computation. 11, (1998), 355β361. , 20 Reynolds, J.C. 1998. Definitional interpreters for higher-order programming languages. Higher-order and symbolic computation. 11, (1998), 363β397.
Historical tidbit
Continuations and continuation passing style have been (re-)discovered multiple times since the 1960s.1 Reynolds, J.C. 1993. The discoveries of continuations. Lisp and symbolic computation. 6, (1993), 233β247. The earliest description of the use of continuations (specifically, CPS) was by Adriaan van Wijngaarden in September 1964, at an IFIP Working Conference on Formal Language Description Languages held in Baden bei Wien, Austria.
Participants in the discussion of his presentation included Dijkstra, Hoare, McCarthy, McIlroy, and Strachey, and other conference attendees included BΓΆhm, Elgot, Landin and Nivat.
van Wijngaarden used CPS transformation (before it was called as such) to
compile a program with goto
statements into another that didn’t need any
goto
statements.
Under the inspiration of the notion of the unnecessity of
goto
’s, Dijkstra spent the evening constructing realistic examples of programs withoutgoto
’s, which he scribbled on napkins at coffee break the next day. … That coffee-break palaver ripened into the most celebrated21 So celebrated, in fact, that we have not one, but two posts on the topic. letter to the editor in the history of computing.22 Dijkstra, E.W. 1968. Letters to the editor: go to statement considered harmful. Communications of the acm. 11, 3 (1968), 147β148.
Conclusion
Continuations reify the meaning of the rest of the program. Continuation passing style transforms a program such that each function takes an additional function which is a representation of the continuation at the call-site. When these continuations are represented as functions, every function in the CPS-transformed program makes a tail-call to said continuation.23Passing as an argument the output of the original function. CPS and defunctionalization complement each other. Appropriate choice of defunctionalized representations can lead to asymptotic improvements. And finally, because continuations make explicit “what to do after this step”, CPS can be used to achieve order-of-application independence in defining and defined languages.
That’s it. That’s the idea.
Comments
Comments can be left on twitter, mastodon, as well as below, so have at it.
New post!
— The Weary Travelers blog (@wearyTravlrsBlg) July 9, 2023
What is the Continuation Passing Style of programming?https://t.co/KBp0IlBJsP
Reply here if you have comments.
Footnotes:
Reynolds, J.C. 1993. The discoveries of continuations. Lisp and symbolic computation. 6, (1993), 233β247.
The term Continuations is credited to Wadsworth and Strachey.
Strachey, C. and Wadsworth, C.P. 1974. Continuations: A mathematical semantics for handling full jumps. (no title). (1974).
In the code
example, this is simply x
.
Composable continuations, aka delimited continuations, do exist and are distinct. Intuitively, they represent the meaning of the “rest of the program” up to a point (i.e., they are delimited).
For a non-negative number.
lambda x: x
Reynolds, J.C. 1972. Definitional interpreters for higher-order programming languages. Proceedings of the acm annual conference-volume 2 (1972), 717β740.
I.e., taking a single int
as an input
and producing an int
.
x - (x - 1) == 1
The departure to Haskell for this section is a didactic choice. Haskell lists have less magic in their implementation than Python’s list.
I.e., a list comprised of elements whose type is
a
, for some type a
.
More accurately, it is the curried form of such a function.
How we choose a representation appropriately is out of scope for this post.
Reynolds, J.C. 1998. Definitional interpreters revisited. Higher-order and symbolic computation. 11, (1998), 355β361.
A serious function is defined as a function that may not terminate (depending on the arguments passed to it).
Not unlike Matryoshka dolls.
As it turns out, CPS isn’t necessary for order-of-application independence.
At least for Call by name and Call by value evaluation orders.
Reynolds, J.C. 1998. Definitional interpreters for higher-order programming languages. Higher-order and symbolic computation. 11, (1998), 363β397.
Dijkstra, E.W. 1968. Letters to the editor: go to statement considered harmful. Communications of the acm. 11, 3 (1968), 147β148.
Passing as an argument the output of the original function.