Defunctionalization: Simulating higher-order functions
Functions, in general, cannot be serialized. This is a pity, because it means we may not be able to pass a function as an argument in some situations. 1For instance, in a distributed program or when you want to specify a function as a configuration parameter and your configuration DSL, e.g. YAML, doesn’t support function references. Defunctionalization is a program transformation which allows us to simulate higher-order functions via first-order functions. Using defunctionalization we can mechanically convert a higher-order program into one that only requires first-order capabilities to execute.
Introduction
A common higher-order function2A
higher-order
function (cf. first-order) is a function that either returns a function or
takes one or more functions as arguments. is map
. It accepts a function and
something list-like, and applies that function to each element in the list.
1: numbers = [1, 2, 3, 4, 5] 2: 3: squared_numbers = list(map(lambda x: x * x, numbers)) 4: print(f"squared_numbers: {squared_numbers}") 5: 6: scaling_factor = 2 7: scaled_numbers = list(map(lambda x: scaling_factor * x, numbers)) 8: print(f"scaled_numbers : {scaled_numbers}")
squared_numbers: [1, 4, 9, 16, 25] scaled_numbers : [2, 4, 6, 8, 10]
But what if map
were a function that had to be executed on a different
machine? There are two complications.
- We wouldn’t be able to pass the
lambda
expressions as inputs since functions aren’t serializable. - Additionally, the other machine wouldn’t have access to the
scaling_factor
. 3Thescaling_factor
is an example of a free variable, aka a non-local variable. It’s a variable whose value is used in the function, but isn’t passed in as an argument.
map
: defunctionalized
The key insight of Reynolds4
John C. Reynolds. 1972.
Definitional
interpreters for higher-order programming languages.
In Proc. ACM Annu. Conf., Vol. 2. ACM, 717–740.
https://doi.org/10.1145/800194.805852/.
was that a given program is finite
and contains only finitely many functions. As such, it is possible to enumerate
all functions and assign a unique identifier to denote each. Each higher-order
function can then be replaced by a function that accepts these identifiers
instead of function arguments.
In Listing No description for this link, each use of map
uses a different lambda
abstraction.5The
first to square the list and the second to scale the list by a
scaling_factor
. Since the point of defunctionalization is to not pass
functions by reference, what can we do? It turns out any other permitted value
that we can compare (to other values) will suffice as an identifier. Below, we
represent each of these functions by a dataclass
.6Equivalently, values of
an algebraic data
type.
3: from dataclasses import dataclass 4: 5: @dataclass 6: class Square: 7: pass 8: @dataclass 9: class Scale: 10: factor: int 11: MapCases = Square | Scale
We can certainly serialize values representing Square
and
Scale
. Additionally, since Scale
contains the factor
member
variable7This is an instance of
Lambda lifting
or Closure conversion. it, when deserialized, will make the value
available to the process executing on the different machine.
Now what of the code that executes on the other machine?8Or, rather, could execute on the other machine. Instead of accepting a function which it could then apply to each of the elements of the list, it performs case-analysis on the first argument.
13: def applyMap(tag: MapCases, xs: list[int]) -> list[int]: 14: result = [] 15: match tag: 16: case Square(): 17: for x in xs: 18: result.append(x * x) 19: return result 20: case Scale(y): 21: for x in xs: 22: result.append(y * x) 23: return result 24: case _ as unreachable: 25: assert_never(unreachable)
Having defined such an apply function, we update each of the use-sites and
replace the higher-order map
with the first-order applyMap
instead.
27: numbers = [1, 2, 3, 4, 5] 28: 29: squared_numbers = applyMap(Square(), numbers) 30: print(f"squared_numbers: {squared_numbers}")
In the case of scaled_numbers
, we additionally pass the scaling_factor
as an
input.
32: scaling_factor = 2 33: scaled_numbers = applyMap(Scale(scaling_factor), numbers) 34: print(f"scaled_numbers : {scaled_numbers}")
And that’s it. Voilà! Listing No description for this link shows the defunctionalized code in its entirety.
1: from __future__ import annotations 2: from typing import assert_never 3: from dataclasses import dataclass 4: 5: @dataclass 6: class Square: 7: pass 8: @dataclass 9: class Scale: 10: factor: int 11: MapCases = Square | Scale 12: 13: def applyMap(tag: MapCases, xs: list[int]) -> list[int]: 14: result = [] 15: match tag: 16: case Square(): 17: for x in xs: 18: result.append(x * x) 19: return result 20: case Scale(y): 21: for x in xs: 22: result.append(y * x) 23: return result 24: case _ as unreachable: 25: assert_never(unreachable) 26: 27: numbers = [1, 2, 3, 4, 5] 28: 29: squared_numbers = applyMap(Square(), numbers) 30: print(f"squared_numbers: {squared_numbers}") 31: 32: scaling_factor = 2 33: scaled_numbers = applyMap(Scale(scaling_factor), numbers) 34: print(f"scaled_numbers : {scaled_numbers}")
$> mypy ./defunc_map.py Success: no issues found in 1 source file
squared_numbers: [1, 4, 9, 16, 25] scaled_numbers : [2, 4, 6, 8, 10]
Related ideas and further reading
Defunctionalization was first introduced in 19724
John C. Reynolds. 1972.
Definitional
interpreters for higher-order programming languages.
In Proc. ACM Annu. Conf., Vol. 2. ACM, 717–740.
https://doi.org/10.1145/800194.805852/.
in the
context of developing definitional interpreters. The seminal paper demonstrated
the use of defunctionalization to implement an interpreter for a higher-order
defined language in a first-order defining language.
The paper used defunctionalization in conjunction with the Continuation-passing
style (CPS) transformation.9
Adriaan van Wijngaarden. 1966. Recursive definition of
syntax and semantics. In T.B. Steel Jr. (Ed.), Formal Language Description
Languages for Computer Programming, 13-24. North-Holland.
The latter was used to make the
defined language independent of the order of evaluation of the defining
language. CPS and defunctionalization are complementary, and can be used to
derive many abstract machines10
Olivier Danvy and Lasse
R. Nielsen. 2001.
Defunctionalization at work.
In Proc. 3rd ACM SIGPLAN Int. Conf. Princ. Pract. Declar. Program.,
162–174. ACM.
https://doi.org/10.1145/773184.773202.
as well as data structures.11For
instance, the Zipper data structure. They can also be used as a refactoring
technique12
James Koppel. 2019. Defunctionalization: Everybody Does
It, Nobody Talks About It. SIGPLAN Blog (Dec. 2019).
to, among other things, convert recursion into
iteration.
Using CPS and defunctionalization together allows us to write an interpreter
of a higher-order programming language in a first-order language. Converting
this into a compiler requires leveraging (pseudo-)
associativity 13
Mitchell
Wand. 1982. Deriving Target Code as a Representation of Continuation Semantics.
ACM Trans. Program. Lang. Syst. 4, 3 (July),
496–517. https://doi.org/10.1145/357172.357179.
of generalized function composition. This
associativity property also can be used to make the resulting interpreter more
efficient by obviating some intermediate data structures.14
Jeremy Gibbons. 2021.
Continuation-Passing Style,
Defunctionalization, Accumulations, and Associativity.
In Art Sci. Eng. Program., Vol. 6,
Issue 2.
https://doi.org/10.22152/programming-journal.org/2022/6/7.
Without leveraging associativity we are required to “think
creatively”.15
Li-yao Xia. 2020. Defunctionalize the Continuation. In PL
Club Blog. University of Pennsylvania (May 2020).
And to
quote Knuth:
The ultimate goal of mathematics is to eliminate all need for intelligent thought.
Conclusion
To summarize, the steps to defunctionalize a higher-order function are simple.
- Tag: Denote each function argument of a higher-order function by an identifying tag.
- Apply: Replace each use of the higher-order function by the use of a first-order apply function that accepts a tag instead of a function reference, and additional parameters as needed.
- Closure conversion: Ensure that the apply function doesn’t have any free variables and that they are passed as extra arguments16As needed. at the use-site(s).
That’s it. That’s the idea.17Step 4: Profit.
Comments
Comments can be left on twitter, mastodon, as well as below, so have at it.
New post!
— The Weary Travelers blog (@wearyTravlrsBlg) June 11, 2023
Idea: What is "defunctionalization"?https://t.co/VnWdkhL801
Reply here if you have comments.
Footnotes:
For instance, in a distributed program or when you want to specify a function as a configuration parameter and your configuration DSL, e.g. YAML, doesn’t support function references.
A higher-order function (cf. first-order) is a function that either returns a function or takes one or more functions as arguments.
The scaling_factor
is an example of a
free variable, aka a
non-local variable. It’s a variable whose value is used in the
function, but isn’t passed in as an argument.
John C. Reynolds. 1972.
Definitional
interpreters for higher-order programming languages.
In Proc. ACM Annu. Conf., Vol. 2. ACM, 717–740.
https://doi.org/10.1145/800194.805852/.
The
first to square the list and the second to scale the list by a
scaling_factor
.
Equivalently, values of an algebraic data type.
This is an instance of Lambda lifting or Closure conversion.
Or, rather, could execute on the other machine.
Adriaan van Wijngaarden. 1966. Recursive definition of syntax and semantics. In T.B. Steel Jr. (Ed.), Formal Language Description Languages for Computer Programming, 13-24. North-Holland.
Olivier Danvy and Lasse
R. Nielsen. 2001.
Defunctionalization at work.
In Proc. 3rd ACM SIGPLAN Int. Conf. Princ. Pract. Declar. Program.,
162–174. ACM.
https://doi.org/10.1145/773184.773202.
For instance, the Zipper data structure.
James Koppel. 2019. Defunctionalization: Everybody Does It, Nobody Talks About It. SIGPLAN Blog (Dec. 2019).
Mitchell
Wand. 1982. Deriving Target Code as a Representation of Continuation Semantics.
ACM Trans. Program. Lang. Syst. 4, 3 (July),
496–517. https://doi.org/10.1145/357172.357179.
Jeremy Gibbons. 2021.
Continuation-Passing Style,
Defunctionalization, Accumulations, and Associativity.
In Art Sci. Eng. Program., Vol. 6,
Issue 2.
https://doi.org/10.22152/programming-journal.org/2022/6/7.
Li-yao Xia. 2020. Defunctionalize the Continuation. In PL Club Blog. University of Pennsylvania (May 2020).
As needed.