Code size analysis#
pw_async2: Cooperative async tasks for embedded
Core async2 implementation#
The following table shows the code size cost of adding pw_async2 to a
system. These size reports assume a baseline system with an RTOS which already
uses a handful of core Pigweed components including HAL abstractions and
pw_allocator.
The first row captures the core of pw_async2: the dispatcher, tasks, and
wakers, using the pw::async2::BasicDispatcher. This is the minimum size
cost a system must pay to adopt pw_async2. The following row demonstrates
the cost of adding another task to this system. Of course, the majority of the
cost of the task exists within its implementation — this simply shows that
there is minimal internal overhead.
Label |
Segment |
Delta |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Full cost of including pw_async2 |
FLASH
|
+3,184 |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
RAM
|
+40 |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Base incremental cost of adding a task to an existing async system |
FLASH
|
+48 |
Futures#
Futures are the core abstraction in
pw_async2, providing a standardized way of polling an asynchronous
operation to completion.
The design of futures has some implications for code size:
All futures are templated on the type of value they produce, which means that the compiler must generate separate code for each type.
Additionally, futures use CRTP for compile-time polymorphism, so each concrete future type is a distinct class and may duplicate common behavior.
The following sections detail the code size of various future implementations and utilities.
ValueFuture#
ValueFuture is the simplest future type, used to return a single result from
an asynchronous operation. Its implementation contains effectively the minimal
code required for a future, making it a good baseline for understanding the size
cost of a future in pw_async2.
The table below shows the size of ValueFuture. The first row shows the base
cost of using a single ValueFuture. The second row adds another
ValueFuture with a different return type to demonstrate the incremental cost
of template specialization. The third row shows the size of VoidFuture
(alias for ValueFuture<void>), which is specialized to avoid storing a
value.
Label |
Segment |
Delta |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Size of |
FLASH
|
+1,064 |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Cost of additional |
FLASH
|
+1,096 |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Size of |
FLASH
|
+888 |
OnceSender and OnceReceiver#
The next table shows sizes of the pair of OnceSender and OnceReceiver
types, which allow for returning a delayed result from an async function,
similar to a Future type in other languages. This type is templated on its
stored value, causing specialization overhead for each type sent through the
sender/receiver pair. The first row showcases the base cost of using a
OnceSender and OnceReceiver; the second row adds another template
specialization on top of this to demonstrate the incremental cost.
Label |
Segment |
Delta |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Size of OnceSender / OnceReceiver |
FLASH
|
+936 |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
RAM
|
+8 |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Cost of additional OnceSender / OnceReceiver template specialization |
FLASH
|
+760 |
async2 utilities#
Pigweed provides several utilities to simplify writing asynchronous code. Among
these are combinators which operate over several pendables, such as Join
which waits for all pendables to complete, and Select which waits for the
first pendable to complete.
The table below demonstrates the code size impact of using these utilities.
For both Join and Select, the report shows:
The initial cost of using the utility with multiple pendables of the same type.
The incremental cost of adding a second call with pendables of different types, which demonstrates the overhead of template specialization.
Additionally, the table includes a comparison showing the code size difference
between using the Select helper versus manually polling each pendable.
Label |
Segment |
Delta |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Size of calling Select() with several pendables of the same type |
FLASH
|
+768 |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Size of an additional a call to Select() with pendables of different types |
FLASH
|
+1,024 |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Cost of using Select() versus manually polling each pendable |
FLASH
|
+936 |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Size of calling Join() with several pendables of the same type |
FLASH
|
+656 |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Size of an additional a call to Join() with pendables of different types |
FLASH
|
+928 |