2016-10-12



alvinashcraft
shared this story
from ②ality – JavaScript and more.

The ECMAScript proposal “Asynchronous Iteration” by Domenic Denicola is currently at stage 3. This blog post explains how it works.

Asynchronous iteration

With ECMAScript 6, JavaScript got built-in support for synchronously iterating over data. But what about data that is delivered asynchronously? For example, lines of text, read asynchronously from a file or an HTTP connection.

This proposal brings support for that kind of data. Before we go into it, let’s first recap synchronous iteration.

Synchronous iteration

Synchronous iteration was introduced with ES6 and works as follows:

Iterable: an object that signals that it can be iterated over, via a method whose key is Symbol.iterator.

Iterator: an object returned by invoking [Symbol.iterator]() on an iterable. It wraps each iterated element in an object and returns it via its method next() – one at a time.

IteratorResult: an object returned by next(). Property value contains an iterated element, property done is true after the last element (value can usually be ignored then; it’s almost always undefined).

I’ll demonstrate via an Array:

Asynchronous iteration

The problem is that the previously explained way of iterating is synchronous, it doesn’t work for asynchronous sources of data. For example, in the following code, readLinesFromFile() cannot deliver its data asynchronously:

The proposal specifies a new protocol for iteration that works asynchronously:

Async iterables are marked via Symbol.asyncIterator.

Method next() of an async iterator returns Promises for IteratorResults (vs. IteratorResults directly).

You may wonder whether it would be possible to instead use a synchronous iterator that returns one Promise for each iterated element. But that is not enough – whether or not iteration is done is generally determined asynchronously.

Using an asynchronous iterable looks as follows. Function createAsyncIterable() is explained later. It converts its synchronously iterable parameter into an async iterable.

Within an asynchronous function, you can process the results of the Promises via await and the code becomes simpler:

The interfaces for async iteration

In TypeScript notation, the interfaces look as follows.

for-await-of

The proposal also specifies an asynchronous version of the for-of loop: for-await-of:

for-await-of and rejections

Similarly to how await works in async functions, the loop throws an exception if next() returns a rejection:

Note that we have just used an Immediate Invoked Async Function Expression (IIAFE, pronounced “yaffee”). It starts in line (A) and ends in line (B). We need to do that because for-of-await doesn’t work at the top level of modules and scripts. It does work everywhere where await can be used. Namely, in async functions and async generators (which are explained later).

for-await-of and sync iterables

for-await-of can also be used to iterate over sync iterables:

Asynchronous generators

Normal (synchronous) generators help with implementing synchronous iterables. Asynchronous generators do the same for asynchronous iterables.

For example, we have previously used the function createAsyncIterable(syncIterable) which converts a syncIterable into an asynchronous iterable. This is how you would implement this function via an async generator:

Note the asterisk after function:

A normal function is turned into a normal generator by putting an asterisk after function.

An async function is turned into an async generator by doing the same.

How do async generators work?

A normal generator returns a generator object genObj. Each invocation genObj.next() returns an object {value,done} that wraps a yielded value.

An async generator returns a generator object genObj. Each invocation genObj.next() returns a Promise for an object {value,done} that wraps a yielded value.

Queuing next() invocations

The JavaScript engine internally queues invocations of next() and feeds them to an async generator once it is ready. That is, after calling next(), you can call again, right away; you don’t have to wait for the Promise it returns to be settled. In most cases, though, you do want to wait for the settlement, because you need the value of done in order to decide whether to call next() again or not. That’s how the for-await-of loop works.

Use cases for calling next() several times without waiting for settlements include:

Use case: Retrieving Promises to be processed via Promise.all(). If you know how many elements there are in an async iterable, you don’t need to check done.

Use case: Async generators as sinks for data, where you don’t always need to know when they are done.

Acknowledgement: Thanks to @domenic and @zenparsing for these use cases.

!--
https://github.com/tc39/proposal-async-iteration/issues/55
--

await in async generators

You can use await and for-await-of inside async generators. For example:

One interesting aspect of combining await and yield is that await can’t stop yield from returning a Promise, but it can stop that Promise from being settled:

Let’s take a closer look at line (A) and (B):

The yield in line (B) fulfills a Promise. That Promise is returned by next() immediately.

Before that Promise is fulfilled, the operand of await (the Promise returned by doSomethingAsync() in line (A)) must be fulfilled.

That means that these two lines correspond (roughly) to this code:

If you want to dig deeper – this is how one would implement the functionality of asyncGenerator via a normal function:

yield* in async generators

yield* in async generators works analogously to how it works in normal generators – like a recursive invocation:

In line (A), gen2() calls gen1(), which means that all elements yielded by gen1() are yielded by gen2():

The operand of yield* can be any async iterable. Sync iterables are automatically converted to async iterables, just like for for-await-of.

Errors

In normal generators, next() can throw exceptions. In async generators, next() can reject the Promise it returns:

Converting exceptions to rejections is similar to how async functions work.

Async function vs. async generator function

Async function:

Returns immediately with a Promise.

That Promise is fulfilled via return and rejected via throw.

Async generator function:

Returns immediately with an async iterable.

Every invocation of next() returns a Promise. yield x fulfills the “current” Promise with {value: x, done: false}. throw err rejects the “current” Promise with err.

Examples

The source code for the examples is available via the repository async-iter-demo on GitHub.

Using asynchronous iteration via Babel

The example repo uses babel-node to run its code. This is how it configures Babel in its package.json:

Example: turning an async iterable into an Array

Function takeAsync() collects all elements of asyncIterable in an Array. I don’t use for-await-of in this case, I invoke the async iteration protocol manually. I also don’t close asyncIterable if I’m finished before the iterable is done.

This is the test for takeAsync():

Note how nicely async functions work together with the mocha test framework: for asynchronous tests, the second parameter of test() can return a Promise.

Example: a queue as an async iterable

The example repo also has an implementation for an asynchronous queue, called AsyncQueue. It’s implementation is relatively complex, which is why I don’t show it here. This is the test for AsyncQueue:

Example: reading text lines asynchronously

Let’s implement code that reads text lines asynchronously. We’ll do it in three steps.

Step 1: read text data in chunks via the Node.js ReadStream API (which is based on callbacks) and push it into an AsyncQueue (which was introduced in the previous section).

Step 2: Use for-await-of to iterate over the chunks of text and yield lines of text.

Step 3: combine the two previous functions. We first feed chunks of text into a queue via readFile() and then convert that queue into an async iterable over lines of text via splitLines().

Lastly, this is how you’d use readLines() from within a Node.js script:

The specification of asynchronous iteration

The spec introduces several new concepts and entities:

Two new interfaces, AsyncIterable and AsyncIterator

New well-known intrinsic objects: %AsyncGenerator%, %AsyncFromSyncIteratorPrototype%, %AsyncGeneratorFunction%, %AsyncGeneratorPrototype%, %AsyncIteratorPrototype%.

One new well-known symbol: Symbol.asyncIterator

No new global variables are introduced by this feature.

Async generators

If you want to understand how async generators work, it’s best to start with Sect. “AsyncGenerator Abstract Operations”. Many things are straightforward; most difficult to understand is how queueing works.

Each async generator manages a queue of pending Promises in the internal property [[AsyncGeneratorQueue]]. Each entry contains two fields:

[[Completion]]: the parameter of next(), throw() or return() that lead to the entry being enqueued. The type of the completion (normal, throw, return) determines what happens after dequeuing.

[[Capability]]: the PromiseCapability of the pending Promise.

The queue is managed mainly via two operations:

Enqueuing happens via AsyncGeneratorEnqueue(). This is the operation that is called by next(), return() and throw(). It adds an entry to the AsyncGeneratorQueue. AsyncGeneratorResumeNext() is only called if the generator isn’t already executing. That means if a generator calls next(), return() or throw() from inside itself then the effects of that call will be delayed.

Dequeuing happens via AsyncGeneratorResumeNext(). The current Promise is always the first element of the queue. If the async generator is currently suspended, it is resumed and continues to run. The current Promise is later settled via AsyncGeneratorResolve() or AsyncGeneratorReject(). AsyncGeneratorResumeNext() is invoked after enqueuing, but also after settling a previous Promise, because there may now be new queued pending Promises, allowing execution to continue. If the generator is already completed, this operation calls AsyncGeneratorResolve() and AsyncGeneratorReject() itself, meaning that all queued pending Promises will eventually be settled.

Async-from-Sync Iterator Objects

To get an async iterator from an object iterable, you call GetIterator(iterable, async) (async is a symbol). If iterable doesn’t have a method [Symbol.asyncIterator](), GetIterator() retrieves a sync iterator via method iterable[Symbol.iterator]() and converts it to an async iterator via CreateAsyncFromSyncIterator().

The for-await-of loop

for-await-of works almost exactly like for-of, but there is an await whenever the contents of an IteratorResult are accessed. You can see that by looking at Sect. “Runtime Semantics: ForIn/OfBodyEvaluation”. Notably, iterators are closed similarly, via IteratorClose(), towards the end of this section.

Is async iteration worth it?

Now that I’ve used asynchronous iteration a little, I’ve come to a few conclusions. These conclusions are evolving, so let me know if you disagree with anything.

Promises have become the primitive building block for everything async in JavaScript. And it’s a joy to see how everything is constantly improving: more and more APIs are using Promises, stack traces are getting better, performance is increasing, using them via async functions is great, etc.

We need some kind of support for asynchronous sequences of data (which are different from streams!). That much is obvious. At the moment, the situation is similar to one-time async data before Promises: various patterns exist, but there is no clear standard and interoperability. Additionally, processing async data is clumsy, similarly to how using Promises directly is clumsy compared to using them via async functions.

Async iteration brings with it considerable additional cognitive load:

There are now normal functions, generator functions, async functions and async generator functions. And each kind of function exists as declaration, expression and method. Knowing what to use when is becoming more complicated.

Async iteration combines iteration with Promises. Individually, each of the two patterns takes a while to fully figure out (especially if you include generators under iteration). Combining them significantly increases the learning curve.

Operations are missing: Sync iterables can be converted to Arrays via the spread operator and accessed via Array destructuring. There are no equivalent operations for async iterables.

Converting legacy APIs to async iteration isn’t easy. A queue that is asynchronously iterable helps, but the pieces don’t fit together as neatly as I would like. In comparison, going from callbacks to Promises is quite elegant.

Next, we’ll look at two alternatives to async iteration for processing async data.

Alternative 1: Communicating Sequential Processes (CSP)

The following code demonstrates the CSP library js-csp:

player defines a “process” that is instantiated twice (in line (B) and in line (C), via csp.go()). The processes are connected via the “channel” table, which is created in line (A) and passed to player via its second parameter. A channel is basically a queue.

How does CSP compare to async iteration?

It’s coding style is also synchronous.

Channels feels like a good abstraction for producing and consuming async data.

Making the connections between processes explicit, as channels, means that you can configure how they work (how much is buffered, when to block, etc.).

The abstraction “channel” works for many use cases: communicating with web workers, distributed programming, etc.

Alternative 2: Reactive Programming

The following code demonstrates Reactive Programming via the JavaScript library RxJS:

In line (A), we create a stream of click events via fromEvent(). These events are then filtered so that there is at most one event per second. Every time there is an event, scan() counts how many events there have been, so far. In the last line, we log all counts.

How does Reactive Programming compare to async iteration?

The coding style is completely different, but has some similarities to Promises.

Chaining operations (such as throttle()) to process streams is quite elegant.

Further reading

Iterables and iterators (chapter on sync iteration in “Exploring ES6”)

Generators (chapter on sync generators in “Exploring ES6”)

ES proposal: async functions (blog post)

Show more