async/await and hidden races

JavaScript has become one of my favorite languages. This is no secret to those who know me in a professional capacity. It used to be one of my most hated languages until I spent some time with it, over a dozen years ago. When I have to use other languages, I almost always find myself missing some feature from JavaScript.

Promises are one of the most straightforward and flexible implementations of a future that I’ve seen in a language. Aside from helping manage callback hell and allow errors to propagate effortlessly, they generally do a very good job of making it abundantly clear when you are giving control back to the event loop: at the end of your handler function. This bright line makes the code flow of promise chains fairly easy to reason about.

Then, along come async functions, which promise (hah!) to make your asynchronous code even more readable! This is mostly true. However, today I stumbled upon something that I have not dealt with in JavaScript in a long time… quite possibly ever.

A race condition.

In a language without threads, race conditions are exceedingly rare because the environment cannot preempt a line of execution to give control to something else. You are absolutely in control of when you yield control back to the event loop.

You still are in control with async functions, but it’s substantially easier to accidentally make an incorrect assumption. Let’s start with a (mostly) real-world example of some node.js code that I was working with. It’s a fairly straightforward task: given an async iterator that produces file paths, return a promise for the sum of their sizes.

I’m using the async library to run this operation with limited concurrency. If the generator produces tens of thousands of files, we don’t want to read their sizes all at once. Also, we don’t want to produce a list of sizes and then sum that. Since async’s reduce function can’t operate concurrently, the most efficient way to approach this is use eachLimit() and add each size to an accumulator. It’s not clean functional programming, but it works.

Using promises, we might do this:

const asyncify = require('async/asyncify');
const eachLimit = require('async/eachLimit');
const fs = require('fs').promises;

function diskUsage(pathGenerator) {
  let total = 0;

  return eachLimit(
    pathGenerator,
    10,
    asyncify(path =>
      fs.stat(path)
      .then(ent => { total += ent.size; })
    )
  )
  .then(() => total);
}

This is pretty straightforward, and it works. Now let’s rewrite this to use async functions. In particular, await can be used in the middle of an expression, so we can use that to shorten our iteration function … right?

const asyncify = require('async/asyncify');
const eachLimit = require('async/eachLimit');
const fs = require('fs').promises;

async function diskUsage(pathGenerator) {
  let total = 0;

  await eachLimit(
    pathGenerator,
    10,
    async path => { total += (await fs.stat(path)).size; }
  );

  return total;
}

If you run this, you might be surprised to get a different answer. What is going on here?

The answer, while simple and understandable, is a bit depressing. Recall that x += y; means the same thing as x = x + y;. In JavaScript, expressions are always evaluated first according to operator precedence, then left-to-right. The statement x += y; is thus broken up into four operations:

  1. Evaluate x and remember the result.
  2. Evaluate y and remember the result.
  3. Add the two results.
  4. Store the result in x.

Now, what happens if y contains an await expression? Step 2 is suspended until the awaited promise is fulfilled.

By now, you can probably see where I’m going with this: two calls to the async iterator function both evaluated x (total in this case) and remembered its value, then awaited the fs.stat() promise. When the first promise was fulfilled, the resulting file size was added to the prior-remembered value of total and then stored in total. When the second promise was fulfilled, the remembered value of total is no longer equal to the value in total!

The expression x += await y; is inherently dangerous if the value of x can be updated after awaiting y and before y is fulfilled.

The Babel transpiler preserves this behavior exactly. Given this input:

let x = 1;

async function foo() {
  x += await bar();
}

The following state machine is produced:

switch (_context.prev = _context.next) {
  case 0:
    _context.t0 = x;
    _context.next = 3;
    return bar();

  case 3:
    x = _context.t0 += _context.sent;

  case 4:
  case "end":
    return _context.stop();
}

Async functions are incredibly powerful and can make your code much easier to follow, but can also make it harder to follow! Be exceedingly careful when using await in the middle of expressions! Know which subexpressions are evaluated before and after await.

One of the huge advantages of promises is that .then provides a totally clear and unmistakable boundary, and can’t be used mid-expression in the same way that await can. If you use syntax highlighting, I would strongly suggest giving await a style that is hard to miss.

When in doubt, always assign the result of an await expression to a variable. x += await y; should become const result = await y; x += result; in this case.

Leave a Reply

Your email address will not be published. Required fields are marked *