Skip to content Skip to sidebar Skip to footer

Why Is Using A Generator Function Slower Than Filling And Iterating An Array In This Example?

A Tale of Two Functions I have one function that fills an array up to a specified value: function getNumberArray(maxValue) { const a = []; for (let i = 0; i < maxValue;

Solution 1:

The terribly unsatisfying answer is probably this: Your ES5 function relies on features that (with the exceptions of let and const) have been in V8 since it was released in 2008 (and presumably for some time before, as I understand that what became V8 originated as part of Google's web crawler). Generators, on the other hand, have only been in V8 since 2013. So not only has the ES5 code had seven years to be optimized while the ES6 code has had only two, almost nobody (compared to the many millions of sites using code just like your ES5 code) is using generators in V8 yet, which means there has been very little opportunity to discover, or incentive to implement, optimizations for it.

If you really want a technical answer as to why generators are comparatively slow in Node.js, you'll probably have to dive into the V8 source yourself, or ask the people who wrote it.

Solution 2:

FYI this question is ancient in internet terms and generators have caught up (at least when tested in Chrome) https://jsperf.com/generator-vs-loops1

Solution 3:

Try replacing the 'let' in the generator function with a function scoped 'var'. It seems that the 'let' inside the loop incurs a lot of overhead. Only use let if you absolutely have to.

Solution 4:

In fact, running this benchmark now, generators at ~2x faster.

I've modified the code slightly (moved let i) and here is the full gist: https://gist.github.com/antonkatz/6b0911c85ddadae39c434bf8ac32b340

On my machine, these are the results:

Running Array... Total time: 4,022ms Rss: 2,728,939,520 Heap Total: 2,726,199,296 Heap Used: 2,704,236,368

Running Generator... Total time: 2,541ms Rss: 851,968 Heap Total: 0 Heap Used: -5,073,968

I was very curious myself and could not find a proper answer. Thanks @David for providing the test code.

Solution 5:

In the OP's example, the generator will always be slower

While JS engine authors will continue working to improve performance, there are some underlying structural realities that virtually guarantee that, for the OP's test case, building the array will always be faster than using an iterator.

Consider that a generator function returns a generator object.

A generator object will, by definition, have a next()function, and calling a function in Javascript means adding an entry to your call stack. While this is fast, it's likely never going to be as fast as direct property access. (At least, not until the singularity.)

So if you are going to iterate over every single element in a collection, then a for loop over a simple array, which accesses elements via direct property access, is always going to be faster than a for loop over an iterator, which accesses each element via a call to the next() function.

As I'm writing this in January of 2022, running Chrome 97, the generator function is 60% slower than the array function using the OP's example.

Performance is use-case-dependent

It's not difficult to imagine scenarios where the generator would be faster. The major downside to the array function is that it must build the entire collection before the code can start iterating over the elements, whether or not you need all the elements.

Consider a basic search operation which will only access, on average, half the elements of the collection. In this scenario, the array function exposes its "Achilles' heel": it must build an array with all the results, even though half will never be seen. This is where a generator has the potential to far outstrip the array function.

To demonstrate this, I slightly modified the OP's use-case. I made the elements of the array slightly more expensive to calculate (with a little division and square root logic) and modified the loop to terminate at about the halfway mark (to mimic a basic search).

Setup

functiongetNumberArray(maxValue) {
    const a = [];

    for (let i = 0; i < maxValue; i++) {
        const half = i / 2;
        const double = half * 2;
        const root = Math.sqrt(double);
        const square = Math.round(root * root);
        a.push(square);
    }

    return a;
}

function* getNumberGenerator(maxValue) {
    for (let i = 0; i < maxValue; i++) {
        const half = i / 2;
        const double = half * 2;
        const root = Math.sqrt(double);
        const square = Math.round(root * root);
        yield square;
    }
}
let dummyCalculation;
const numIterations = 99999;
const searchNumber = numIterations / 2;

Generator

const iterator = getNumberGenerator(numIterations);
for (let val of iterator) {
    dummyCalculation = numIterations - val;
    if (val > searchNumber) break;
}

Array

const iterator = getNumberArray(numIterations);
for (let val of iterator) {
    dummyCalculation = numIterations - val;
    if (val > searchNumber) break;
}

With this code, the two approaches are neck-and-neck. After repeated test runs, the generator and array functions trade first and second place. It's not difficult to imagine that if the elements of the array were even more expensive to calculate (for example, cloning a complex object, making a REST callout, etc), then the generator would win easily.

Considerations beyond performance

While recognizing that the OP's question is specifically about performance, I think it's worth calling out that generator functions were not primarily developed as a faster alternative to looping over arrays.

Memory efficiency

The OP has already acknowledged this, but memory efficiency is one of the main benefits that generators provide over building arrays. Generators can build objects on the fly and then discard them when they are no longer needed. In its most ideal implementation, a generator need only hold one object in memory at a time, while an array must hold all of them simultaneously.

For a very memory-intensive collection, a generator would allow the system to build objects as they are needed and then reclaim that memory when the calling code moves on to the next element.

Representation of non-static collections

Generators don't have to resolve the entire collection, which free them up to represent collections that might not exist entirely in memory.

For example, a generator can represent collections where the logic to fetch the "next" item is time-consuming (such as paging over the results of a database query, where items are fetched in batches) or state-dependent (such as iterating over a collection where operations on the current item affect which item is "next") or. These are scenarios where building an array would be either impractical or impossible.

A generator could be used, for example, to return endless random data, or to represent game levels where each iteration is based in part on the previous, or even to represent a theoretical AI's "stream of consciousness" (for example, playing a word association game). These are interesting scenarios that would not be possible to represent using a standard array or list, but where a loop structure might feel more natural in code.

Post a Comment for "Why Is Using A Generator Function Slower Than Filling And Iterating An Array In This Example?"