Skip to content

stream: Readable batched iteration #34207

Open
@ronag

Description

@ronag

This is a continuation of #34035 and the promises session we had on OpenJS about async iteration performance of streams. One alternative discussed was to batch reading.

I was thinking we could do something along the lines of:

async function* createBatchedAsyncIterator(stream, batchLen) {
  let callback = nop;

  function next(resolve) {
    if (this === stream) {
      callback();
      callback = nop;
    } else {
      callback = resolve;
    }
  }

  stream
    .on('readable', next)
    .on('error', next)
    .on('end', next)
    .on('close', next);

  try {
    const state = stream._readableState;
    while (true) {
      let buffer;
      while (true) {
        const chunk = stream.read();
        if (chunk === null) break;
        if (!buffer) buffer = [];
        buffer.push(chunk);
        if (batchLen && buffer.length >= batchLen) break;
      }
      if (buffer) {
        yield buffer;
      } else if (state.errored) {
        throw state.errored;
      } else if (state.ended) {
        break;
      } else if (state.closed) {
        // TODO(ronag): ERR_PREMATURE_CLOSE?
        break;
      } else {
        await new Promise(next);
      }
    }
  } catch (err) {
    destroyImpl.destroyer(stream, err);
    throw err;
  } finally {
    destroyImpl.destroyer(stream, null);
  }
}

Readable.batched = function (stream, batchLen) {
  return createBatchedAsyncIterator(stream, batchLen);
}

Which would make the following possible:

// Concurrency
for await (const requests of Readable.batched(stream, 128)) {
  // Process in parallel with concurrency limit of 128.
  await Promise.all(requests.map(dispatch))
}

// Speed
for await (const requests of Readable.batched(stream, 128)) {
  for (const request of requests) {
    // All in the same tick
  }
}

It's still not perfect since if one element takes very long it would reduce concurrency. However, it would still be a step forward. Also reducing the async iteration overhead.

Metadata

Metadata

Assignees

No one assigned

    Labels

    streamIssues and PRs related to the stream subsystem.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions