Skip to content

Possible unhandleable crash in kad-dht #2216

@saul-jb

Description

@saul-jb
  • Version:
    • libp2p: 0.46.16
    • @libp2p/kad-dht: 10.0.11
  • Platform: Linux 5.15.0-88-generic Ubuntu x86_64 GNU/Linux
  • Subsystem: kad-dht

Severity: ?

Description:

I have been downloading many blocks in parallel via Helia and on (rare) occasions it would crash without much of a trace:

file://node_modules/@libp2p/kad-dht/src/query/query-path.ts:227
      deferred.reject(new CodeError('Query aborted', 'ERR_QUERY_ABORTED'))
                      ^
CodeError: Query aborted
    at EventTarget.<anonymous> (file://node_modules/@libp2p/kad-dht/src/query/query-path.ts:227:23)
    at EventTarget.[nodejs.internal.kHybridDispatch] (node:internal/event_target:757:20)
    at EventTarget.dispatchEvent (node:internal/event_target:692:26)
    at abortSignal (node:internal/abort_controller:369:10)
    at AbortController.abort (node:internal/abort_controller:403:5)
    at EventTarget.onAbort (file://node_modules/any-signal/src/index.ts:14:16)
    at EventTarget.[nodejs.internal.kHybridDispatch] (node:internal/event_target:757:20)
    at EventTarget.dispatchEvent (node:internal/event_target:692:26)
    at abortSignal (node:internal/abort_controller:369:10)
    at AbortController.abort (node:internal/abort_controller:403:5) {
  code: 'ERR_QUERY_ABORTED',
  props: {}
}

Further debugging shows that the abort signal triggering this can be called from bitswap

Steps to reproduce the error:

Unfortunately I can't seem to trigger this reliably. The error is thrown in query-path.ts#L227. I believe this error is thrown and is uncatchable if the following steps happen:

In case that is not clear enough here is some code illustrating the error when everything aligns:

import defer from 'p-defer';

async function * toGenerator () {
	let defered = defer();

	// Resolve soon.
	setTimeout(() => defered.resolve(), 10);

	for (;;) {
		// Wait for the deferred to be resolved
		await defered.promise;
		defered = defer();

		// Reject later.
		setTimeout(() => defered.reject("Query aborted"), 10);

		yield;
	}
}

const itr = toGenerator();

try {
	// First iteration should resolve.
	await itr.next();

	// A timer allows the promise to reject before the .next() call.
	await new Promise(r => setTimeout(r, 30));

	// The toGenerator iterator will throw an uncatchable error before it gets here.
	await itr.next();
} catch (error) {
	// Error will not be catchable.
}

It's possible that I am barking up the wrong tree here and my crash is caused by something else but after many hours of debugging it is the only possibility that I can think of that causes this issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    need/triageNeeds initial labeling and prioritization

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions