-
Notifications
You must be signed in to change notification settings - Fork 229
Description
The Problem
const apm = require('elastic-apm-node').start()
async function somethingToAwait () {
return 42
}
async function firstThing () {
// await Promise.resolve()
const span = apm.startSpan('firstThing')
await somethingToAwait()
span.end()
}
async function secondThing () {
const span = apm.startSpan('secondThing')
// ...
span.end()
}
async function main () {
const trans = apm.startTransaction('myTransaction')
await firstThing()
await secondThing()
trans.end()
}
main()
Tracing this results in this trace:
trace 5d54ad
`- transaction 731c00 "myTransaction"
`- span 67b9ad "firstThing"
`- span 2f21b1 "secondThing"
The problem is that secondThing
is a child of firstThing
when we would not expect it to be so.
Workaround
If you comment out await somethingToAwait()
(i.e. have an async function that does not use await
) then the parent/child relationships are as expected. However, obviously, avoiding await
isn't a reasonable solution.
trace b877f3
`- transaction 6c3739 "myTransaction"
`- span 285f98 "firstThing"
`- span 54ac1b "secondThing"
A different workaround is to add await Promise.resolve()
before apm.startSpan(...)
.
This is effectively the solution that Dario has used in @kbn/apm-utils
's withSpan()
: https://github.com/elastic/kibana/blob/595b6948a1492a04fff14c8d6b2f8b50308681c0/packages/kbn-apm-utils/src/index.ts#L54
This workaround is still required after the run context work, as discuss in the comments starting at: #2181 (comment)
(/cc @AlexanderWert This is the same thing you have hit in your Lambda example.)
To be continued...
tl;dr: Execution of an async function creates and returns a Promise either when it hits return
(or the end of the function) or when it hits the first await
. The way run context (a.k.a. async context) tracking is done with async_hooks
, is to (indirectly) attach the current tracing state to newly created async tasks. When await somethingToAwait()
is hit, the currentSpan is Span firstThing
, and this is returned to the calling function main
. When main
resumes, its current execution async task is the same as that promise. By design, it doesn't matter that Span firstThing
has ended, it is still used as the parent for the subsequently created Span secondThing
. Why "by design"? See https://github.com/elastic/apm-agent-nodejs/blob/main/test/instrumentation/run-context/fixtures/parentage-with-ended-span.js
TODO: I'll add some details, and why I think an automatic fix for startSpan()
may not be feasible, and adding a withSpan
API (akin to OTel's withActiveSpan()
and similar in other tracers) is an appropriate fix.