Skip to content

fix: stop double recording exceptions on spans#1069

Closed
robertlaurin wants to merge 1 commit into
petergoldstein:mainfrom
robertlaurin:fix-span-exception-double-record
Closed

fix: stop double recording exceptions on spans#1069
robertlaurin wants to merge 1 commit into
petergoldstein:mainfrom
robertlaurin:fix-span-exception-double-record

Conversation

@robertlaurin

@robertlaurin robertlaurin commented Feb 4, 2026

Copy link
Copy Markdown
Contributor

The in_span method already records exceptions. So the explicit recording of exceptions here could result in doubling the exception event count.

On a high volume traced service an error with memcached could result in a lot of stacktraces, doubling them can compound the issue.

require 'bundler/inline'

gemfile(true) do
  source 'https://rubygems.org'

  gem 'debug'
  gem 'dalli', "4.2.0"
  gem 'opentelemetry-api'
  gem 'opentelemetry-sdk'
end

require 'opentelemetry-api'
require 'opentelemetry-sdk'
require 'dalli'

exporter = OpenTelemetry::SDK::Trace::Export::InMemorySpanExporter.new
OpenTelemetry::SDK.configure do |c|
  c.service_name = "test"
  c.add_span_processor(OpenTelemetry::SDK::Trace::Export::SimpleSpanProcessor.new(
    exporter
  ))
end

begin
  Dalli::Instrumentation.trace("foo") { raise StandardError.new("foo") }
rescue => e
  #
end

puts exporter.finished_spans[0].events

Lightly formatted the output

#<struct OpenTelemetry::SDK::Trace::Event 
    name="exception", 
    attributes={
        "exception.type" => "StandardError", 
        "exception.message" => "foo", 
        "exception.stacktrace" => "dalli_span.rb:26:in 'block in <main>': foo (StandardError)\n\tfrom..."
    }, 
    timestamp=1770245395869251000>

#<struct OpenTelemetry::SDK::Trace::Event 
    name="exception", 
        attributes={
        "exception.type" => "StandardError", 
        "exception.message" => "foo", 
        "exception.stacktrace" => "dalli_span.rb:26:in 'block in <main>': foo (StandardError)\n\tfrom ..."
    }, 
    timestamp=1770245395869262000>

The tests are currently mocking out OpenTelemetry entirely which means they would not catch this issue. Is it ok to bring in Otel as a test dependency, if so then I can add some proper regression tests here, or we can just remove the tests that assert the recording of exceptions as its a feature of the otel library itself and doesn't really need to be retested.

petergoldstein added a commit that referenced this pull request Feb 5, 2026
OpenTelemetry's in_span method already records exceptions and sets
error status, so Dalli's explicit recording was redundant and caused
duplicate exception events.

Also applied rubocop fixes and removed tests that were testing
OpenTelemetry's internal behavior rather than Dalli's code.

Co-Authored-By: Robert Laurin <robert.laurin@shopify.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
petergoldstein added a commit that referenced this pull request Feb 5, 2026
@petergoldstein

Copy link
Copy Markdown
Owner

Thanks @robertlaurin . Cherry picked into 4.3.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants