Skip to content

Conversation

ivoanjo
Copy link
Member

@ivoanjo ivoanjo commented Jan 6, 2021

Rollbar gem versions <= 3.1.1 have an incompatibility with ddtrace's
CPU profiling. This was ack'd and fixed upstream in
rollbar/rollbar-gem#1018.

This commit adds code to detect when customers are still using a legacy
version of rollbar, printing a warning to ask them to upgrade.

Note: Because the fixed version of rollbar has not yet been released,
the tests that validate that the warning is not shown on unpatched
versions are disabled and tagged with a FIXME.

See the "FIXME NEW ROLLBAR NEEDED" note in the setup_spec.rb for more
details on what's disabled and what steps we need to take once the
new version gets released.

First commit @ Datadog 🎉

@ivoanjo ivoanjo requested a review from a team January 6, 2021 12:34
Rollbar gem versions <= 3.1.1 have an incompatibility with ddtrace's
CPU profiling. This was ack'd and fixed upstream in
<rollbar/rollbar-gem#1018>.

This commit adds code to detect when customers are still using a legacy
version of rollbar, printing a warning to ask them to upgrade.

Note: Because the fixed version of rollbar has not yet been released,
the tests that validate that the warning is not shown on unpatched
versions are disabled and tagged with a FIXME.

See the "FIXME NEW ROLLBAR NEEDED" note in the setup_spec.rb for more
details on what's disabled and what steps we need to take once the
new version gets released.

First commit @ Datadog 🎉
@ivoanjo ivoanjo force-pushed the fix/warn_on_incompatible_rollbar branch from d7ea76a to 662a44e Compare January 6, 2021 12:41
@codecov-io
Copy link

codecov-io commented Jan 6, 2021

Codecov Report

Merging #1300 (91b5b32) into feature/profiling (6f2f1ce) will increase coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@                Coverage Diff                 @@
##           feature/profiling    #1300   +/-   ##
==================================================
  Coverage              97.91%   97.92%           
==================================================
  Files                    848      848           
  Lines                  39745    39779   +34     
==================================================
+ Hits                   38918    38952   +34     
  Misses                   827      827           
Impacted Files Coverage Δ
lib/ddtrace/profiling/tasks/setup.rb 95.91% <100.00%> (+0.68%) ⬆️
spec/ddtrace/profiling/tasks/setup_spec.rb 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6f2f1ce...91b5b32. Read the comment docs.

Copy link
Contributor

@ericmustin ericmustin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not super familiar with the profiling branch yet but generally lgtm, deferring to @delner here though

Copy link
Contributor

@delner delner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the actual warning itself looks good! However, I think we could leverage a test helper to greatly simplify the tests for this, and eliminate much of the appraisal related changes.

We did this in another file, which tested for behavior based on the presence or non-presence of a gem (and its version):

RSpec.describe Datadog::Contrib::Faraday::Integration do
  extend ConfigurationHelpers

  describe '.version' do
    subject(:version) { described_class.version }

    context 'when the "faraday" gem is loaded' do
      include_context 'loaded gems', faraday: described_class::MINIMUM_VERSION
      it { is_expected.to be_a_kind_of(Gem::Version) }
    end

    context 'when "faraday" gem is not loaded' do
      include_context 'loaded gems', faraday: nil
      it { is_expected.to be nil }
    end
  end

I think you could use this, and remove all the appraisal changes. What do you think @ivoanjo ?

@ivoanjo
Copy link
Member Author

ivoanjo commented Jan 8, 2021

Ah, interesting. It took me a few minutes to wrap my head around what the helper was doing ;)

If I understand correctly, the loaded gems helper simulates looking up the info on a gem that is loaded (e.g. the response from Gem.loaded_specs).

I think the general idea would work for this PR. The tests in my first attempt are more realistic, but they do introduce a number of extra steps (and make testing slower), as well as making it hard for testing all variants in one run (you always need to restart your ruby with different environments).

I can't quite use the helper directly since I'm using Gem::Specification.find_all_by_name which differs from Gem.loaded_specs because it includes gems that may not be loaded:

$ pry
[1] pry(main)> Gem.loaded_specs.keys.include?("rollbar")
=> false
[2] pry(main)> Gem::Specification.find_all_by_name("rollbar").map { |it| [it.name, it.version] }
=> [["rollbar", Gem::Version.new("3.1.1")], ["rollbar", Gem::Version.new("2.27.1")]]

...but the approach is the same -- I shall mock the other method instead.

Now I shall quote Arnie:

The approach taken in my first attempt to test this change was rather
heavy handed -- needing a lot of changes to the appraisals/rakefile
and multiple runs of the test to validate.

As suggested during review, let's instead mock out the Gem apis, which
are documented and pretty stable, and avoid the impact on the test
suite running time/complexity.
Copy link
Contributor

@delner delner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the use of mocks here is appropriate (Gem::Specification is a stable API) and serves us well in simplifying the test code; it's now much easier to reason about its mechanisms.

Nice work @ivoanjo ! Thank you :)

@ivoanjo ivoanjo merged commit 0484aff into feature/profiling Jan 11, 2021
@ivoanjo ivoanjo deleted the fix/warn_on_incompatible_rollbar branch January 11, 2021 08:55
ivoanjo added a commit that referenced this pull request Feb 2, 2021
[PROF-2585] Warn on incompatible rollbar version
ivoanjo added a commit that referenced this pull request Feb 9, 2021
[PROF-2585] Warn on incompatible rollbar version
ivoanjo added a commit that referenced this pull request Feb 11, 2021
When I initially worked on #1300 ("Warn on incompatible rollbar
version") my understanding of the issue on rollbar/rollbar-gem#1018 was
incomplete and I did not realize that old rollbar versions, beyond
breaking datadog profiler's CPU time instrumentation, could actually
break the customer's application.

One of our private beta customers mentioned he had seen both the
warning but also had seen a stack trace:

```
[1] ! Unable to load application: SystemStackError: stack level too deep
/usr/local/bundle/gems/rollbar-3.1.1/lib/rollbar/plugins/thread.rb:5:in `initialize_with_rollbar': stack level too deep (SystemStackError)
	from /usr/local/bundle/gems/ddtrace-0.45.0.feature.profiling.109457/lib/ddtrace/profiling/ext/cthread.rb:47:in `initialize'
	from /usr/local/bundle/gems/rollbar-3.1.1/lib/rollbar/plugins/thread.rb:6:in `initialize_with_rollbar'
	from /usr/local/bundle/gems/ddtrace-0.45.0.feature.profiling.109457/lib/ddtrace/profiling/ext/cthread.rb:47:in `initialize'
```

This led me to try to reproduce this issue. I discovered that with

```ruby
ENV['DD_PROFILING_ENABLED'] = 'true'
require 'ddtrace/profiling/preload'

require 'rollbar'
Rollbar.configure do |c|
  c.access_token = "nope"
end

Thread.new { }.join
```

I could trigger the issue, and that the current behavior was that
the warning from #1300 would be printed, but then the application
would proceed to fail.

Having a bit more experience with the codebase now, I realize that
the correct place to put this check is in the CPU extension
 #unsupported_reason method, which will ensure the extension is
not loaded at all if an incompatible version of rollbar is around.

The resulting behavior is that even with an old rollbar version,
profiler will happily load and work; it will just omit the CPU
time profiling, similarly to how it behaves on other platforms where
for different reasons we don't support CPU time profiling.
ivoanjo added a commit that referenced this pull request Mar 11, 2021
[PROF-2585] Warn on incompatible rollbar version
ivoanjo added a commit that referenced this pull request Mar 11, 2021
When I initially worked on #1300 ("Warn on incompatible rollbar
version") my understanding of the issue on rollbar/rollbar-gem#1018 was
incomplete and I did not realize that old rollbar versions, beyond
breaking datadog profiler's CPU time instrumentation, could actually
break the customer's application.

One of our private beta customers mentioned he had seen both the
warning but also had seen a stack trace:

```
[1] ! Unable to load application: SystemStackError: stack level too deep
/usr/local/bundle/gems/rollbar-3.1.1/lib/rollbar/plugins/thread.rb:5:in `initialize_with_rollbar': stack level too deep (SystemStackError)
	from /usr/local/bundle/gems/ddtrace-0.45.0.feature.profiling.109457/lib/ddtrace/profiling/ext/cthread.rb:47:in `initialize'
	from /usr/local/bundle/gems/rollbar-3.1.1/lib/rollbar/plugins/thread.rb:6:in `initialize_with_rollbar'
	from /usr/local/bundle/gems/ddtrace-0.45.0.feature.profiling.109457/lib/ddtrace/profiling/ext/cthread.rb:47:in `initialize'
```

This led me to try to reproduce this issue. I discovered that with

```ruby
ENV['DD_PROFILING_ENABLED'] = 'true'
require 'ddtrace/profiling/preload'

require 'rollbar'
Rollbar.configure do |c|
  c.access_token = "nope"
end

Thread.new { }.join
```

I could trigger the issue, and that the current behavior was that
the warning from #1300 would be printed, but then the application
would proceed to fail.

Having a bit more experience with the codebase now, I realize that
the correct place to put this check is in the CPU extension
 #unsupported_reason method, which will ensure the extension is
not loaded at all if an incompatible version of rollbar is around.

The resulting behavior is that even with an old rollbar version,
profiler will happily load and work; it will just omit the CPU
time profiling, similarly to how it behaves on other platforms where
for different reasons we don't support CPU time profiling.
ivoanjo added a commit that referenced this pull request Mar 16, 2021
[PROF-2585] Warn on incompatible rollbar version
ivoanjo added a commit that referenced this pull request Mar 16, 2021
When I initially worked on #1300 ("Warn on incompatible rollbar
version") my understanding of the issue on rollbar/rollbar-gem#1018 was
incomplete and I did not realize that old rollbar versions, beyond
breaking datadog profiler's CPU time instrumentation, could actually
break the customer's application.

One of our private beta customers mentioned he had seen both the
warning but also had seen a stack trace:

```
[1] ! Unable to load application: SystemStackError: stack level too deep
/usr/local/bundle/gems/rollbar-3.1.1/lib/rollbar/plugins/thread.rb:5:in `initialize_with_rollbar': stack level too deep (SystemStackError)
	from /usr/local/bundle/gems/ddtrace-0.45.0.feature.profiling.109457/lib/ddtrace/profiling/ext/cthread.rb:47:in `initialize'
	from /usr/local/bundle/gems/rollbar-3.1.1/lib/rollbar/plugins/thread.rb:6:in `initialize_with_rollbar'
	from /usr/local/bundle/gems/ddtrace-0.45.0.feature.profiling.109457/lib/ddtrace/profiling/ext/cthread.rb:47:in `initialize'
```

This led me to try to reproduce this issue. I discovered that with

```ruby
ENV['DD_PROFILING_ENABLED'] = 'true'
require 'ddtrace/profiling/preload'

require 'rollbar'
Rollbar.configure do |c|
  c.access_token = "nope"
end

Thread.new { }.join
```

I could trigger the issue, and that the current behavior was that
the warning from #1300 would be printed, but then the application
would proceed to fail.

Having a bit more experience with the codebase now, I realize that
the correct place to put this check is in the CPU extension
 #unsupported_reason method, which will ensure the extension is
not loaded at all if an incompatible version of rollbar is around.

The resulting behavior is that even with an old rollbar version,
profiler will happily load and work; it will just omit the CPU
time profiling, similarly to how it behaves on other platforms where
for different reasons we don't support CPU time profiling.
ivoanjo added a commit that referenced this pull request Mar 29, 2021
[PROF-2585] Warn on incompatible rollbar version
ivoanjo added a commit that referenced this pull request Mar 29, 2021
When I initially worked on #1300 ("Warn on incompatible rollbar
version") my understanding of the issue on rollbar/rollbar-gem#1018 was
incomplete and I did not realize that old rollbar versions, beyond
breaking datadog profiler's CPU time instrumentation, could actually
break the customer's application.

One of our private beta customers mentioned he had seen both the
warning but also had seen a stack trace:

```
[1] ! Unable to load application: SystemStackError: stack level too deep
/usr/local/bundle/gems/rollbar-3.1.1/lib/rollbar/plugins/thread.rb:5:in `initialize_with_rollbar': stack level too deep (SystemStackError)
	from /usr/local/bundle/gems/ddtrace-0.45.0.feature.profiling.109457/lib/ddtrace/profiling/ext/cthread.rb:47:in `initialize'
	from /usr/local/bundle/gems/rollbar-3.1.1/lib/rollbar/plugins/thread.rb:6:in `initialize_with_rollbar'
	from /usr/local/bundle/gems/ddtrace-0.45.0.feature.profiling.109457/lib/ddtrace/profiling/ext/cthread.rb:47:in `initialize'
```

This led me to try to reproduce this issue. I discovered that with

```ruby
ENV['DD_PROFILING_ENABLED'] = 'true'
require 'ddtrace/profiling/preload'

require 'rollbar'
Rollbar.configure do |c|
  c.access_token = "nope"
end

Thread.new { }.join
```

I could trigger the issue, and that the current behavior was that
the warning from #1300 would be printed, but then the application
would proceed to fail.

Having a bit more experience with the codebase now, I realize that
the correct place to put this check is in the CPU extension
 #unsupported_reason method, which will ensure the extension is
not loaded at all if an incompatible version of rollbar is around.

The resulting behavior is that even with an old rollbar version,
profiler will happily load and work; it will just omit the CPU
time profiling, similarly to how it behaves on other platforms where
for different reasons we don't support CPU time profiling.
ivoanjo added a commit that referenced this pull request Apr 12, 2021
[PROF-2585] Warn on incompatible rollbar version
ivoanjo added a commit that referenced this pull request Apr 12, 2021
When I initially worked on #1300 ("Warn on incompatible rollbar
version") my understanding of the issue on rollbar/rollbar-gem#1018 was
incomplete and I did not realize that old rollbar versions, beyond
breaking datadog profiler's CPU time instrumentation, could actually
break the customer's application.

One of our private beta customers mentioned he had seen both the
warning but also had seen a stack trace:

```
[1] ! Unable to load application: SystemStackError: stack level too deep
/usr/local/bundle/gems/rollbar-3.1.1/lib/rollbar/plugins/thread.rb:5:in `initialize_with_rollbar': stack level too deep (SystemStackError)
	from /usr/local/bundle/gems/ddtrace-0.45.0.feature.profiling.109457/lib/ddtrace/profiling/ext/cthread.rb:47:in `initialize'
	from /usr/local/bundle/gems/rollbar-3.1.1/lib/rollbar/plugins/thread.rb:6:in `initialize_with_rollbar'
	from /usr/local/bundle/gems/ddtrace-0.45.0.feature.profiling.109457/lib/ddtrace/profiling/ext/cthread.rb:47:in `initialize'
```

This led me to try to reproduce this issue. I discovered that with

```ruby
ENV['DD_PROFILING_ENABLED'] = 'true'
require 'ddtrace/profiling/preload'

require 'rollbar'
Rollbar.configure do |c|
  c.access_token = "nope"
end

Thread.new { }.join
```

I could trigger the issue, and that the current behavior was that
the warning from #1300 would be printed, but then the application
would proceed to fail.

Having a bit more experience with the codebase now, I realize that
the correct place to put this check is in the CPU extension
 #unsupported_reason method, which will ensure the extension is
not loaded at all if an incompatible version of rollbar is around.

The resulting behavior is that even with an old rollbar version,
profiler will happily load and work; it will just omit the CPU
time profiling, similarly to how it behaves on other platforms where
for different reasons we don't support CPU time profiling.
@codecov-commenter
Copy link

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 97.92%. Comparing base (6f2f1ce) to head (91b5b32).

Additional details and impacted files
@@                Coverage Diff                 @@
##           feature/profiling    #1300   +/-   ##
==================================================
  Coverage              97.91%   97.92%           
==================================================
  Files                    848      848           
  Lines                  39745    39779   +34     
==================================================
+ Hits                   38918    38952   +34     
  Misses                   827      827           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants