-
Notifications
You must be signed in to change notification settings - Fork 213
feat: Puma plugin to gracefully shut down providers #1637
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your contribution, @bdewater-thatch! 😄 I took a first pass on this PR and left a few comments related to the repo's best practices. Overall, this looks good! I still want to do more testing and thinking before I approve.
The main thing that stands out to me, is that this instrumentation isn't emitting any telemetry. (I apologize, we started talking Puma stats in Slack ages ago and I lost track of things 😅).
I'm not sure if that's a problem for the initial release. Perhaps calling out what the gem does a little more in the README is enough for now, since the shutdown problem already plagues users. What do you think?
cc'ing the folks in the on_shutdown slack thread: @arielvalentin, @hibachrach, @wsmoak
instrumentation/puma/opentelemetry-instrumentation-puma.gemspec
Outdated
Show resolved
Hide resolved
OpenTelemetry.tracer_provider.shutdown | ||
OpenTelemetry.meter_provider.shutdown if OpenTelemetry.respond_to?(:meter_provider) | ||
OpenTelemetry.logger_provider.shutdown if OpenTelemetry.respond_to?(:logger_provider) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love the inclusion of metrics and logs! 🎉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bit forward looking but figured it wouldn't hurt :)
I forgot to ask but was wondering if you think this should include a configurable timeout option, and if so what the default should be (nil
?). We host our apps on Render, which has a 30s timeout by default for graceful shutdown but is configurable via API/IaC to be up to 300s.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ooh, that's a good question. I'm not sure. Thinking out loud here:
Timeouts could technically be set through OTEL_BSP_EXPORT_TIMEOUT
since the tracer provider will iterate through all the span processors on shutdown and they'll use that setting. However, the tracer provider's shutdown
method has its own timeout that wraps the processors, which would have a different value.
The API doesn't have a shutdown
method for the TracerProvider, and one isn't mentioned in the spec. The SDK defines the shutdown
method. Instrumentation should be implementation-agnostic, so we don't include the SDK as a dependency in instrumentation. I wonder if we need to add a no-op shutdown API to allow this to work? And if that's even spec compliant? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to add that there is indeed no shutdown
method in the API; therefore we need to guard for when someone uses the OTEL_SDK_DISABLED envvar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 Could try()
help here maybe with a warning along the lines of "do you have an OTel SDK loaded?" logged if there isn't a shutdown()
method implemented on the configured tracer_provider?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not following the "shutdown method in the API" discussion. I've looked at what Sidekiq does (which is seemingly the only example of instrumentation calling shutdown) and it doesn't do anything additional checks?
Lines 134 to 136 in 2edff11
config.on(:shutdown) do | |
OpenTelemetry.tracer_provider.shutdown | |
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found this article to help me understand the difference between API and SDK. I've added an early return based on this example - since that already logs a warning I don't think we need another log here?
@kaylareopelle I addressed your feedback. Please LMK if you prefer additional commits for ease of review or for me to squash. |
I have the same concern. The plugin could be in a different folder (e.g. |
Following discussion at open-telemetry/opentelemetry-ruby#1357 and on Slack I decided to take a moment and write a Puma plugin to shutdown tracer/meter/logger providers and make sure nothing is lost before the process exits. Given that puma is the default for newly generated Rails apps, this should make things "just work" a little better.
First time contributor, I'm sure I'm missing things :) any feedback welcome!