-
Notifications
You must be signed in to change notification settings - Fork 42
add lifecycle metrics #93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
ea7e5ef
to
0dc16e9
Compare
79855f8
to
52a0c8f
Compare
self.state = .starting(queue) | ||
} | ||
|
||
self.log("starting") | ||
Counter(label: "\(self.label).lifecycle.start").increment() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is label risky i.e. containing spaces etc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm good question, do you think we should normalize it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was just thinking about it but I also just remembered that last time we said that the libs should sanitize, so we do e.g. in the prometheus one https://github.com/MrLotU/SwiftPrometheus/blob/f4d4adfeb1178e250a0ea427017fb4d1cc877de3/Sources/Prometheus/PrometheusMetrics.swift#L112 🤔
I guess it's fine 👍
52a0c8f
to
28f7ef6
Compare
motivation: startup/shutdown metrics are important for real-life services, for example spike in start metrics indicates a crash-loop changes: * add start and shutdown counters to ComponentLifecycle * add start and shutdown timers to report duration of startup and shutdown operations
28f7ef6
to
39b9cd1
Compare
start { error in | ||
Timer(label: "\(self.label).\(tasks[index].label).lifecycle.start").recordNanoseconds(DispatchTime.now().uptimeNanoseconds - startTime.uptimeNanoseconds) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could it be worth making some struct/enum with the labels such that it is easier to look in one place to see "aha, these are all the metric labels this lib exports"?
I do this in some other project, it is easier to know what metrics are exposed then.
Sure not all that much here but it's a nice pattern;
So either some Labels.start(base: self.label)
or Metrics.timer(base: self.label)
I guess?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have func recordInterval(since: DispatchTime, now: DispatchTime = .now()) {
for this type of nanosecond recording, consider using it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
recordInterval
is not define in the core library afaict, where did you see it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have it here https://github.com/apple/swift-metrics/blob/44e8bfc7f5a0686e21d5d21a7ea5454e74ed3449/Sources/Metrics/Metrics.swift#L42-L44
maybe it's not released yet? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see but it's not in CoreMetrics hm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah yes... I prefer not to introduce dependency on Foundation for low level libs if we can get away with it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah though this could live in core I guess, I'll ticketify -- it's just Dispatch dep, not foundation hm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Graphite/statsd and Prometheus have different naming schemes (dot notation vs snake_case).
For my own education, is it a general direction to use dot notation?
@@ -18,9 +18,9 @@ import Darwin | |||
import Glibc | |||
#endif | |||
import Backtrace | |||
import CoreMetrics |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it intentional? AFAIR libraries shouldn't normally depend on CoreMetrics unless implementing a MetricFactory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CoreMetrics is useful when you do not want to pul in Foundation
shutdown { error in | ||
Timer(label: "\(self.label).\(tasks[index].label).lifecycle.shutdown").recordNanoseconds(DispatchTime.now().uptimeNanoseconds - startTime.uptimeNanoseconds) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider using recordinternal(since: startTime)
I do not believe there is a standard for this, and for this reason our Metrics API is also is not opinionated about this: as you pointed out different system (Graphite, Prometheus) have their own conventions and constraints so it must be the "job" of the "backend" library to sanitize the labels to whatever its sensitive to, for example Prometheus is sensitive to dot notation, so its the "job" of the Prometheus "backend" library to convert dots into underscores or whatever. |
Yeah agreed on the sanitization, sounds good. LGTM |
motivation: startup/shutdown metrics are important for real-life services, for example spike in start metrics indicates a crash-loop
changes: