Project Proposal: system packages#3252
Project Proposal: system packages#3252mmanciop wants to merge 17 commits intoopen-telemetry:mainfrom
Conversation
projects/packaging.md
Outdated
| Publish modular, well-integrated system packages for: | ||
| * OBI | ||
| * OpenTelemetry Injector | ||
| * SDK+autoinstrumentation for Java, .NET, Node.js and Python (with potentially Python and Ruby if bandwidth allows) |
There was a problem hiding this comment.
How will it be chosen which autoinstrumentation packages to use? Is there a goal to make the experience cohesive across languages in terms of what is instrumented and how instrumentations are configured, or will that be dictated by the current offerings of each target language?
There was a problem hiding this comment.
Which processes to instrument and which not will mostly depend on:
(1) which autoinstrumentation packages are installed
(2) allow/deny lists in Injector and OBI
Also, I think we could additionally have:
(3) depending on other configurations, opt-ins / opt-outs via process environment
There was a problem hiding this comment.
@dyladan Dynatrace has significant experience with the OneAgent in opt-in/opt-out mechanisms for host-wide monitoring. Is there any advice or experience you folks can share?
There was a problem hiding this comment.
I may be able to share some if you have specific questions about it. I assume you're looking beyond the obvious tension between "customers want everything instrumented automatically" and "customers want 0 bytes deployments and 0 resource overhead". To that point I think I would say that this is a classic 80-20 rule where 20 percent of the instrumentations provide 80% of the value, except it might be more like 95-5. Just HTTP and some common databases like pg, mysql, Cassandra, Redis, etc would cover 95% of use cases. For the other 5% you can have more fine-grained installation packages.
I think if you're instrumenting with system packages you do get away with more installation size because it's installed once on a host that is running many processes. Right now with Otel you often have to repeat the same instrumentation dependencies installation for each process running.
My question had more to do with the cohesiveness of the system though. If a customer does apt install otel (or otel-python, otel-js, etc) they will expect roughly the same results from each language. For example they will be frustrated if the python package instruments databases but the JS one only does HTTP. It could be said that we should already be striving for this type of consistency with our auto instrumentation packages but the different installation and setup paradigms hides this from users enough that they don't notice the inconsistency as much.
There was a problem hiding this comment.
I may be able to share some if you have specific questions about it.
My questions about the OneAgent are pretty deep. For example, I'd love to know how the OneAgent seems to have a weak symbol for getenv and not break linking for dynamically linked binaries that do not link LibC. Another one would be the experiences and learning made when apparently (give the weak symbols I saw), highjack the main process of LibC.
My question had more to do with the cohesiveness of the system though. If a customer does apt install otel (or otel-python, otel-js, etc) they will expect roughly the same results from each language. For example they will be frustrated if the python package instruments databases but the JS one only does HTTP. It could be said that we should already be striving for this type of consistency with our auto instrumentation packages but the different installation and setup paradigms hides this from users enough that they don't notice the inconsistency as much.
I understand the GC has a workstream about making "OpenTelemetry more like a product". The consistency of auto-instrumentations fits squarely in there IMO. The packaging is just exposing issues we already have, and have so far gone undernoticed (not unnoticed: the inconsistency of auto-instrumentations I something I heard complain about in end users any number of times).
Co-authored-by: Antoine Toulme <antoine@toulme.name>
|
|
||
| TODO | ||
|
|
||
| ## Staffing / Help Wanted |
There was a problem hiding this comment.
This project will serve a function that doesn't yet exist in OpenTelemetry, acting as a point where we take the various components in the ecosystem and stitch them together into an opinionated set of defaults. There are going to be questions about what components are included, what the threshold for inclusion is in terms of stability, what the default configuration is, how breaking changes of dependent components are managed, and more. Its going to function as the project that takes a discreet set of tools and turns them into cohesive product.
I think for this to be successful, this group is going to need representation and have communication from the various SIGs whose projects it stitches together, including: the language SIGs with auto instrumentation solutions that will be used, the operator SIG which serves a similar function in k8s, the collector SIG which will no doubt be a part of this, the eBPF projects that will likely end up part of this (profiling, OBI), and maybe more.
That's not to say that people from all these groups are necessarily going to do the work, but they need to be in the loop, and offer guidance / feedback to the SIG and bring guidance / feedback back to their respective SIG.
Can we advertise this project to all the SIGs we think will be involved and request that each volunteers one or more maintainers to work in a liaison capacity?
There was a problem hiding this comment.
I also think we could make the support in the system packages (and OpenTelemetry Injector) a criterion in the Status and releases compatibility matrix for language SIGs, as it would signal to both Language SIGs and end users that OpenTelemetry considers automatic instrumentation as a first-class requirement. (Of course, assuming there is consensus around this last statement.)
There was a problem hiding this comment.
I agree that updating the status page to include Linux Package Management as a column and a section would be a good way to raise awareness. But we should also reach out to Java, .NET, Python, Ruby, and Go to confirm a liaison who will review and provide feedback.
Co-authored-by: Denys Sedchenko <9203548+x1unix@users.noreply.github.com>
| [`@mmanciop`](https://github.com/mmanciop) tried to reach out to Canonical for help with DEB packaging, but while generally interested, they have not committed to helping. | ||
|
|
||
| Need more expertise in packaging RPM, right now the expertise in the SIG is mostly with DEB |
There was a problem hiding this comment.
are there other CNCF or LF projects we are close to that might be able to help us here?
There was a problem hiding this comment.
None that I know of
There was a problem hiding this comment.
@pavolloffay @frzifus @brunobat tagging you as you're with RedHat, any chance you or someone you know would be able to assist here?
|
@open-telemetry/ebpf-profiler-maintainers please take a look! |
Clarified focus on Go for OBI to prevent double instrumentation with other languages.
Removed TODO section from packaging.md
Removed the Staffing section and its TODO note.
|
@open-telemetry/technical-committee any updated on this? |
|
Something for this SIG. I believe this tap would have a better home within the otel org (maybe under a |
|
@mmanciop apologies for the delay, and for the radio silence. We are trying to improve our process. In particular, we feel that we are at capacity with sponsorship, so we are trying to get better insight into how we can expand or adjust this capacity. See here: #3337 Possibly, this SIG could have a lower level of sponsorship if the SIG could instead put effort into getting feedback on packaging design decisions early and often from Otel maintainers and end users. We are also working to put a GC/TC checklist and notifications in place, to improve how we communicate around project proposals and to prevent them from accidentally getting stale. |
I am interested in this proposal. Are you saying we can move forward without a TC sponsor if we do the work of reaching out to other SIGs? If so, I certainly can do that, would you please indicate what level of effort you would expect so we can meet your bar? |
|
@technical-committee has discussed this proposal on 4/1. I believe their next step is to discuss with the GC. Please keep us appraised of your progress, thanks! |
|
| The ultimate goal is to provide an excellent experience via: | ||
|
|
||
| ```shell | ||
| {apt|yum} install opentelemetry |
There was a problem hiding this comment.
My comment here was based on this proposed scope.
If the scope were reduced to one which didn't need to make decisions about defaults, gating requirements (e.g. stability thresholds), and how various components interact with each other (e.g. when to install OBI vs. SDKs), then I think we could get away with TC escalating sponsorship level.
The scope could start as just building packages for each of the language instrumentations. E.g. opentelemetry-java, opentelemetry-node, opentelemetry-{language}.
Phase 2 of the project could include the bits that have the dependencies on "stable by default", interactions between components, and things that require elevated TC sponsorship. Phase 1 would unblock the injector. By the time phase 2 rolls around, hopefully we have finished other work such that we can give this the attention it deserves. Also by the time phase 2 rolls around, we'll have developed some of the tooling / expertise / patterns needed to expand the set of packages we publish.
New project proposal to provide a product-like, idiomatic experience to provide a seamless experience of monitoring applications running on (virtual) hosts through a combination of the OpenTelemetry Injector injecting SDKs and autoinstrumentations, OpenTelemetry eBPF Instrumentation (OBI), and the OpenTelemetry Collector.