feat: add agent config options for internal telemetry and health check, enable use of status cmd in kubernetes #182

obs-gh-mattcotter · 2025-03-27T22:31:43Z

Description

Add agent config options for internal telemetry and health check. Enable the use of the agent status command in kubernetes. Also upgrade the internal telemetry config as our config for metrics is no longer supported: https://github.com/open-telemetry/opentelemetry-collector/releases/tag/v0.123.0

Ex:

$ kubectl -n observe exec --stdin --tty observe-agent-node-logs-metrics-agent-brjwr -- /observe-agent --observe-config=/observe-agent-conf/observe-agent.yaml status
Defaulted container "node-logs-metrics" out of: node-logs-metrics, kube-cluster-info (init)
================
Agent
================

  Host Info
  ================
  HostID: 45220b39-2c66-47bf-984d-9d0bbae88222
  Hostname: observe-agent-node-logs-metrics-agent-brjwr
  BootTime: 2025-04-10T20:05:46Z
  Uptime: 20h32m12s
  OS: linux
  Platform: alpine
  PlatformFamily: alpine
  PlatformVersion: 3.21.3
  KernelArch: aarch64
  KernelVersion: 6.10.14-linuxkit

  Agent Metrics
  ================
  ExporterQueueSize: 0
  CPUSeconds: 1.19s
  MemoryUsed: 132.86328MB
  TotalSysMemory: 60.27076MB
  Uptime: 17.37295s
  AvgServerResponseTime: 0ms
  AvgClientResponseTime: 0ms

    Logs Stats
    ================
    ReceiverAcceptedCount: 18
    ReceiverRefusedCount: 0
    ExporterSentCount: 10
    ExporterSendFailedCount: 0

    Metrics Stats
    ================
    ReceiverAcceptedCount: 601
    ReceiverRefusedCount: 0
    ExporterSentCount: 601
    ExporterSendFailedCount: 0

    Traces Stats
    ================
    ReceiverAcceptedCount: 0
    ReceiverRefusedCount: 0
    ExporterSentCount: 0
    ExporterSendFailedCount: 0

  Agent Health
  ================
  Status: Running
  TotalRefusedCount: 0
  TotalSendFailedCount: 0

obs-gh-alexlew · 2025-03-28T00:27:06Z

internal/commands/status/flags.go

+)
+
+const (
+	TelemetryEndpointFlag   = "telemetry-endpoint"


I think actually the more useful use case here is that we could then provide overrides in the config.yaml we generate in the helm chart. so we could add a telemetry_endpoint: "{{ template "config.local_host"}}:8888" and that would work correctly. Since that's the case, I'd prefer if we stick to the snake_case style of naming instead of kebabcase and maybe we can nest these under something like endpoints::telemetry_endpoint

Good point, I agree! I will update this PR after my config refactor to simplify the rebasing.

…able use in kubernetes

obs-gh-mattcotter · 2025-04-11T16:42:55Z

internal/commands/config/config.go

 	},
 }

-func printAllConfigsIndividually(configFilePaths []string) error {


This is all code motion; I put these methods in util since the config package depends on the start package, and having these methods available in start for debugging is very useful.

obs-gh-mattcotter · 2025-04-11T16:43:40Z

internal/commands/status/statusretriever.go

-func GetAgentStatusFromHealthcheck(baseURL string) (AgentStatus, error) {
-	URL := fmt.Sprintf("%s/status", baseURL)
+func GetAgentStatusFromHealthcheck(baseURL string, path string) (AgentStatus, error) {
+	baseURL = util.ReplaceEnvString(baseURL)


This is to handle our default k8s use of ${env:MY_POD_IP}

obs-gh-mattcotter · 2025-04-11T16:45:12Z

internal/commands/status/statusretriever.go

@@ -119,53 +147,53 @@ func GetAgentMetricsFromEndpoint(baseURL string) (*AgentMetrics, error) {
 		if v.Type.String() == io_prometheus_client.MetricType_HISTOGRAM.String() {
 			met := v.Metric[0]
 			switch name := *v.Name; name {
-			case "otelcol_http_client_duration":
+			case "http_client_duration_milliseconds":


These metric names changed; possibly to be inline with prometheus conventions after the config upgrade. I checked the output from the prometheus endpoint and verified the metric names as well as watching the numbers update when calling the status command.

I verified that the names after this update match what's collected now from our prometheus exporter

obs-gh-mattcotter · 2025-04-11T16:45:34Z

internal/commands/util/config_printers.go

+	"gopkg.in/yaml.v3"
+)
+
+func PrintAllConfigsIndividually(configFilePaths []string) error {


Here's where the config methods moved to. Again, no changes just code motion.

obs-gh-mattcotter requested a review from obs-gh-alexlew March 27, 2025 22:55

obs-gh-mattcotter assigned obs-gh-alexlew Mar 27, 2025

obs-gh-alexlew reviewed Mar 28, 2025

View reviewed changes

obs-gh-alexlew assigned obs-gh-mattcotter and unassigned obs-gh-alexlew Mar 28, 2025

feat: add flags to set non-default endpoints for status command to en…

53beaf9

…able use in kubernetes

obs-gh-mattcotter force-pushed the mc/config-fix branch from e4ce7df to 6bd451c Compare April 11, 2025 16:38

obs-gh-mattcotter changed the title ~~feat: add flags to set non-default endpoints for status command to enable use in kubernetes~~ feat: add agent config options for internal telemetry and health check, enable use of status cmd in kubernetes Apr 11, 2025

obs-gh-mattcotter commented Apr 11, 2025

View reviewed changes

obs-gh-mattcotter requested a review from obs-gh-alexlew April 11, 2025 16:50

obs-gh-mattcotter assigned obs-gh-alexlew and unassigned obs-gh-mattcotter Apr 11, 2025

refactor to use agent config

bdc1c4f

obs-gh-mattcotter force-pushed the mc/config-fix branch from 6bd451c to bdc1c4f Compare April 14, 2025 15:55

obs-gh-alexlew approved these changes Apr 15, 2025

View reviewed changes

obs-gh-mattcotter merged commit f3f56e7 into main Apr 15, 2025
8 checks passed

obs-gh-mattcotter deleted the mc/config-fix branch April 15, 2025 15:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add agent config options for internal telemetry and health check, enable use of status cmd in kubernetes #182

feat: add agent config options for internal telemetry and health check, enable use of status cmd in kubernetes #182

Uh oh!

obs-gh-mattcotter commented Mar 27, 2025 •

edited

Loading

Uh oh!

obs-gh-alexlew Mar 28, 2025

Uh oh!

obs-gh-mattcotter Apr 3, 2025

Uh oh!

obs-gh-mattcotter Apr 11, 2025

Uh oh!

obs-gh-mattcotter Apr 11, 2025

Uh oh!

obs-gh-mattcotter Apr 11, 2025

Uh oh!

obs-gh-mattcotter Apr 11, 2025

Uh oh!

obs-gh-mattcotter Apr 11, 2025

Uh oh!

Uh oh!

Uh oh!

feat: add agent config options for internal telemetry and health check, enable use of status cmd in kubernetes #182

feat: add agent config options for internal telemetry and health check, enable use of status cmd in kubernetes #182

Uh oh!

Conversation

obs-gh-mattcotter commented Mar 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Uh oh!

obs-gh-alexlew Mar 28, 2025

Choose a reason for hiding this comment

Uh oh!

obs-gh-mattcotter Apr 3, 2025

Choose a reason for hiding this comment

Uh oh!

obs-gh-mattcotter Apr 11, 2025

Choose a reason for hiding this comment

Uh oh!

obs-gh-mattcotter Apr 11, 2025

Choose a reason for hiding this comment

Uh oh!

obs-gh-mattcotter Apr 11, 2025

Choose a reason for hiding this comment

Uh oh!

obs-gh-mattcotter Apr 11, 2025

Choose a reason for hiding this comment

Uh oh!

obs-gh-mattcotter Apr 11, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

obs-gh-mattcotter commented Mar 27, 2025 •

edited

Loading