Skip to content

[P1] Validate Coder observability stack post-install #11

@ausbru87

Description

@ausbru87

The observability Helm chart + Argo app + Route + dashboards landed in commit cd0c0a3 but haven't been validated end-to-end on a live cluster.

Depends on: cluster install.

Validation steps

  1. Confirm Argo app is Healthy:
    `oc get application coder-observability -n openshift-gitops`
  2. Confirm pods running:
    ```bash
    oc get pods -n coder-observability

    Expect: grafana-..., prometheus-server-..., loki-..., grafana-agent-...

    ```
  3. Confirm OCP Route is created and TLS works:
    ```bash
    curl -fsS https://grafana.apps.cluster.rhsummit.coderdemo.io | head -5

    Should return Grafana login HTML

    ```
  4. Confirm anonymous viewer auth works (booth-friendly):
  5. Confirm Coder dashboards loaded (the chart bundles them):
    • Look for "Coder Server", "Coder Workspace Builds", "Coder Provisioner"
  6. Confirm the imported k3s-pattern dashboards loaded:
    • "Agent Boundaries" — under the "Coder" folder
    • "AI Bridge" — should appear, panels may show "no data" until Postgres exporter is wired (separate issue)
  7. Confirm Loki is ingesting logs:
    • `oc logs -n coder-observability deploy/loki`
    • In Grafana → Explore → pick Loki datasource → `{namespace="coder"}` should show Coder server logs

Acceptance criteria

  • Grafana reachable + TLS valid
  • Anonymous read-only access works
  • All Coder bundled dashboards load with data
  • Agent Boundaries dashboard loads (data shows once a workspace is started)
  • AI Bridge dashboard loads (some panels expect Postgres datasource — note as known gap)
  • Loki captures Coder + Agent Firewall logs
  • Prometheus retains 7d of data (we capped retention from k3s's 60d)

If something is broken

  • Common issue: chart's Service name doesn't match what `manifests/observability/route.yaml` expects (`coder-observability-grafana`). `oc get svc -n coder-observability` to find the right name; update the Route.
  • Loki may need additional config to scrape from the `coder` namespace; check the chart's grafana-agent values.
  • If anonymous viewer doesn't work, verify the `grafana.ini` block's `auth.anonymous` is being applied (`oc get cm -n coder-observability` for the rendered config).

Reference: `gitops/apps/observability/application.yaml` (chart values), `manifests/observability/` (manifests).

Metadata

Metadata

Assignees

No one assigned

    Labels

    area-observabilityGrafana / Prometheus / LokidemoBooth demo contentp1Should-do for resilience or polishrhsummit-2026Red Hat Summit 2026 demo asset

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions