Skip to content

Commit 4ae7129

Browse files
feat: add readme for k8sattributesprocessor (#201)
### Description OB-XXX Please explain the changes you made here. ### Checklist - [ ] Created tests which fail without the change (if possible) - [ ] Extended the README / documentation, if necessary
1 parent 2bc36cb commit 4ae7129

File tree

1 file changed

+205
-5
lines changed
  • components/processors/observek8sattributesprocessor

1 file changed

+205
-5
lines changed
Lines changed: 205 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,212 @@
11
# Observe K8s Attributes Processor
22
This processor operates on K8s resource logs from the `k8sobjectsreceiver` and adds additional attributes.
33

4+
## Purpose
5+
6+
This is a specialized processor component that enriches Kubernetes resource logs with additional attributes. The processor essentially takes raw Kubernetes events and adds computed, normalized attributes that make the data more useful for observability purposes.
7+
8+
The processor is designed to be part of a larger observability pipeline, enhancing Kubernetes resource logs with additional context and derived attributes that make the logs more useful for monitoring and debugging purposes.
9+
10+
This processor plays a crucial role in enriching Kubernetes observability data by adding computed attributes and status information that might not be directly available from the raw Kubernetes API responses.
11+
12+
## Technical Rationale
13+
14+
This processor shares much of its logic with `kubectl`, the official Kubernetes command-line tool. When you run `kubectl describe <resource>`, kubectl transforms raw Kubernetes API data into human-readable information. Our processor uses the same transformation logic to generate its facets. This means most of the attributes you see in Observe's explorer are computed using the same battle-tested code that powers `kubectl`, ensuring consistency and reliability in how Kubernetes resource states are interpreted and displayed.
15+
16+
Another key reason to create a dedicated processor is to compute attributes that cannot be computed with OTTL (the transformation language used by the Opentelemetry’s transform processor). One of the main limitations of that language is that it cannot iterate over lists/maps. With the custom processor we can access those elements in Go as structured objects, and leverage the power of the programming language to be able to access and manipulate data contained in such objects.
17+
18+
Beyond adding facets, the processor performs another critical function: automatic secret redaction. While preserving the raw event structure and secret names, the processor replaces any secret values with "REDACTED" before ingestion. This security feature protects customers from accidentally exposing sensitive information, even if their Kubernetes clusters inadvertently leak secrets in their events. By performing redaction at the processor level, we ensure that secret values never reach our storage system.
419

520
## Caveats
6-
This processor currently expects the `kind` field to be set at the base level of the event. In the case of `watch` events from the `k8sobjectsreceiver`, this field is instead present inside of the `object` field. This processor currently expects this field to be lifted from inside the `object` field to the base level by a transform processor earlier in the pipeline. If that isn't set up, this processor will only calculate status for `pull` events from the `k8sobjectsreceiver`.
21+
> [!CAUTION]
22+
> This processor currently expects the `kind` field to be set at the base level of the event. In the case of `watch` events from the `k8sobjectsreceiver`, this field is instead present inside of the `object` field. This processor currently expects this field to be lifted from inside the `object` field to the base level by a transform processor earlier in the pipeline. If that isn't set up, this processor will only calculate status for `pull` events from the `k8sobjectsreceiver`.
23+
24+
## Description
25+
26+
1. **Main Purpose**:
27+
- Processes logs from the **`k8sobjectsreceiver`** (a Kubernetes objects receiver)
28+
- Adds additional attributes and metadata to various Kubernetes resource types
29+
- Calculates and derives status information for different Kubernetes objects
30+
2. **Supported Kubernetes Resources**: The processor handles multiple Kubernetes resource types including:
31+
- Core Resources: Pods, Nodes, Services, ServiceAccounts, Endpoints, ConfigMaps, Secrets
32+
- Apps: StatefulSets, DaemonSets, Deployments
33+
- Workloads: Jobs, CronJobs
34+
- Storage: PersistentVolumes, PersistentVolumeClaims
35+
- Network: Ingress
36+
3. **Key Features**:
37+
- Calculates derived status information for various resources
38+
- Adds metadata and attributes based on resource type
39+
- Processes both "watch" and "pull" events from the Kubernetes API
40+
- Handles resource body transformations and attribute enrichment
41+
4. **Specific Actions Per Resource**: Each resource type has its own set of actions that add specific attributes:
42+
- Pods: Status, container counts, readiness state, and conditions
43+
- Nodes: Status, roles, and node pool information
44+
- Services: Load balancer ingress, selectors, ports, and external IPs
45+
- Jobs: Status and duration calculations
46+
- And many more resource-specific attributes
47+
5. **Important Note**: There's a caveat that the processor expects the **`kind`** field to be at the base level of the event. For watch events, this field needs to be lifted from the **`object`** field to the base level by a transform processor earlier in the pipeline.
48+
49+
## Example
50+
51+
**Input Example (Kubernetes Pod Event)**:
52+
53+
```json
54+
{
55+
"apiVersion": "v1",
56+
"kind": "Pod",
57+
"metadata": { "name": "purge-old-datasets-28688813-sspnn",
58+
"namespace": "eng",
59+
"labels": {
60+
"observeinc.com/app": "apiserver",
61+
"observeinc.com/environment": "eng"
62+
}
63+
},
64+
"status": {
65+
"containerStatuses": [
66+
{
67+
"ready": true,
68+
"restartCount": 2,
69+
"state": {"running": {...}}
70+
},
71+
{
72+
"ready": true,
73+
"restartCount": 3,
74+
"state": {"running": {...}}
75+
},
76+
{
77+
"ready": false,
78+
"restartCount": 0,
79+
"state": {"waiting": {...}}
80+
}
81+
],
82+
"conditions": [
83+
// Various pod conditions...
84+
]
85+
}
86+
}
87+
```
88+
89+
**Output (Added OTEL Attributes)**:
90+
91+
```json
92+
{
93+
// Original event data remains unchanged, but new attributes are added:
94+
"attributes": {
95+
"observe_transform": {
96+
"facets": {
97+
// Derived status based on pod conditions and state
98+
"status": "Terminating",
99+
100+
// Container statistics
101+
"total_containers": 4,
102+
"ready_containers": 3,
103+
"restarts": 5,
104+
105+
// Pod conditions as a map
106+
"conditions": {
107+
"PodScheduled": true,
108+
"Initialized": true,
109+
"Ready": false,
110+
"ContainersReady": false,
111+
"PodHasNetwork": true
112+
},
113+
114+
// If pod has readiness gates
115+
"readinessGatesReady": 1,
116+
"readinessGatesTotal": 2
117+
}
118+
}
119+
}
120+
}
121+
```
122+
123+
## Implementation Details
124+
125+
The processor enriches the original Kubernetes event by:
126+
127+
1. **Computing Status**: It analyzes the pod's conditions, container states, and metadata to determine a high-level status (e.g., "Terminating")
128+
2. **Container Statistics**: It calculates:
129+
- Total number of containers of a Pod
130+
- Number of ready containers of a Pod
131+
- Total restart count across all containers of a Pod
132+
3. **Condition Analysis**: It transforms the pod's conditions into an easily queryable map
133+
4. **Readiness Information**: If the pod has readiness gates, it computes how many are ready vs total
134+
135+
Similar transformations happen for other Kubernetes resources. For example:
136+
137+
- For **Services**: It adds load balancer status, external IPs, and port information
138+
- For **Nodes**: It adds derived roles, pool information, and overall status
139+
- For **Jobs**: It adds duration and completion status
140+
- For **Ingress**: It adds routing and backend service information
141+
142+
These enriched attributes make it much easier to:
143+
144+
1. Query and filter Kubernetes resources based on their state
145+
2. Create meaningful visualizations and dashboards
146+
3. Set up monitoring and alerting based on derived states
147+
4. Analyze the health and status of your Kubernetes resources
148+
149+
## Added Attributes
150+
151+
The processor adds a list of attributes under the `observe_transform.facets` namespace. The processor computes these attributes based on the raw Kubernetes resource state and adds them to make querying and monitoring easier. Note that some attributes might be conditionally present based on the resource state. For example, load balancer information will only be present for Services of type LoadBalancer, and certain status attributes will only appear when specific conditions are met.
152+
153+
### **Pod Attributes**
154+
155+
- **`observe_transform.facets.status`** - Overall pod status
156+
- **`observe_transform.facets.total_containers`** - Total number of containers
157+
- **`observe_transform.facets.ready_containers`** - Number of ready containers
158+
- **`observe_transform.facets.restarts`** - Total restart count
159+
- **`observe_transform.facets.readinessGatesReady`** - Number of ready readiness gates
160+
- **`observe_transform.facets.readinessGatesTotal`** - Total number of readiness gates
161+
- **`observe_transform.facets.conditions`** - Map of pod conditions
162+
- **`observe_transform.facets.cronjob_name`** - Name of parent CronJob if applicable
163+
- **`observe_transform.facets.statefulset_name`** - Name of parent StatefulSet if applicable
164+
- **`observe_transform.facets.daemonset_name`** - Name of parent DaemonSet if applicable
165+
166+
### **Node Attributes**
167+
168+
- **`observe_transform.facets.status`** - Node status
169+
- **`observe_transform.facets.roles`** - Node roles (e.g., master, worker)
170+
- **`observe_transform.facets.pool`** - Node pool information (for managed K8s services)
171+
172+
### **Service Attributes**
173+
174+
- **`observe_transform.facets.loadBalancerIngress`** - Load balancer ingress information
175+
- **`observe_transform.facets.selector`** - Service selector labels
176+
- **`observe_transform.facets.ports`** - Service ports configuration
177+
- **`observe_transform.facets.externalIPs`** - External IPs assigned to the service
178+
179+
### **Job Attributes**
180+
181+
- **`observe_transform.facets.status`** - Job status
182+
- **`observe_transform.facets.duration`** - Job duration
183+
184+
### **ServiceAccount Attributes**
185+
186+
- **`observe_transform.facets.secrets`** - Associated secrets
187+
- **`observe_transform.facets.secretsNames`** - Names of associated secrets
188+
- **`observe_transform.facets.imagePullSecrets`** - Image pull secrets
189+
190+
### **Endpoints Attributes**
191+
192+
- **`observe_transform.facets.endpoints`** - List of endpoints
193+
194+
### **ConfigMap Attributes**
195+
196+
- **`observe_transform.facets.data`** - ConfigMap data
197+
198+
### **StatefulSet Attributes**
199+
200+
- **`observe_transform.facets.selector`** - StatefulSet selector labels
201+
202+
### **Deployment Attributes**
203+
204+
- **`observe_transform.facets.selector`** - Deployment selector labels
205+
206+
### **Ingress Attributes**
207+
208+
- **`observe_transform.facets.loadBalancer`** - Load balancer information
7209

8-
## Emitted Attributes
210+
### **PersistentVolume and PersistentVolumeClaim Attributes**
9211

10-
| Attribute Key | Description |
11-
|-----------------------------------|--------------------------------------------------------------|
12-
| `observe_transform.facets.status` | The derived Pod status based on the current Pod description. |
212+
- Various storage-related attributes (specific attributes depend on the storage provider)

0 commit comments

Comments
 (0)