Add documentation for observability and port forwarding#129
Conversation
Signed-off-by: Sam Pastoriza <spastoriza@nvidia.com>
Greptile SummaryThis PR adds two new documentation sections to the AI-Q blueprint: a comprehensive Observability guide covering Phoenix, LangSmith, Weights & Biases Weave, the OTEL Collector with privacy redaction, and verbose logging; and a VM / Remote Development troubleshooting section explaining SSH port forwarding. Supporting changes include updating
Confidence Score: 5/5
Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[AI-Q Application] --> B{Observability Backend?}
B -->|Phoenix| C[arize-phoenix server\nlocalhost:6006]
B -->|LangSmith| D[LangSmith Cloud\nvia LANGCHAIN_* env vars]
B -->|Weave| E[W&B Weave\nvia WANDB_API_KEY + YAML config]
B -->|OTEL Collector| F[otelcollector_redaction exporter]
B -->|Verbose| G[Console / stdout]
F --> H{Redaction enabled?}
H -->|Yes| I[Redact PII / sensitive attrs]
H -->|No| J[Forward spans as-is]
I --> K[OTEL Collector\nJaeger / Tempo / Datadog]
J --> K
C -->|Trace UI| L[localhost:6006 UI]
D -->|Cloud dashboard| M[smith.langchain.com]
E -->|Cloud dashboard| N[wandb.ai]
Last reviewed commit: 7ed9d79 |
AjayThorve
left a comment
There was a problem hiding this comment.
one type, otherwise looks great!
Signed-off-by: Sam Pastoriza <spastoriza@nvidia.com>
Signed-off-by: Sam Pastoriza <spastoriza@nvidia.com>
Signed-off-by: Sam Pastoriza <spastoriza@nvidia.com>
| ### Batch Configuration | ||
|
|
||
| The exporter supports standard OTEL batch settings: | ||
|
|
||
| ```yaml | ||
| general: | ||
| telemetry: | ||
| tracing: | ||
| otel: | ||
| _type: otelcollector_redaction | ||
| endpoint: http://your-otel-collector:4318/v1/traces | ||
| batch_size: 512 | ||
| flush_interval: 5000 | ||
| max_queue_size: 2048 | ||
| drop_on_overflow: false | ||
| shutdown_timeout: 30000 | ||
| ``` |
There was a problem hiding this comment.
Batch configuration fields undocumented — units unclear for time values
The "Batch Configuration" section shows five fields in a code block but none of them appear in a reference table. In particular, flush_interval: 5000 and shutdown_timeout: 30000 are ambiguous — a user unfamiliar with OTEL exporters cannot tell whether these are in milliseconds, seconds, or some other unit from the snippet alone.
Consider either extending the existing configuration reference table or adding a small table here:
| Field | Description |
|---|---|
batch_size |
Maximum number of spans per export batch. |
flush_interval |
Interval in milliseconds between automatic flushes (default: 5000 = 5 s). |
max_queue_size |
Maximum number of spans held in the queue before exporting. |
drop_on_overflow |
Whether to drop spans when the queue reaches max_queue_size. |
shutdown_timeout |
Maximum time in milliseconds to wait for in-flight spans on shutdown (default: 30000 = 30 s). |
What does this PR do?