Skip to content

feat(config): add optional metadata dict to workflow definition#107

Merged
jrob5756 merged 8 commits into
microsoft:mainfrom
PolyphonyRequiem:feat/workflow-metadata
May 1, 2026
Merged

feat(config): add optional metadata dict to workflow definition#107
jrob5756 merged 8 commits into
microsoft:mainfrom
PolyphonyRequiem:feat/workflow-metadata

Conversation

@PolyphonyRequiem
Copy link
Copy Markdown
Member

@PolyphonyRequiem PolyphonyRequiem commented Apr 21, 2026

Summary

Adds an optional metadata dict to workflow configuration with two binding paths:

  1. Static — declared in the workflow YAML
  2. Dynamic — injected at runtime via --metadata / -m CLI flags

Both are merged (CLI wins on conflicts) and included verbatim in the workflow_started event, enabling downstream consumers to adapt behavior without parsing YAML source.

Closes #106

Changes

  • src/conductor/config/schema.py: Add metadata: dict[str, Any] field to WorkflowDef (empty dict default)
  • src/conductor/engine/workflow.py: Include metadata in the workflow_started event data
  • src/conductor/cli/app.py: Add --metadata / -m CLI option, parsed separately from --input
  • src/conductor/cli/run.py: Accept metadata param, merge CLI metadata on top of YAML metadata after config load
  • src/conductor/cli/bg_runner.py: Forward --metadata flags to background child process

Example

YAML (static)

workflow:
  name: twig-sdlc
  entry_point: intake
  metadata:
    tracker: ado
    project_url: https://dev.azure.com/org/Project

CLI (dynamic, merged on top)

conductor run twig-sdlc.yaml --metadata work_item_id=1814

Result in event log

{
  "type": "workflow_started",
  "data": {
    "name": "twig-sdlc",
    "metadata": {
      "tracker": "ado",
      "project_url": "https://dev.azure.com/org/Project",
      "work_item_id": "1814"
    }
  }
}

Backward Compatibility

  • metadata defaults to {} — existing workflows and CLI invocations are completely unaffected
  • --metadata is optional — omitting it changes nothing
  • No changes to event format beyond the additive metadata key
  • All existing schema tests pass (100/100)

Daniel Green and others added 4 commits April 21, 2026 12:10
Add a metadata field to WorkflowDef that allows workflow authors to
attach arbitrary key-value pairs for external tooling. The metadata
is included verbatim in the workflow_started event, enabling
downstream consumers (dashboards, trackers, enrichers) to adapt
behavior without parsing the YAML source.

Example usage in workflow YAML:
  workflow:
    name: twig-sdlc
    metadata:
      tracker: ado
      project_url: https://dev.azure.com/org/Project
      work_item_id_agent: intake
      work_item_id_field: epic_id

The field defaults to an empty dict, so existing workflows are
unaffected.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add --metadata / -m flag to 'conductor run' that accepts key=value
pairs, merged on top of YAML-declared metadata. This enables callers
to inject dynamic values at invocation time:

    conductor run twig-sdlc.yaml --metadata work_item_id=1814

CLI metadata is:
- Parsed separately from --input (different binding path)
- Merged on top of YAML metadata (CLI wins on conflicts)
- Forwarded through --web-bg background process spawning
- Included in the workflow_started event alongside YAML metadata

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
7 new tests verifying:
- Schema: metadata defaults to empty dict, accepts arbitrary keys,
  independent from input/context fields
- Loader: metadata round-trips through YAML, omission gives empty
  dict, nested values preserved, metadata and input are separate
  namespaces

All 140 config tests pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 21, 2026

Codecov Report

❌ Patch coverage is 62.85714% with 26 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (main@873c72b). Learn more about missing BASE report.

Files with missing lines Patch % Lines
src/conductor/cli/run.py 20.00% 12 Missing ⚠️
src/conductor/web/server.py 33.33% 8 Missing ⚠️
src/conductor/cli/bg_runner.py 0.00% 3 Missing ⚠️
src/conductor/engine/workflow.py 93.10% 2 Missing ⚠️
src/conductor/cli/app.py 66.66% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main     #107   +/-   ##
=======================================
  Coverage        ?   84.67%           
=======================================
  Files           ?       53           
  Lines           ?     7232           
  Branches        ?        0           
=======================================
  Hits            ?     6124           
  Misses          ?     1108           
  Partials        ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Daniel Green and others added 2 commits April 22, 2026 09:56
Propagate the event log's random hex suffix as a run_id across all
systems:

- EventLogSubscriber: expose run_id property (was already generated)
- WorkflowEngine: accept run_id + log_file params, include in
  workflow_started event
- PID files: include run_id + log_file fields
- Web dashboard: add /api/info endpoint returning run_id, log_file,
  workflow_name, started_at, metadata

This enables the central dashboard to match per-run dashboards to
event logs by exact run_id instead of fragile name/timestamp heuristics.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Auto-inject runtime diagnostics (PID, platform, Python version, cwd,
conductor version, started_at, run_id, log_file, bg_mode) into the
workflow_started event. Dashboard port/URL included when --web is active;
parent_pid included in --web-bg mode.

System metadata flows through:
- JSONL event log (via EventLogSubscriber)
- Web dashboard /api/info endpoint
- Checkpoint files (for resume context)

PID files are intentionally left unchanged.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Collaborator

@jrob5756 jrob5756 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really cool. Thanks for contributing. I have a few comments. Please take a look and we can merge!

Comment thread src/conductor/engine/workflow.py Outdated
"platform": sys.platform,
"python_version": _platform.python_version(),
"conductor_version": self._conductor_version(),
"cwd": os.getcwd(),
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Critical: os.getcwd() can raise unhandled OSError

If the working directory is deleted between process start and this call (CI runners, containers, temp-dir cleanup), this raises FileNotFoundError/OSError. Unlike _conductor_version() which is wrapped in try/except, this method has no protection — and it's called at the top of _execute_loop(), so an unhandled exception here crashes the entire workflow before any agent runs with a confusing error.

try:
    cwd = os.getcwd()
except OSError:
    cwd = "<unavailable>"

Or wrap the entire _build_system_metadata() body in try/except to match _conductor_version()'s pattern.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Wrapped os.getcwd() in try/except — falls back to '' on OSError. Matches the defensive pattern used by _conductor_version().

Comment thread src/conductor/web/server.py Outdated
"workflow_name": data.get("name", ""),
"started_at": event.get("timestamp", 0),
"metadata": data.get("metadata", {}),
"system": data.get("system", {}),
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Critical: Unauthenticated endpoint exposes sensitive system info

The system dict contains PID, filesystem paths (cwd, log_file), platform details, and parent PID — all served via an unauthenticated HTTP GET to any network client that can reach the dashboard port.

Consider:

  • Omitting system from /api/info entirely (it's still in the event log for diagnostics)
  • Or limiting to non-sensitive fields only (e.g., conductor_version, started_at)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed — stripped the system dict from /api/info entirely. The endpoint now only returns non-sensitive fields: run_id, workflow_name, started_at, metadata, and conductor_version. Full diagnostics remain in the event log for debugging.

Comment thread src/conductor/cli/app.py Outdated
# Parse --metadata key=value flags (separate from inputs)
cli_metadata: dict[str, str] = {}
if raw_metadata:
cli_metadata.update(parse_input_flags(raw_metadata))
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important: parse_input_flags silently coerces metadata values

parse_input_flags calls coerce_value(), which converts "42"int, "true"bool, "null"None. So --metadata work_item_id=42 silently becomes {"work_item_id": 42} (int, not string). The type annotation says dict[str, str] but actual values will be int | bool | None | list | dict.

Metadata values should stay as strings since they're opaque key-value pairs. Consider a dedicated parse_metadata_flags() that splits on first = without coercion, or add a coerce=False parameter to parse_input_flags.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great point — added a dedicated parse_metadata_flags() that splits on first = without any coercion. Metadata values stay as raw strings, keeping the dict[str, str] annotation honest.

Comment thread src/conductor/cli/bg_runner.py Outdated
# Forward metadata
if metadata:
for key, value in metadata.items():
cmd.extend(["--metadata", f"{key}={value}"])
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important: Missing _serialize_value() — nested metadata breaks in background mode

Inputs (line 110) use _serialize_value(value) to handle non-string types (dicts, lists → JSON). Metadata uses bare f"{key}={value}". If YAML metadata contains nested dicts:

metadata:
  config:
    base_url: https://example.com

…this produces config={'base_url': 'https://example.com'} (Python repr, not JSON), which fails to parse on the child side.

Suggested change
cmd.extend(["--metadata", f"{key}={value}"])
cmd.extend(["--metadata", f"{key}={_serialize_value(value)}"])

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice find — switched to _serialize_value(value) so nested dicts/lists get proper JSON serialization instead of Python repr. Matches the pattern already used for inputs on line 110.

Comment thread src/conductor/cli/run.py Outdated
web_dashboard=dashboard,
run_id=event_log_subscriber.run_id if event_log_subscriber else "",
log_file=str(event_log_subscriber.path) if event_log_subscriber else "",
dashboard_port=(dashboard._actual_port if dashboard is not None else None),
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important: Accessing private dashboard._actual_port

_actual_port is a private attribute. Consider adding a public @property port on WebDashboard that returns self._actual_port or self._port, then use dashboard.port here.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a public port property on WebDashboard that returns _actual_port or _port. Updated run.py to use dashboard.port instead of reaching into the private attribute.

Comment thread src/conductor/engine/workflow.py Outdated
run_id: str = "",
log_file: str = "",
dashboard_port: int | None = None,
bg_mode: bool = False,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Consider grouping informational params into a dataclass

These 4 new params (run_id, log_file, dashboard_port, bg_mode) are purely informational — not used for orchestration, only passed through to event data. The constructor already has 10 params; this brings it to 14.

Consider grouping into a dataclass:

@dataclass
class RunContext:
    run_id: str = ""
    log_file: str = ""
    dashboard_port: int | None = None
    bg_mode: bool = False

Then pass a single run_context: RunContext | None = None parameter. This keeps the constructor clean and makes future additions trivial.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Went ahead and did this in this pass — added a RunContext dataclass grouping run_id, log_file, dashboard_port, and bg_mode. WorkflowEngine constructor now takes a single run_context param instead of four. Should make future additions painless.

Daniel Green and others added 2 commits April 27, 2026 14:57
- Guard os.getcwd() with try/except OSError in _build_system_metadata()
- Strip sensitive system info (PID, cwd, log_file) from /api/info endpoint
- Add parse_metadata_flags() to keep metadata values as raw strings (no coercion)
- Use _serialize_value() for metadata in bg_runner to handle nested dicts
- Add public WebDashboard.port property, stop accessing _actual_port externally
- Group informational params (run_id, log_file, dashboard_port, bg_mode) into RunContext dataclass

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Collaborator

@jrob5756 jrob5756 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@jrob5756 jrob5756 merged commit fdacd32 into microsoft:main May 1, 2026
7 checks passed
@PolyphonyRequiem PolyphonyRequiem deleted the feat/workflow-metadata branch May 1, 2026 00:53
jrob5756 added a commit that referenced this pull request May 4, 2026
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
jrob5756 added a commit that referenced this pull request May 4, 2026
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
jrob5756 added a commit that referenced this pull request May 4, 2026
…ion (#129)

* fix(copilot): pass streaming=True to SDK to prevent tool-call truncation

The Copilot SDK's create_session accepts a 'streaming' parameter that
defaults to false. In non-streaming mode the model must emit its entire
turn (text + tool_use blocks + arguments) under a single per-turn output
budget. For agents that issue large tool-call arguments — most commonly
'create' with multi-KB 'file_text' — that budget is exhausted mid-JSON
and the CLI silently executes the partial tool call (path only, no
file_text). The model sees the tool succeed with no content, retries the
same broken call, and loops indefinitely until the wall-clock session
limit fires (default 1800s). The interactive 'copilot' CLI defaults to
streaming, which is why the same model + tool combination works there
but not via the SDK without this flag.

Empirically verified red→green on the same workflow + model
(claude-opus-4.7-1m-internal, single ~50 KB create tool call):
- Without streaming=True: 9m08s wall-clock failure, 0 bytes written
  (ProviderError: tool 'create' was executing).
- With streaming=True: 4m57s success, 62 KB written in a single
  create call.

Tests:
- tests/test_providers/test_copilot_streaming.py — unit test that
  verifies create_session is called with streaming=True (and that the
  existing required kwargs are preserved).
- tests/test_integration/test_copilot_large_write.py — opt-in
  (real_api marker) regression test that builds a workflow inline,
  asks the writer agent to produce a single large create call, and
  asserts the file is at least 30 KB. Skips automatically when no
  copilot CLI is available.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: add changelog entry for streaming fix (#129)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: add #107 and #109 to unreleased changelog

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: add #100, #110, #111, #139, #142, #143, #144 to unreleased changelog

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(config): add optional metadata dict to workflow definition

3 participants