All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
- Workflows that configure
reasoning.effort(or workflow-wideruntime.default_reasoning_effort) on the Copilot provider were broken for every named Copilot model when running againstgithub-copilot-sdk0.3.0. The SDK'smodels.listresponse includes abillingobject on every model, but none of them currently ship themultiplierfield that the SDK'sModelBilling.from_dictparser treats as required — so every model in the response triggersValueError("Missing required field 'multiplier' in ModelBilling"), which kills the entirelist_models()call. The error then leaked through the narrowexcepttuple in_validate_reasoning_effort_for_model(andget_max_prompt_tokens), poisoned the retry loop, and surfaced asDialog turn failed: …after three wasted attempts. (get_max_prompt_tokenswas rescued by the engine's outerexcept Exception, so context-window metadata was silently unavailable rather than fatal.) Both metadata methods now catch anyExceptionraised at the SDK boundary and treat the failure as "metadata unavailable" — validation is skipped permissively and the configuredreasoning_effortis forwarded tocreate_sessionas before.asyncio.CancelledError/KeyboardInterrupt/SystemExit(allBaseExceptionsubclasses) still propagate.
0.1.16 - 2026-05-14
type: workflowagents now accept registry references (workflow[@registry][#ref]) in theworkflow:field, not just local file paths. Resolution prefers a local file when one exists relative to the parent workflow directory (preserves backward compatibility for extensionless local refs); otherwise the value is parsed as a registry reference, fetched via the registry cache, and executed from the cached location.conductor validatenow recursively validates fetched sub-workflows with cycle detection (inode-based identity, so case-variant paths on macOS/Windows collapse correctly) and a depth cap of 10 — when the cap is hit a warning surfaces so users know validation was truncated rather than silently clean. Mutable registry refs (name@registry#main, or no#ref) may resolve to a different commit onconductor resumeif the upstream branch has moved; pinned tags or commit SHAs guarantee deterministic resume (#188).- Conductor now ships as a Claude Code plugin marketplace at the repo root.
Users can install the conductor skill directly from
microsoft/conductorwith/plugin marketplace add microsoft/conductorfollowed by/plugin install conductor@conductor. The plugin ships markdown only (nobin/, hooks, MCP servers, or executables), keeping the trust surface minimal. The sameSKILL.mdremains usable viagh skill install microsoft/conductor conductorfor Copilot CLI users. The previous.claude/skills/conductorlocation was removed — the plugin is now the single home for the skill; for local development on the skill itself, useclaude --plugin-dir plugins/conductor(#186).
- The bundled Conductor skill (
SKILL.md+ references) was refreshed to reflect the current CLI, schema, and feature set:show/replay/--metadata/--workspace-instructionsquick-reference entries; newtype: workflow,dialog,retry,hooks,metadata,instructions,timeout_seconds, andopenai-agentsprovider concepts; correctedupdatebehavior (default prints the install-script one-liner,--applylaunches the installer);CONDUCTOR_NO_UPDATE_CHECK; registrylatest = branch HEADand#refsyntax; sub-workflow agents and dialog mode authoring guidance; script JSON-stdout auto-merge;workflow.dir/workflow.filetemplate variables; and unknown-fields rejection in schema validation (#187). - README "Why Conductor?" rewritten around three pillars — repeatable execution, deterministic routing, and version-controlled YAML workflows — and now leads with the real differentiator (zero-token orchestration) using concrete use-case examples (#185).
0.1.15 - 2026-05-13
- Per-agent
timeout_secondsfield for hard wall-clock timeouts on agent execution. Wraps execution inasyncio.wait_for()at the engine level so a slow agent no longer blocks the rest of the workflow. Effective timeout ismin(agent.timeout_seconds, remaining_workflow_timeout)— when the workflow timeout is stricter it owns the error so attribution is never mislabeled. Raises a newAgentTimeoutError(subclass ofTimeoutError) honored by existingfail_fast/continue_on_errorsemantics in parallel and for-each groups, and emits anagent_timeoutevent (with elapsed time and limit) for console + dashboard subscribers. Scoped to provider-backed agents; rejected onscript,human_gate, andworkflowtypes (#150). - Auto-discovery of
.github/instructions/**/*.instructions.mdworkspace conventions, matching GitHub Copilot's documented semantics. Files markedapplyTo: "**"in their frontmatter are loaded into the workspace preamble alongsideAGENTS.md/CLAUDE.md/.github/copilot-instructions.md; scoped (applyTo: "<glob>") and absent-applyTofiles are skipped per the convention's manual-attach default. The internalCONVENTION_FILES: list[str]table is refactored to a polymorphicCONVENTIONS: list[Convention](ConventionFile | ConventionDirectory) so adding new conventions (Cursor rules, Cline rules, etc.) becomes one filter function plus one list entry; aCONVENTION_FILESmodule-level alias preserves backward compatibility for downstream imports (#169).
agent.system_promptis now rendered and forwarded to providers. The executor was renderingagent.system_promptonly to discard the result (_ = self.renderer.render(...)), so providers that forwardsystem_prompt— notably the Copilot provider, which concatenates it into the prompt — received the un-rendered Jinja template. Agents whose instructions lived insystem_promptsent literal{{ ... }}placeholders to the model and got back "the prompt template contains unfilled variables" refusals. Also adds aconductor validatewarning for agents that definesystem_promptbut noprompt:(a portability hazard since the Claude provider dropssystem_promptentirely, and almost always a missing-prompt:typo) (#179).conductor updateon Windows no longer attempts an in-process self-upgrade. The previous flow tried to re-install into the same venv the runningpython.exelives in, producing "Access is denied" failures that earlier mitigations only papered over.conductor updatenow checks for a newer version and prints the OS-appropriateinstall.ps1/install.shone-liner, and the install scripts become the single upgrade path: they detect other running conductor processes (auto-stopping under-Yes), sweep stale*.exe.oldfiles, retry with backoff (2s / 5s / 10s), and — when uv can't remove theconductor-clitool dir because of file locks — rename the whole dir aside and retry.install.shreaches parity with--yes/--force/--sourceflags, retry-with-backoff, running-process detection, and a post-installconductor --versionverify (#171).install.ps1is now stored without a UTF-8 BOM. The documented one-linerirm https://aka.ms/conductor/install.ps1 | iexreturns the script body as a single string with the BOM surviving as U+FEFF at index 0; PowerShell's in-memoryiexparser then trips on the[CmdletBinding()]attribute withUnexpected attribute 'CmdletBinding'. Both fresh installs viairm | iexandconductor update --apply(which re-runs the same command in a spawned console) now succeed. Directpowershell.exe -File install.ps1invocations were unaffected, which is why prior file-based integration tests didn't catch it (#178).conductor stop(including--alland--port) no longer crashes on Windows when a PID file exists in~/.conductor/runs/. The Unix idiomos.kill(pid, 0)for liveness probing is not a no-op on Windows — any signal other thanCTRL_C_EVENT/CTRL_BREAK_EVENTroutes throughTerminateProcessand can raiseOSErrorsubclasses outsideProcessLookupError/PermissionError(e.g.WinError 11 / ERROR_BAD_FORMAT), and even "successful" calls would actually terminate the target with exit code 0._is_process_alive()now dispatches to a Windows-specific implementation usingOpenProcess+GetExitCodeProcessfor a truly non-destructive liveness check (#176).
0.1.14 - 2026-05-06
conductor updateno longer reports its own launching shim as another running Conductor process. On Windows theconductor.exeshim is a separate process from the Python interpreter that runs the update command, so excluding onlyos.getpid()caused a false "1 other Conductor process is running" warning. The check now walks the full ancestor PID chain (viawmicon Windows,pselsewhere) and excludes every process along the way, falling back to{getpid(), getppid()}if the parent map cannot be built. #164
0.1.13 - 2026-05-06
conductor resumeis now at flag parity withconductor run. New flags:--provider/-p(runtime provider override),--metadata/-m(CLI metadata merged on top of YAML metadata),--web(real-time dashboard for the resumed run),--web-port, and--web-bg(fork a detached resume + dashboard process).--weband--web-bgare mutually exclusive, matchingrun. The dashboard only shows events from the resumed agent forward — agent runs that completed before the checkpoint were emitted in the original process and are not replayed.--input,--workspace-instructions,--instructions, and--dry-runare intentionally not mirrored (#158).- Reasoning effort (
low/medium/high/xhigh) is now displayed in the web dashboard under each agent's metadata, right afterModel. Effective value is per-agentreasoning.effortif set, otherwiseruntime.default_reasoning_effort, otherwise omitted. Backed by a newreasoning_effortfield on theworkflow_startedevent payload, so older event log JSONL files replay gracefully (the row simply doesn't render) (#160). - New
iteration_limit_reachedanditeration_limit_resolvedevents are emitted when a workflow hits itsmax_iterationscap. Previously the console showed an interactiveIntPromptwhile the web dashboard went silently dark; the dashboard now renders the prompt state and the chosen resolution. Theiteration_limit_reachedpayload includes apossible_loopheuristic flag (set when the last 3 history entries are the same agent) so subscribers can call out stuck review loops (#162).
- Workflow registry references now resolve
latest(and barename@registryrefs) to the default branch HEAD instead of the newest git tag. Previously, the moment a registry repo got its first tag, bare references silently froze at that tag and stopped picking up commits tomain. Tags remain first-class — pin explicitly viaworkflow#v1.2.3for releases. Also saves one GitHub API call on the hot path of bare-name fetches (#157).
- Schema validation now rejects unknown fields on
AgentDef,ParallelGroup,ForEachDef, andWorkflowConfiginstead of silently dropping them. Misnestingparallel:orfor_each:inside anagents:item — or typos likeprmpt:— used to fall through to a runtimeModel "gpt-4o" is not availableerror three layers downstream. They now fail at parse time with a clear Pydantic error pointing at the offending location.conductor validatealso gained "Parallel Groups" and "For-each Groups" rows in its summary table so missing groups are immediately visible (#159). - Tool arguments and results are now pretty-printed in dashboard / JSONL /
verbose-console events. Copilot tool results no longer leak the full
Result(content=..., contents=None, detailed_content=..., kind=None)repr with literal\\nescapes and doubled\\\\Windows paths, and tool arguments render as JSON ({"k": "v"}) instead of Python dict repr ({'k': 'v'}). Both providers share a newsrc/conductor/providers/_event_format.pyhelper for parity (#161). install.ps1on Windows now captures fulluv tool installstdout AND stderr viaStart-Process -RedirectStandardOutput -RedirectStandardErrorto temp files. Previously, with$ErrorActionPreference = 'Stop', PowerShell treated uv's stderr as a terminating error and threw before the assignment completed, so install failures showed(no output captured)with no way to diagnose them (#156).
0.1.12 - 2026-05-05
- Unified
reasoning.effortconfiguration for per-agent and workflow-wide control of model reasoning / extended-thinking effort. Setruntime.default_reasoning_effort(low|medium|high|xhigh) for a workflow-wide default, or override per agent with areasoning.effortblock. Translates toreasoning_efforton the Copilot session and to extendedthinkingbudget on Claude (low=2048, medium=8192, high=16384, xhigh=32768 tokens, withtemperaturecoerced to 1.0 andmax_tokensbumped to fit). Validates against each model's supported efforts/capabilities and surfaces thinking content viaagent_reasoningevents. Seeexamples/reasoning-effort.yaml(#152). - Tag-based versioning for the workflow registry. Versions are now
auto-discovered from git tags instead of being explicitly listed in
registry.yaml, and refs accept any tag, branch, or SHA via the newworkflow#refsyntax (e.g.sdd/plan#v3.0.0,sdd/plan#main,sdd/plan#abc1234). Stale CDN content is bypassed via cache-busting query parameters so registry updates are visible immediately (#151).
conductor updatereliability on Windows. Adds a pre-flight check for other running Conductor processes (which hold file locks on%LOCALAPPDATA%\uv\tools\conductor-cli\and causeuv tool install --forceto fail with "Access is denied"), retries the install up to 3 times to absorb transient Windows Defender failures, surfaces full uv stdout AND stderr on failure with Defender-exclusion guidance, broadens the Windows entrypoint rename to cover the uv tool venvScripts/directory in%LOCALAPPDATA%and%APPDATA%, and adds a newconductor update --forceflag to skip the pre-flight check (#155).- Dashboard layout for workflows with
human_gateoptions or multiple loop-back routes (e.g. revision loops). Theworkflow_startedevent now emits routes fromhuman_gateoptions[].routeso gate edges aren't silently dropped, and the frontend pre-classifies back-edges via DFS from$startand feeds them to Dagre in reversed direction so cycles no longer scramble rank assignment. Workflows likesdd/plan-v3.yamlnow render as a coherent top-to-bottom DAG instead of disconnected columns with long diagonal edges (#153). - Windows install failures now surface useful diagnostics.
install.ps1prints captureduvstdout/stderr on failure instead of swallowing it, and uses the correct Microsoft Defender cmdlet so the install path is exclusion-friendly (#149).
0.1.11 - 2026-05-04
metadatadict on workflow definitions, settable statically in YAML or dynamically via--metadata/-mCLI flags. Merged metadata is included in theworkflow_startedevent for downstream consumers (#107).input_mappingfield ontype: workflowagents, enabling Jinja2-templated per-call inputs to sub-workflows evaluated against the parent context. When omitted, the parent'sworkflow.input.*is forwarded as before (#109).type: workflowagents are now allowed insidefor_eachgroups, enabling dynamic fan-out to sub-workflows with per-iterationinput_mapping. Each iteration emits its ownsubworkflow_started/subworkflow_completedevents (#110).- Self-referential sub-workflows are now allowed; depth is bounded by the
global
MAX_SUBWORKFLOW_DEPTHplus an optional per-agentmax_depthfield onAgentDef(#111). workflow.dir,workflow.file, andworkflow.nametemplate variables are now available in all agent contexts (regardless of context mode). Lets registry-hosted workflows reference co-located scripts and assets without depending on the caller's working directory (#121).- Script agent stdout that is valid JSON is auto-parsed and merged into
the agent's output dict alongside
stdout,stderr, andexit_code, enabling field-basedwhen:route conditions instead of opaque exit-code matching (#122). conductor validatenow performs semantic validation in addition to YAML schema checks, catching stale agent references, missing workflow inputs, and undeclared explicit-mode dependencies before runtime inprompt,system_prompt,command,args,working_dir,input_mapping, parallel-group inputs, and workflowoutput:templates (#125).- Web dashboard: breadcrumb navigation, double-click dive-in to sub-workflow graphs, isolated subworkflow contexts (no node-status bleed across repeated runs), and reliable Stop button during subworkflows (#113, follow-up fixes in #146).
- Dialog mode for agents: multi-turn conversational interactions
driven by a
dialoggate with conditional transitions, full Copilot and Claude provider support, and dedicated dashboard UI (DialogDetail,DialogEngagementPrompt,DialogOverlay) (#130). - Markdown rendering and auto-linkification in human gate prompts.
Gate prompts render through Rich Markdown in the terminal and as
GitHub-Flavored Markdown in the dashboard. Bare file paths and URLs
in gate prompts are converted to clickable links; relative paths
open a sandboxed
FileViewermodal served via a path-traversal-safeGET /api/files/{path}endpoint (#131). - Workspace instructions support:
--workspace-instructionsand--instructionsCLI flags plus a YAML-levelinstructions:field on the workflow. Auto-discoversAGENTS.md,CLAUDE.md, and.github/copilot-instructions.mdby walking from CWD to the git root, prepends them to every agent's prompt, inherits into sub-workflows, and persists in checkpoints (#141).
- The dashboard's "context window remaining" bar now sources
context_window_maxfrom each provider's SDK at runtime instead of a hand-maintained static table. Values now reflect the actual cap the SDK enforces (e.g.claude-opus-4.6reports 200K rather than the theoretical 1M;gpt-5.xreports 128K rather than 400K). Thecontext_windowfield onModelPricinghas been removed; pricing data continues to be hand-maintained for cost calculation only (#144).
- Pass
streaming=Trueto the Copilot SDK'screate_sessionto prevent silent truncation of large tool-call arguments. In non-streaming mode the model's per-turn output budget is exhausted mid-JSON for large arguments (e.g.,createwith multi-KBfile_text), the CLI executes the partial tool call, and the agent loops on the broken call until the wall-clock session limit fires (#129). - Build the Copilot prompt schema recursively from nested
output:definitions instead of flattening to top-level fields only. Nested object properties, required keys, and array item schemas are now included in the prompt-facing schema used for initial guidance and parse recovery (#100). - Coerce Python literal
"True"/"False"/"None"strings produced by Jinja's defaultstr(bool)rendering into native Python types when building workflow output. Previously,output: { matched: "{{ a == b }}" }produced the string"False"(truthy), causing downstreamwhen:comparisons againstfalseto silently misbehave (#139). - Pricing fuzzy match no longer silently inherits values across model
families. Names sharing a textual prefix with a known key (e.g.
claude-opus-4.7previously matchedclaude-opus-4) now require a-delimiter; non-matching names returnNoneand the dashboard hides the cost field. A one-time warning is emitted per requested name on any non-exact match (#143). - Run
uv tool update-shellafteruv tool installin bothinstall.ps1andinstall.shsoconductoris available on PATH in new shells, CI agents, and IDE extensions after a fresh install (#142). - In explicit context mode,
workflow.inputis now always available toscriptandtype: workflowagent templates regardless of the agent's declaredinput:list. The explicit-mode contract still applies to LLM agents (no undeclared inputs in prompts to control token cost) (#119). - Optional workflow inputs without an explicit
default:now resolve to type-appropriate zero values ("",0,false,[],{}) instead of PythonNone, so templates like{{ workflow.input.optional | default("fallback") }}render the fallback rather than the literal string"None"(#123). - Web dashboard: events without an engine-supplied
subworkflow_pathstamp (e.g.,for_each_item_startedfor a parent for_each overtype: workflowagents) now route strictly to the root context instead of falling back to the user's currently-viewed path. This fixes two related symptoms: dashboards opened during a run with sub-workflows no longer auto-land inside an iteration, and a parent for_each panel now displays every iteration rather than silently dropping the middle ones into a sibling sub-workflow's context (#148).
0.1.10 - 2026-04-30
- Sub-workflow composition support:
workflow-type agents can now be used insidefor_eachgroups, with dynamic per-iterationinput_mapping(#101, #102).
- Bumped
github-copilot-sdkto>=0.3.0. The SDK ships a bundledcopilotCLI binary used for JSON-RPCsession.createcalls;0.2.2bundled CLI1.0.21, which rejected newer model IDs locally withJSON-RPC -32603: Model "<id>" is not available.0.3.0bundles CLI1.0.36-0, which accepts the current Copilot model catalog (includingclaude-opus-4.7*variants).
- Suppressed noisy PowerShell stderr output from
uv tool installduring Windows self-update (#99).