Changelog

Notable changes across all MUXI components

Ask ChatGPT Ask Claude.ai Open in Cursor

For detailed release notes, see individual component repositories:

Observability events are now redacted far more thoroughly, and entity-based PII redaction (names, addresses, organizations, dates of birth, financial identifiers) ships as a built-in capability that is on by default.

Redact by default: every emitted event is scrubbed before it reaches any log sink. Previously only user-facing events were redacted, so non-user events (system, MCP, workflow) could carry secrets. Callers may opt out per event via skip_redaction for audited, non-sensitive payloads.
Two layers: an always-on regex layer (API keys/tokens, passwords, AWS credentials, database URLs, JWTs, emails, phone numbers, SSNs, and credit cards) plus an entity layer (names, addresses, orgs, DOB, financial identifiers) masked with consistent indexed tokens like [PERSON_1], [ORG_1].
Luhn-validated cards: 16-digit runs are masked only when they pass the Luhn checksum, so order IDs, timestamps, and other long digit runs are preserved; placeholders are length-accurate.
New toggle: logging.redaction.entities (default true) controls the entity layer; regex redaction is always on. If the spaCy model can't load, the layer degrades gracefully to regex-only with a one-time warning.
Core, not optional: entity redaction uses Microsoft Presidio + spaCy en_core_web_sm, baked into the default images. The memory extractor uses the same detector (and confidence threshold) so PII is also kept out of long-term memory.
See Observability deep dive.

Runtime v0.20260616.0

SOP Skill Directives -- deterministic skill activation from SOP steps

SOP steps can now declare skills that should be activated deterministically before the assigned agent processes the task. This removes the need for the LLM to choose the activate_skill tool when a skill is required by a workflow step.

Bracket syntax: [skill:skill-name] for activation-only, [skill:skill-name/script-name] to also run a script from the skill's scripts/ directory. Placed on the same line as the step heading after [agent:...].
Deterministic activation: the workflow executor calls skill_manager.activate_async directly before agent.process_message. Skill content is injected into the task prompt as a skill prelude.
Deterministic script execution: when the run form is used and an RCE client is available, the executor calls run_skill_command directly before the agent runs, and the script output is appended to the task prompt under "Skill execution results".
Request-scoped transient grants: SOP-declared skills work even when not pre-declared for the assigned agent in its YAML formation. The executor registers transient grants via skill_manager.grant_request_skills before workflow execution and revokes them in finally.
See SOPs guide and Skills guide for updated documentation.

Dependency minimums updated to latest compatible releases

65 direct dependency minimums in pyproject.toml raised to the newest resolvable versions after uv lock --upgrade. Notable bumps include mcp>=1.27.2, fastmcp>=3.4.2, numpy>=2.2.6, pandas>=2.3.3.

May 2026

Server v0.20260514.1

Bundle extraction now lives under /tmp instead of system $TMPDIR, avoiding EXDEV cross-device link failures during os.Rename on modern Linux distros where /tmp is a separate tmpfs mount.
safeRename helper wraps os.Rename with a copyTreePreservingMode + RemoveAll fallback for EXDEV errors, preserving per-file modes (including 0600 for secrets.enc).
os.MkdirAll added to update.go's directory-setup to close the registry-without-dir race window.

Server v0.20260514.0

docker run --platform pin in runtime-runner spawn: passes --platform linux/ derived from the SIF filename, preventing Apple Silicon Docker from resolving :latest to the host-native arm64 manifest and breaking amd64 SIF launches.
sifPlatform(path) helper parses arch suffix from muxi-runtime-[-]-linux-.sif filenames.

SDKs v0.20260514.0

Go SDK module root moved from go/src to go/ so go get github.com/muxi-ai/muxi-go@latest works with standard Go module layout.
SSE keepalive/heartbeat-aware parsing across all 12 SDKs: : keepalive comments no longer cause false idle failures; multi-line data: payloads and event: done frames parse correctly; route-level event: error frames surface as SDK errors.

Runtime v0.20260508.0

Scheduler firing recursion fix

Recurring scheduled jobs were spawning a fresh one-time job on every firing instead of delivering the reminder. The overlord's request_analyzer classified the delivery message as a new scheduling request, the scheduler-intent handler created a NEW job, and the agent never ran. Three fixes:

overlord.chat() now takes is_scheduled_execution: bool = False. When True, scheduler-intent and scheduler-query handlers are bypassed so the message reaches the agent as a normal delivery.
SchedulerService._execute_job passes is_scheduled_execution=True so any chat invocation tied to a job session is treated as delivery, not new scheduling.
PromptRewriter no longer treats rewritten == original_prompt as failure. Empty LLM responses still fall back to prefix wrapping; surrounding quotes are stripped.

Runtime v0.20260503.0

Init-time model probe rejects bad slugs

Model slugs validated on startup: Embedding.acreate / ChatCompletion round-trips check every declared model; 404 or wrong shape → ConfigurationValidationError with a corrected-slug suggestion. Auth, rate-limit, and network errors warn-and-continue.
local// shape enforced: misformatted slugs like local/all-MiniLM-L6-v2 fail with an inline "did you mean local/sentence-transformers/all-MiniLM-L6-v2?" hint.

MCP error translation

Misleading upstream errors rewritten before reaching the planner: e.g. Microsoft Graph's WAC token error becomes "the item ID is not an Excel file — re-check list-folder-files and filter by .xlsx", letting the agent self-correct without escalating to the user.
Applies automatically to any MCP server emitting known patterns; no formation-side change required.

Lean Docker variants on `python:3.13-slim`; `markitdown` extras narrowed

Lean Dockerfiles updated to python:3.13-slim (Dockerfile, Dockerfile.production, e2e/docker/Dockerfile); requires-python floor stays >=3.10.
markitdown[all] → markitdown[docx,pdf,pptx,xls,xlsx]: drops youtube-transcription and audio/Azure-DI deps unused in the codebase; downstream consumers needing them install markitdown[all] explicitly.
Image size reductions vs 3.10 baseline: lean Docker −10%, arm64 SIF −14%, amd64 SIF −5.6%.
PyPI classifier added: Programming Language :: Python :: 3.13.

Scheduler: doubled session IDs, missing job stats, stripped delivery framing

Doubled session_id collapsed: scheduler no longer double-prefixes job_; mark_job_execution_success now reliably finds its _active_executions entry.
Job stats persisted on every completion: last_run_at, last_run_status, and total_runs update on success or failure.
Delivery framing preserved: phrases like remind me to, notify me, tell me when are no longer stripped by the prompt rewriter.

Tool whitelist / blacklist filter on MCP servers (now documented)

tools.whitelist / tools.blacklist on any MCP .afs file — mutually exclusive, fnmatch globs, applied at registration so filtered tools are invisible to the LLM. See Tools & MCP and the Add Tools guide.

OneLLM v0.20260502.0

CoreML compiled model cached across loads

ModelCacheDirectory injected for every CoreML EP session: compiled .mlmodelc artifacts are written to $HF_HOME/onellm-coreml/// once and mmap'd on subsequent loads, eliminating the 5–15 s recompile cost per session.
SpecializationStrategy=FastPrediction enabled (onnxruntime ≥ 1.20): reduces per-input-shape recompilation; older runtimes log an unknown-option notice and proceed.
Non-CoreML EPs unaffected: CUDA, ROCm, OpenVINO, and CPU providers pass through untouched.
Operator knobs: ONELLM_COREML_DISABLED=true drops the CoreML EP entirely; ONELLM_COREML_CACHE_DIR overrides the cache root.
Measured impact (MUXI Runtime macOS arm64 6_knowledge e2e): peak RSS 8.7 GB → 3.8 GB (cold) / 4.7 GB (warm), wall time 280 s + SIGKILL → 90 s / 72 s.

CLI v0.20260501.1

Append ?pull=true to registry pull-info requests in muxi pull so the registry records each download and refreshes total_downloads and weekly activity charts.

CLI v0.20260501.0

muxi push bundles now include SOUL.md and the legacy mcp/ component directory, so registry round-trips no longer drop them. Both mcp/ and mcps/ directory names are matched when resolving MCP server declarations from formation.afs.
muxi validate recognizes the new MCP spec keys parameters (default tool-call parameters) and tools.whitelist / tools.blacklist (catalog filtering) as valid, and reports an error when both whitelist and blacklist are declared on the same MCP server.

April 2026

Server v0.20260428.2

muxi-server init pre-downloads multilingual embedding model (Xenova/multilingual-e5-small, ~125 MB) alongside the existing Nomic model, so formations needing non-English retrieval skip the first-deploy stall.

Server v0.20260428.1

Self-update URL aligned with v-prefixed S3 layout: server binary downloads now use https://pkg.muxi.org/server/v{VERSION}/{binary}, matching git tags and GitHub release names. Previously the binary constructed a non-v-prefixed URL that 404'd.

CLI v0.20260428.0

muxi chat renders ordered list markers correctly in assistant messages — the enumeration element is now part of the custom glamour style, so numbered lists display as 1. Item instead of 1Item.

Server v0.20260424.0

Runtime-runner image single source of truth: new config.DefaultRuntimeRunnerImage constant. All 7 API handlers and spawn paths now resolve from the config field or constant instead of the previous hardcoded name in spawn_common.go.
SIF downloads from pkg.muxi.org: default mirror changed from GitHub releases to https://pkg.muxi.org/runtime. fetchLatestVersion reads a plain-text latest.txt instead of parsing GitHub redirect headers.
Server binaries uploaded to S3 on release with public-read ACL, mirroring the runtime's S3 distribution layout.
Test stability: TestHandleRollback and TestHandleBundleDeploy_ValidBundle no longer hang on real SIF downloads; non-routable test URLs and HTTP client timeouts added.

CLI v0.20260424.0

CLI release binaries are now uploaded to S3 (s3://BUCKET/cli/VERSION/*) with public-read ACL during the release workflow, mirroring the runtime's S3 distribution pattern.
CLI download URLs moved from releases.muxi.org to pkg.muxi.org/cli/vVERSION/BINARY across the upgrade command and the release workflow.

Server v0.20260423.0

muxi-server init pre-downloads default embedding model (nomic-ai/nomic-embed-text-v1.5, ~524 MiB) so first formation deploy doesn't stall on a multi-hundred-MB fetch.
Atomic cache writes: each file writes to .tmp and is atomically renamed, so killed processes can't leave partial cache entries.
MUXI_CACHE_DIR env var overrides default /cache.
Runtime variants (CPU / GPU / CUDA): formations can opt into GPU or CUDA runtime SIFs via muxi_runtime.variant. Variant validation rejects unknown names.
Init UX polish: single-line Docker pull progress (⠙ Layers 5/8 (62%)), spinner for embedding downloads, --quiet dropped so init no longer looks frozen.

Runtime v0.20260422.0

Scheduler recurring jobs resume after the first run

Recurring scheduled jobs could silently stop firing after their first successful execution because the runtime compared a naive database timestamp against a timezone-aware "now", raising a TypeError that a broad except silently translated to "not due". Timestamp handling is now UTC-normalized at both edges of the scheduler.

Recurring cron jobs keep firing: a naive last_run_at from the database is treated as UTC before being compared against croniter's aware "scheduled next", so cron jobs re-fire reliably on every tick instead of appearing not-due forever after their first run.
One-time jobs get the same guarantee: scheduled_for comparisons on one-shot jobs no longer hit the same naive-vs-aware TypeError mid-check.
Scheduler API timestamps are unambiguous UTC: ScheduledJob and ScheduledJobAudit JSON payloads now emit ISO 8601 with a trailing Z on every datetime field, removing the "local or UTC?" ambiguity for clients.

Tool-schema-documented sentinel values are honored

Cell-specific Excel reads (A1/B2 in the authenticated user's default OneDrive) silently collapsed because the parameter-inference prompt conflated "don't invent placeholder values" with "don't emit short sentinel strings like me" -- even when the tool's own parameter description explicitly documented the sentinel as the correct value. Inference returned nothing, required-parameter repair fired, auto-discovery inserted an unrelated step, and the replan guard blocked a second attempt. The chain collapsed silently on every tool whose schema documented a sentinel.

Parameter descriptions are now a first-class contract: when a tool schema's parameter description explicitly documents a sentinel (e.g. "use 'me' for the current user's drive", "pass 'root' for the default site", "use 'primary' for the default calendar"), the runtime emits that sentinel when the user's request did not identify a specific resource and no prior step supplies the real ID. The rule is schema-driven -- any MCP (Microsoft Graph, Google Workspace, Slack, custom tool servers) whose schema documents a sentinel benefits with no runtime-code change.
Anti-guessing guardrails preserved: short strings that are not documented sentinels still get the same treatment as before -- the LLM omits them rather than inventing a value.
Specific-resource requests keep working: naming a specific drive, mailbox, site, or calendar in the user's message still triggers a discovery-first chain, because the sentinel rule is guarded by "user did not name a specific resource" and "no prior step produced the real ID".

CLI v0.20260422.0

muxi scheduler list and muxi scheduler show no longer fail when the runtime returns scheduler timestamps without timezone suffixes (e.g. 2026-04-20T11:30:35.935869).
Timezone-less scheduled_for and last_run_at values are now treated as UTC, so one-time jobs still normalize correctly in CLI output.

Runtime v0.20260421.0

Native A2A 1.0 migration

Native a2a-sdk 1.0 support: the runtime now uses the SDK's 1.0 client, server, registry, transport, and overlord APIs directly instead of the old compatibility layer.
Centralized protobuf translation: SDK/MUXI message conversion now lives in one helper layer, reducing drift across A2A call sites and preserving existing runtime contracts.
Updated A2A dependency floor: runtime now requires a2a-sdk>=1.0,<2.0 and protobuf>=5.29.5,<6.
Broader A2A verification: release coverage includes focused unit, integration, and orchestration E2E tests for internal messaging, external messaging, and discovery flows.

OneLLM v0.20260421.0

Local embedding provider

New local/ embedding provider: OneLLM can now run in-process local embedding models through the standard Embedding.create() / Embedding.acreate() surface.
ONNX-first backend selection: local models prefer ONNX Runtime when ONNX weights are available, with sentence-transformers fallback for PyTorch-only repos.
Extras split for local backends: [cache] now stays lean by default, while [local-gpu] and [local-pytorch] opt into CUDA and PyTorch-backed local embedding support.
Cleaner local model workflows: onellm download local/... now snapshots Hugging Face repos directly, and the semantic cache loads embeddings through the same LocalProvider path.
More resilient provider error handling: non-JSON HTTP error bodies across providers are normalized consistently, and OpenAI audio text formats now return raw text correctly.

Runtime v0.20260420.1

Free-text placeholder resolution

Google Calendar and Gmail text payloads now resolve cleanly: placeholder predicates can match bulleted free-text results instead of assuming structured JSON records only.
Embedded placeholders inside larger strings now substitute correctly: composed values such as draft bodies with appended text no longer send literal {{...}} tokens downstream.
Section-style field extraction works: payloads using --- FIELD --- blocks now expose values like message bodies for downstream tool calls.
Unresolved placeholders are surfaced earlier: leftover placeholder tokens are logged loudly and blocked or repaired before they can silently no-op at MCP execution time.

Runtime v0.20260420.0

Placeholder pipeline reliability

Follow-up to v0.20260418.0. The runtime now surfaces a loud warning whenever an unresolved {{...}} placeholder would have reached an MCP call, and two silent-failure paths that surfaced in field reports are fixed.

Positional indexes work in placeholders: {{WORKSHEET_LIST[0].id}} ("the first worksheet's id") and [-1] ("the last one") now resolve correctly. Previously, bare integer indexes were rejected and quietly fell back to first-match, which could bind the wrong entity and produce misleading 404s.
Nested-dict placeholders: Placeholders inside nested parameters -- e.g. OneDrive's parentReference: { id: "{{FOLDER.id}}" } when moving a file -- are now substituted recursively. Previously they were shipped as literal strings, and Microsoft Graph happily returned 200 OK while silently ignoring the bad field.
Loud warnings for unresolved placeholders: Any literal {{...}} string left at any depth after substitution now emits a warning event naming every leftover leaf. Non-required parameters containing unresolved leaves are dropped before the call; required ones are preserved so the repair-plan flow can replan.

Runtime v0.20260418.0

Pick the right record from a list

Agents can now disambiguate a record from a prior step's list output using a predicate -- {{FILE_LIST[name='Book.xlsx'].id}} instead of {{FILE_LIST.id}}. This closes a long-running bug where the runtime silently resolved to the first record in the list (often wrong) when multiple candidates were available.

New syntax: {{NAME[key=value]}} and {{NAME[key=value].field}}. Values can be quoted strings, booleans, numbers, or null. Field name matching is normalized so [name=X] matches Name, display_name, and DisplayName, and string comparisons are case-insensitive.
Auto-inference: When the planner emits the old {{FOO.field}} form and the step description explicitly names a resource ("update Book.xlsx", backticked Spark Folder), the runtime synthesizes a name predicate automatically -- but only when the list actually contains a matching record. Precedence is always explicit predicate > auto-inference > legacy first-match.
Planning prompt updated with correct/wrong examples for the new syntax.

Runtime v0.20260417.2

Cleaner plans by default

The agent planning prompt now pins the placeholder contract directly, so plans are valid on the first pass more often and the runtime's safety nets rarely need to fire.

One placeholder syntax: only {{UPPERCASE_NAME}} / {{UPPERCASE_NAME.field}}. Variants like <>, ${{NAME}}, and {NAME} are rejected.
No invented names: a later step may only reference upstream output by the exact placeholder assigned upstream.
No sentinel values: fills like "auto-injected" or "from_server" are banned; plans omit the key instead and let server defaults take over.
No fabricated arrays: the prompt explicitly prohibits inventing extra IDs, emails, or hashes to pad a list.

Runtime v0.20260417.1

Placeholder substitution catches more cases

A set of guards that prevent literal {{...}} placeholders or LLM-fabricated values from reaching MCP calls.

Extract values from free-text MCP output: array placeholders like {{APRIL_10_MESSAGES.message_ids}} now resolve against Field: value patterns and inline JSON embedded in text chunks, eliminating the "hallucinated incrementing hex" pattern previously seen with Gmail.
Cross-placeholder fallback: if the planner emits a placeholder name it never assigned, the runtime searches all prior successful results for a single unambiguous candidate and commits only when exactly one matches.
Strip literal placeholders before the call: any non-required parameter still shaped like {{...}} / <<...>> after all substitution attempts is dropped. Required placeholders are preserved so the repair-plan flow can handle them.
Reject fabricated arrays: every item in an LLM-inferred list must appear in a prior successful result. Fabricated items are dropped; a fully fabricated array is removed entirely and triggers replanning.

Runtime v0.20260417.0

Smarter repair-tool selection

When a tool call fails and the runtime looks for a replacement, the auto-discovery scorer now understands resource domains (mail, calendar, drive, sharepoint, chat, contact, task, note). A failed get-drive-root-item can no longer be "repaired" with list-mail-folders just because both tools live on the same MCP server. Same-domain candidates get a bonus; cross-domain candidates get a penalty.

CLI v0.20260417.0

muxi scheduler list / show no longer crash on timestamps with fractional seconds and a timezone offset (e.g. 2026-04-16T09:00:00.123456+00:00), and fractional precision is preserved when normalizing one-time scheduled_for values.

Runtime v0.20260416.3

Placeholder polish and scheduler completion

Sentinel values treated as unresolved: strings like "driveId": "auto-injected" no longer block MCP server-default injection from overwriting them with a real value.
Dotted placeholder references: {{SPARK_EVENT.event_id}} now correctly extracts the event_id field from the {{SPARK_EVENT}} result (case- and underscore-insensitive) instead of passing the literal string through.
Scalar-only whole-payload fallback: LLM-hallucinated parameters not in a tool's schema no longer receive a full result dict coerced into them, preventing pydantic errors downstream.
Scheduler marks success without a webhook: formations without async.webhook_url now run jobs synchronously and record success/last-run directly, so total_runs and last_run_status stay accurate.

Runtime v0.20260416.1 / v0.20260416.0

Planning mode preserves tool parameters, knows today's date

Tool parameters no longer dropped (critical): an internal rebuild step was silently replacing every parameter set with {} after the LLM assembled the plan. Parameters are now preserved verbatim, and repeated uses of the same tool each keep their own arguments.
"Today" and "tomorrow" resolve to concrete dates: the planning prompt now includes the current date and time, so relative date references produce concrete RFC3339 values.
Parameterless MCP steps no longer crash: tools that require no user-supplied arguments (e.g. list-mail-messages, get_events, search_gmail_messages) no longer hit an uninitialized-variable error.
Scheduler jobs run on the right event loop: fixes "Future attached to a different loop" crashes on scheduled jobs whose chat path uses asyncpg, httpx, or MCP transports.
Repair scoring prefers same-server tools: a tool from a different MCP server can no longer outscore a same-server candidate during repair planning.

CLI v0.20260416.0

muxi scheduler list / show render active jobs correctly by reading the runtime's real job fields (status, is_recurring, cron_expression, scheduled_for, original_prompt, last_run_at, total_failures) instead of the stubs they were previously hitting.
Recurring cron next-run values are now computed client-side; one-time jobs reuse scheduled_for when no precomputed next_run is returned.

Runtime v0.20260415.0

Delegation and context hints

Server-default params are satisfiable: MCP-injected values like driveId are no longer treated as missing, preventing redundant LLM inference that could replace them with "me".
Generalized named-resource hints: context extraction recognizes #channel, @name, quoted names, and filenames without service-specific heuristics.
Delegated agents get context: downstream agents receive compact summaries of successful prior results instead of reasoning from scratch.
Less unnecessary delegation: agents keep arithmetic, summarization, and analysis in-house when their own tools already fetch the needed data.
Dependency floor bumps: onellm[cache] >= 0.20260415.0, fastmcp >= 3.2.0, pypdf >= 6.10.0, Pillow >= 12.2.0, aiohttp >= 3.13.4, requests >= 2.33.0, cryptography >= 46.0.7, pytest >= 9.0.3, black >= 26.3.1.

Runtime v0.20260414.0

Parameter matching across casings and step order

Recent results preferred: when binding a parameter from prior results, the most recent step wins. A worksheet GUID from step 3 now correctly binds workbookWorksheetId, not a file's driveItemId from step 2.
snake_case and camelCase interchangeable: Slack's channel_id binds against upstream records keyed as channelId (and vice versa). Fields like display_name, channel_name, drive_id, drive_item_id, and created_at are recognized alongside their camelCase equivalents.

Runtime v0.20260413.1

Microsoft 365 entity IDs flow through multi-step chains

Expanded alias list for suffix-based parameter matching -- worksheet, notebook, section, page, channel, team, plan, list, event, and contact IDs -- unblocking common MS365/Graph chains like "open the Book.xlsx worksheet and get a range".
Real GUIDs no longer mistaken for placeholders: values like {4C35B2DD-58DF-4BDB-B806-E0421A3D5456} are no longer rejected by the unresolved-placeholder check.
JSON-string result fields parsed: when a tool returns {"result": "{\"value\":[...]}", "status": "success"}, the embedded string payload is now parsed for downstream extraction (matching the existing behavior for content).

Runtime v0.20260413.0

More resilient plan parsing

Plans with a preamble parse correctly: some models emit natural-language text before the JSON plan, sometimes in a markdown fence that doesn't start at character 0. The runtime now extracts plans via direct parse, code-fence regex, and outermost-{...}-with-"steps" search.
Large tool catalogs no longer truncate plans: planning calls now cap at 16K tokens, so formations with 100+ MCP tools stop silently falling back to delegation because the 4K default got hit mid-response.

CLI v0.20260413.0

muxi validate recognizes the updated MCP declaration spec (mcps.servers) while staying compatible with legacy mcp.servers manifests.
MCP config files are matched from both mcps/ and legacy mcp/ directories so declaration checks and required-field validation report correctly during migration.

Runtime v0.20260410.0

Shared MCP defaults

MCP servers now support a parameters field: a flat key-value map of default values injected into every tool call on that server. This removes infrastructure constants (org-level drive IDs, tenant IDs, project keys) from agent prompts and LLM inference, making multi-step tool chains deterministic regardless of model.

# mcp/ms365.afs
schema: "1.0.0"
id: ms365-mcp
type: http
endpoint: "https://mcp.example.com/ms365"
parameters:
  driveId: "${{ secrets.ORG_DRIVE_ID }}"

Values support ${{ secrets.X }} and ${{ user.credentials.X }} interpolation
Caller-supplied arguments always take precedence over defaults
Works on both formation-level and agent-level MCP server declarations

→ Tools Reference for full documentation

Multi-step MCP reliability

A bundle of fixes that make multi-step MCP chains (find user -> get drive root -> list files -> open workbook) less brittle:

Clean context at execution time: planned tool steps resolve parameters from the current request and successful prior results only, no longer dragging in stale memory or profile data.
Placeholders never shipped as arguments: {{ROOT_FOLDER_ID}} or <> strings that slip through are blocked before the call.
Failed steps excluded: parameter substitution and inference only consider successful prior results; error payloads no longer contaminate downstream steps.
Fabricated IDs rejected: an LLM-inferred ID value that appears in no prior record is dropped.
Modern MCP results parsed: tool results returned as JSON strings are parsed for downstream extraction.

Runtime v0.20260409.0

Faster persistent memory recall

Single multi-collection query for memory search (previously fanned out into separate lookups)
Lightweight fast-path for profile/synopsis queries before broader semantic search
Best-effort PostgreSQL indexes for user/collection filtering (non-fatal if creation fails)

Runtime v0.20260408.0

Smarter agent routing

Specialists beat generalists: routing now considers an agent's MCP tools, specialties, and domain keywords, not just its name and description.
Follow-ups stay with the same agent: session-aware routing biases follow-up messages toward the agent that handled the previous turn.
Exact dates preserved: workflow synthesis and agent planning responses no longer rewrite concrete dates/times as relative language.
Delegated specialists use their tools: when a broad agent delegates to a specialist, the specialist plans and executes its MCP tools instead of answering from model prior.

CLI v0.20260408.0

Chat sessions no longer time out during slow-starting formations that emit SSE keepalive frames
SSE stream parsing handles full event blocks, surfacing route-level errors and preserving progress/tool-activity events

SDKs v0.20260408.0

SSE parsing across all 12 SDKs is keepalive-aware (: keepalive comments no longer cause false idle failures)
Route-level event: error frames are surfaced as SDK errors instead of being silently dropped
New runtime event types (progress, thinking, planning, tool_call) preserved in chat streams

Runtime v0.20260407.0

HTTP MCP reliability

Per-request timeouts now enforced on all MCP HTTP operations (previously only on connect/init)
Explicit transport type preserved across reconnects (no more re-detection on every reconnect)
Client disconnects now cancel in-flight MCP requests and close their pooled connections

Runtime v0.20260403.0

MCP Accept header & faissx bump

Strict FastMCP servers no longer reject transport detection: the detector's ping request now sends Accept: application/json, text/event-stream, */* explicitly, satisfying servers that previously returned 406 Not Acceptable (community PR #139).
faissx >= 0.20260403.0: improved vector-store persistence across restarts -- no data loss on formation restart.

Runtime v0.20260402.0

Workflow tool-call reliability

Isolated context for workflow tasks: workflow-dispatched agents no longer inherit full conversation history, which was causing them to reproduce prior tool calls as XML text instead of issuing real API calls.
Today's date injected: agents now know the actual current date instead of falling back to their training-data cutoff.

Server v0.20260402.0

Creates {dataDir}/tmp on startup so TMPDIR works out of the box in Docker deployments
Default health check endpoint changed from /health to /v1/health to match the runtime API

Server v0.20260401.1

Auto-installs Apptainer on muxi-server start if not found on Linux (survives container restarts)
Fixed Apptainer/Singularity binary lookup to prefer apptainer over singularity

Runtime v0.20260401.0

Skill secrets

Skills can now reference ${{ secrets.X }} directly in their SKILL.md instructions -- the same syntax used everywhere else in MUXI. The runtime resolves secrets at activation time and injects them into the agent's context. Bundled scripts receive them as environment variables in the RCE subprocess.

Note: Skills RCE must not be exposed publicly -- it is an internal service intended for trusted network boundaries only.

SOP synthesis fixes

Synthesis step instructions no longer truncated: the old 500-character cap on SOP step body text is gone; instructions pass through verbatim.
Prior-step results reach the synthesis agent: dependency outputs are extracted and presented as clear content instead of raw metadata blobs.

SDKs v0.20260324.0

Added updateSchedulerJob, pauseSchedulerJob, and resumeSchedulerJob methods to all SDKs

March 2026

Runtime v0.20260330.1

MCP connection keep-alive

MCP tool calls no longer reconnect and disconnect for every invocation. After the first tool call, the connection is kept alive and reused for subsequent calls. Each call resets an idle timer; a background reaper closes connections that exceed the configured TTL (default: 5 minutes). Frequently-used servers stay connected indefinitely; idle servers close automatically.

Configurable TTL: Set connection_ttl globally under mcp: or per-server in individual MCP server configs. Use connection_ttl: 0 to restore the previous connect-per-call behavior.
User isolation: Connections are keyed by server ID and credential hash, so different users never share a connection.

Runtime v0.20260330.0

Response latency fixes

Workflow synthesis no longer times out on large results: When SOPs fetched large amounts of data (calendar events, emails, tasks), the synthesis step timed out after 30 seconds, causing 120-175 second silent gaps while retries ran. The synthesis timeout is now 120 seconds.
User identifier resolution cached: A database lookup that ran up to 8 times per request for the same immutable mapping is now cached for the formation's lifetime.
PDF processing works in SIF containers: libpoppler.so was not discoverable inside Apptainer containers because the container's LD_LIBRARY_PATH excluded system library paths. The entrypoint now appends standard system library paths when running in SIF mode.

Runtime v0.20260329.0

MCP HTTP transport CPU fix

Fixed a bug where MCP servers using type: http caused 90%+ idle CPU after the first tool call. The root cause is an upstream SDK bug (python-sdk#1805) where memory object streams with zero-buffer capacity create an infinite busy-loop during context teardown. The runtime now closes transport streams before SDK context exit, preventing the spin. This workaround will become a harmless no-op once the upstream fix ships.

Runtime v0.20260326.3

Generative UI skill

MUXI now ships with a built-in generative-ui skill for requests that are better shown than described. Agents can use it to create self-contained interactive HTML widgets, dashboards, diagrams, and visual explainers through the existing file-generation flow.

Interactive HTML artifacts: Steers agents toward single-file .html outputs with inline CSS/JS, responsive layouts, and dark-mode-friendly presentation.
Covered by tests: Added unit coverage for skill loading and activation, plus an end-to-end test that verifies RCE-backed HTML widget generation.

Runtime v0.20260326.2

XML tool parameter extraction

Improved tool-call parameter extraction so agents can recover parameters from more XML-shaped model outputs, not just the wrapped JSON form. This makes tool execution more reliable when the model emits individual parameter tags.

Broader XML support: The runtime now extracts values from individual XML parameter tags and assembles them into tool arguments automatically.
Fewer dropped tool calls: This reduces cases where a tool call was planned correctly but skipped because argument parsing failed.

Runtime v0.20260326.1

MCP error handling & conversation-aware routing

MCP error flag now propagated correctly: When an MCP tool returned an error (e.g., "Item not found" from Microsoft Graph), the system incorrectly reported it as a success. Error responses are now detected and surfaced properly.
Follow-up messages stay with the right agent: Short follow-up messages like "change it to normal" were routed to the wrong agent because the router had no memory of the previous conversation. The system now tracks which agent handled the last message per session and biases follow-ups toward the same agent.

Runtime v0.20260325.1

MCP tool chaining reliability

Fixed a bug where multi-step MCP tool chains silently failed. When an agent needed to call multiple tools in sequence (e.g., list tasks then update a specific task), the second tool call could be skipped without error if the LLM returned a non-standard response format during parameter inference. The agent would report success despite never executing the action.

Parameter inference resilience: The inference step now extracts JSON parameters from non-standard LLM responses (XML tool-call wrappers, embedded JSON in prose) and retries on failure.
Smarter planning: Agents now automatically add prerequisite lookup steps when a tool requires an ID they don't have (e.g., fetching the task list before updating a task).

Runtime v0.20260325.0

SOP synthesis control

SOPs can now declare synthesis: false in their frontmatter to bypass the Overlord's response synthesis step. When set, the last completed task's raw output is returned as-is -- useful for SOPs that specify strict output formats like JSON, CSV, or structured templates. Default remains true for backward compatibility.

Parallel SOP execution

Independent SOP steps now execute concurrently in guide mode. The workflow engine detects fan-in patterns (e.g., three parallel data-fetching steps feeding one synthesis step) and runs them simultaneously instead of forcing sequential execution. Linear chains (A->B->C) still execute in order.

Scheduler chat integration

Job creation no longer times out: Fixed a missing streaming event that caused chat-created scheduled jobs to hang indefinitely despite being written to the database.
Scheduler intent routing: Scheduler-related requests are now detected before SOP matching, preventing unrelated SOPs from hijacking scheduler intents.
List jobs via chat: Users can now ask "show my scheduled jobs" or "list my reminders" directly in chat. Detected via LLM analysis with keyword heuristic fallback.
Multi-day scheduling: "every Tuesday and Thursday at 3pm" now correctly produces 0 15 * * 2,4 instead of only capturing the first day.

Streaming reliability

Broken pipe recovery: When a client disconnects mid-stream, the background processing task is now cancelled with a 5-second timeout. A 10-minute subscribe ceiling prevents zombie streams, and a stale request reaper force-fails requests stuck in processing.
Job title expansion: Chat-created job titles now support up to 500 characters (previously truncated at 61).

MCP tool chaining

Fixed sequential MCP tool calls losing context between steps. When an agent's execution plan chains multiple tools (e.g., list task lists, then fetch tasks from a specific list), the parameter inference LLM now receives results from previous steps, allowing it to extract IDs and values needed for subsequent calls.

Runtime v0.20260324.0

Scheduler API persistence & job lifecycle

Fixed critical data loss where all scheduler jobs vanished on restart -- route handlers fell back to in-memory dicts instead of calling JobManager for database persistence. All CRUD routes now use async JobManager methods with PostgreSQL via SQLAlchemy.

Jobs not persisted: create, list, get, delete all used in-memory fallback. Rewired to JobManager.
pause/resume/delete missing user_id: SchedulerService methods omitted required user_id, causing TypeError.
FK constraint on delete: Audit records now deleted before the job to avoid ForeignKeyViolation.
Double-call bug: get_default_nanoid()() raised 'str' object is not callable. Fixed to single call.

New scheduler endpoints

PUT /v1/scheduler/jobs/{job_id} -- Update job message, schedule, or title.
POST /v1/scheduler/jobs/{job_id}/pause -- Pause an active job.
POST /v1/scheduler/jobs/{job_id}/resume -- Resume a paused job.

CLI v0.20260324.0

Chat stream timeout guard

Prevent the CLI chat TUI from hanging indefinitely when the server stops emitting SSE events. A 60-second inactivity timeout now surfaces a clear error instead of spinning the thinking indicator forever. (Addresses the "NLP scheduling hangs" report where certain natural-language patterns caused stalled streams.)

Runtime v0.20260323.0

Memory recall fixes

Fixed three bugs in the overlord message processing pipeline that caused buffer memory recall failures:

Non-actionable path lost all context: When a recall question (e.g., "what is my favorite turtle?") was classified as non-actionable, the persona model received zero memory context and could not answer. The non-actionable path now preserves relevant memories and conversation context.
Recall questions misclassified: The LLM actionability check could classify memory recall questions as non-actionable. Messages with relevant long-term memories are now forced through the full agent pipeline.
Duplicate buffer storage: Both the chat orchestrator and overlord independently stored messages in buffer memory, halving effective capacity. Removed duplicate; chat orchestrator is the sole owner.

Scheduler and Memobase fixes

Scheduler worker now runs in a daemon thread instead of blocking the event loop during formation startup.
Memobase collection passthrough, search filter deduplication, and parameter compatibility fixes.

Server v0.20260323.0

Docker networking for host services

Added --add-host localhost:host-gateway and --add-host host.docker.internal:host-gateway to runtime-runner Docker commands so formations can reach host-local services (e.g., PostgreSQL) via localhost.

CDN release downloads

Switched release download URLs from github.com to releases.muxi.org for server and runtime binaries.

CLI v0.20260323.0 + Installer v0.20260323.0

CDN release downloads

CLI upgrade flow and installer's scripts (install.sh, install.ps1) now use CDN-based release download URLs (releases.muxi.org) instead of direct GitHub release URLs.

CLI v0.20260314.0

`muxi server` renamed to `muxi remote`

All deployed formation management commands moved from muxi server to muxi remote to avoid confusion with muxi-server (the server daemon). muxi server will become a passthrough to muxi-server in a future release.

Runtime v0.20260312.0

Formation init hook

Formations now support a top-level init: field that runs a shell command before any services are initialized. Use it for environment setup like creating directories, installing tools, or seeding data. The command runs with a 120-second timeout, uses the formation directory as working directory, and fails the formation on non-zero exit.

MCP path diagnostics

When a command-type MCP server fails with "Connection closed", the runtime now checks if any args look like filesystem paths that don't exist and prints a hint pointing to the init hook.

Runtime v0.20260311.0

Agent Skills

Formations now support skills -- self-contained packages of instructions, references, and executable scripts that give agents deep expertise on demand. Skills follow the open Agent Skills specification.

Discovery: Skills in skills/ directories are parsed at startup. Agents receive a lightweight catalog (~100 tokens per skill) in their system prompt.
Progressive disclosure: Full skill content loads only when the agent activates it, keeping baseline context lean.
Script execution: Skills with scripts/ directories can execute code in sandboxed containers via the RCE service. Configure with rce: { url, token } in your formation.
Three-layer isolation: Catalog filtering, tool restriction (allowed-tools), and planning prompt scoping ensure agents only see and activate authorized skills.
Built-in file-generation skill for producing PDFs, images, spreadsheets, and charts.
REST API: GET /v1/skills, GET /v1/skills/{name}, GET /v1/agents/{agent_id}/skills.

Skills documentation

MCP transport reliability

MCP streamable HTTP transport operations now have timeouts (asyncio.wait_for): 30s for connect, 10s for cleanup. Invalid auth tokens fail in under 1 second instead of hanging indefinitely.

Credential selection fixes

Fixed 7 bugs in multi-credential MCP flows covering sync state management, credential caching after clarification, cache-aware skip to prevent re-asking, and string/dict type handling.

Skills RCE Integration

Built-in code execution: formations now ship with a managed RCE (Remote Code Execution) service for Skills
muxi-server init: downloads RCE automatically (SIF on Linux, Docker image on macOS/Windows)
muxi-server start: launches RCE as a managed process, injects MUXI_RCE_URL and MUXI_RCE_TOKEN into all formations
Auto port discovery: if default port 7891 is occupied, scans upward for an available port

Upgrade Command

muxi-server upgrade: self-update the server binary, pull latest RCE, and migrate config
Downloads latest server binary from GitHub releases (atomic swap with rollback)
Adds missing config fields (e.g. RCE auth token) to existing configurations

Fixes

HuggingFace model cache: pass HF_HOME=/opt/hf-cache to containers so the pre-cached embedding model is used instead of re-downloading on every startup (~80s), which caused health check timeouts
npm/npx in containers: npm and npx are symlinks that use relative require('../lib/cli.js'); bind-mounting the resolved path broke the import. Now creates wrapper scripts that invoke node with the full path to the npm module
Exact runtime version pinning: versions like muxi_runtime: "0.20260220.0" were rejected by the resolver if not in the local registry. Now passes exact versions through to the downloader, which checks disk and downloads if needed
Restore path: use downloader in the restore path to resolve latest runtime from GitHub instead of building a literal muxi-runtime-latest-*.sif filename
Runtime resolution: always resolve latest runtime from GitHub instead of using stale locally-cached version
Host tools: add npm, npx, bun, uv, uvx, tar, and gzip to tools bind-mounted into containers

Runtime v0.20260306.1

Explicit component declaration

Formations now require explicit declaration of agents, MCP servers, and A2A services. Files in subdirectories (agents/, mcp/, a2a/) are pure definitions -- only components listed in the formation manifest are loaded. This replaces auto-discovery and the active field, giving full control over what gets loaded.

agents:
  - support-agent       # Loads agents/support-agent.yaml by ID
  - research-agent      # Loads agents/research-agent.yaml by ID

mcp:
  servers:
    - github-mcp        # Loads mcp/github-mcp.yaml by ID

String entries reference files by ID. Dict entries are inline definitions. Omitted or empty lists mean nothing is loaded. Agents can now reference formation-level MCP servers by string ID in their mcp_servers field.

This is a breaking change. Remove active: from all component files and add explicit agents: / mcp.servers: lists to your formation.yaml.

Runtime v0.20260306.0 + Server v0.20260305.0

Use your formation from Claude Desktop, Cursor, and any MCP client

Every formation now exposes an MCP server at /mcp. Connect from Claude Desktop, Cursor, or any MCP-compatible client and interact with your agents using the standard Model Context Protocol – same memory, same tools, same auth. All 33 client endpoints are available as MCP tools with clean names (chat, list_sessions, search_memories, etc.). Admin and internal endpoints are never exposed.

Connect via MCP guide →

MCP authentication works exactly like the REST API: pass X-Muxi-Client-Key in your transport headers. No key, no access.

Async without webhooks

You can now fire off async requests and poll for results without setting up a webhook. The response includes a poll URL – just check back when you're ready. Per-request threshold_seconds and webhook_url overrides are now available in the chat request body too.

Faster responses

Context loading (memory, synopsis, buffer) now runs in parallel, saving 300-500ms per request. Simple greetings skip context entirely, cutting response time from ~4.4s to ~2.4s. JSON serialization switched to orjson (6x faster encoding).

CLI v0.20260306.0

Explicit component declaration (CLI support)

The CLI now fully supports the runtime's explicit declaration model. When you create a component with muxi new agent, muxi new mcp, or muxi new a2a-service, it automatically registers the ID in your formation manifest. muxi validate warns about undeclared files ("it will not be loaded") and errors on declared IDs with no matching file. All templates have active: removed.

File artifacts in chat

Formations that generate files (PDFs, images, data) now have their artifacts saved to ~/.muxi/cli/artifacts/. New management commands:

muxi artifacts list - List saved artifacts grouped by formation
muxi artifacts open - Open artifacts directory in file manager
muxi artifacts cleanup - Remove old artifacts (--days N, --formation )

All commands auto-scope to the current formation when run from inside a formation directory.

Runtime v0.20260302.0

Mix and match embedding models

Formations can now use any embedding dimension – the runtime creates dimension-specific memory tables automatically. A 384-dim local model and a 1536-dim OpenAI model can coexist in the same database. Local embedding models (local/all-MiniLM-L6-v2, etc.) run via sentence-transformers with no API key required.

If upgrading from an earlier version, rename your existing table: ALTER TABLE memories RENAME TO memories_1536;

February 2026: Initial Release

Server

Production-ready orchestration platform with HMAC authentication
Multi-formation support with zero-downtime deployment
Circuit breakers and resilience patterns
Hot-reload formations and health endpoints
Local development mode: /rpc/dev/run and /rpc/dev/stop endpoints, /draft/{id}/* proxy route
SDK update notifications via X-Muxi-SDK-Latest header (fetches release versions from GitHub API, refreshes every 24h)

Runtime

4-layer memory system (buffer, working, user synopsis, persistent)
FAISSx vector store for semantic search
Full MCP tool integration for agents
Observability system with 350+ typed events
Topic tagging for analytics
Self-healing tool chaining (agents recover from tool failures automatically)
User synopsis caching (80%+ token reduction)

CLI

muxi dev / muxi up for local development, muxi down to stop
muxi deploy for production deployment
Secrets management
Formation scaffolding
Draft and live formations can run simultaneously with the same ID

SDKs

Python, TypeScript, and Go client libraries
mode parameter ("live" or "draft") to switch between live and draft formations
Streaming support

OneLLM

Unified LLM interface supporting 15+ providers (OpenAI, Anthropic, Google, Ollama, and any OpenAI-compatible API)
Semantic caching (50-80% cost savings)
Streaming support

Documentation

Complete documentation restructure with concept pages, guides, deep dives, and reference
Added: Sessions, Multi-Identity Users, Scheduled Tasks, self-healing tool chaining, topic tagging, semantic caching