Observability
Track every request from ingress to response
MUXI emits structured events across the full request lifecycle - from initial routing through agent execution to final delivery. Stream these events to your logging infrastructure for debugging, auditing, and performance monitoring.
Why this matters: When a user reports "the agent gave me a wrong answer," you need to trace exactly what happened - which agent was selected, what tools were called, what context was retrieved from memory, and how the response was generated. Without observability, you're debugging blind.
Example use cases:
- Debugging: Trace why an agent selected the wrong tool or returned unexpected results
- Auditing: Log all tool calls and data access for compliance requirements
- Performance: Identify slow LLM calls, memory lookups, or tool executions
- Alerting: Get notified when error rates spike or latency exceeds thresholds
Event System
MUXI emits 349 typed events across the system lifecycle.
Event Categories
| Category | Count | Examples |
|---|---|---|
| System | 120 | Startup, shutdown, config |
| Conversation | 157 | Request, response, routing |
| Server | 9 | HTTP, connections |
| API | 2 | Call start/end |
| Error | 61 | All error types |
Event Format
{
"event_type": "conversation.request.received",
"timestamp": "2025-01-08T10:00:00.123Z",
"formation_id": "my-assistant",
"session_id": "sess_abc123",
"user_id": "user_123",
"data": {
"message_length": 42,
"agent": "assistant"
}
}
Event Stream
Subscribe to events:
curl http://localhost:8001/v1/events \
-H "X-Muxi-Admin-Key: fma_..." \
-H "Accept: text/event-stream"
Events stream via SSE:
event: conversation.request.received
data: {"session_id": "sess_123", ...}
event: conversation.llm.call.started
data: {"model": "GPT-5", ...}
event: conversation.response.completed
data: {"tokens": 150, ...}
Key Events
Request Lifecycle
| Event | Description |
|---|---|
request.received
| Request arrived |
auth.validated
| Authentication passed |
orchestration.started
| Routing began |
agent.selected
| Agent chosen |
llm.call.started
| LLM request sent |
llm.call.completed
| LLM response received |
response.completed
| Response sent |
Tool Execution
| Event | Description |
|---|---|
tool.invoked
| Tool call started |
tool.completed
| Tool returned result |
tool.failed
| Tool error |
Memory
| Event | Description |
|---|---|
memory.loaded
| Context loaded |
memory.updated
| Memory saved |
knowledge.searched
| RAG search |
Logging configuration (schema-aligned)
Two-tier config matching the formation schema:
logging:
system:
level: info
destination: stdout # or file path
conversation:
enabled: true
streams:
- transport: stdout # stdout | file | stream | trail
level: info
format: jsonl # jsonl | text | msgpack | datadog_json | splunk_hec | ...
events: ["request.*", "agent.*", "tool.*"]
- transport: stream
protocol: http # http/https/zmq/websocket
destination: https://logs.example.com/ingest
format: jsonl
auth:
type: bearer
token: "${{ secrets.LOG_TOKEN }}"
- System tier: infrastructure events (startup/shutdown, MCP, A2A, errors) → single destination.
- Conversation tier: multi-stream with per-stream level/format/event filters; use
eventsto scope (e.g.,request.*,memory.*).
Metrics
Available Metrics
| Metric | Description |
|---|---|
requests_total
| Total requests |
request_duration
| Request latency |
llm_calls_total
| LLM API calls |
llm_tokens
| Token usage |
tool_calls
| Tool invocations |
errors_total
| Error count |
Prometheus Export
curl http://localhost:7890/metrics
Integration
Datadog
Forward logs:
logging:
format: json
output: stdout # Datadog agent picks up
Elastic
Use Filebeat to ship logs:
filebeat.inputs:
- type: log
paths:
- /var/log/muxi/*.log
json.keys_under_root: true
Custom Webhook
Forward events:
observability:
webhook:
url: https://your-service.com/events
events:
- "error.*"
- "conversation.completed"
Debugging
Enable Debug Logs
logging:
level: debug
Trace Requests
Add trace ID:
curl -H "X-Trace-Id: trace_123" ...
Trace ID appears in all related events.