Response Formats Deep Dive
Technical implementation of LLM soul-based formatting
MUXI's response format system uses LLM soul instructions combined with post-processing validation to generate naturally formatted content while maintaining technical correctness.
Architecture
Hybrid Approach
User Request
↓
Soul Enhancement (LLM instructions)
↓
Agent Processing (natural generation)
↓
Post-Processing Validation (JSON/HTML only)
↓
Formatted Response
Why hybrid?
- LLM instructions: Natural, contextually appropriate formatting
- Post-processing: Ensures technical validity for structured formats
Soul Enhancement
How It Works
The system modifies the agent's system prompt based on format:
# Internal implementation
def _get_format_instruction(format: str) -> str:
if format == "markdown":
return """
Format your response using proper markdown with headers (# ## ###),
bullet points, bold/italic text, and code blocks where appropriate.
Use markdown syntax for structure and emphasis.
"""
elif format == "text":
return """
Format your response as plain text with no markdown formatting,
special characters, or HTML. Use simple text formatting like
line breaks and spacing for structure.
"""
elif format == "json":
return """
Format your response as valid JSON. Use appropriate data structures
(objects, arrays, strings, numbers, booleans). Ensure all JSON is
properly formatted and parseable.
"""
elif format == "html":
return """
Format your response as valid HTML with proper semantic tags like
<h1>, <h2>, <p>, <ul>, <li>, <strong>, <em>, and <code>. Include
proper structure and ensure all tags are properly closed.
"""
Soul Injection
# Simplified flow
enhanced_soul = f"{base_soul}
{format_instruction}"
llm_response = await llm.chat(
messages=messages,
system=enhanced_soul # Enhanced with format instructions
)
The LLM naturally generates formatted responses based on enhanced instructions.
Post-Processing Validation
JSON Validation
def validate_json_response(content: str) -> str:
"""Validate and reformat JSON response."""
try:
# Parse to ensure validity
parsed = json.loads(content)
# Re-serialize with consistent formatting
return json.dumps(parsed, indent=2, ensure_ascii=False)
except json.JSONDecodeError as e:
# Invalid JSON - log and return as-is
logger.warning(f"Invalid JSON response: {e}")
return content
Why validate?
- Ensures response is parseable
- Consistent formatting (indentation, spacing)
- Catches LLM errors early
HTML Validation
from bs4 import BeautifulSoup
def validate_html_response(content: str) -> str:
"""Validate and reformat HTML response."""
try:
# Parse with BeautifulSoup
soup = BeautifulSoup(content, 'html.parser')
# Validate structure
if not soup.find():
raise ValueError("No valid HTML tags found")
# Return prettified HTML
return soup.prettify()
except Exception as e:
logger.warning(f"Invalid HTML response: {e}")
return content
Validation includes:
- Proper tag nesting
- Closed tags
- Semantic structure
- Auto-correction of minor issues
Format-Specific Behavior
Real-Time Streaming
MUXI streams tokens as the model generates them. Use SSE for simple, one-way streams (chat, dashboards) or WebSockets for bidirectional interactions. Long tasks can start streaming, then hand off to async workflows with progress events.
- Protocols: SSE, WebSockets
- When to use: Chat UIs, live dashboards, long-running jobs needing updates
- See also: Streaming, Async Operations
Markdown
No validation - LLMs are good at markdown naturally.
Features:
- Headers (
#,##,###) - Bold and italic
- Code blocks with syntax highlighting
- Lists (bulleted and numbered)
- Links and references
- Tables
Example response:
# Analysis Results
## Key Findings
1. **Performance**: 95% improvement
2. **Cost**: Reduced by 40%
### Recommendations
- Optimize caching strategy
- Scale horizontally for peak loadsbash
Deploy command
kubectl apply -f deployment.yaml
Plain Text
No validation - flexible by design.
Features:
- Clean, unformatted text
- Line breaks for structure
- No special characters
- Maximum compatibility
Example response:
Analysis Results
Key Findings:
1. Performance improved by 95%
2. Cost reduced by 40%
Recommendations:
- Optimize caching strategy
- Scale horizontally for peak loads
Deployment: kubectl apply -f deployment.yaml
JSON
Validated and reformatted.
Structure:
{
"content": "The actual response content",
"type": "response",
"format": "json",
"metadata": {
// Optional metadata
}
}
Validation:
- Must be valid JSON
- Parsed with
json.loads() - Re-serialized with consistent formatting
Error handling:
if validation_fails:
# Return original content with error metadata
return {
"content": original_content,
"error": "Invalid JSON",
"format": "json"
}
HTML
Validated with BeautifulSoup.
Allowed tags:
- Structural:
div,span,section - Headers:
h1throughh6 - Text:
p,strong,em,code - Lists:
ul,ol,li - Links:
a - Pre-formatted:
pre,code
Not allowed:
- Scripts:
script - Styles:
style(inline styles ok) - Forms:
form,input(security) - Iframes:
iframe(security)
Validation:
# Auto-closes unclosed tags
<p>Text<p>More → <p>Text</p><p>More</p>
# Fixes nesting
<strong><em>Text</strong></em> → <strong><em>Text</em></strong>
# Removes dangerous tags
<p>Text<script>alert()</script></p> → <p>Text</p>
Configuration
Formation YAML
# formation.afs
overlord:
soul: "You are a helpful assistant"
response:
format: "markdown" # Default format
streaming: true # Works with all formats
validation: true # Enable post-processing (JSON/HTML)
Runtime Override
from muxi.runtime.formation import Formation
formation = Formation()
await formation.load("formation.afs")
# Get overlord instance
overlord = await formation.start_overlord()
# Override format
overlord.response_format = "json"
response = await overlord.chat("Extract data...")
# Reset to formation default
overlord.response_format = None
API Request
curl -X POST http://localhost:8001/v1/chat \
-H "Content-Type: application/json" \
-d '{
"message": "Analyze this data",
"format": "json"
}'
Streaming Behavior
Format-Specific Streaming
Markdown and Text:
event: chunk
data: {"text": "# Header
"}
event: chunk
data: {"text": "Some **bold** text"}
Streams naturally, readable as it arrives.
JSON:
event: chunk
data: {"text": "{\"name\": "}
event: chunk
data: {"text": "\"John\""}
event: chunk
data: {"text": ", \"email\": "}
event: chunk
data: {"text": "\"john@example.com\""}
event: chunk
data: {"text": "}"}
Client must collect all chunks, then parse complete JSON.
HTML:
event: chunk
data: {"text": "<h1>"}
event: chunk
data: {"text": "Header"}
event: chunk
data: {"text": "</h1>"}
Similar to JSON - collect and validate at end.
Performance Considerations
Token Overhead
Format instructions add ~50-100 tokens to system prompt:
| Format | Instruction Tokens | Impact |
|---|---|---|
| Markdown | ~60 tokens | Minimal |
| Text | ~50 tokens | Minimal |
| JSON | ~70 tokens | Minimal |
| HTML | ~80 tokens | Minimal |
Cost impact: <0.01% for typical conversations.
Validation Overhead
| Operation | Time | When |
|---|---|---|
| JSON parse | <1ms | Every JSON response |
| HTML parse | 1-5ms | Every HTML response |
| Markdown | 0ms | No validation |
| Text | 0ms | No validation |
Latency impact: Negligible (<0.1% of total).
Error Handling
Invalid JSON
# Agent returns invalid JSON
response_text = "{name: 'John'}" # Missing quotes
# Validation fails
try:
json.loads(response_text)
except JSONDecodeError:
# Log error, return wrapped response
return {
"content": response_text,
"error": "Invalid JSON returned by agent",
"format": "json"
}
Invalid HTML
# Agent returns malformed HTML
response_text = "<p>Text<strong>Bold" # Unclosed tags
# BeautifulSoup auto-corrects
soup = BeautifulSoup(response_text, 'html.parser')
corrected = soup.prettify()
# Result: <p>Text<strong>Bold</strong></p>
BeautifulSoup is forgiving - auto-closes tags, fixes nesting.
Best Practices
1. Clear Schema Expectations
agents:
- id: extractor
system_message: |
Extract customer information as JSON with fields:
- name (string, required)
- email (string, required)
- phone (string, optional)
- company (string, optional)
Use null for missing optional fields.
2. Validation in Code
from pydantic import BaseModel
class Customer(BaseModel):
name: str
email: str
phone: str | None = None
company: str | None = None
# Validate response
response = await overlord.chat("Extract customer info...")
data = json.loads(response.content)
customer = Customer(**data) # Validates schema
3. Retry on Failure
max_retries = 3
for attempt in range(max_retries):
response = await overlord.chat(message)
try:
data = json.loads(response.content)
break # Success
except JSONDecodeError:
if attempt == max_retries - 1:
raise # Give up after retries
# Retry with clarification
message = f"{message}
Previous response was invalid JSON. Please return valid JSON."
Future Enhancements
Planned Features
Pydantic Schema Enforcement
yaml response: format: json schema: "schemas/customer.json"Custom Validators
yaml response: format: json validator: "validators/custom.py"Format Auto-Detection
yaml response: format: auto # Detect from context
Debugging
Enable Validation Logs
logging:
level: debug
validators: true
Logs all validation attempts and failures.
Inspect Raw Responses
response = await overlord.chat("Extract data...")
print("Raw:", response.raw_content) # Before validation
print("Validated:", response.content) # After validation
Learn More
- Structured Output Concept - User-facing explanation
- Tools & MCP - Tools also return structured data
- API Reference - Complete API documentation