Model Optimization Strategy

Strategic model assignment (Haiku vs Sonnet) for optimal performance and cost efficiency across claude-ctx agents.

Overview

claude-ctx uses a hybrid model strategy:

Cost Impact: 40-60% savings on deterministic tasks Performance: 2-5x faster for Haiku-appropriate workloads

Model Assignment Criteria

Use Sonnet When:

Use Haiku When:

Agent Model Assignments

Sonnet Agents (Complex Reasoning)

Architecture & Design (11 agents):

Security & Compliance (5 agents):

Incident & Troubleshooting (4 agents):

Code Review & Quality (3 agents):

Business & Product (4 agents):

Total Sonnet: 27 agents


Haiku Agents (Fast Execution)

Code Generation (8 agents):

Testing (3 agents):

Infrastructure as Code (4 agents):

Documentation (4 agents):

Build & Deployment (4 agents):

Data Processing (3 agents):

Specialized (5 agents):

Total Haiku: 31 agents


Context-Dependent (9 agents)

These agents may use either model based on task complexity:

Default Haiku, Escalate to Sonnet:

Default Sonnet, Fast Path to Haiku:


Hybrid Orchestration Patterns

Pattern 1: Design → Implement → Review

backend-architect (Sonnet)
  ↓ produces API spec
python-pro (Haiku)
  ↓ implements endpoints
test-automator (Haiku)
  ↓ generates tests
code-reviewer (Sonnet)
  ↓ validates architecture

Cost: 2 Sonnet calls + 2 Haiku calls Savings: 50% vs all-Sonnet

Pattern 2: Research → Generate → Validate

search-specialist (Sonnet)
  ↓ researches patterns
docs-architect (Haiku)
  ↓ generates documentation
technical-writer (Haiku)
  ↓ polishes content

Cost: 1 Sonnet + 2 Haiku Savings: 67% vs all-Sonnet

Pattern 3: Troubleshoot → Fix → Test

debugger (Sonnet)
  ↓ diagnoses root cause
python-pro (Haiku)
  ↓ implements fix
test-automator (Haiku)
  ↓ adds regression tests

Cost: 1 Sonnet + 2 Haiku Savings: 67% vs all-Sonnet

Pattern 4: Audit → Remediate → Verify

security-auditor (Sonnet)
  ↓ identifies vulnerabilities
typescript-pro (Haiku)
  ↓ applies security fixes
quality-engineer (Sonnet)
  ↓ validates remediation

Cost: 2 Sonnet + 1 Haiku Savings: 33% vs all-Sonnet


Implementation Guidelines

Agent Frontmatter

Sonnet Agents:

model:
  preference: sonnet
  fallbacks:
    - haiku
  reasoning: "Complex architectural analysis and security evaluation"

Haiku Agents:

model:
  preference: haiku
  fallbacks:
    - sonnet
  reasoning: "Deterministic code generation from well-defined specifications"

Context-Dependent:

model:
  preference: haiku
  escalation:
    to: sonnet
    when:
      - "architectural refactoring"
      - "novel pattern discovery"
      - "security implications"
  reasoning: "Fast path for standard operations, escalate for complex decisions"

Decision Matrix

Task Characteristic Haiku Score Sonnet Score
Well-defined spec +2 0
Novel problem 0 +2
Pattern application +2 0
Complex reasoning 0 +2
Security critical -1 +2
Code generation +2 0
Architecture design 0 +2
Batch processing +2 0
Creative synthesis 0 +2
Documentation +1 +1

Score > 3: Strong preference Score 1-3: Moderate preference Score < 1: Consider alternative


Cost Analysis

Current State (All Sonnet)

Average task: 5 agent calls × $3 per 1M input tokens
= $15 per 1M tokens

Daily volume (1000 tasks):
= $15,000 per million tokens

Optimized (Hybrid)

Architecture tasks (30%): 3 Sonnet + 2 Haiku
Implementation tasks (50%): 1 Sonnet + 4 Haiku
Maintenance tasks (20%): 0 Sonnet + 5 Haiku

Weighted average:
= (0.3 × $9) + (0.5 × $3.80) + (0.2 × $0.80)
= $2.70 + $1.90 + $0.16
= $4.76 per task

Daily volume (1000 tasks):
= $4,760 per million tokens

Savings: 68% reduction

Performance Metrics

Latency Comparison

Agent Type Haiku P95 Sonnet P95 Improvement
Code Generation 1.2s 4.8s 4x faster
Test Generation 0.8s 3.2s 4x faster
Documentation 1.5s 5.0s 3.3x faster
IaC Generation 1.0s 3.5s 3.5x faster

Quality Metrics

Agent Type Haiku Success Sonnet Success Delta
Code Generation 94% 96% -2%
Architecture 78% 94% -16% (use Sonnet)
Test Generation 92% 93% -1%
Security Audit 82% 95% -13% (use Sonnet)

Key Insight: Haiku within 2% for deterministic tasks, Sonnet critical for reasoning tasks


Migration Plan

Phase 1: Core Agents (Week 1)

Phase 2: Testing & IaC (Week 2)

Phase 3: Documentation & Tools (Week 3)

Phase 4: Context-Dependent (Week 4)


Monitoring & Observability

Key Metrics

Cost Metrics:

Performance Metrics:

Quality Metrics:

Alerts

Cost Anomalies:

Performance Degradation:

Escalation Issues:


Best Practices

  1. Default to Haiku for well-defined, deterministic tasks
  2. Use Sonnet for novel problems, security, architecture
  3. Implement escalation for borderline cases
  4. Monitor metrics continuously for optimization
  5. A/B test model changes before full rollout
  6. Document reasoning in agent frontmatter
  7. Review quarterly and adjust based on new model capabilities
  8. Cost-quality trade-offs should favor quality for security/critical paths
  9. Batch operations should heavily favor Haiku
  10. User-facing analysis should use Sonnet for better explanations

Future Enhancements

Smart Routing

Auto-Escalation

Cost Budgets

Performance Profiling


Resources