Metacognitive agents for adaptive problem-solving inspired by Poetiq's record-breaking ARC-AGI solver.
This bundle provides agents that reason about how to solve problems, not just solve them directly. Based on research from Poetiq's SOTA ARC-AGI approach, these agents enable:
- Complexity assessment: Judge if tasks are at a scale likely to succeed
- Iterative refinement: Multiple attempts with measured feedback (soft scoring)
- Solution evaluation: Objective scoring with detailed feedback
- Ensemble consensus: Parallel strategies with voting for critical decisions
# Run directly from GitHub
amplifier run --bundle git+https://github.com/ramparte/amplifier-bundle-metacognition@main "Your task"
# Or clone and run locally
git clone https://github.com/ramparte/amplifier-bundle-metacognition
amplifier run --bundle ./amplifier-bundle-metacognition/bundle.md "Your task"The metacognition bundle enables adaptive problem-solving:
# The coordinator will automatically assess complexity and adapt strategy
amplifier run --bundle metacognition "Design and implement caching layer"
# Behind the scenes:
# 1. Delegates to complexity-assessor → score 7, "iterative-refinement"
# 2. Delegates to iterative-refiner → runs 3 iterations with feedback
# 3. Returns best solution with score 0.92Four specialized agents that orchestrate problem-solving:
complexity-assessor
- Analyzes task complexity (1-10 scale)
- Recommends strategy: solve-directly, single-pass-with-review, iterative-refinement, ensemble, decompose
- Considers scope, novelty, integration points, success criteria
iterative-refiner
- Implements poetiq-style iterative refinement
- Soft scoring (0.0-1.0) tracks progress
- Learns from past attempts with rich feedback
- Terminates on success (score ≥ 0.9) or max iterations
solution-evaluator
- Objective evaluation across 4 dimensions
- Scores: correctness, completeness, quality, testability
- Detailed feedback with specific locations and suggestions
- Recommendations: accept, iterate, reject
ensemble-coordinator
- Runs multiple strategies in parallel
- Groups identical solutions (consensus)
- Votes on best solution
- High consensus = high confidence
Three comprehensive guides:
- poetiq-patterns.md: Poetiq's metacognitive patterns from ARC-AGI research
- complexity-signals.md: How to assess task complexity
- scoring-rubrics.md: Detailed rubrics for solution evaluation
Pre-configured bundle loading all agents with optimal settings. Includes foundation:dev for standard tools.
Task → complexity-assessor → Strategy Recommendation
↓
┌─────────┴─────────┐
solve-directly iterative-refinement ensemble
↓ ↓ ↓
Execute iterative-refiner ensemble-coordinator
↓ ↓ ↓
solution-evaluator ↓ ↓
↓ ↓ ↓
Result ←──────────────────┴────────────────────────┘
User: "Add logging to authentication module"
1. Coordinator delegates to complexity-assessor
→ Score: 4.0 (medium)
→ Recommendation: "single-pass-with-review"
2. Coordinator implements logging
3. Coordinator delegates to solution-evaluator
→ Score: 0.75 (good, some improvements needed)
→ Feedback: "Add structured logging, improve error messages"
4. Coordinator refines based on feedback
5. Coordinator evaluates again
→ Score: 0.92 (excellent)
→ Recommendation: "accept"
Result: High-quality implementation with objective validation
User: "Design and implement caching layer"
1. Coordinator delegates to complexity-assessor
→ Score: 7.0 (high)
→ Recommendation: "iterative-refinement"
2. Coordinator delegates to iterative-refiner
iterative-refiner orchestrates:
- Iteration 1: Generate solution → evaluate → score 0.6
Feedback: "Missing TTL support, no invalidation"
- Iteration 2: Add TTL and invalidation → evaluate → score 0.8
Feedback: "Good progress, but race conditions in concurrent access"
- Iteration 3: Fix race conditions → evaluate → score 0.93
Success! Return solution
3. Coordinator presents solution with confidence score
Result: Robust solution refined through measured iteration
User: "Design authentication architecture for microservices"
1. Coordinator delegates to complexity-assessor
→ Score: 9.0 (very high)
→ Recommendation: "ensemble"
2. Coordinator delegates to ensemble-coordinator
ensemble-coordinator spawns 5 parallel strategies:
- zen-architect (temp=0.3) → JWT + Redis
- modular-builder (temp=0.5) → JWT + Redis ← CONSENSUS
- modular-builder (temp=0.7) → JWT + Redis ←
- bug-hunter (temp=0.3) → OAuth2 + DB
- zen-architect (temp=0.7) → Stateless JWT
Vote: 3 for JWT+Redis (60% consensus)
Confidence: HIGH
3. Coordinator returns consensus solution with rationale
Result: Robust decision backed by multiple independent analyses
This collection is inspired by Poetiq's record-breaking ARC-AGI solver:
- SOTA Results: Achieved state-of-the-art on ARC-AGI-1 and ARC-AGI-2 benchmarks
- Iterative Refinement: Generate → Evaluate → Refine → Repeat
- Soft Scoring: Track progress with 0.0-1.0 scores (not just pass/fail)
- Rich Feedback: Detailed, actionable critique guides improvement
- Parallel Exploration: Multiple experts vote on best solution
Key Insight: Don't predict success upfront → measure and adapt in real-time.
See poetiq-patterns.md for complete details.
Purpose: Assess task complexity and recommend strategy
Inputs:
- Task description
- Available context (files, systems, etc.)
Outputs:
{
"complexity_score": 7.0,
"confidence": 0.85,
"recommendation": "iterative-refinement",
"reasoning": "Novel architecture with multiple integration points...",
"suggested_strategy": {
"approach": "decompose-then-iterate",
"substeps": [...],
"estimated_iterations": 3
}
}When to use: Before starting any non-trivial task
Purpose: Implement poetiq-style iterative refinement
Inputs:
- Task to solve
- Max iterations (default: 5)
- Success threshold (default: 0.9)
Outputs:
{
"iteration": 3,
"solution": "...",
"self_score": 0.92,
"breakdown": {
"correctness": 0.95,
"completeness": 0.90,
"quality": 0.92,
"clarity": 0.90
},
"history": [...],
"best_iteration": 3
}When to use: Novel work, creative tasks, complex implementations
Purpose: Objective evaluation with detailed feedback
Inputs:
- Solution (code/design/implementation)
- Requirements
Outputs:
{
"overall_score": 0.85,
"scores": {
"correctness": 0.9,
"completeness": 0.8,
"quality": 0.85,
"testability": 0.85
},
"strengths": ["Clean separation of concerns", "..."],
"weaknesses": [
{
"issue": "Missing null validation",
"location": "auth.py:23",
"severity": "high",
"suggestion": "Add guard clause"
}
],
"recommendation": "iterate"
}When to use: After completing medium+ complexity tasks, or to evaluate iterative-refiner's output
Purpose: Parallel strategy exploration with voting
Inputs:
- Problem requiring multiple perspectives
- Number of strategies (default: 5)
- Agent/temperature diversity settings
Outputs:
{
"strategies_tried": 5,
"consensus_groups": [
{
"solution_id": "A",
"vote_count": 3,
"agents": ["zen-architect", "modular-builder", "..."],
"quality_score": 0.85,
"solution": "..."
}
],
"recommendation": {
"selected_solution": "A",
"confidence": 0.9,
"reasoning": "60% consensus with high quality score"
}
}When to use: Critical decisions, architectural choices, high-stakes work
Metacognitive agents work alongside standard Amplifier agents:
agents:
# Standard agents (if you have them)
- zen-architect
- modular-builder
- bug-hunter
# Metacognitive agents (from this collection)
- complexity-assessor
- iterative-refiner
- solution-evaluator
- ensemble-coordinatorUse cases:
- Use complexity-assessor to decide which standard agent to use
- Use iterative-refiner to orchestrate multiple attempts by standard agents
- Use ensemble-coordinator to get consensus from multiple standard agents
- Use solution-evaluator to assess standard agents' work
DO use for:
- Novel or creative work
- Critical decisions (architecture, security)
- Complex multi-step tasks
- When quality bar is high
DON'T use for:
- Simple, routine tasks (overhead not worth it)
- Time-sensitive work (adds latency)
- Well-defined, pattern-based work
Complexity assessment: Fast, always worth it (~5s) Iterative refinement: Medium cost (N iterations × task time) Solution evaluation: Fast (~10s per evaluation) Ensemble coordination: High cost (N parallel strategies)
Use ensemble sparingly - only for critical decisions worth the cost.
0.9-1.0: Excellent, ship it 0.7-0.9: Good, minor iteration recommended 0.5-0.7: Acceptable, needs iteration 0.3-0.5: Poor, major rework 0.0-0.3: Unacceptable, try different approach
Be honest with scores - 0.6 is okay and means "needs work."
See the examples/ directory for complete working examples:
- example-simple.md: Low complexity task
- example-medium.md: Medium complexity with review
- example-complex.md: High complexity with iteration
- example-critical.md: Critical decision with ensemble
See examples/code/ for programmatic usage:
- basic_usage.py: Simple complexity assessment example
- iterative_workflow.py: Complete iterative refinement workflow
- error_handling.py: Error recovery patterns
This bundle embodies Amplifier's core principles:
Ruthless Simplicity: Agents are markdown files, no complex orchestration code Measurement Over Prediction: Use soft scoring to track actual progress Present-Moment Focus: Solve current problems, don't future-proof Code for Structure, AI for Intelligence: Agents think, you orchestrate
From Poetiq:
- Don't guess if you'll succeed → measure and adapt
- Rich feedback enables learning
- Parallel exploration reduces bias
- Consensus = confidence
- CONTRIBUTING.md: Guidelines for contributing agents, tests, and documentation
- docs/TROUBLESHOOTING.md: Common issues and solutions
- docs/PERFORMANCE.md: Performance benchmarks and optimization guide
Found a pattern that works well? Improve the scoring rubrics? Share!
This bundle welcomes contributions. See CONTRIBUTING.md for guidelines.
MIT License - See LICENSE
Inspired by Poetiq's groundbreaking work on ARC-AGI:
- Blog: https://poetiq.ai/posts/arcagi_announcement/
- Code: https://github.com/poetiq-ai/poetiq-arc-agi-solver
- Paper: "Traversing the Frontier of Superintelligence"
Bundle Version: 0.1.0
Last Updated: 2026-01-06
Ready to get started?
amplifier run --bundle git+https://github.com/ramparte/amplifier-bundle-metacognition@main "Your complex task here"Let the agents reason about how to solve your problems!