/ Agent Studio
Design agents.
Test them. Ship them.
Agent Studio is the visual workspace for configuring, fine-tuning, and evaluating agents before they go to production.
01 / Visual Configuration Editor
Configure without code.
Set system prompts, bind tools, adjust model parameters, and configure context windows — all from a clean visual interface. Changes preview in real-time so you can iterate quickly without redeploying.
- Drag-and-drop tool binding
- Real-time preview pane
- Export to YAML or JSON
Configuration
Test Harness
02 / Interactive Test Harness
Catch regressions before production.
Run test suites against your agent directly in the browser. Compare outputs across versions side-by-side. Every save runs your suite automatically so regressions never reach production users.
03 / Fine-tuning Pipeline
Teach your agents new behaviours.
Provide example input/output pairs and AIRMY handles the fine-tuning job end to end. Track training loss live, preview model outputs at each checkpoint, and publish to production when the eval metrics satisfy your threshold.
- Upload CSV or JSONL training data
- Live training loss chart
- Checkpoint comparison
Training Job — data-engineer-v2
04 / Prompt Template Library
200+ battle-tested system prompts.
Choose from 200+ community-contributed and AIRMY-verified system prompt templates covering data analysis, code review, customer support, legal summarisation, and more. Fully customisable — use as a starting point or deploy as-is.
05 / Evaluation Metrics Dashboard
Measure what matters.
Track accuracy, consistency, latency, and token usage across agent versions in a unified dashboard. AIRMY's built-in eval harness runs automated tests on every save, so you always have a fresh signal before pushing to production.
Eval Metrics — v1.4 vs v1.5
Review Queue
Increased context window to 128k
Priya Nair · awaiting approval
Updated system prompt — tone adjustment
James D. · approved 1h ago
06 / Collaboration & Review
Ship together, safely.
Invite team members to collaborate on agent configuration. Review proposed changes in a Git-style diff view. Require approval from a designated reviewer before any change reaches production — keeping your critical agents stable.
Your agents deserve a proper workbench.
Open Agent Studio and start building in minutes.
Open Agent Studio