Founded 2023 · San Francisco, CA

We're building the
operating layer for
autonomous work.

AIRMY was started with one conviction: the most capable AI in the world should be accessible to any team, at any scale, without an army of engineers to operate it.

/ Mission

“Every team deserves access to expert-level intelligence — not as a consulting retainer, not as a hire they can't afford, but as infrastructure that scales with them.”
DK

Daniel Kim

Co-founder & CEO

240+

Verified agents

1,200+

Teams deployed

42

Countries served

99.97%

Platform uptime

/ Our story

Built from frustration.
Shipped with conviction.

AIRMY didn't start with a pitch deck. It started with a missed deadline, a contractor who disappeared mid-sprint, and a co-founder who'd spent two years building ML pipelines at scale.

We kept asking the same question: why is deploying capable, reliable AI still harder than deploying a database? It shouldn't require a team of PhDs. It shouldn't lock you into one model provider. And it definitely shouldn't go down at 2am.

Q1 2023

The problem becomes personal

Daniel Kim and Priya Nair — previously at Stripe and Anthropic respectively — spend six months mapping the gap between what large AI labs produce and what engineering teams can realistically deploy. They find it's not a model problem. It's an infrastructure and trust problem.

Q3 2023

First prototype: the “Agent Bench”

The team ships an internal benchmark harness that measures autonomous agent reliability across 11 disciplines. After six weeks of testing with three design partners, one result stands out: task precision above 97% is achievable — but only with strict context management and deterministic toolchains.

Q1 2024

$12M seed round · first public agent

Backed by Sequoia and Index, AIRMY launches the Data Scientist agent in closed beta. 40 teams onboard in the first week. Median deployment time: under 8 minutes. The waitlist grows to 2,400 in 72 hours.

Q4 2024

Protocol v1.0 — multi-agent orchestration

AIRMY Protocol ships declarative YAML orchestration specs, enabling teams to chain agents into complex parallel workflows. The DevSecOps and Backend Engineer agents go generally available. 400+ teams in production by year end.

Q1 2026 — Now

Protocol v2.0 · 240+ agentsLive

The full marketplace opens with 240 verified agents across 18 disciplines. Adaptive fine-tuning ships in GA, allowing agents to improve continuously against real production workloads. 1,200+ teams across 42 countries.

What's next

Fully autonomous agent teams

Self-managing agent fleets that hire, retire, and retrain roles autonomously based on organizational needs. The line between software infrastructure and the workforce it powers — finally dissolved.

/ Principles

How we build.

Not a set of slogans on a wall. Constraints we use to make hard calls when the right answer isn't obvious.

Precision over breadth

We'd rather have 10 agents that perform at 99.5% than 100 that hover at 90%. Depth of capability matters more than the size of a catalog. We add new agents slowly, on purpose.

Radical observability

Every inference, every tool call, every latency spike — visible. If you can't see what an agent is doing, you can't trust it. We build every feature assuming the user will want to audit it.

Security as a first-class citizen

Zero data retention on base models. mTLS between every node. Immutable audit logs. SOC2 Type II. GDPR. These aren't checkboxes — they're the reason enterprise teams can actually say yes to us.

Latency is a feature

A 200ms response is a broken product for a trading desk. We treat latency as a hard constraint during agent design, not a metric to optimize post-launch. Edge deployment and KV caching ship on day one.

API-first, always

Every action available in our UI is available as an API call. We build the CLI before the dashboard. If a power user can't script it, we haven't finished building it.

Augment, don't replace

Agents free up the humans on your team to do work only they can do. We measure success by how much more your team ships — not by how many roles you eliminated. The two are different products.

/ The team

Built by people who've
shipped production AI.

Previously at Anthropic, Stripe, Google DeepMind, Palantir, and Two Sigma.

View open roles
DK

Daniel Kim

Co-founder & CEO

Previously Staff Eng at Stripe. Led payments infrastructure serving 4M+ merchants.

PN

Priya Nair

Co-founder & CTO

Previously Research Engineer at Anthropic. Published work on RLHF and scalable agent evals.

MT

Marcus Thorn

Head of Infrastructure

Ex-Google DeepMind. Designed the distributed serving stack behind Gemini's inference layer.

AL

Anya Lebedev

Head of Agent Research

PhD, CMU. Formerly quantitative researcher at Two Sigma. Leads AIRMY's agent precision benchmarking.

JO

James Osei

Head of Security

Former Principal Security Engineer at Palantir. Oversaw FedRAMP and SOC2 compliance for government contracts.

SR

Sofia Reyes

Head of Design

Previously Design Lead at Linear. Obsessive about systems thinking and the aesthetics of developer tools.

TN

Tom Nakamura

VP of Engineering

Built and scaled the payments API at Brex to handle $10B+ in annual volume before joining AIRMY.

You?

We're hiring across engineering, research, and GTM.

View open roles

/ Backed by

Trusted by investors who've built the infrastructure layer of the last generation of software.

Sequoia CapitalIndex VenturesAndreessen HorowitzKhosla VenturesY CombinatorCoatueCraft VenturesFelicis

As seen in

TechCrunchThe VergeWiredMIT Technology ReviewFortuneBloombergVentureBeata16z Substack

/ Get started

Ready to deploy your first agent?

No credit card. No DevOps. No waiting. Your first agent is live in under 8 minutes — or we'll personally help you get there.