Frontier AI Lab · Confidential · San Francisco

Research Engineer, Agent Systems

One of the most mission-driven organizations in AI is building the infrastructure that makes intelligent agents safe and reliable in the real world. This is not an application layer role. You’ll work directly alongside researchers to build the execution layer that determines how agents reason, act, fail, recover, and improve in production.

$300K – $600K+ Total Comp + Equity San Francisco · On-Site Frontier AI Lab · Confidential Highly Selective · 1 Engineer No Visa Sponsorship

Apply via AIONIA

The Organization

Where the Frontier Is Actually Being Built

We’re partnering with a mission-driven frontier AI lab focused on building safe, reliable intelligent systems — not shipping products, not chasing growth metrics. The work is foundational. The team is small. The impact is real.

The organization operates with a founding team at the forefront of AI safety and superintelligence research, a deliberate research-first culture where engineering quality is non-negotiable, and a small team where every engineer shapes the direction of the system.

Frontier

Founding team from the top of the AI research world

Small

Every hire shapes system architecture — no passengers

Mission

Research-first culture — engineering quality is non-negotiable

$300K+

Top-of-market comp with meaningful equity for the right person

“This lab exists to get superintelligence right. If that motivates you, you’ll find this environment unlike anywhere else.”

The Work

What You’ll Build

This is a research engineering role at the frontier. You’ll translate model insights into production-grade systems — sitting at the boundary between what researchers discover and what actually runs reliably in the world.

Build and evolve agent execution frameworks used directly in research and production
Develop evaluation infrastructure that measures reliability, capability, and safety together
Design control-plane systems for routing, planning, and tool use with strong correctness guarantees
Build feedback loops that close the gap between offline evaluation and real-world behavior
Create observability and simulation systems that make failure modes visible and fixable
Work directly with researchers to turn experimental insights into stable, scalable systems
Contribute to sandboxed environments where agents can operate and self-validate safely
Continuously adapt orchestration systems as model capabilities evolve

Tech Stack

Stack & Tools

Python Distributed Systems Eval Frameworks Agent Orchestration Observability Async Execution ML Pipelines

Requirements

What They’re Looking For

Must-Have — Non-Negotiable

Experience building and operating complex systems in production — reliability under real-world pressure
Ability to debug complex systems and identify root causes of failures, not just symptoms
Comfort working in ambiguous, fast-moving environments where the problems are genuinely unsolved

Required

Familiarity with experimentation, evaluation, or data-driven product improvement loops
Experience working closely with researchers or data scientists — translating their needs into reliable infrastructure
Experience owning systems end-to-end, from design through production and iteration
Strong backend engineering foundation — distributed systems, async execution, observability

Even Better If

You’ve built or worked on agent harnesses, orchestration layers, or execution frameworks
You think in terms of control planes, feedback loops, and system-level optimization — not just features
You’re excited about diagnosing failure modes and iterating toward measurable improvements
You care deeply about production quality — not just making systems work, but making them reliable, safe, and scalable
You’re motivated by pushing the frontier of how intelligent systems behave in the real world

Why This Role

Why This Role Is Different

Safety is a first-class engineering concern. Not a compliance layer added at the end — a design constraint built into every system from the start.
You’ll work with researchers, not around them. This is a true research engineering role — your infrastructure directly enables the science.
Small team, outsized leverage. Every architectural decision you make shapes how intelligent systems behave in the world.
The mission is the point. This lab exists to get superintelligence right. If that drives you, there is no comparable environment.

Compensation

Compensation & Perks

$300,000 – $600,000+ total compensation depending on level, with meaningful equity at an organization of this caliber. This is not typical startup equity — this is a lab building toward one of the most consequential outcomes in technology.

Top-of-Market Base Meaningful Equity On-Site San Francisco Small Elite Team

Process

Interview Process

Intro Call with Aionia

Role alignment, background overview, and candidate brief review

Technical Screen

Systems depth, distributed infrastructure, and agent/eval thinking

Practical Assessment

Systems design or project-based challenge relevant to the execution layer

Founder / Team Interview

Mission alignment, research culture fit, and vision conversation

Location	San Francisco, CA
Work Policy	On-Site · 5 days
Total Comp	$300K – $600K+
Equity	Meaningful
Experience	3–15 yrs · Backend / Infra
Openings	1 Engineer · Selective
Client	Confidential
Visa	Not available