Research Engineer, Agent Systems
One of the most mission-driven organizations in AI is building the infrastructure that makes intelligent agents safe and reliable in the real world. This is not an application layer role. You’ll work directly alongside researchers to build the execution layer that determines how agents reason, act, fail, recover, and improve in production.
Where the Frontier Is Actually Being Built
We’re partnering with a mission-driven frontier AI lab focused on building safe, reliable intelligent systems — not shipping products, not chasing growth metrics. The work is foundational. The team is small. The impact is real.
The organization operates with a founding team at the forefront of AI safety and superintelligence research, a deliberate research-first culture where engineering quality is non-negotiable, and a small team where every engineer shapes the direction of the system.
What You’ll Build
This is a research engineering role at the frontier. You’ll translate model insights into production-grade systems — sitting at the boundary between what researchers discover and what actually runs reliably in the world.
- Build and evolve agent execution frameworks used directly in research and production
- Develop evaluation infrastructure that measures reliability, capability, and safety together
- Design control-plane systems for routing, planning, and tool use with strong correctness guarantees
- Build feedback loops that close the gap between offline evaluation and real-world behavior
- Create observability and simulation systems that make failure modes visible and fixable
- Work directly with researchers to turn experimental insights into stable, scalable systems
- Contribute to sandboxed environments where agents can operate and self-validate safely
- Continuously adapt orchestration systems as model capabilities evolve
Stack & Tools
What They’re Looking For
- Experience building and operating complex systems in production — reliability under real-world pressure
- Ability to debug complex systems and identify root causes of failures, not just symptoms
- Comfort working in ambiguous, fast-moving environments where the problems are genuinely unsolved
- Familiarity with experimentation, evaluation, or data-driven product improvement loops
- Experience working closely with researchers or data scientists — translating their needs into reliable infrastructure
- Experience owning systems end-to-end, from design through production and iteration
- Strong backend engineering foundation — distributed systems, async execution, observability
- You’ve built or worked on agent harnesses, orchestration layers, or execution frameworks
- You think in terms of control planes, feedback loops, and system-level optimization — not just features
- You’re excited about diagnosing failure modes and iterating toward measurable improvements
- You care deeply about production quality — not just making systems work, but making them reliable, safe, and scalable
- You’re motivated by pushing the frontier of how intelligent systems behave in the real world
Why This Role Is Different
- Safety is a first-class engineering concern. Not a compliance layer added at the end — a design constraint built into every system from the start.
- You’ll work with researchers, not around them. This is a true research engineering role — your infrastructure directly enables the science.
- Small team, outsized leverage. Every architectural decision you make shapes how intelligent systems behave in the world.
- The mission is the point. This lab exists to get superintelligence right. If that drives you, there is no comparable environment.
Compensation & Perks
$300,000 – $600,000+ total compensation depending on level, with meaningful equity at an organization of this caliber. This is not typical startup equity — this is a lab building toward one of the most consequential outcomes in technology.
Interview Process
Intro Call with Aionia
Role alignment, background overview, and candidate brief review
Technical Screen
Systems depth, distributed infrastructure, and agent/eval thinking
Practical Assessment
Systems design or project-based challenge relevant to the execution layer
Founder / Team Interview
Mission alignment, research culture fit, and vision conversation
