Member of Technical Staff | AI Systems & Benchmarking | SF – AIONIA
AI Systems · Benchmarking · San Francisco

Member of Technical Staff

Most AI roles focus on building AI systems. This one focuses on something more fundamental — understanding how AI systems actually perform in the real world, and defining how that performance is measured. You’ll operate at the intersection of systems, analysis, AI, and strategy.

$130K – $220K + Equity On-Site · San Francisco AI Systems & Benchmarking Early-Stage Equity
Apply via AIONIA

Not Just Building AI — Understanding It

This is not a typical AI role. You won’t just build AI systems — you’ll work on the problems that define how modern AI is evaluated, understood, and deployed. You’ll shape the frameworks, datasets, and metrics that the industry uses to make sense of model and agent behavior at scale.

“You’ll operate at the intersection of systems, analysis, AI, and strategy — on problems that most engineers never get to touch.”

This Role Tends to Resonate With

This role attracts a specific type of person. You might be a fit if you:

Think deeply about how AI systems behave, not just how to build them
Have built real systems — APIs, pipelines, integrations, or products
Enjoy breaking down complex problems into structured, reusable frameworks
Are comfortable moving between technical detail and big-picture thinking
Have used AI tools in practice — LLMs, agents, and automated workflows
Still enjoy coding and working hands-on at the implementation level

What You’ll Do

  • Design and build AI evaluation and benchmarking systems
  • Analyze how models and agents perform across real-world use cases
  • Develop frameworks, datasets, and metrics to measure AI capabilities
  • Translate complex system behavior into clear, actionable insights
  • Work closely with engineers, product teams, and external partners
  • Contribute directly to product direction and overall strategy

What Matters Most

Must-Have — Non-Negotiable
  • Strong Python proficiency with recent, hands-on production work
  • Experience with data analysis and building analytical frameworks
  • Ability to operate as a technical generalist across systems and domains
  • Clear, structured communication — you translate complexity into insight
  • High ownership and comfort operating in ambiguous environments
Backgrounds That Tend to Work Well
  • Product-minded software engineers with meaningful AI exposure
  • Engineers who’ve built systems involving LLMs, pipelines, or automation
  • Technical professionals from top-tier consulting with real coding ability
  • Founding or early engineers with broad, cross-functional ownership
Not a Fit If You’re Primarily

The following profiles are unlikely to thrive in this role.

  • Focused exclusively on training models or academic research
  • Removed from hands-on coding and implementation work
  • Purely management-focused without technical execution

Why This Role Is Different

Define how AI is measured, not just built

The work here shapes the frameworks and metrics the industry relies on to understand model and agent behavior — a rare, foundational problem space.

Direct exposure to cutting-edge AI systems

You’ll work with frontier models and real-world deployment contexts that most engineers don’t have visibility into.

High ownership in a small, fast-growing team

The team is expected to scale rapidly. Joining now means meaningful equity upside and the ability to shape how the function is built.

A rare blend of technical depth, strategy, and impact

This isn’t a pure engineering role or a pure strategy role. It’s both — for someone who can operate at that intersection.