Senior Site Reliability Engineer

Remote, USA

Posted Jun 13, 2026

Full-time

AI is transforming how every company operates, but most enterprises are stuck. They want to move fast with AI agents, tools, and workflows, but they can't do it safely. We're fixing that.

Our team built AI Actions for OpenAI, shipped Zapier Agents to millions of users, and launched the first remote MCP server with Anthropic. The co-creator of MCP is on our cap table. We helped establish the protocol, and now we're building the platform enterprises need to actually use it.

Runlayer is one platform for MCPs, Skills, and Agents. Purpose-built security, fine-grained governance, and complete observability so organizations can push AI forward across the entire company without the risk. We raised $11M from Khosla Ventures and Felicis, and customers include Gusto, Instacart, and Opendoor.

We're a team of 25, mostly engineers, shipping fast. If you want to work at the center of how AI gets things done, this is the moment.

As our Site Reliability Engineer, you'll own the reliability, performance, and scalability of Runlayer's infrastructure as we grow to serve enterprise customers across cloud and on-prem environments.

Why You'll Thrive Here

Impact: Build the infrastructure foundation for the enterprise MCP platform, directly enabling AI adoption at scale

Excellence: Work closely with founders and a small, senior engineering team shipping fast in a high-growth environment

Ownership: Own reliability end-to-end, from database performance to incident response to CI/CD pipelines

What You'll Do

Own reliability and performance of our cloud infrastructure across AWS (ECS, Aurora, CloudWatch) and GCP

Manage and optimize Kubernetes clusters and container orchestration

Drive database reliability engineering, including performance tuning and scaling

Build and maintain CI/CD pipelines for rapid, safe deployments

Run incident response and on-call rotations

Partner with product engineers to design scalable, resilient systems

What We're Looking For

Strong AWS experience, particularly ECS, Aurora, and CloudWatch

GCP experience as we expand cross-cloud

Kubernetes and container orchestration expertise

DBRE experience with database performance tuning

CI/CD pipeline ownership and incident response experience

Background at a B2B SaaS company serving enterprise customers, ideally in infrastructure

Bonus Qualifications

Experience deploying and supporting on-prem or hybrid environments

Python backend familiarity (our platform is Python-based)

Experience at an early-stage or high-growth company

What We Offer

We provide a competitive package designed to attract and retain top talent who can work effectively with enterprise customers.

Competitive salary and equity — compensation that reflects your expertise and customer-facing responsibilities.

Paid time off — 4 weeks paid vacation, paid sick leave, and paid parental leave.

Professional development — budget for conferences, courses, and certifications in AI, enterprise software, and customer success.

Top-tier equipment — your choice of laptop and accessories to create your ideal work environment.

Health benefits — comprehensive health, dental, and vision coverage.

Customer interaction opportunities — work directly with innovative companies and see the immediate impact of your work.

Not quite the right fit? Reach out to careers@runlayer.com with details about your experience and interests.

Apply Now

Senior Site Reliability Engineer

What We Offer

More Remote Jobs