Senior Manager, Infrastructure & Security
Senior Manager, Infrastructure & Security Reports to: Senior Director of Engineering, Platform & Infrastructure What You'll do Lead Cloud Infrastructure & SRE Lead Cloud Infrastructure & SRE: AWS (ECS, EKS, networking, VPCs, IAM), databases (PostgreSQL, Kafka, DynamoDB), IaC standards, SLOs, error budgets, DR/BCP, and a sustainable on-call program Own CI/CD & Developer Experience: GitLab CI across ~150 repos: build speed, reliability, test signal, release safety, environments, and paved roads for new services Drive Observability & Cloud Cost: high signal logging/metrics/tracing/alerting stack, SLO instrumentation, cloud cost attribution by team/service, FinOps practice Lead Application & Cloud Security end-to-end: secure SDLC, SAST/DAST/SCA, vuln triage and SLAs, secrets hygiene, IAM, CSPM Partner with IT to support compliance program, customer security questionnaires, and pen-test remediation Own Vendor & Tooling Portfolio: build-vs-buy decisions, vendor selection, POCs and lifecycle across observability, CI/CD, security, and incident management Grow and Develop the Team Hire, onboard, and mentor engineers. Actively participate in recruiting; interview candidates, provide calibrated feedback, and raise the bar. Conduct regular 1:1s, provide actionable feedback, and support career development for each team member. Build a culture of urgency, ownership, customer empathy, and continuous improvement. What We're Looking For 7+ years of infrastructure or SRE engineering experience, with at least 3 years managing platform, infra, or SRE teams in a product-focused SaaS environment. Production AWS ownership — you've run HA workloads on AWS at scale, covering high availability, governance, compliance, and a 24/7 on-call program. Infrastructure as code - you've built and scaled IaC practices using Terraform, and know enough about packaging tooling (Helm, Kustomize, or similar) to make good architectural decisions and hold the team to a high bar. CI/CD fluency — you've built or owned pipelines in GitLab CI, GitHub Actions, Jenkins, or similar, and know where they break under scale. Observability ownership - you've built or evolved an observability stack and made it materially better: alerting, SLOs, and distributed tracing for a real production system. Incident command - you've run customer-facing major incidents both as incident commander and as the manager supporting the team. You know the difference between those roles. Application security - you've embedded security into the SDLC: SAST/DAST/SCA, vuln management with SLAs, and threat modeling as a team practice, not a checkbox. Cloud security posture - you own IAM, secrets management, network segmentation, and CSPM. You've hardened an AWS environment and can explain the tradeoffs you made. Compliance partnership - you've worked alongside IT on compliance programs (SOC 2, ISO 27001, or similar) and know how to translate requirements into engineering work. Deeply hands-on - you've kept your technical edge as your teams have grown, and you're still close enough to the work to spot problems early and earn credibility with your engineers. Bias for action - you default to doing, not delegating. When something is broken, you dig in. When a process isn't working, you fix it. You don't wait to be told. Active AI adoption - you use AI-assisted tools (Claude Code, Copilot, Cursor, Windsurf, or similar) as part of your daily workflow for incident diagnosis, IaC authoring, runbook generation, or log and metric analysis. You can describe specifically how, not just that you do. Preferred B2B SaaS experience with Salesforce coupling — managed packages, org provisioning, integration patterns CPQ, billing, or revenue domain — high-stakes correctness, audit-relevant data Experience standing up a Platform Engineering practice or internal developer platform (Team Topologies, paved roads) Tech Stack Salesforce Platform: Apex, Lightning Web Components (LWC), SOQL, Flows, managed packages Backend: Java (Spring Boot), Node.js, REST APIs, GraphQL Frontend: React 16, TypeScript, Material-UI, Webpack Database: PostgreSQL (AWS RDS), Salesforce SOQL/SOSL Infrastructure: AWS (Lambda, SQS, S3), Kafka, Docker, GitLab CI/CD AI Tooling: Claude Code, Windsurf, Copilot (used daily across engineering) Collaboration: Jira, Slack, Gem, Fellow, Confluence Apply To This Job