[Remote] VP of Cloud Engineering, Operations & Delivery
Note: The job is a remote job and is open to candidates in USA. EXL is seeking an experienced VP of Cloud Engineering, Operations & Delivery to lead their cloud practice across various industry verticals. This role involves driving strategy, client relationships, and organizational growth while ensuring high-performing teams deliver complex multi-cloud solutions. The ideal candidate will blend technical authority with executive leadership to transform cloud infrastructure using AI and intelligent automation.
Responsibilities
- Serve as the senior technical authority for cloud architecture and infrastructure decisions across AWS, Azure, and GCP
- Advance and mature our Infrastructure as Code (IaC) practices — Github, Jenkins, Terraform, Qualys, Sonarqube, etc. — ensuring consistency, security, and scalability across client environments
- Provide meaningful technical guidance and architectural direction to engineering teams — going beyond high-level oversight to engage substantively on design decisions, standards, and delivery quality
- Guide adoption of cloud-native patterns including Kubernetes (EKS/AKS/GKE), serverless, CI/CD automation, and event-driven architecture
- Lead architecture reviews and serve as the escalation point for complex technical challenges
- Ensure security and compliance are embedded into infrastructure from the ground up — spanning IAM design, network segmentation, secrets management, and frameworks such as SOC 2, NIST, CIS, HIPAA, and PCI-DSS
- Champion the adoption of AI agents and multi-agent systems to transform how cloud infrastructure is built, operated, and optimized — moving teams from reactive, manual workflows to intelligent, autonomous execution
- Identify high-value opportunities to introduce agentic workflows into engineering operations — including infrastructure provisioning, incident detection and remediation, cost optimization, compliance monitoring, security response, and deployment pipelines
- Lead the evaluation and adoption of agentic AI frameworks and platforms (e.g., LangGraph, AutoGen, Amazon Bedrock Agents, Azure AI Agent Service, Vertex AI Agent Builder) to build purpose-built agents that extend the capabilities of our engineering teams
- Define governance, guardrails, and human-in-the-loop checkpoints for agentic systems operating in cloud environments — ensuring autonomous actions are safe, auditable, and aligned with client expectations
- Collaborate with engineering and solutions teams to design agentic delivery pipelines — where AI agents assist in code generation, IaC validation, drift detection, security scanning, and release orchestration
- Work with peer technology teams to identify process transformation opportunities — helping envision, roadmap and execute an agentic future state for cloud operations and engineering workflows
- Own the operational health of cloud environments across the client portfolio — including availability, performance, security posture, and cost efficiency
- Mature SRE practices across the organization: SLOs, error budgets, incident management, and blameless postmortems
- Drive FinOps discipline — optimizing cloud spend through right-sizing, commitment strategies, tagging governance, and anomaly detection — increasingly augmented by AI-driven insights and autonomous recommendations
- Define and enforce observability standards across logging, metrics, and tracing using Datadog and CloudWatch — and explore how agentic monitoring can move teams from alert fatigue to autonomous resolution
- Lead end-to-end delivery of cloud engineering engagements — from technical discovery and architecture through deployment, cutover, and steady-state operations
- Build scalable delivery frameworks, runbooks, and IaC-driven playbooks that can be applied consistently across verticals and client environments — and actively work to make those playbooks AI-executable over time
- Proactively identify technical risks and drive resolution before they become client issues
- Build, mentor, and retain a high-performing team of cloud engineers, DevOps engineers, SREs, and delivery managers — cultivating a team culture that embraces AI-augmented workflows as a force multiplier, not a threat
- Define clear career ladders, engineering standards, and technical growth paths that attract and retain top talent — including emerging skills in AI/ML infrastructure, prompt engineering, and agentic system design
- Foster a culture of engineering excellence, continuous learning, and genuine curiosity about what AI agents can unlock
- Communicate cloud strategy, delivery status, and technical decisions clearly to executive stakeholders — both internally and with clients
- Help clients articulate and develop their agentic transformation roadmap — translating the potential of AI agents into concrete, phased busines
Apply tot his job Apply To this Job