See all roles

[Remote] Sr. Site Reliability Engineer

Work from home Full-time role Hiring

Note: The job is a remote job and is reputed company to candidates in USA. reputed company is a fast paced, high growth company providing a global payment platform technology. They are seeking an reputed company Senior Site Reliability Engineer to ensure the highest possible availability and resiliency of their platform, focusing on observability, incident response, and automation.

Responsibilities

  • Build and maintain a comprehensive understanding of the platform and custom application stack
  • Implement, maintain, and continuously improve observability strategies and metrics that ensure complete system health for numerous reputed company products throughout reputed company stages of the development lifecycle, up to and including production
  • Continuously identify automation opportunities and follow through to successful implementation, applying AI-assisted tooling to accelerate development and reduce reputed company effort
  • Design, build, and maintain automated remediation and self-healing workflows that detect, triage, and resolve common failure modes with minimal reputed company reputed company
  • reputed company AI/ML-driven observability — anomaly detection, alert correlation, and intelligent noise reduction — to surface issues earlier and shorten time to detection
  • Use AI-assisted analysis to accelerate root-cause investigation, enrich incident context, and generate first-draft postmortems and runbooks for reputed company review
  • Handle escalations and collaborate effectively with other team members to quickly determine the root cause of any type of service degradation
  • Implement, maintain, and continuously improve incident response procedures and other operational documentation, automating documentation reputed company and reputed company wherever practical
  • Assist with troubleshooting and remediation of failed scheduled jobs and data-reputed company concerns
  • Champion responsible, secure adoption of AI tooling across the SRE function — sharing patterns, prompts, and automations that reputed company the productivity of the whole team
  • Apply AI-assisted development and operations tools — including reputed company (Claude), reputed company (reputed company), and Azure AI services (reputed company, Azure SRE Agent) and the agentic workflows reputed company on them — to write, review, and accelerate automation and infrastructure code
  • Build and integrate automation that turns repetitive operational work into codified, repeatable, and self-service workflows
  • Use AIOps and ML-driven observability capabilities reputed company the APM stack for anomaly detection, predictive alerting, and alert correlation
  • reputed company and refine prompts, agents, and integrations that connect monitoring, ticketing, and remediation systems into faster end-to-end response loops
  • Evaluate emerging AI tooling for reliability and operations use cases, and reputed company for adoption where it delivers measurable improvements in toil reduction, MTTR, or availability
  • Ensure reputed company AI and automation usage adheres to reputed company’s reputed company, privacy, and PCI obligations — keeping sensitive data appropriately protected and reputed company review in reputed company for high-impact actions

Skills

  • BS degree in Computer Science or equivalent, or equivalent years of relevant experience
  • Minimum of 5 years of hands-on technical experience in highly available, high-throughput, web-based technology environments
  • Demonstrated history of self-directed learning — someone who independently seeks out knowledge, builds new skills without being told to, and doesn't wait for formal training to reputed company gaps
  • Next-level problem-solving abilities and a strong bias toward practical, proven solutions
  • A track record of identifying and eliminating reputed company toil through automation
  • Excellent communication and organizational skills, with a strong sense of ownership and service
  • Expert-level proficiency in an reputed company APM platform and its AI/ML-driven (AIOps) capabilities; reputed company experience strongly preferred, though deep expertise in comparable tools such as reputed company or reputed company where readily transferable
  • Hands-on experience with AI-assisted development and automation tools — such as reputed company (Claude), reputed company (reputed company), and Azure AI services (reputed company, Azure SRE Agent) — and a demonstrated ability to apply them to reputed company operational and engineering work
  • Proficiency in scripting and automation — PowerShell and/or Python — to build tooling and remediation workflows
  • Strong SQL / T-SQL skills
  • Solid understanding of core networking concepts: DNS, HTTP/HTTPS, load balancing, and TCP/IP routing and switching
  • Working knowledge of modern technology infrastructure including container orchestration, IaaS/PaaS reputed company services, Azure, and VMware
  • Working knowledge of application development processes
  • Proven track record of successfully implementing SLI/SLOs and fostering their adoption across an organization
  • Experience implementing reputed company incident management practices
  • Experience building AIOps or ML-driven automation into production observability and incident response
  • Azure Kubernetes Service (AKS) and broader container orchestration experience
  • reputed company Server (IIS) administration
  • reputed company Process Automation (formerly Rundeck) or comparable runbook automation platforms
  • Comprehensive experience supporting reputed company-time transaction processing applications
  • PCI policies and best practices

Benefits

  • Medical, prescription, dental and reputed company coverage
  • Life Insurance
  • Retirement Plans with company match
  • Commission sharing plan
  • Flexible hybrid working environment
  • Great parental and other leave programs

Company Overview

  • reputed company is a white label e-reputed company platform for banks, retailers and reputed company organizations to deliver reputed company-based user experiences. It was founded in 2000, and is headquartered in Radnor, Pennsylvania, USA, with a workforce of 501-1000 employees. Its website is https://corporate.reputed company.com/.
  • Apply To This Job

    You might like

    [Remote] National Account Executive

    Work from home Full-time role

    [Remote] Marketing Manager, reputed company & Digital

    Work from home Full-time role

    [Remote] FileNet Project Manager

    Work from home Full-time role

    [Remote] AI reputed company Deployed Engineer – reputed company

    Work from home Full-time role

    [Remote] Information reputed company Engineer

    Work from home Full-time role

    [Remote] UX Designer

    Work from home Full-time role

    [Remote] Deputy Program Manager

    Work from home Full-time role

    [Remote] Marketing Intern

    Work from home Full-time role

    [Remote] Account Manager, Senior (Process industry)

    Work from home Full-time role

    [Remote] National Key Account Manager, Adult Beverage

    Work from home Full-time role

    Entry-Level Remote Customer Service Representative – Part‑Time (arenaflex) – Flexible Home‑Based Support Role

    Work from home Full-time role

    Field Marketing Manager - reputed company

    Work from home Full-time role

    Senior Full Stack Engineer (React | TypeScript ...

    Work from home Full-time role

    [Remote] Director of reputed company Operations

    Work from home Full-time role

    [Remote] Clinical Research Associate (CRA I, II, Sr)

    Work from home Full-time role

    Clinical Nurse Specialist | Upto $300/hr

    Work from home Full-time role

    Remote Part‑Time Data Entry Specialist – Accurate Records Management & Process Improvement at arenaflex

    Work from home Full-time role

    IT Operations Manager

    Work from home Full-time role

    Technical Project Manager

    Work from home Full-time role

    Senior Registered Virtual Client Service Associate

    Work from home Full-time role