Site Reliability Engineer (SRE) – Dynatrace Remote (New York, USA)
This a Full Remote job, the offer is available from: New York (USA) Site Reliability Engineer (SRE) – Dynatrace Remote (New York, USA) Location: Remote (New York, USA) Experience Required: 6–10 Years Job Summary We are seeking an experienced Site Reliability Engineer (SRE) with strong expertise in Dynatrace and modern observability practices. The ideal candidate will be responsible for ensuring the reliability, scalability, and performance of enterprise applications and infrastructure across cloud and hybrid environments. This role requires hands-on experience with monitoring, automation, cloud platforms, CI/CD pipelines, and containerized environments.
Key Responsibilities
- Design, implement, and manage end-to-end monitoring solutions using Dynatrace.
- Configure alerting, dashboards, problem detection, and performance optimization strategies.
- Monitor application health, infrastructure performance, and user experience across distributed systems.
- Troubleshoot production incidents and perform root cause analysis for system and application issues.
- Collaborate with DevOps, Cloud, and Engineering teams to improve system reliability and operational efficiency.
- Automate operational tasks and monitoring workflows using scripting languages such as Python, Bash, or Shell.
- Support and optimize cloud-based environments on AWS, Azure, or GCP.
- Manage and troubleshoot Linux/Unix-based systems.
- Work with containerization and orchestration technologies including Docker and Kubernetes.
- Build and maintain CI/CD pipelines using tools such as Jenkins, GitLab CI/CD, or Azure DevOps.
- Ensure observability best practices across microservices and distributed architectures.
- Participate in on-call support and incident response activities as needed.
Required Skills & Qualifications
- 6–10 years of experience in Site Reliability Engineering, DevOps, or Production Support roles.
- Strong hands-on expertise in Dynatrace including monitoring, alerting, dashboards, and problem analysis.
- Solid understanding of observability, logging, and monitoring frameworks.
- Experience with cloud platforms such as AWS, Azure, or GCP.
- Strong knowledge of Linux/Unix systems administration and troubleshooting.
- Experience with Docker and Kubernetes in enterprise environments.
- Proficiency with CI/CD tools including Jenkins, GitLab, or Azure DevOps.
- Strong scripting and automation skills using Python, Bash, or Shell scripting.
- Understanding of microservices architecture and distributed systems.
- Excellent troubleshooting, analytical, and communication skills.
Preferred Qualifications
- Experience implementing SRE best practices and reliability engineering principles.
- Knowledge of Infrastructure as Code (Terraform, Ansible, etc.) is a plus.
- Exposure to enterprise-scale monitoring and cloud-native technologies.
- Relevant cloud or Dynatrace certifications are an advantage
This offer from "Georgia IT, Inc." has been enriched by Jobgether.com and got a 72% flex score. Apply tot his job Apply To this Job