Senior Site Reliability Engineer
Shipping & handling responsibilities Design, scale, and secure infrastructure to stay ahead of business needs through fault-tolerant architecture design, performance testing, profiling, and tuning, and capacity planning Design, build, deploy, and maintain automation, monitoring, and alerting systems, as well as design, implement, and test disaster recovery solutions Ensure scalability and maintainability through microservices adoption, decoupling of concerns and data model, queuing of jobs and application layering Enhance and maintain our CI/CD pipeline for smooth and safe production releases via automated testing and verification Verify and ensure performance and correctness of systems in response time and throughput Participate in peer reviews and testing and contribute to automated test suites and in design reviews for new features, products, and systems Participate in an on-call rotation Your shipping requirements Experience developing, managing and troubleshooting highly available distributed systems, including operational experience with Kubernetes in a production environment Extensive expertise with at least one public cloud provider (AWS, GCP, Azure) Exceptional verbal, written, and interpersonal communication skills Interest in and understanding of best-in-class security practices, and automation and testing methods Familiarity with configuration and maintenance of common infrastructure components such as Redis, Elasticsearch, and Hadoop Deep understanding of customer needs and passion for customer success BS or MS degree in Computer Science or equivalent experience Bonus Advanced knowledge of managing and optimizing Postgresql server configuration 3+ years of experience in software development Experience with: Managing service meshes (e.g. Istio) Defining and monitoring Service-Level Objectives (SLOs) and Service-Level Agreements (SLAs) to ensure that systems meet reliability and performance targets; Monitoring Tools like New Relic, Prometheus, Grafana and/or Datadog OpenTelemetry knowledge for distributed tracing and metrics collection and experience on using it in production environments Managing Python and Golang applications in production Microservices architectures DevOps tooling such as Docker, Terraform, ArgoCD, ArgoWorkflows, CircleCI, Github Actions, New Relic, PagerDuty, etc AWS/Cloud services such as EKS, EC2, S3, Lambda, Route 53, CloudFront, Cloudflare, IAM, etc. Sail through the process: Here at Shippo, we celebrate inclusivity and are committed to creating equal access to opportunities for people from all backgrounds, perspectives and geographies. These values define who we are and everything we do. All qualified individuals are encouraged to apply. If you need assistance, or a reasonable accommodation during the application and recruiting process, please contact us at [email protected] Shippos in the wild: Our people, much like the packages we help ship, are all over the world. This means, through our remote-first program, “Shippos Everywhere”, our roles can be based anywhere in the US with the exception of Delaware, Nevada, Ohio, Oregon, Hawaii, New Mexico and West Virginia and many roles can be based internationally. For locations outside of the US and Ireland, the employment contracts are powered by Rippling.com. Apply To This Job