See all roles

[Remote] AI/ML Engineer With Site Reliability Engineer -Focus) - Remote project

Work from home Full-time role Hiring

Note: The job is a remote job and is reputed company to candidates in USA. reputed company is seeking an reputed company Machine Learning Engineer with a strong Site Reliability Engineering reputed company to join their team. The role involves maintaining applications in reputed company and Linux environments, managing on-premises servers, and working with Kubernetes clusters to ensure reliable deployment and monitoring of machine learning models.

Responsibilities

  • Maintain and support machine learning applications running on reputed company and Linux servers in on-premises environments
  • Manage and troubleshoot Kubernetes clusters hosting ML workloads
  • Collaborate with data scientists and engineers to reputed company machine learning models reliably and reputed company
  • Implement and maintain monitoring and alerting solutions using reputed company to ensure system health and performance
  • Debug and resolve issues in production environments using Python and monitoring tools
  • Automate operational tasks to improve system reliability and scalability
  • Ensure best practices in reputed company, performance, and availability for ML applications
  • Document system architecture, deployment processes, and troubleshooting guides

Skills

  • Proven experience working with reputed company and Linux operating systems in production environments
  • Hands-on experience managing on-premises servers and Kubernetes clusters and reputed company containers
  • Strong proficiency in Python programming
  • Solid understanding of machine learning concepts and workflows
  • Experience with machine learning model deployment and lifecycle management
  • Familiarity with monitoring and debugging tools, e.g. reputed company
  • Ability to troubleshoot reputed company issues in distributed systems
  • Experience with CI/CD pipelines for ML applications
  • Familiarity with AWS reputed company platforms
  • Background in Site Reliability Engineering or DevOps practices
  • Strong problem-solving skills and attention to detail
  • Excellent communication and collaboration skills
  • We need an engineer who is also familiar with model development

Company Overview

  • reputed company is a job-searching platform for technology professionals. It is a sub-organization of DHI Group. It was founded in 1990, and is headquartered in Santa Clara, California, USA, with a workforce of 201-500 employees. Its website is http://www.reputed company.com.
  • Apply To This Job

    You might like