Senior MLOps Engineer Job at DeepRec.ai, San Jose, CA

U2JRRmZYRlpFUmsrTWhCV2xLSE1BeGVndlE9PQ==
  • DeepRec.ai
  • San Jose, CA

Job Description

Senior MLOps Engineer

We are hiring for an MLOps Engineer for a fast-moving AI startup who are building a worldclass AI-powered video platform.

We are looking for a skilled and hands-on MLOps Engineer to join their growing team. You will play a critical role in deploying, scaling, and maintaining their machine learning infrastructure, supporting a range of tools that enable the controlled generation of high-quality animated videos.

Key Responsibilities

  • Design, deploy, and maintain scalable training and data-processing pipelines on distributed compute clusters (e.g., Slurm, Kubernetes, or cloud-native equivalents).
  • Optimize inference systems for latency and cost in a production setting.
  • Collaborate closely with ML researchers and engineers to productionize deep learning models.
  • Implement robust monitoring, logging, and alerting systems for model performance and infrastructure reliability.
  • Automate model testing, validation, and deployment processes across staging and production environments.
  • Ensure efficient usage of compute resources, including GPU clusters, and help identify bottlenecks or cost-saving opportunities.

Requirements

  • Proven experience in MLOps, ML infrastructure, or related roles.
  • Deep expertise in deploying and maintaining ML training pipelines on distributed systems.
  • Strong knowledge of inference optimization techniques, especially in reducing latency and cost at scale.
  • Proficiency with cloud platforms (AWS, GCP, Azure) and orchestration tools (Kubernetes, Docker).
  • Experience working with GPU scheduling, distributed training (e.g., PyTorch DDP), and model serving frameworks (e.g., Triton, TorchServe).
  • Familiarity with CI/CD for ML workflows.
  • Strong Python skills and experience with ML/DL frameworks like PyTorch or TensorFlow.

Bonus Points

  • Experience working in the creative media or animation industry.
  • Exposure to video processing, generative AI, or large-scale content production systems.
  • Experience collaborating with research teams or integrating research code into production pipelines.

Please apply for more information

Job Tags

Similar Jobs

Platinum West Corporation Inc

CDL A Truck Driver Job at Platinum West Corporation Inc

Over the road driving routes.

Breckenridge Ski Resort

Team Lead Driver Job at Breckenridge Ski Resort

 ...with teammates and guests from around the world. With 40+ resorts across 3 continents, you can join our team for a season or stay...  ...compensation rates in the industry, free pass(es) along with free ski and snowboard lessons, 40% retail discounts, the chance to grow... 

Great Oaks Learning, LLC

Special Education Teacher Job at Great Oaks Learning, LLC

 ...afternoons a week and accept students with any and all learning needs. Most of our students are neurodivergent with differences ranging...  ..., and given the tools to succeed. Started by a licensed special education teacher/administrator who saw a great need, we are growing... 

Wilbur Curtis Company

Quality Technician Job at Wilbur Curtis Company

 ...worldwide. Chances are you have enjoyed a cup of coffee brewed by one of our products during your morning coffee run at McDonalds, Dunkin Donuts, Starbucks, or Tim Hortons. SEB Professional is a subsidiary of Groupe SEB, a large French consortium and the worlds... 

Claythis

Full Stack Founding AI Engineer Job at Claythis

 ...diffusion models and 3D GenAI models. Optimize and integrate open-source tools for the 3D workflow, including modeling, posing, rigging, animating, and rendering. Collaborate with researchers and designers to prototype and implement new AI models, features, and UIs...