ML Ops Engineer (100% Remote)

Seed stage startup offering salary and generous equity - work remote from anywhere!

  • San Francisco, CA
  • $150,000 - $200,000
  • Managed by Jobot Pro: Karyn Spies
Easy Apply Now

A bit about us:

Company at the forefront of the intersection between artificial intelligence (AI) and gaming, offering innovative solutions and immersive experiences to our users. Our AI character studio allows users to create, interact with, and share AI characters which have varying skills - from seeing everything on your screen to interacting with these characters in AR/VR. We have been growing steadily and now it's time to 100x our growth.

Why join us?

You'll be part of a creative team that's passionate about pushing the boundaries of AI and gaming. We have been pushing the boundaries of what's possible and have been covered by the leading press for our efforts.

We offer a collaborative environment where your ideas and leadership can shape the future of our products and community. With us, you'll have the opportunity to make a significant impact in an exciting, fast-growing industry.

Job Details

Responsibilities:
  • Architecting and deploying open source models at scale.
  • Model and inference optimization and server configuration and deployment with an emphasis on stability, scalability, and speed of inference/generation.
  • Open source model evaluation.
  • Building fine-tuning datasets and LoRAs.
  • Providing best practices and executing POC for automated and efficient model operations at scale.

Qualifications:
  • 5+ years of experience in ML Ops or a related field.
  • 3+ years of experience in managing machine learning projects end-to-end.
  • Recent focus (at least 18 months) on text generation models at scale.
  • Monitoring Build & Production systems using automated monitoring and alarm tools.
  • Knowledge of machine learning frameworks: TensorFlow, PyTorch, Keras, Scikit-Learn.
  • Experience with running large scale inference services at scale: NVIDIA Triton, TGI, vLLM
  • Experience with container technologies (Docker, Kubernetes, EKS, ECS).
  • Experience with multiple cloud providers (AWS, GCP, Azure, etc).
  • Experience in distributed computing.
  • Experience with a wide range of ML Models (Text Generation, Classification, OCR, Object Detection, Stable Diffusion)
  • Proven track record of building and managing ML pipelines, including data preparation, model training, deployment, and monitoring.
  • Solid foundation in DevOps principles and practices with experience in CI/CD pipelines for ML deployments.
  • Solid programming skills with at least some of the following: C, C++, Rust, Python, JavaScript
  • Experience with model performance monitoring and troubleshooting.
  • Passion for ML and its potential to solve real-world problems.

Easy Apply Now
Easy Apply Now
Job Details
Managed by Jobot Pro
Location
San Francisco, CA
Job Type
Permanent
Compensation
$150,000 - $200,000