MLOps Engineer (PyTorch/DevOps)
We're leading the way in AI-Compression, having the world's first AI-Codec running in real-time on mobile devices. Our team comes with 50+ years of combined research experience with over 40+ filed patents.
Deep Render has recently raised its Series A funding round from top-tier investors and is looking to double/triple its current 20-person team. We're in commercial engagements with some of the largest Big Tech companies in the world and expect hundreds of millions of people to use the Deep Render AI Codec by 2024.
The MLOps team at Deep Render builds novel and efficient pipelines, platforms and tools to accelerate and simplify the complex machine-learning workflow of our researchers. As an MLOps Engineer, you will play a pivotal role in the seamless operation of machine learning infrastructure and pipelines. Your responsibilities will involve maintaining and optimising critical infrastructure components. Your expertise in PyTorch will be crucial, as you troubleshoot and resolve PyTorch-related issues and bottlenecks. Implementing DevOps practices, including Kubernetes and Docker, will be second nature to you, enhancing quality control and efficient model distribution. Supporting the existing ML stack, you'll collaborate with researchers and engineers to enhance operational efficiency and create custom tooling to streamline the model development process. Your role is at the intersection of infrastructure, PyTorch, DevOps, automation, and custom tooling, making you the linchpin for a robust and efficient ML ecosystem.
- Infrastructure code: Maintain and optimise infrastructure components critical to research and production pipelines. Collaborate with cross-functional teams to ensure the scalability and reliability of the infrastructure.
- PyTorch maintenance: Be the subject matter expert in PyTorch, ensuring the smooth operation of PyTorch-based models. Troubleshoot and debug PyTorch-related issues and bottlenecks.
- DevOps: Implement DevOps practices to ensure quality control, testing, and the efficient distribution of machine learning models. Utilise DevOps tools and frameworks, including Kubernetes and Docker, to streamline deployment processes.
- Automation: Build automations to enhance efficiency in the development and deployment of machine learning models.
- ML Stack Support: Support the existing ML stack by providing technical expertise and ensuring its operational efficiency. Collaborate with research scientists and engineers to enhance the ML infrastructure and stack.
- Custom Tooling: Develop custom tools and utilities to streamline the model development and deployment process.
- MSc in Computer Science or a related field (Mathematics, Physics, Engineering)
- Experience writing production code in Python/PyTorch
- Knowledge of DevOps technologies and stacks, including Kubernetes and Docker.
- Experience in building and maintaining data pipelines for production-ready systems.
- Familiarity working with machine learning models.
- A minimum of 3 years of experience, ideally in a machine learning start-up environment