Dolby Laboratories
Sr. Staff ML Ops Technical Lead
Job Description
Join the leader in entertainment innovation and help us design the future. At Dolby, science meets art, and high tech means more than computer code. As a member of the Dolby team, you’ll see and hear the results of your work everywhere, from movie theaters to smartphones. We continue to revolutionize how people create, deliver, and enjoy entertainment worldwide. To do that, we need the absolute best talent. We’re big enough to give you all the resources you need, and small enough so you can make a real difference and earn recognition for your work. We offer a collegial culture, challenging projects, and excellent compensation and benefits, not to mention a Flex Work approach that is truly flexible to support where, when, and how you do your best work.
Dolby’s consumer entertainment and cinema businesses are bringing Dolby’s breakthrough technologies, powering the world’s top movies, TV shows, music, games, and live sports to more places around the world across a wider range of consumer experiences and devices.
We are seeking a talented Staff Machine Learning Operations Engineer to join the Consumer Entertainment Group, to help bring the next generation of spectacular audio and video experiences to market. You will partner closely with research and development to establish machine-learning best practices and tools that maximize training and use of resources.
MLOpsEngineer-Responsibilities”>Responsibilities
- Troubleshooting high-performance computing, storage and networks for machine-learning workloads.
- Collaborate with research, development and engineering to establish machine-learning and data management workflows and supporting tools and processes that maximize machine-learning activities and use of resources.
- Improve capabilities of data set exploration, transformation and overall data management of large to very large datasets.
- Partner with research and development to proactively iterate and fine-tune model training for best performance and efficient use of machine-learning resources.
- Collaborate with infrastructure teams physical compute, storage and network infrastructure experts to improve on-premise and cloud infrastructure.
- Improve use of cloud compute and storage for global research teams and manage within budget.
Education and Experience
- BS or MS degree in Computer Science or equivalent experience.
- 6+ years of professional practical hands-on experience in machine learning operations or equivalent.
- Comprehensive knowledge of AWS and infrastructure-as-code techniques.
- Advanced proficiency with Python, Terraform, Cloud Formation, Ansible, git and related.
- Experience leading a small team of machine-learning operations engineers with international distribution.
- Positive team leader with strong interpersonal skills to build team cohesion and rapport even from half a world away.
- Proficiency with machine learning and scaling workloads with both cloud and on-premise GPU server environments.
- Experience with managing and coordinating storage of large machine learning data sets.
- Proficiency in Kubernetes cluster design, deployment and management.
- Interest and understanding of industry trends in machine learning development techniques and tools and processes.
- Comprehensive knowledge of continuous integration and continuous release processes and tools
Recommended
- Exceptional understanding and practical experience in software and infrastructure configuration management with high-performance compute and storage and maximizing high-availability.
- Active collaborator to help build positive community with researchers, scientists and engineers around machine-learning operations and resources.
- AWS resource management and provisioning.
- Previous experience in system administration and infrastructure.
- Hands On Experience with:
- Conda, Python
- Ray cluster design, setup, provisioning and monitoring for high-availability.
- ML flow or similar
- High-performance file systems (lustre, beegeefs, Weka, or similar).
The Atlanta Area base salary range for this full-time position is $161,400-$197,200, which can vary if outside this location, plus bonus, benefits, and some roles may also include equity. Our salary ranges are determined by role, level, and location. Within the range, individual pay is determined by work location and additional factors, including job-related skills, competencies, experience, market demands, internal parity, and relevant education or training. Your recruiter can share more about the specific salary range and perks and benefits for your location during the hiring process.
Dolby will consider qualified applicants with criminal histories in a manner consistent with the requirements of San Francisco Police Code, Article 49, and Administrative Code, Article 12
Equal Employment Opportunity:
Dolby is proud to be an equal opportunity employer. Our success depends on the combined skills and talents of all our employees. We are committed to making employment decisions without regard to race, religious creed, color, age, sex, sexual orientation, gender identity, national origin, religion, marital status, family status, medical condition, disability, military service, pregnancy, childbirth and related medical conditions or any other classification protected by federal, state, and local laws and ordinances.