Scaleway

DevOps – Inference squad (AI Tribe)

21 May 2024
Apply Now
Deadline date:
£50000 - £93000 / year

Job Description

About the job
At Scaleway, our AI Tribe is at the forefront of deploying cutting-edge AI technologies. Our Inference Squad specializes in delivering LLM-as-a-Service, facilitating both dedicated and mutualized GPU resources across various client applications. As a DevOps Engineer on this dynamic team, you will play a pivotal role in ensuring the seamless development, deployment, operation, and scaling of our AI products.You will collaborate with a team of engineers and developers dedicated to our AI service offerings, focusing on the infrastructure that supports large language models and various other ML models. Your expertise will drive the optimization and enhancement of our AI deployment pipelines and contribute significantly to our mission of providing robust AI solutions..

Minimum Qualifications

  • Strong coding skills in Golang.
  • Knowledge of GitOps best practices.
  • Good experience with Kubernetes and container orchestration systems.
  • Solid background in cloud computing and working with major cloud providers (AWS, Azure, GCP).
  • Experience in monitoring and ensuring the reliability of serverless architectures.
  • Demonstrated ability to maintain high standards of code quality and system security.
  • Effective communication skills in English.

Preferred Qualifications

  • Experience with AI model-serving technologies and frameworks.
  • Proficiency in infrastructure as code tools such as Terraform.
  • Knowledge of GPU-based computing and optimization.
  • Experience contributing to or managing open-source projects.
  • Advanced skills in network configuration, system administration, and optimization.
  • Very good command of French is advantageous.

Responsibilities

  • Develop and operate the backend side of our AI products portfolio.
  • Contribute to our products’ architecture design.
  • Design and implement robust CI/CD pipelines tailored for AI applications, focusing on automation, scalability, and security.
  • Monitor and optimize the performance of the AI services, ensuring they meet the strict demands of latency and throughput.
  • Lead the integration of new technologies and frameworks that support the advancement of our AI capabilities.
  • Collaborate across teams to enhance the scalability and reliability of our AI solutions.
  • Ensure the security and compliance of the infrastructure according to industry standards.
  • Actively participate in the planning and execution of infrastructure strategies, aiming for continuous improvement in all areas of deployment and operations.

LocationThis position is based in our offices in Paris or Lille (France)
Recruitment Process  Screening call – 30 mins with the recruiter Manager Interview – 45 minsHome AssignmentTeam InterviewHR Interview – 45 minsOffer sent