Griot, Inc.

Artificial Intelligence Engineer

25 April 2026
Apply Now
Deadline date:

Job Description

About Us

We are an early-stage team building next-generation speech intelligence systems for underserved, oral-first language environments.

We are looking for a Founding AI Engineer who wants to build foundational systems from the ground up — not iterate on incremental features, but help define the architecture, training strategy, and production stack behind a new class of speech-native AI.

The Role

As our Founding AI Engineer, you will own the core speech modeling architecture and deployment strategy.

This is not a research-only position. This is end-to-end system ownership.

You will design, train, evaluate, and deploy models — and ensure they are production-ready, scalable, and appropriate for real-world use.

Key Responsibilities

  • Design, train, and evaluate text-to-speech models from scratch.
  • Build and optimize GPU-based training pipelines (single and multi-GPU).
  • Work directly with raw speech/audio datasets, including low-resource environments.
  • Improve pronunciation accuracy, tone, prosody, and perceptual naturalness.
  • Experiment with self-supervised acoustic encoders (HuBERT / wav2vec2-class models).
  • Explore discrete acoustic tokenization and speech-native representation learning.
  • Contribute to early-stage sound-to-sound architecture exploration.
  • Deploy trained models into production infrastructure.
  • Optimize models for inference latency, cost efficiency, and on-device execution.
  • Design thoughtful evaluation frameworks (MOS, perceptual scoring, robustness metrics).
  • Architect the broader system around the model — ensuring reliability, scalability, and maintainability.

Qualifications

Required

  • 3+ years of experience in Machine Learning.
  • Strong proficiency in PyTorch.
  • Experience training models on GPU infrastructure.
  • Experience deploying ML systems into production.
  • Strong system design instincts and architectural judgment.
  • Ability to make pragmatic engineering tradeoffs between research ambition and production reality.
  • Ability to build robust, production-ready systems — not just research prototypes.
  • Comfort operating independently in early-stage environments.

Preferred

  • Experience with speech models (ASR, TTS, Speech Language Models).
  • Experience with self-supervised learning.
  • Experience with diffusion or flow-matching generative models.
  • Experience with discrete acoustic token pipelines.
  • Experience with ONNX, quantization, or model optimization.
  • Experience working with low-resource datasets.
  • A PhD is not required.
  • Big Tech background is not required.
  • Ownership, judgment, and execution matter most.

Why Join

  • Foundational technical ownership from day one.
  • Meaningful early-stage equity.
  • Direct collaboration with founders.
  • Real-world production deployment.
  • Opportunity to help shape both the model and the system around it.

Application Process

Please send:

  • Your GitHub.
  • A short note outlining your ML experience.
  • Links to any relevant speech or audio-related projects.
  • We are reviewing candidates on a rolling basis and moving thoughtfully but quickly.