AI lab · Accra, Ghana

Models, voices, and data
for the languages AI left behind.

Neriqlabs is an AI research and data lab. We build speech-to-text, text-to-speech, and large language models for low-resource languages — and we collect, license, and clean the training data for the cultures and domains the global AI industry has overlooked.

See the research Get in touch →

What we do

Two engines that need each other. Models without their language's data are deaf; data without models stays unread.

Engine 01 · Models

Production-grade, efficient, deployable.

We design speech and language models for production deployment — small enough to run where customers are, fast enough for real conversation, and engineered for the languages and accents the largest models miss. Efficiency and accessibility are first-class principles, not afterthoughts.

Engine 02 · Data

The training corpus that doesn't exist yet.

For most of the world's languages, no usable training corpus has ever been built. We design the collection pipelines, partner with native speakers, and produce the licensed, cleaned, evaluation-graded data the global AI industry has skipped — for our own models, and for partners building in markets the global stack ignores.

Live research

First model out the door, real metrics, public weights.

First model live

MMS-Twi v1

Twi WER

54.04

Twi CER

17.69

Languages targeted

5 → pan-Africa

MMS-Twi v1 — our first ASR model for Asante Twi. Beats publicly reported baselines on both word- and character-error rate and streams under ~35 ms per segment. Model card, samples, and full evaluation methodology are available to qualified partners under licence.

Request access →huggingface.co/neriqlabs →

Why

The global AI stack works in twelve languages.
Seven thousand languages are spoken on Earth.
That gap is the work.

Models, voices, and datafor the languages AI left behind.

Two engines that need each other. Models without their language's data are deaf; data without models stays unread.

Production-grade, efficient, deployable.

The training corpus that doesn't exist yet.

First model out the door, real metrics, public weights.

Models, voices, and data
for the languages AI left behind.