Speech models that understand what you mean

The next generation speech lab building models that read tone, intent, and subtext so machines finally respond to what people actually mean.

Evaluations

State of the art at understanding people

Oruk models are purpose built for the human layer of speech. Here is how Resonance compares to frontier general purpose systems on two public benchmarks.

EmotionBench

Affect recognition accuracy

Identifying emotion, sarcasm, and concealed intent across 12k human labeled utterances.

94.2
Oruk Resonance
81.6
GPT 4o Audio
79.8
Gemini 2.0
76.3
Qwen2 Audio
71.2
SALMONN
68.9
Open SOTA

Prosody Bench

Prosodic feature F1

Decomposing intonation, intensity, rhythm, and stress against expert phonetician annotations.

91.7
Oruk Resonance
73.4
GPT 4o Audio
71.9
Gemini 2.0
70.1
Qwen2 Audio
66.8
SALMONN
64.5
Open SOTA

Powering voice teams at

RelatefyStanfordUnit AerospaceHexbandit

Understanding, not transcription

Same words. Different meaning.

Transcription tells you what was said. Oruk tells you what was meant. Select a phrase to see how the model reads beneath the surface.

Oruk analysisSarcasm · 0.92

Literal

Positive sentiment. The speaker is pleased about an update.

What they meant

Sarcastic frustration. The user is annoyed and likely overwhelmed.

Acoustic signal

Flat pitch, drawn out vowels, downward final contour.

What we understand

We hear the

sarcasm

when the words mean their opposite

Capabilities

A complete model of how humans really speak

Emotion & affect

Detect joy, doubt, anger, warmth and the subtle states in between across speakers and cultures.

Sarcasm & subtext

Read the gap between words and intent, the hallmark of human conversation.

Real time intent

Sub 200ms understanding so agents can respond in the rhythm of natural speech.

Context memory

Meaning that carries across a conversation through references, mood shifts, and history.

Prosody modeling

Pitch, pace, pauses and stress decoded as first class signal, not noise.

Private by design

On device options and strict data controls. Your voice never trains what you didn’t allow.

The lineup

One lab. A model for every conversation.

View documentation
Oruk ResonanceFlagship

Our most expressive model. Full emotional and contextual understanding for production voice agents.

  • Real time streaming
  • 40+ languages
  • Context window: full call
Get started
Oruk PulseFast

Low latency intent and sentiment for high volume routing, triage, and live assistance.

  • <120ms latency
  • Intent + sentiment
  • On device ready
Get started
Oruk AtlasResearch

Frontier model for deep affect research, fine grained prosody, and cross cultural study.

  • Fine grained affect
  • Prosody decomposition
  • Research preview
Get started
98.4%
Intent accuracy on spontaneous speech
<120ms
Time to first understanding
40+
Languages and dialects
12M
Hours of expressive speech modeled