# oruk — full LLM context > Single concatenated reference. Keeps `llms.txt` as the curated index and > dumps every doc page worth reading inline here. ~12 kB, well within any > modern context window. Last updated: 2026-04-28. The content here mirrors the live website (/docs, /methodology, /about, /sources, /pricing). When in doubt, the canonical source is the page on oruk.ai — every page is server-rendered and reflects the current behaviour. --- # 1. Quickstart oruk is a live broadcast intelligence API. It listens to ~200 live radio, TV, social, and structured feeds in real time and publishes corroborated news events. Every story includes a stable id (`evt_...`), headline, body, summary, primary category, multi-category list, topics, urgency, impact, confidence, event location, and a corroboration block with the count of independent sources. Base URL: `https://api.oruk.ai` Public website: `https://oruk.ai` System health: `https://api.oruk.ai/health` Sign up: . Generate API keys at . First call (no key required): ``` curl https://api.oruk.ai/v1/stories/feed?limit=10 ``` First filtered call (with a key): ``` curl -H "X-API-Key: ork_xxxxxxxx" \ "https://api.oruk.ai/v1/stories?category=conflict&min_impact=7&limit=20" ``` --- # 2. Authentication Pass your API key on each request: - `X-API-Key: ork_xxxx` (preferred) - `Authorization: Bearer ork_xxxx` - `?api_key=ork_xxxx` (only for SSE / EventSource clients that can't set headers) Public read endpoints (no key, no quota): `GET /health`, `GET /v1/health`, `GET /v1/stories/feed`. Everything else under `/v1/*` requires a key. Tier matrix: | Tier | Calls/mo | API delay | Keys | SSE | Per-min | | --- | --- | --- | --- | --- | --- | | free | 100 | 5 min | 1 | not included | 30 req/min | | pro ($12/mo) | 1,000 | none | 2 | not included | 60 req/min | | legacy | 1,000 | none | 2 | not included | 60 req/min | | trader ($50/mo, invite) | 10,000 | none | 2 | real-time | 300 req/min | | developer ($100/mo) | 10,000 | none | 2 | real-time | 300 req/min | | enterprise | 1M+ | none | 100 | real-time | custom | The live wire on https://oruk.ai/ is real-time and free for everyone (no signup, no delay, no credit card). Paid tiers exist for the programmatic API (REST + SSE), real-time API responses, and higher quotas. Annual billing is two months free vs. monthly on Pro and Developer. --- # 3. Endpoints ## GET /v1/stories/feed (public) Pre-built feed of the latest stories. No auth, no quota. Powers the public wire on oruk.ai and the no-key fallback in oruk-mcp. Parameters: - `limit` int, 1-100, default 20 - `sort` "recent" (default) or "impact" - `since_hours` int, 1-168, default 4 Response: ```json { "stories": [{ /* see Story shape below */ }], "meta": {"count": 10, "window": "all", "cursor": "evt_...", "hasMore": true} } ``` ## GET /v1/stories (auth) Paginated, filterable list. Parameters: - `limit` int, 1-100 - `cursor` string (story id from a prior response) - `category` one of the 12 categories - `since` ISO 8601 ("2026-04-28" or "2026-04-28T15:00:00Z") - `topics` comma-separated topic filter - `q` full-text search (headline / summary / body / source / city) - `region` one of the 6 regions - `country` ISO 3166-1 alpha-2 ("US", "DE", "JP") - `urgency` "breaking" | "developing" | "routine" - `min_impact` int, 0-10 - `min_confidence` float, 0.0-1.0 - `format` "json" (default) | "csv" | "jsonl" ## GET /v1/stories/{id} (auth) Single story by `evt_…` id. Includes the full body, timeline of developments, multi-source corroboration with verbatim quotes, multi-category list, and event coordinates. ## GET /v1/stream (auth, SSE) Server-Sent Events. Developer, Trader, and Enterprise tiers may connect in real time (Free, Pro, Legacy → HTTP 403). Events: - `story` — new or updated story payload - `corroboration` — existing story confirmed by another source - `heartbeat` — system pulse with active source count Reconnect with `Last-Event-ID` is supported. ## GET /v1/sources (auth) Every monitored source: name, city, region, country, language, default category, medium (`audio_radio` | `social` | `structured`), live status, polling cadence. ## GET /v1/regions (auth) Aggregated regional story counts for map and analytics overlays. ## GET /v1/stats (auth) System-wide statistics (active sources, stories total, transcription throughput, top categories, last cycle ms, uptime seconds). ## POST /v1/webhooks (Developer or higher) Subscribe an HTTPS endpoint to `story` and `corroboration` events. Filters: - `categories` (array of category slugs) - `min_impact` (0-10) - `min_confidence` (0.0-1.0) - `country` (ISO 3166-1 alpha-2) - `topic_match` (substring match on topics) Payloads are HMAC-SHA256 signed with your secret. Up to five active webhooks per account. --- # 4. Story shape (canonical) Every story payload, whether from `/v1/stories`, `/v1/stories/feed`, `/v1/stories/{id}`, or the SSE stream, has this shape: ```json { "id": "evt_8f3a2b", "headline": "...", "summary": "...", "body": "...", "category": "conflict", "categories": ["conflict", "diplomacy"], "topics": ["Iran", "aircraft", "military"], "urgency": "breaking | developing | routine", "impact": 9, "confidence": 0.96, "sourceName": "BBC World Service", "sourceId": 14, "eventCity": "London", "eventCountry": "GB", "eventRegion": "Europe", "eventLat": 51.51, "eventLon": -0.13, "language": "en", "translatedFrom": null, "firstSeenAt": "2026-04-28T22:13:42Z", "updatedAt": "2026-04-28T22:14:08Z", "timestamp": "2026-04-28T22:13:42Z", "storyStatus": "developing", "corroboration": { "count": 4, "sources": ["BBC", "NPR", "Al Araby", "France Info"], "sourceDetails": [ {"name": "BBC", "region": "Europe", "language": "en", "medium": "audio_radio"}, {"name": "NPR", "region": "North America","language": "en", "medium": "audio_radio"}, {"name": "Al Araby", "region": "Middle East", "language": "ar", "medium": "audio_radio"}, {"name": "France Info","region": "Europe", "language": "fr", "medium": "audio_radio"} ] }, "timeline": [ {"at": "2026-04-28T21:30:00Z", "text": "Initial report on BBC World Service"}, {"at": "2026-04-28T21:42:00Z", "text": "NPR confirms with named official"} ], "sources": [ {"station": "BBC World Service", "quote": "...", "medium": "audio_radio"}, {"station": "NPR", "quote": "...", "medium": "audio_radio"} ] } ``` Important field semantics: - `corroboration.count` is *independent* sources, not raw mentions. Two AP wires republished by different outlets count once. - `medium ∈ {audio_radio, social, structured}`. - `eventCity` / `eventCountry` / `eventRegion` are where the *news* happened. Don't confuse with the `source.region` (where the broadcaster is). - `confidence` is the LLM extractor's self-reported confidence. Use ≥ 0.85 for high-confidence reads. --- # 5. Errors Stable shape on every error response: ```json {"error": "", "message": ""} ``` | HTTP | code | When | | --- | --- | --- | | 400 | `invalid_request` | Malformed query (e.g. `since=yesterday`) | | 400 | `invalid_email` | Malformed email at signup | | 401 | `unauthorized` | Missing or invalid API key / JWT | | 401 | `invalid_code` | Wrong email verification code | | 404 | `not_found` | Unknown story / source / etc. | | 409 | `email_taken` | Email already registered | | 429 | `rate_limit_exceeded` | Monthly quota exhausted; honor `Retry-After` | | 500 | `internal_error` | Backend hiccup; retry with exponential backoff | | 503 | `service_unavailable` | Pipeline temporarily down (rare) | Every response carries `x-request-id` (include it in support tickets). 401 responses also carry `www-authenticate: Bearer`. --- # 6. Categories (12) A story has exactly one primary `category`; multi-category coverage is exposed via the `categories[]` array. - **politics** — elections, legislation, government, parties, policy. - **conflict** — military operations, attacks, ceasefires, frontline broadcasts. - **economy** — markets, central banks, trade, employment, macro indicators. - **disaster** — earthquakes, storms, fires, floods, humanitarian emergencies. - **diplomacy** — bilateral talks, summits, treaties, sanctions, statements. - **science** — research, missions, discoveries, scientific announcements. - **health** — outbreaks, public health policy, hospitals, drugs, clinical news. - **technology** — product launches, AI, regulation, internet, hardware, companies. - **culture** — arts, entertainment, religion, language, music, society. - **environment** — climate, conservation, pollution, energy transition. - **sports** — matches, transfers, tournaments, athletic news. - **other** — cross-cutting stories that don't fit a single primary category. Filter by category with `?category=politics` (single) or `?topics=...` (multi-topic intersection). --- # 7. Methodology Pipeline: **Ingest → ASR → Extract → Corroborate**. 1. **Ingest** — live audio, video, social, and structured feeds streamed continuously from ~200 sources. 2. **ASR** — per-source on-pod transcription and translation on dedicated GPU pods. Audio never leaves our infrastructure. 3. **Extract** — an LLM extracts headline, summary, body, primary category, topics, urgency, impact, confidence, location, and a verbatim source quote from rolling transcript windows. 4. **Corroborate** — match against existing events in time, space, and semantic similarity. Independent sources are merged onto the same story. End-to-end latency from "spoken on air" to "live on the public wire" is typically 30-90 seconds for breaking events. ## What sources we listen to - **audio_radio** — live radio and TV news streams from public broadcasters and major commercial outlets across every region. - **social** — Mastodon firehose and curated journalist accounts, used as a corroboration signal for events first surfaced on broadcast. - **structured** — USGS earthquakes, NOAA weather alerts, OpenFDA, GDELT, and similar machine-readable feeds whose source-of-truth is the agency. ## Headline grounding Headlines are constrained by rules-based grounding that prevents the LLM from sharpening vague claims. We accept a higher false-negative rate to keep the false-positive rate near zero. ## What `corroboration.count` means The number of *independent* sources that have confirmed the same event in their own words. Two AP wires republished by different outlets count once. Two original radio reports from different broadcasters count twice. ## Quality controls - Headline-grounding rules - JSON-schema validation on every LLM output - Corroboration thresholds for low-confidence events - Manual daily audit of a random sample ## When to trust a story - For automated decisioning, prefer events with `corroboration.count >= 3` from the `medium` values you trust for the use case (audio for breaking, structured for compliance). - Cross-reference single-source stories against `source.url` before quoting them externally. - For corrections, contact . Reported corrections are logged on . --- # 8. MCP server (oruk-mcp) The official Model Context Protocol server is published on npm as [`oruk-mcp`](https://www.npmjs.com/package/oruk-mcp). It runs locally, spawned as a child process by the IDE, and talks directly to `api.oruk.ai`. ## Install ``` npx -y oruk-mcp ``` ## Configure Same JSON for Claude Desktop, Cursor, Continue.dev: ```json { "mcpServers": { "oruk": { "command": "npx", "args": ["-y", "oruk-mcp"], "env": { "ORUK_API_KEY": "ork_xxxxxxxxxxxx" } } } } ``` - Claude Desktop: `~/Library/Application Support/Claude/claude_desktop_config.json` (macOS) or `%AppData%\Claude\claude_desktop_config.json` (Windows). Restart the app. - Cursor: Settings → MCP, paste the inner block; or edit `~/.cursor/mcp.json`. - Continue.dev: `~/.continue/config.json` under `mcpServers[]`. ## Two modes - `mode: "public"` — `ORUK_API_KEY` unset. Server scans the freshest 50 stories on `/v1/stories/feed` (2-hour window) and applies all filters client-side. Good for almost every interactive query. - `mode: "authed"` — `ORUK_API_KEY` set. Full `/v1/stories` surface, cursor pagination, full-text search, arbitrary `evt_…` lookups, and the `oruk_list_sources` / `oruk_get_stats` tools that require a key. Tool output annotates `structuredContent.mode` so an LLM knows which path was taken and can suggest the user provide a key when relevant. ## Surface Tools (12): - `oruk_get_latest`, `oruk_search_news`, `oruk_get_breaking`, `oruk_get_story`, `oruk_get_topic`, `oruk_list_categories`, `oruk_list_sources`, `oruk_get_stats`, `oruk_get_corroboration`, `oruk_describe_api`, `oruk_show_pricing`, `oruk_health`. Resources (6, read-only URIs): - `oruk://docs/quickstart`, `oruk://docs/api-reference`, `oruk://docs/methodology`, `oruk://docs/categories`, `oruk://docs/pricing`, `oruk://stories/latest`. Slash-prompts (3): - `/summarize_breaking`, `/track_topic`, `/morning_briefing`. ## Quota accounting Every MCP tool call that reaches the backend is one API call against your key — same metering as the REST API. The MCP holds an in-process cache for ~3 s on the public feed, so multi-tool turns inside one session typically collapse to a single backend hit. Package + readme: . Official transport today: stdio via npm (`npx -y oruk-mcp`). The package includes `mcpName: "ai.oruk/oruk-mcp"` and `server.json` metadata for MCP registries and agent directories. Oruk does not currently publish a production remote MCP endpoint. Remote MCP over Streamable HTTP is planned as a controlled beta after OAuth-compatible connector auth, per-key quotas, origin controls, and audit logging are in place. --- # 9. Agent discovery Oruk publishes machine-readable discovery files: - `https://oruk.ai/.well-known/ai.json` — capability manifest for REST, SSE, webhooks, MCP, auth, story fields, pricing, and machine-payment status. - `https://oruk.ai/.well-known/agent.json` — lightweight agent skill card, usage rules, boundaries, and payment status. - `https://oruk.ai/AGENTS.md` — operating guide for autonomous agents, including citation rules and examples. - `https://oruk.ai/sitemap.md` — Markdown sitemap for LLM crawlers. - `https://oruk.ai/llms.txt` — curated LLM index. - `https://oruk.ai/llms-full.txt` — this full context file. Agents should cite `https://oruk.ai/story/{evt_id}`, include `corroboration.count`, and name at least one confirming source when available. For automated briefings, prefer `corroboration.count >= 3` and `confidence >= 0.85`; do not invent details beyond the returned story fields, timeline, sources, and corroboration block. --- # 10. Machine payments Current production billing is API-key based through Stripe subscription tiers. Existing REST, SSE, webhook, and MCP paths do not emit live `402 Payment Required` challenges. Agents should authenticate with an Oruk API key today. Planned machine-payment rollout: - x402: planned beta for new opt-in endpoints such as `/v1/x402/stories/feed` and `/v1/x402/stories/{id}`. Existing endpoints stay unchanged. - Stripe Machine Payments Protocol: under evaluation for session-based or spending-limit authorization once Stripe machine-payments access is available. - Ledgering: future machine-payment endpoints should record request id, payer, endpoint, amount, network/facilitator, settlement id, and response status. Until a beta is announced, treat `401`, `403`, and `429` as the active auth and quota signals. --- # 11. Pricing | Tier | Monthly | Annual | Calls/mo | API delay | Keys | SSE | | --- | --- | --- | --- | --- | --- | --- | | free | $0 | $0 | 100 | 5 min | 1 | not included | | pro | $12/mo | $120/yr ($10/mo eq.) | 1,000 | none | 2 | not included | | developer | $100/mo | $1,000/yr ($83/mo eq.) | 10,000 | none | 2 | real-time | | enterprise | contact | contact | 1,000,000+ | none | unlimited | real-time | - The live wire on https://oruk.ai/ is real-time and free for everyone, no signup required. The paid surface is the API + SSE stream, not the wire. - Annual = two months free vs. monthly on Pro and Developer. - One API call = one REST request *or* one SSE connection open (events on an open SSE stream don't re-bill) *or* one MCP tool invocation that reaches the backend. - Webhooks ship on Developer and above (up to 5 endpoints, HMAC-signed). - Bulk export (`format=csv` / `format=jsonl`) requires any API key. Sign up: . Pricing details: . Enterprise: . --- # 12. Recipes (common agent tasks) - **What's breaking right now?** — MCP `oruk_get_breaking()` or `GET /v1/stories?urgency=breaking&min_impact=5`. - **Track a topic over time** — MCP slash-prompt `/track_topic topic="tariffs" hours=12` or `GET /v1/stories?q=tariffs&since=`. - **Verify a single claim** — MCP `oruk_get_corroboration(evt_id)` or `GET /v1/stories/`; read the verbatim quotes; require `corroboration.count >= 3` for automated decisioning. - **Bulk export** — `GET /v1/stories?since=&format=jsonl` (any tier with a key). - **Real-time stream** — `GET /v1/stream` with SSE; reconnect on close; events: `story`, `corroboration`, `heartbeat`. - **Filter by event location** — use `region` (continent) + `country` (ISO 3166-1 alpha-2). `eventRegion` / `eventCountry` are where the news happened, *not* where the broadcaster is. - **Find broadcast-only confirmations** — `GET /v1/stories?q=...` then filter the response client-side for stories where every `sources[].medium === "audio_radio"`. - **What stations cover Europe?** — `GET /v1/sources?region=Europe` (auth) or `https://oruk.ai/sources?region=Europe` (HTML). - **Morning briefing** — MCP slash-prompt `/morning_briefing hours=12`. --- # 13. Feeds & sitemaps - Atom: - RSS (all): - RSS per category: - Sitemap index: - Markdown sitemap: --- # 14. Editorial pages - Methodology — - Sources catalogue — - About / contact — - Changelog — --- # 15. Agent guidance (how to behave) - When citing an oruk story, link to `https://oruk.ai/story/`. It 301s to the canonical `/feed//` URL. - Always quote `corroboration.count` and at least one source name when summarising. Independent sources are the credibility signal. - For automated decisioning, prefer events with `corroboration.count >= 3` from `medium = audio_radio` (live broadcast) or `medium = structured` (USGS, NOAA, OpenFDA, GDELT). - Default to the public `/v1/stories/feed` for first-touch reads — it needs no key and counts against no quota. - Ask for `ORUK_API_KEY` only when the user wants deeper history, arbitrary `evt_…` lookups, the sources/stats endpoints, or the full-text search surface. - When users hit a 429, tell them: monthly quota — upgrade at or wait until the calendar month rolls over. - When users report a factual error in a story, route them to . Reported corrections are logged on the public changelog. --- End of llms-full.txt. The curated index is at .