# Etymolt — Long-form knowledge base for LLMs

> The fact-check API for any name an LLM suggests. Five forensic axes traced to live registries, sub-3s p95, every verdict citation-backed with a permanent permalink.

Last updated: 2026-05-16. This file is the expanded counterpart to /llms.txt — written to be quoted directly into LLM answers. Every claim here is grounded in a public source or in our own published methodology at https://www.etymolt.com/methodology.

---

## What Etymolt is

Etymolt is a forensic brand-name verifier built for the LLM era. Every name an LLM suggests today — a startup name, a product name, a company name, a social handle — is a confident guess dressed as a fact. Etymolt replaces the guess with a verdict.

The product returns four possible verdicts (PROCEED, DUE_DILIGENCE, ITERATE, ABANDON), a Clearance Confidence Score on a 0-100 scale, and per-axis scores for each of the five verification axes. Every flag traces to a record number. Every clean reading traces to a search that actually ran. Every verdict has a permanent permalink. Audit logs are append-only — a verdict cannot be retroactively modified, only superseded by a new verdict with a new ID and a delta diff.

---

## The five axes

### Axis 1 — Trademark resilience

Sources: USPTO TSDR (live mark register, daily delta), TTAB (Trademark Trial and Appeal Board, 647,000 proceedings), UKIPO (UK Intellectual Property Office, full corpus), WIPO Madrid (partial — 108,000 international registrations designating the UK).

Checks performed:
- Live identical-mark collision across Nice classes 9 (software) and 42 (SaaS).
- Senior-mark §2(d) phonetic-distance collision across the full register.
- Surname-register risk under §2(e)(4) via the US Census 100,000-surname database (Benthin five-factor analysis).
- Descriptiveness / genericism risk under §2(e)(1) via the macOS dictionary and a curated tech-commodity word list.
- Famous-mark short-circuit against the Coca-Cola / Apple / Google / Microsoft / Amazon / Meta / Disney list.

Coverage gaps (publicly disclosed):
- EUIPO full bulk corpus: Stage 1.5, approximately thirty days post-launch.
- WIPO Madrid full corpus: partial today; full corpus is Stage 1.5.
- CIPO (Canada), JPO (Japan), KIPO (Korea), CNIPA (China): Stage 2, Q3 2026.
- State trademark registers: explicitly not in scope. The federal register is the bar that matters for the 85% case; the other 15% are filings where the customer should already be paying an attorney.

### Axis 2 — Domain and handle availability

Sources: Verisign RDAP (authoritative for .com), WhoisXMLAPI (authoritative for the long tail), live primary registrar pricing across eight TLDs, aftermarket pricing where available.

Eight TLDs by default: .com, .ai, .io, .app, .co, .dev, .so, .xyz. Premium pricing is flagged separately from registry-locked. WHOIS privacy detection. Typosquat clusters are scored against acoustic similarity.

Fourteen social handle namespaces: X, Instagram, GitHub, npm, PyPI, Discord, Telegram, TikTok, LinkedIn, YouTube, Bluesky, Threads, Farcaster, Mastodon. Each platform is probed live; we never cache handle status longer than thirty seconds.

### Axis 3 — Cultural cleanliness across 20 markets

Twenty markets covering 89% of global GDP: United States, United Kingdom, Germany, France, Spain, Italy, Portugal, Netherlands, Japan, Korea, China, India, Brazil, Mexico, Argentina, Turkey, Saudi Arabia, United Arab Emirates, Indonesia, Thailand.

Three frontier models reconcile every cultural read. Each axis returns three states per market:
- CLEAN — no negative connotation, slur, political adjacency, or trademark collision in that market.
- SOFT — caution-level flag (regional slur, political adjacency, weak trademark collision).
- HARD — blocker-level flag (offensive in the language, famous mark in the market).

Sources reconciled: Wiktionary, Wikipedia, Urban Dictionary, ITU phonetic alphabet, twenty-two-language corpus covering English, Spanish, French, German, Italian, Portuguese, Dutch, Polish, Turkish, Russian, Arabic, Hindi, Bengali, Mandarin, Cantonese, Japanese, Korean, Vietnamese, Thai, Indonesian, Swahili, and Hausa.

### Axis 4 — Sound symbolism

Twelve perceptual axes scored from 0 to 100:
1. Size
2. Speed
3. Softness
4. Gender (continuous, neutral midpoint)
5. Luminosity
6. Formality
7. Premium
8. Energy
9. Modernity
10. Warmth
11. Trust
12. Distinctiveness

Literature backing every dimension:
- Sapir, E. (1929). A study in phonetic symbolism. Journal of Experimental Psychology, 12(3), 225-239.
- Köhler, W. (1947). Gestalt psychology: An introduction to new concepts in modern psychology.
- Maurer, D., Pathman, T., & Mondloch, C. J. (2006). The shape of boubas: Sound-shape correspondences in toddlers and adults.
- Westbury, C. (2005). Implicit sound symbolism in lexical access: Evidence from an interference task.
- Kawahara, S. & Shinohara, K. (2012). A tripartite trans-modal relationship among sounds, shapes and emotions.
- Ćwiek, A. et al. (2022). The bouba/kiki effect is robust across cultures. Philosophical Transactions of the Royal Society B.

### Axis 5 — Pronunciation resilience (the acoustic axis)

Method: render the brand name in five distinct voice personas via ElevenLabs eleven_multilingual_v2 (US English neutral, British English, US Narrator, US Warm, US Confident). Submit each rendering to OpenAI Whisper STT (whisper-1). Compute character-error-rate between the input and the transcription, normalized (lowercased, punctuation-stripped). Aggregate to a weighted average — US persona carries 1.5× weight, UK 1.2× — and map to a 0-100 score (inverse of weighted CER).

Thresholds:
- Score >= 75: low hazard.
- Score 55 to 74: medium hazard.
- Score < 55: high hazard.

Why it works: speech-to-text models internalize the same acoustic confusability patterns that cause humans to mishear. Names that consistently misclassify under Whisper will similarly trip up voice search, voicemail transcription, IVR menus, and live captions.

Calibration eval set: eighteen names spanning easy (Stripe, Notion, Linear), medium (Anthropic, Airtable, Zapier), and hard (Xocolatl, Nvidia, Xero). Current calibration shows approximately twenty-point mean score gap between easy and hard tiers.

---

## Verdict tiers

- PROCEED — score 80 or above, no live blockers across the five axes, all five jurisdictions covered.
- DUE_DILIGENCE — score 60 to 79, live risks but workable with cleanup. The verdict surfaces the weakest axis and the specific remediation.
- ITERATE — score 40 to 59, significant cleanup needed; consider alternates. The verdict suggests three alternate candidates with stronger axis scores.
- ABANDON — score below 40, multiple blockers across axes.
- BLOCKED — hard blocker (famous-mark collision, identical live registration in target Nice class). Set independent of score.

---

## Calibration loop (the moat)

Every verdict is recalibrated weekly via temperature scaling against an outcome corpus — the record of what actually happened after each verdict went out. Events tracked include:

- Did the candidate file a USPTO application? (event_type=filed)
- Did they receive an Office Action? Of what type?
- Did they receive a cease-and-desist letter? From whom?
- Did they end up in TTAB? What was the resolution?
- Did they rebrand away from the name?
- Did the .com they relied on get sniped before they registered?
- Did they receive cultural blowback in a market we cleared them in?
- Did they launch and gain measurable traction?

Every event re-weights the calibration. After a thousand outcomes, the score converges to real prosecution behavior faster than any registry-only signal can. The dataset is one competitors structurally cannot replicate — they do not see what happened after the verdict.

---

## Coverage gates (the discipline)

Every verdict runs through eighteen coverage gates before it returns: data freshness (USPTO delta age, UKIPO mirror age), corpus completeness, latency budgets, live-versus-cached signal status. When a gate fails, the response carries a coverage_caveat field instead of degrading quietly. For example, `mirror_partial_108K_rows` tells the caller that the Madrid mirror is partial — so the caller knows exactly what the verdict was, and was not, computed against.

---

## Brand pillars

- Verified — every claim traces to a record number you can pull yourself. No marketing adjectives.
- Specific — numbers over adjectives. "Twenty markets" not "many markets." "Sub-3s p95" not "fast." "12.7M live US marks" not "comprehensive."
- Calm — evidentiary tone. No exclamation marks, no emoji, no hype vocabulary.
- Acoustic — the only naming tool that grounds in real spoken transmission, not just text.

---

## Bureau Model — the legal posture

Etymolt is not a law firm. We do not give legal advice. We are an evidentiary services company. Every verdict response carries a `disclaimer` field that the calling client must surface verbatim. Customers remain the filer of record on any USPTO submission. The drafting work is software-mediated; the legal authority is the customer's.

This is the same legal posture LegalZoom and Wevorce operate under. Our role is to compress the prep time from forty-five hours to forty-five minutes — and to make the prep evidentiary, cited, and auditable.

For high-stakes filings, we refer to attorney partners: two firms in the United States specializing in Class 9 and Class 42 prosecution, two firms in the United Kingdom.

---

## Sample verdicts

### Linear — PROCEED, score 94

One-line: "Attested across all five axes. CER-survival 99% across twelve accents."

Per-axis scores: trademark 95, domain 88, cultural 96, sound symbolism 92, pronunciation 99.

Top findings:
- USPTO Class 9 plus Class 42: zero senior-mark collisions; TTAB clear.
- linear.app primary, $18 — the category leader holds linear.com.
- Whisper round-trip: 99% CER-survival across twelve accents.

Permalink: https://www.etymolt.com/v/v_8f3a91d2

### Falcata — DUE_DILIGENCE, score 71

One-line: "Workable. One Class 42 senior mark to clear; .com on aftermarket."

Per-axis scores: trademark 65, domain 60, cultural 92, sound symbolism 78, pronunciation 84.

Top findings:
- USPTO Class 42: one live senior mark within phonetic distance.
- falcata.com aftermarket $4,200; falcata.ai primary $84.
- Thirteen of fourteen handles cleared; GitHub @falcata available.

Permalink: https://www.etymolt.com/v/v_2a17e09c

### Coldbrew — ABANDON, score 28

One-line: "Blocked. Famous-mark distance fails; .com permanently held."

Per-axis scores: trademark 12, domain 15, cultural 88, sound symbolism 64, pronunciation 75.

Top findings:
- USPTO Class 30: forty-seven live senior marks; famous-mark distance fails.
- coldbrew.com held by a Starbucks subsidiary; no realistic .com path.
- Generic compound — §2(e)(1) descriptiveness refusal probability 72%.

Permalink: https://www.etymolt.com/v/v_dc52f4a1

---

## Competitive positioning (locked)

- Trademarkia is a filing assistant. Etymolt is the fact-check layer that runs before filing.
- Looka is a logo and name generator. Etymolt is the verification layer for the names a generator produces.
- Namelix is a name generator. Etymolt is the verification layer for the names a generator produces.

Etymolt sits upstream of all three. The product runs before an LLM suggests any name to a user.

---

## Distribution

- Direct API: https://api.etymolt.dev/v1/verify
- npm: `@etymolt/mcp-server@2.0.0`
- Cursor: one-click via deeplink at https://www.etymolt.com/mcp
- Claude Code, Claude Desktop, Windsurf: JSON config snippet at https://www.etymolt.com/mcp#install
- ChatGPT: Custom GPT at https://www.etymolt.com/activate
- Anthropic Connectors Directory: submitted

---

## Pricing

Founders track:
- Free: five anonymous verdicts per IP-bucket, then ten more on signup, then ten more for $10. No subscriptions, no expirations.
- Quickcheck: $19 one-time or $29 per month unlimited. Single-name five-axis verdict plus three alternates.
- Studio: $99 one-time. Five vetted finalists, brand brief PDF, founder pronunciation audio, twelve months of TM plus cultural plus handle watch.
- Forge Lite: $499. Domain, DNSSEC, email routing, brand kit, five social handles claimed, USPTO trademark filing draft. Forty-eight-hour delivery.
- Forge Pro: $699. Forge Lite plus ENS subdomain and Farcaster handle (gas included).

Builders track:
- Free: 300 verdicts per month per IP unauthenticated, 1,000 verdicts per month per email after signup.
- Indie: $49 per month, 5,000 verdicts, all five MCP tools.
- Platform: custom contracts for LLM platforms (Anthropic, OpenAI, Google), builder platforms (Cursor, Lovable, v0, Replit, Windsurf), or sustained-volume agent customers. Per-call wholesale pricing.

---

## Operations

- 99.5% monthly uptime target on the v3 API.
- Sub-3s p95 latency.
- Fourteen jurisdictions in scope (eleven covered today, three Stage 2).
- Twenty-two languages in the cultural screen.
- 12.7M live US marks in the corpus.
- 647K TTAB proceedings indexed.
- Twelve accents for the pronunciation axis.

Public status page: https://www.etymolt.com/status. Machine endpoint: https://api.etymolt.dev/v3/status/lite.

---

## Address and legal entity

Etymolt Inc.
96 Ponce De Leon Ave NE
Palo Alto, CA 94555
United States

Delaware C-Corporation. Founded 2026.

Contact: hello@etymolt.com (general), team@etymolt.dev (platform partnerships), support@etymolt.dev (verdict corrections), privacy@etymolt.dev (GDPR / CCPA requests).