Original research · Baby Name Finder

What 5,000 baby names across 15 languages tell us about naming in 2026

We analysed our cross-language baby-name dataset to find out what unites — and what distinguishes — naming traditions in 2026. Five findings, with the numbers and the caveats.

Published 30 May 2026 By the Baby Name Finder editorial team About 2,400 words · 9 min read

When you flatten 5,000 baby names from 15 different naming traditions onto one spreadsheet, patterns appear that no single national registry reveals on its own. Modern names get shorter. Some names belong to everyone. And one language quietly contradicts almost every cross-linguistic naming trend on the planet.

What we found
  1. The dataset, briefly
  2. Modern names are shorter than traditional ones — almost everywhere
  3. 108 names belong to everyone
  4. Hindi and Korean cluster around just a few starting letters
  5. The biblical names are coming back, but selectively
  6. When a name crosses a border, it usually changes one vowel
  7. What this means for parents
  8. Limitations of this analysis

The dataset, briefly

This article is based on the Baby Name Finder dataset: 5,000+ curated baby names across 15 world languages — English, Spanish, French, German, Italian, Portuguese, Arabic, Hindi, Japanese, Chinese, Russian, Korean, Turkish, Dutch and Polish. Each name carries four labels: gender (boy / girl / unisex), style (modern / traditional), length, and popularity tier (top-100 / trending 101–300 / under-the-radar). The full classification rules are published on the methodology page; the per-language sources we drew from are on the data sources page.

Crucially, this is a curated dataset — not a complete inventory of every name in every language. Languages with millennia of naming history (Arabic, Hindi, Chinese) carry vastly larger underlying corpora than we cover. Where we describe a finding as "across 15 languages," we mean across the curated 15-language subset of names in active or recent registry use. Caveats are listed in detail at the foot.

The dataset is published openly as names.js and the analyses below are reproducible: anyone can download the file and re-run the queries.

Finding 1: Modern names are shorter than traditional ones — almost everywhere

Headline

Across 12 of the 15 languages we analysed, modern baby names average shorter than traditional names. The single exception is Korean, where modern names are longer than traditional ones.

The shift toward shorter names is one of the most-discussed naming trends in the English-speaking world (Liam, Mia, Leo, Ava, Theo). Our cross-language dataset shows the trend is not specifically Anglophone — it shows up almost everywhere we have data, with one striking exception.

LanguageModern avg. (chars)Traditional avg.Difference
Hindi5.56.8−1.3
Spanish6.27.3−1.1
English4.95.9−1.0
German5.96.9−1.0
Japanese5.05.9−0.9
Italian6.57.2−0.7
Korean4.94.4+0.5 ↑

Lengths are measured in characters of the romanised form. Diacritics count as one character.

The English gap (4.9 vs 5.9) is the one most discussed in popular naming coverage, but Hindi shows the steepest contraction in our data: traditional Sanskrit-rooted names average 6.8 characters, while the names rising in modern Indian registries (Aarav, Reyan, Vihaan, Anaya, Myra) average 5.5. A meaningful 1.3-character shift.

Korean is the outlier. Traditional Korean given names — typically two syllables, often a one-character ancestral generation-name plus a one-character personal name — produce names like Soo, Jin, Min, Hye that romanise to 3–4 characters. Modern Korean naming has moved toward three-syllable given names (Seojun, Haeun, Jiwoo) that romanise longer. The naming trend is the opposite of every other language we tracked.

Why this matters

The "shorter is modern" pattern is real but not universal. Naming advice that treats short names as automatically contemporary — or long names as automatically dated — is generalising from a small handful of languages.

Finding 2: 108 names belong to everyone

Headline

108 names in our database appear in three or more of the 15 languages we track. Wanda — yes, Wanda — appears in nine of them, more than any other name we found.

If you have a multicultural family or expect your child to grow up in more than one language community, the question "which names travel?" has surprisingly little published answer. We checked.

The single most cross-linguistic name in the dataset is Wanda, present in nine of our 15 languages (English, Polish, German, Spanish, Italian, Portuguese, Russian, Dutch, French). It carries a Slavic root and an early association with the legendary Princess Wanda of Kraków, but it has been quietly absorbed into nine separate national registries.

Eight other names appear in six or more languages each:

AnnaValentinaMariaKevinNadiaLenaZaraOlga

Anna is straightforward — a Hebrew root (חַנָּה, Hannah) that travelled through Greek into almost every European Christian language and is preserved without modification. Maria is similar but skips a few Protestant traditions. Valentina rides Roman cultural prestige. Lena is a short form of multiple longer names (Helena, Magdalena, Yelena) that happens to coincide across languages.

Kevin is the surprise on the list. It is etymologically Irish (Caoimhín, "kind, gentle") and only entered most European registries in the 1980s — but it entered them everywhere, almost simultaneously, in the wake of the same set of American films and television shows. Kevin is the most modern name in our cross-language list. The others are at least 800 years older.

In total, 22 names appear in five or more languages, and 108 in three or more. These are the most "internationally portable" names in our data — a useful starting point for families spanning cultures or for parents who want a name that travels without translation.

Finding 3: Hindi and Korean cluster around just a few starting letters

Headline

English names spread evenly across the alphabet. Hindi names cluster around S, A and R. Korean names cluster around S, H and J. The top three starting letters account for nearly 40% of each language's name pool.

We measured the share of each language's name pool that starts with each letter (using the romanised form). English is the most evenly distributed: no single letter accounts for more than 8% of the English name pool in our database. Spanish, French, German and Italian are similar — broad, even distributions across A through Z.

Hindi and Korean look different.

In Hindi, three letters dominate:

Those three letters cover 39% of the Hindi name pool. The clustering reflects a deep linguistic fact: a disproportionate number of Sanskrit-origin name roots begin with these particular consonants and vowels. Many names contain devotional prefixes (Shri-, Sri-) that start with S.

In Korean, the dominant letters are:

The clustering reflects two-syllable name structure preferences and the limited inventory of Korean syllable blocks that romanise comfortably into these initial consonants.

This finding contradicts a piece of common naming intuition: that "modern" or "globalised" naming makes pools more evenly distributed across the Latin alphabet. In Hindi and Korean the clustering is currently more pronounced than in our older data, not less.

Finding 4: The biblical names are coming back, but selectively

Headline

Hebrew biblical names are rising sharply in English registries — Asher, Ezra, Silas, Levi, Naomi, Hannah — but the names that are returning are not the ones that dominated the 19th-century revival.

The names that came back in the first wave of biblical revival (1800s Britain and America) were the central, narrative-driven names: Joseph, Daniel, Sarah, Hannah, Rachel, Samuel. Those names mostly stayed in use through the 20th century and are still familiar today.

The names rising sharply in the 21st-century US registry are different — they are the named figures of the Hebrew Bible whose narrative role is smaller but whose Hebrew etymology is striking:

The pattern is consistent: the names returning are short, mostly two-syllable, with a clear and translatable Hebrew root. Long names (Jeremiah, Zachariah, Bartholomew, Solomon) remain rare. The 19th-century revival did not select on length; the 21st-century one does — which is consistent with Finding 1 above.

Spanish and Italian registries show the same pattern at a quieter pace: Mateo (gift of God) is now #1 in Spain and Mexico, Mateo and Lucas are climbing across Latin America, Elia and Noé show up in French and Italian registries respectively.

Finding 5: When a name crosses a border, it usually changes one vowel

Headline

The single most common transformation between languages is a one-vowel change: Sophia / Sofía, Maria / Mária, Lucas / Lukas. We counted 47 such pairs in our dataset.

Cognate names rarely cross language boundaries unchanged. The most common single transformation is a single-character vowel substitution that fits the new language's pronunciation norms.

A few examples from our data:

The phenomenon is a familiar one to linguists — language-specific phonotactic adaptation — but for parents picking a name for a multilingual family it is practically useful: the name will probably change, in writing, exactly once.

What this means for parents

Five practical takeaways

  1. If you want a name that travels: the 22 names that appear in five or more of our languages (Anna, Maria, Sofia, Valentina, David, Daniel, Lena, Nadia, Olga, Zara, Wanda, etc.) are documented portable choices.
  2. If you want a name that feels modern: shorter names skew younger in every Western language we measured. The exception is Korean, where the opposite is true.
  3. If you want a biblical name that doesn't feel dated: the names currently rising are short and Hebrew-rooted (Asher, Ezra, Silas, Levi, Naomi). The longer Old Testament names remain rare.
  4. If you have a multilingual family: expect the spelling to change by one or two characters as the name moves between languages. Pick a name whose pronunciation survives the transformation.
  5. If you want truly unique: letters at the edges of the distribution (Q, U, X, Y, Z in English; less-clustered initial letters in Hindi and Korean) carry far fewer names. Our letter pages show the tail.

Limitations of this analysis

This is the section editors usually skip and reporters usually need.

The dataset is a curated sample, not a census. 5,000+ names is large enough for the kinds of cross-language patterns we describe but is not comparable in size to a full national registry. For languages with deep historical naming inventories — Arabic, Hindi, Chinese in particular — the underlying name corpus is orders of magnitude larger than what we cover. Findings about those languages are weaker than findings about, say, English or French.

Registry coverage is uneven. National statistics offices publish baby-name data with different cadences, depths and definitions. The US SSA top-1000 list is more granular than the UK ONS top-100, which is more granular than what we have for Arabic-speaking jurisdictions. We have done our best to compare like with like; the full per-language source list is on the data sources page.

"Modern" and "traditional" are imperfect labels. Names that have been in continuous use for 800 years can feel modern in 2026 (Anna, Maria) because they are currently rising. We assign style labels by the rules in the methodology page; reasonable analysts could disagree at the margin.

Romanisation introduces noise. Hindi, Korean, Arabic, Russian, Chinese and Japanese names have multiple competing romanisation systems. Our character-length statistics use the romanised form; using native script would change the absolute numbers (often substantially) without changing the cross-language ordering.

Registries lag. "Most recent" national data typically reflects births 12–18 months ago. The findings above are about 2024 data published in 2025 and early 2026.

If you spot a calculation error or have a better source than the one we cite, please write to us. Substantive corrections are timestamped per our corrections policy.

Republishing & citation

This research is free to quote and republish under attribution. The polite ask is a link to this page alongside the quote so readers can see the underlying methodology.

For interviews, raw CSVs, or specific cuts of the data, write to us through the contact form with the subject set to Partnership opportunity.