Names Aren’t Neutral: David J. Peterson on Creating a Fantasy Language


Photo by Mark Rasmuson

A common bit of advice given to writers is that story comes first; everything else comes second. With respect to fantasy, this advice is often employed to warn against the dangers of falling down the rabbit hole of world building. World building is great only insofar as it serves the story; anything else is a creative form of procrastination.

Ultimately, the author is responsible for every choice made in their book, however much care or research was invested in the effort, and this includes the setting and everything that entails—from the physical terrain to the architecture to the weather to the names of characters. Some details may be relatively unimportant with respect to the story, and so can be left to the reader’s imagination; some may be important in one story, and not in another (for example, the way seasons work is an important detail in George R. R. Martin’s A Song of Ice and Fire; not so important in J. R. R. Tolkien’s Lord of the Rings). But one detail that is never unimportant is language.

When setting a story in a world that isn’t ours, the question of language should be raised immediately. Naturally, the story will need to be told in a language a reader in our world can understand, dialogue included, and so we have the necessary fiction that whatever the story has been told in, it’s been “translated” to a language like English. But the same “translation” rarely happens with names. In other words, if one character can happily say to another “Hand me my sword!” rather than “Ve ilas su varikayet lek!”, why are the names left as Endiriel, Morelth, and Valator, rather than Andy, Mary, and Victor? Tolkien himself actually did a bit of name translation, giving us Bilbo instead of the proper Bilba, because he didn’t think readers would accept the latter as a male name, but otherwise the names tend to be otherworldly.

Of course we know that names aren’t neutral. Names that are commonly associated with one region or one era will evoke memories in the reader associated with that region or era. It’d be just as odd to name knights in a fantasy story Billy Bob and Brittany as it is for King Arthur et al. to come across a wizard named Tim in Monty Python and the Holy Grail. There’s nothing wrong with these names, by any means, and two different languages may coincidentally come up with the same form that has entirely opposite associations in each language, but the notion is if the author expects most of their audience to come from one linguistic tradition, they can rely on the background of that tradition in coming up with names—though at their peril if their work becomes famous internationally (cf. the term “bender” which is vital to the Avatar: The Last Airbender universe, but which has a meaning in Britain it doesn’t in the US).

Keeping in mind an English-speaking audience (and where there’s a distinction, an American English-speaking audience, since that’s the audience I’m a part of), what are the linguistic tropes we hold in our heads regarding names and language? To begin to unravel this question, I have to introduce a concept from linguistics: Phonotactics.

A language’s phonotactics is a collection of rules that define every detail having to do with sound in a language. Not merely a collection of the various sounds found in a language, the phonotactics of a language also define licit syllable shape, licit word shape, stress or tone (depending on the language), and intonation. These rules are descriptive rules, so they may or may not be inviolable, depending on the language and the rule in question. Think of them as tendencies that define the aural character of a language.

For example, in English, we have a lot of monosyllabic words and names (Tom, brick, sword, Sal, fist, etc.). While we can have just about any sound at the end of a word, in a name, we expect certain classes of sounds to come at the end, like the voiceless stops /p, t, k/ (Rip, Matt, Chuck); the nasals /m, n/ (Tom, Ann); and the voiceless fricatives /s, f, θ, ʃ/ (Russ, Jeff, Beth, Rush). While they are licit sounds of English, we don’t expect the voiced versions of the non-sibilant fricatives /v, ð/ at the end of a name (Rav, Midh—the latter rhyming with an antiquated pronunciation of “with”). We can get these sounds at the end of a word, but only after certain vowels—specifically, the type of vowels that we think of as “long” vowels. Thus, my nickname, Dave, ends with the sound /v/. Orthographically, it ends with one of English’s infamous silent e’s. So Dave is fine; Clive is fine; Rove is fine; Dav, Cliv, and Rov are not. In fact, English speakers are hard pressed to figure out how to even represent the sound of these names orthographically (is it Davv and Clivv, the way one could write Daff and Cliff and have them pronounced unambiguously?).

Historically, there’s a reason for this. In English, we only got sounds like /v/ and the /ð/ sound in “bathe” at the end of words because of two sound changes. First, those “silent” e’s didn’t use to be silent. In fact, they were pronounced as a full vowel. This is important, because English had a sound change by which the voiceless sounds /f/ and /θ/ and others were voiced between two vowels. So looking at “bathe” in comparison to “bath”, if you pronounce all the vowels in “bathe”, and you have to voice sounds in between vowels, then, it stands to reason, the “th” in “bathe” should be voiced (pronounced like the “th” in “this” not the “th” in “thin”). By contrast, nothing happens to the “th” in “bath”, because it’s not in between two vowels.

A bit later, English speakers stopped pronouncing a lot of these word-final e’s, rendering them silent (why we kept them there orthographically is beyond me), making “bathe” a monosyllabic word, but leaving it with a different “th” sound, and a different “a” sound than “bath”. The story behind the latter is that in the past, English lengthened vowels in open syllables. An open syllable is one that doesn’t end in a consonant. “Bath” is one syllable long and ends in the consonant /θ/, spelled “th”. In the original pronunciation, “bathe” was two syllables, the first something like /ba/, and the second something like /θe/, which later became /ðe/ as discussed (then later /ðǝ/ and finally just /ð/, where it then is forced to join the previous syllable). Since the first syllable of “bathe” in this pronunciation is open (it ends with a vowel), the vowel is lengthened, giving us something like /baː.ðe/ in the original pronunciation, and [beːð] in the modern pronunciation.

Both the origin of English’s long vowels and the loss of word-final e give us part of the story for why a name like Dave looks common, but Dav looks foreign (or why Cliff looks common but Cliv, or Clivv, or Kliv, or Klivv all look impossibly foreign). But this is just one example amongst the many phonotactic constraints of English, and it’s only half the story. After all, Vestrellios looks foreign. So does Pukpuk. Why does each conjure up sharply contrasting associations?

When an English speaker evaluates a word, there’s not simply an on-off “native/foreign” judgment. In addition to regional judgments made about native terms and phrases (cf. freeway designations), non-native terms and languages are also evaluated. Terms of Romance origin are evaluated more positively probably because you can’t get through a sentence without using a word deriving from a Romance language, despite the fact that English is a Germanic language (for example, in this sentence, leading up to the parenthetical comment, 13 of the 34 words are Romance in origin). In addition, names that are from that tradition, or sound as if they could be, conjure up positive associations. We know that many names of Latin end in -us, -ius, -a, -ia, and so faux-Latin names like Kostrius, Valtus, Caula, and Helvia seem like pretty good names—or, at the very least, aren’t evaluated as ugly, inappropriate, fake-sounding, or childish. Though not a Romance language, the same holds for Greek names, with endings like -ys, -os, -ios, -ion, -iad, etc.

Other languages and language families don’t get the same privileged status—especially Semitic, East Asian, African, and Austronesian naming traditions. This has partly to do with lack of familiarity (not because they’re unfamiliar to English speakers, but because Western European names and languages are vastly overrepresented in daily life), but has more to do with the Western tradition of exoticism (read: racism), especially in literature (cf. William Beckford’s Vathek). The culture and mythology of Greek and Rome have long been revered in the West for purely cultural reasons, and so the names associated with those histories and myths conjure up positive associations. By contrast, the canon of the Middle East, China, and Japan, if it was read at all, was often read mostly by scholars who translated and presented that work to Western audiences, couched in the language and trappings of colonialism. They were not afforded the same status as the “classics,” and so they were not treated with the same respect.

An upshot of this history is the way that early authors would use names supposedly of Eastern or African origin. These weren’t the names given to heroes, but to sidekicks, “mystics,” demons, wicked rulers, barbarians, villagers (“tribesmen”), or occasionally damsels in distress. So-called “Oriental” tales like Vathek were quite popular in their day, and drew lots of readers, which led to more works like these, which led to continual reinforcement of these stereotypes. Thus, you get heroes like Vontius, not Bongaluka.

At base, though, if a name is totally made up (i.e. the precise phonological form doesn’t exist as a name or a word in any other language, to the best of an author’s ability), it should be totally neutral. Theoretically, anyway. We don’t live in the theoretical world, though, so here’s some practical advice for fantasy authors who need to come up with a whole host of names for their epic fantasy series.

Naturally, an author cannot be expected to come up with one or more full languages for their work (though it’s not unprecedented, and, indeed, I’d encourage it, if the author knows what they’re doing). That doesn’t mean they can’t work with a conlanger who can do precisely that. There are thousands of language creators the world over, and several hundred who are able and ready to do high quality work for an author who hires them to do so (and, honestly, even a beginning conlanger is likely going to do a better job than an author who has no idea what they’re doing). Unlike in TV and film, though, where everyone involved is used to working as a team to get a high quality result, authors are notoriously stingy about collaborating on any aspect of their work—especially when it comes to anything so important as the names of characters. My advice to authors? Don’t be. That name you’re so attached to (e.g. Estriel or Drixxx or Vaurus)? Probably not that special (see above). Plus, good conlangers know all this stuff. They can work with you to get something you’re happy with that works. Plus, in addition to that one name, you can also get a ton of other person and place names along with it.

If you’re determined to go it alone, that’s fine, but here are the most important things to bear in mind when coming up with the linguistic background of a fantasy world.

First and foremost, if an author is going to go to the trouble of creating a huge fantasy world with different countries and regions, it is important to give some thought to the linguistic history of that world. Thus far, the only inhabited world we know of is ours, so we don’t have a lot of examples to draw from, but based on what we know about our world, it seems unlikely that there would be an entire world with only one language—or even only two languages. We have about 7,000 here, but for the purposes of a fantasy work, that’s a little misleading, as these aren’t 7,000 unrelated languages. They all come in more or less interrelated clumps, with isolates popping up here and there. And that’s just spoken languages. The sign languages of our world have an entirely separate history of their own (so while English is spoken in America and England and French is spoken in France, American Sign Language is related to French Sign Language, not British Sign Language, which is separate from both).

The amount of linguistic diversity you see in a region depends on the history of the region. If it’s just one group of people who more or less travel all over the region regularly and they’ve never had any contact with anyone outside that region, sure, you can get away with one language. If there’s immigration of any kind, though, you’ll have different people who bring with them different languages and come from different naming traditions. Do you have a large city? It probably draws many people from many regions and many different linguistic traditions. Was your region conquered in the distant past by some other region? Unless they totally wiped out everyone who was there, there’s at least two linguistic traditions there. Even if the conquerors were eventually repelled, if they were there for any considerable length of time, that linguistic tradition will leave its footprints on the region, just like French did with English as a result of the Norman Conquest.

You don’t need a language tree in the front of the book the way you have family trees or geographical maps (though this would be very cool!), but you need to take some notes on the linguistic diversity of your region and its linguistic history before proceeding to the next step: naming.

In coming up with names, both for people and places, you first need to know what language the name is coming from (it’s weird if you have four characters all from the same place and with the same linguistic backgrounds with names like Tolros, Mevelestemnia, Brightsilver, and Mukmuk). With that information, you have to come up with the languages themselves. Not in their fullest forms, mind, but as a sketch, or what Jeffrey Henning called a naming language.

A naming language requires a couple of different elements that are found in a full conlang, but not all of them. Crucially, you don’t need a full grammar or lexicon. You do, however, need a list of phonemes (the sounds of the language), a set of licit syllable shapes, ways in which the syllables fit together into words, and, for some added authenticity, the headedness of the language (I’ll get to this in a minute).

Starting with sounds, all languages have a fixed set of sounds which can be used to create meaning. Sometimes sounds will only appear in foreign words (as with /f/ in Hindi or Tamil), or in certain very narrow circumstances (as with /ʒ/ in English, which occurs in borrowings, like the “g” in “genre”, or as a variant pronunciation of old /s/ before an old /j/ sound, as in “measure”, “leisure”, and “treasure”), but this will only be relevant if you have other languages to borrow from. If you’re lost about what sounds to use, here’s a minimal starter set:




























All languages have most of these sounds; most languages have all of these sounds. From here, you can add others, or subtract some to give it a different character. An easy place to start is adding /e/ and /o/, as many (but not all) languages have the common five vowel system of Spanish. Some common approximants are some kind of /r/ (many languages have an /r/, but its precise character tends to differ from language to language), and then the semi-vowels /j/ (commonly spelled “y” in English) and /w/. The sound /f/ is also fairly common, as are the voiced versions of /p, t, k, s/—respectively, /b, d, g, z/. Moving beyond English, if you’re going to add a sound, it pays to research a language that has that sound, so you can see how it works. Note generally, though, that sounds come in bunches. If you decide you like the ejective /p’/ sound, for whatever reason, it’d be surprising to see just /p’/, and not also /t’/ and /k’/. It’d also be a little odd to have /p, t, k/ and then /b/ and /ɡ/, but not /d/. There are further generalizations than this, but they get a bit technical.

Once you have your sounds, the next step is to figure out how they’re put together into syllables. In English, our syllables are quite permissive. We have syllables as small as “a,” represented as V, and as large as “strengths,” CCCVCCC. Many languages have much narrower restrictions on what constitutes a syllable. Hawaiian is famous for allowing no consonant clusters and no coda consonants (i.e. consonants that close a syllable), but requiring no consonants in between vowels. This results in words like Honolulu, Kamehameha, and Maui. In English, it’s rare for two vowel sounds to come next to one another, though it happens in words like “react” and “seeing”.

In designing your language’s phonotactics, it pays to decide precisely what syllables will be allowed and to stick to it, so you don’t end up with a bunch of names like Manta, Kalu, Sevan, Embe, and Tanam, and then the one name Skrux, all of them supposedly from the same language. That’s weird.

It also brings us to the next point. Names and words that come from the same source should like they belong together—that they exhibit a reasonable amount of variation within a fixed set of parameters. George R. R. Martin is pretty good with this. Take a look at the following Dothraki and Valyrian names:

















The Dothraki names (all male) are a bit simpler than the Valyrian names, but it’s clear what he’s doing. He’s identified sequences that recur frequently in each language and he’s put them together to form names, paying special attention to the endings which are important in each of these languages. The names have no meanings, and George R. R. Martin didn’t flesh out the languages themselves too much, but the names all look and feel like they belong together. This was done by keeping the phonemes consistent, and their arrangements consistent.

Moving beyond just the sounds of names, if you want to add some authenticity to them, you can give them meaning without expending too much effort. Most names either are words with some specific meaning, or are derived from words with some specific meaning. Unless you want every single name to derive from exactly one word, you’ll have to figure out how elements are arranged in a sentence, in order to figure out how elements will be arranged in a compound. In linguistics we refer to this as headedness. When dividing bits of a sentence into phrases, each phrase has a head (the thing the phrase is about) and a series of dependents. The ones relevant to us are noun phrases and verb phrases, the head of each of which is a noun and verb respectively.

Starting with a nominal example, “happy cat” is a noun phrase. It refers to one entity, and that entity is a cat, not a happy (i.e. we’re talking about a cat that is happy, not a happy that is cat). Thus “cat” is the head of the phrase “happy cat.” In English, the head of that phrase comes at the end, with its dependents or modifiers coming first. In Spanish, we see the opposite, where the translation of “happy cat” would be gato feliz, literally “cat happy.”

This ordering is relevant for names because we sometimes see place names that are simply an adjective and a noun. In California, we have Big Bear, Redlands, Rolling Hills, and many others. We also see noun-noun compounds, where the first noun modifies the second, as with Mountain View, Garden Grove, Fountain Valley, and Newport Beach (both types of compounds are in that last one!). To come up with names like these, all you need to do is come up with the words and decide what order they go in. In A Song of Ice and Fire, it’s clear that the order of elements in Dothraki is opposite from English, with place names like Vaes Dothrak and Vaes Tolorro.

The same strategy can be used for possession, which can help with last names and other place name strategies. For example, in Peterson, the possessor comes first, and the possessee second. The possessor is a kind of modifier (Peter’s son is a type of son, not a type of Peter). Same with something like Williamsburg, where “-burg” is an old suffix deriving from a word for mountain (hence iceberg). The order is flipped for languages that have a different order, such as MacDonald, where “Mac” is the part that means “son”.

And, of course, if you’re going this far, you can also take a moment to decide if “son” is something appropriate for a compound that’s used for names. In English, we just have “son.” In Icelandic, both the words for son and daughter are used in the same constructions (e.g. Einarsson and Einarsdóttir). But perhaps instead the word “child” is used, or just “of” or “from” (the latter is especially common in Romance with people and places, as in De Silva, Del Monte, De Anza, etc.).

Less common in English are compounds with verbs. In Spanish, though, there are words like sacapuntas, “pencil sharpener,” which literally means “removes ends.” In order to form a word like that, though, one has to know the order of the verb and its object (or the verb and its adverb), as they will appear in the same order. These can be used to form good descriptive names for characters given by parents who hope the name will inform the character of their offspring (e.g. “brings joy,” “defeats enemies,” “listens well,” etc.), and also different types of descriptions of places (e.g. “runs fast,” said of a river, or “grows wheat,” referring to the soil).

All of this information is useful for generating interesting and consistent names of people and places, and for giving realistic linguistic backgrounds to regions. It’s not sufficient for translating sentences. As minimal as they may seem, many of the choices above constrain the possible grammar of a language. Doing anything beyond this constrains it even further. With only a bunch of names, the language itself can still take on any character, provided it takes into account the phonology and phonotactic constraints, and the headedness present in compounds. That, though, says nothing about whether the language has cases or not, if verbs agree with a subject or object, what tense and aspect information is encoded on the verb, what auxiliaries there are, how subordination works, how relative clauses work, etc. There’s lot yet to be done, but what’s there will be consistent. So, if the book is optioned for a movie, and a language creator is hired to fill in the blanks later, they won’t be tearing their hair out trying to figure out the mess that the author made of their language.

One very strong recommendation I will make, though, is that one should not begin with some specific language—especially a language one doesn’t speak—and kind of arbitrarily change bits and pieces of it to give it some kind of “aesthetic” specific to that region. In satire it’s done defensively for the sake of plausible deniability; in fantasy it smacks of the old 18th and 19th century exoticism. It can also be disorienting if readers familiar with the language in question see familiar bits of it mashed together in odd ways. Imagine a book set in the fictional land of Englerika with names like Ittler, Breakfas, Bighot, Streetson, and Laceingest (and then imagine non-English speakers gushing over what a beautiful name Laceingest is). If an author finds themself saying “I think I’ve altered this language enough that speakers of said language won’t be offended,” it’s a sure sign that the author should be doing something different.

As a final note, I’d like to discuss how words and names in created languages are written in a given text. To start with, I want to introduce some precise terminology. A writing system is the set of glyphs used by one or more languages to write their language down. For example, English, Spanish, German, Finnish, and Portuguese all use the Roman alphabet as their writing system. By contrast, Arabic and Farsi both use the Arabic abjad as their writing system. An orthography is the specific way a writing system is used for a given language. The American English orthography differs from the British English orthography in certain respects (e.g. the spelling of “color/colour”), despite the fact that both use the Roman alphabet. A romanization system is a way of transcribing a writing system that differs from the Roman alphabet. For example, tabemashita is a way of romanizing the Japanese word 食ました. In short:




Writing System

あいうえ etc.

A B C D etc.

А Б В Г etc.









If an author is going to the trouble of rendering all the non-English dialogue into English, they should do their readers the courtesy of rendering their names in a romanization system. This romanization system should be as uncreative as humanly possible, using only graphs and digraphs that will be fairly unambiguous for the majority of their readership. There should be no need for a page saying how each letter is pronounced. Where it’s important to render sounds not readily rendered in English, use digraphs that can easily be analogized (so t : d :: th : dh or s : z :: sh : zh). If most readers will be American, avoid diacritics at all costs, as these will be ignored. One would hope that many would be familiar with the pronunciations of German “ü” and “ö,” but it is not a guarantee. Apostrophes should be used to mark possession or contractions, as they are in English orthography. If they are necessary, they can be used as a consonant to indicate the glottal stop /ʔ/, similar to the okina of Hawaiian, which is always ‘, never ’. They can also indicate that a sound is an ejective, as that’s how they’re transcribed in the International Phonetic Alphabet. Otherwise, they should be avoided. Using an apostrophe to separate parts of a word is not clever: it is an abomination. Do’ing so in Engl’ish would give ever’y’one head’ache’s. Why would an author wish such violence upon their readers?

The good news is that it would take an author less time to come up with a naming language than it took to read this (and in the time it took me to write this, an author could probably come up with ten or more). Knowing what you’re doing takes a bit of time and effort, but not as much as it takes to learn to create an entire language. This is fairly simple—and can be done ahead of time. An enterprising author could sit down and come up with twenty or so naming languages in an afternoon for use in future work. It’s a minimal investment of time that will pay off dividends, as it doesn’t rely on using other linguistic tropes, but actively creates brand new ones unique to the work in question.