If you were to ask Google how many languages there are, it would tell you that there are somewhere in the region of 6500 languages spoken in the world today.

If you were to ask us, you might expect a translation agency to have a more precise answer than that. But it’s actually a much trickier question than you might think.

There are the obvious ones we all already know about. English and Mandarin Chinese for example, both have over a billion speakers. And then there are the surprisingly popular ones you may never have heard of, like Bhojpuri, which has somewhere in the region of four times as many speakers as Swedish.

But languages such as those are really obvious and easy to identify. They are clearly distinct and millions of people speak them every day.

Even if we stick to such criteria – being clearly distinct and having millions of speakers – it isn’t a simple question to answer.  This is because we don’t actually have any set criteria for determining when a way of communicating becomes classed as a language.

What is a language?

Let’s look at Nigerian Pidgin. The first question we have to ask is what is a ‘pidgin’? And is it a language?

Technically a pidgin is a simplified form of language used to allow two groups who do not share a common language to communicate.

But when it stops being used for that purpose, and starts being spoken as a primary means of communication within groups, does that not make it a language?

A Creole is defined as a mix of a European language and a “local” language, most commonly African languages spoken by slaves when they were transported to the West Indies. So one could argue that a creole is a language in its own right. But at what point does a pidgin become a creole? And if over 30,000,000 people speak a pidgin, as is the case for Nigerian Pidgin, is it still not a language? It certainly has no national recognition as a language, but it does have unique words not found anywhere else. So I think you could be forgiven for accepting that it is indeed a language.  But equally you might forgive someone for claiming it isn’t.

So that gives us some idea as to why we can’t really know for definite how many languages there are – but we’ve only really scratched the surface.

What about variations within a language?

For a variety of social, political and economic reasons, a creole language can begin to drift more towards one of the original languages from which it is descended. But what that inevitably leads to is a number of different dialects within a creole. How different must they be to be considered their own language?

The linguist William Stewart suggested a continuum from “Acrolect”, the most socially prestigious variety, down to “Basilect”, the least so. But these distinctions are clearly arbitrary with many communities speaking various varieties of the creole in question depending on context, and often in original tongues as well.

As an example, in Guyanese Creole the phrase “aɪ ɡeɪv hɪm wʌn” and “mɪ bɪn ɡiː æm wan” are equally correct and mean exactly the same thing.

But you probably noticed that these are written in phonetic characters, which naturally leads to the question of whether a language needs to have a written form. And if not, how do you represent it in a digital format? The use of phonetic typology is not an effective way to communicate.

We’ve mentioned dialects with Creoles, but what about dialects in more standard languages?

Are dialects languages?

While fewer and fewer people are speaking them with any consistency, there are still up to forty dialects in the UK alone that would be difficult for a non-local to understand. Many of them use different spellings and word structure and would be almost impossible to represent digitally. However, it is unlikely that these would be classed as languages. Even two hundred years ago, when they would be almost unrecognisable to a modern English speaker, they were still classed as part of the same language.

And if that isn’t confusing enough, we then have methods of communication such as Silbo Gomero. That is only a language in the loosest sense of the word.

It has no words.

It uses differences in the pitch of whistling to communicate. So does that class as a language?

Let’s assume for a moment that we can all agree on exactly what makes a method of communication a language. Nevertheless, this doesn’t actually solve the confusion as there are some languages that have died out and subsequently been revived.

Does a language need to have native speakers to be considered valid?

How about extinct languages?

Take Cornish for example. It’s commonly accepted that the last native speaker of the Cornish language was Dolly Pentreath who died in 1777. But in reality there is no way of knowing if people were speaking true Cornish, or simply English with a heavy sprinkling of Cornish dialect mixed in that long ago.

What we know with certainty is that by around the year 1800 no one was speaking it any more.  However, after a concerted effort there are now around 3000 people with some ability in the Cornish language, and 500 or so fluent speakers.

How about the many different forms that make up Latin? Does contemporary Latin count, and if so how about Medieval Latin, or even Classical Latin?

And if Latin does count, then how far back do we have to take English for it to count as a separate language? There are people perfectly able to converse in Middle English, or even Old English. How about old Norse?

And then once we’ve decided against trying to find out if ancient languages count, how do we deal with newly created languages?

What about fictional languages?

What is a language?  Does Elvish count?

Tolkien’s Elvish has unique grammar, vocabulary, and even an alphabet of its own. And there are people who can speak it. Does that mean it counts?

How about Klingon?


So, if we go back to the original question of ‘how many languages are there?’, we can see that even the proposed answer of 6500 is nothing more than a guess based on a huge number of assumptions.

What we can say one thing for certain though. Here at STAR we have worked from 61 source languages, and translated into 89 target languages.

We haven’t yet been asked to work in a language we couldn’t translate into or from.

So when it comes down to it, does it really matter how many there are?