Image Image
Image homepage of jouni filip maho
[ ]
« BACK TO FRONT | Papers & Stuff | Blog à la Maho | Music | Foto Galleries | Assorted Links
Image Image


The Bantu languages constitute one of the largest groups of languages in Africa. They are spoken by more than 150 million people in central, eastern and southern Africa. The total number of Bantu languages is difficult to guess, but a rough estimate of 400-500 shouldn't be too far off.

Map showing the distribution of Bantu language groups
click for full size image
During the 20th century, several classifications of the Bantu languages have been published. Following Sir Harry Johnston, variously authoritative statements on how to classify the Bantu languages have been offered by Yvonne Bastin, Anthony Cope, Desmond Doke, Bernd Heine, Malcolm Guthrie, Derek Nurse, Gérard Philippson, the Summer Institute of Linguistics (SIL), and many others. Some have tried to classify all of Bantu, while others have focused their attention on various subgroups only.


Most classifications have employed some kind of coding system reflecting the internal structure of the classification itself. The most well-known such classification-cum-coding system is that of Malcolm Guthrie, which was first presented in the late 1940s, and later elaborated upon, esp. during the late 1960s.

According to Guthrie's system, languages are assigned to various zones, signified by upper case letters, e.g. Zone A, B, C, etc. Within these zones, languages are further subgrouped into various groups, signified by digits in tens, e.g. group A10, A20, A30, etc. Each language is then given a number of its own, e.g. Fang is coded A75, Kikongo is H16, Zulu is S42, etc. Occasionally dialects are distinguished by trailing lower-case letters, e.g. Mpongwe (being a dialect of Myene B11) is coded B11a.

Guthrie's original classification-cum-coding system is not without faults and inconsistencies. Still, the codes have become widely used in the literature on Bantu languages, and are among Guthrie's most pervasive contributions to the field of Bantuistics (among many other things, of course).


Many people have noted that, while Guthrie's coding system has become widely used when referring to individual Bantu languages, the structure of his classification is not without considerable flaws as a linguistic-genetic statement. That is, the historical validity of many of Guthrie's claimed subgroups is often bad and sometimes non-existent. For this reason, post-Guthrie classifications have sought to revise or even replace it. Some of these have done so while retaining Guthrie's original coding system, though suitably modified (e.g. SIL and Tervuren, cfr Maho 2002).

The unfortunate consequence is that there now exists a veritable mess when referring to individual languages or larger language groupings within Bantu, simply because (a) one and the same language and language group is referred to with different codes by different authors, and (b) one and the same code can refer to several different languages and language groups. Examples are given below.

      Language (with suggested NUGL code) Guthrie's code Tervuren's code SIL's code Other codes used
  Gciriku (K332) --- K39 K70 K38 (Möhlig 1983)
  Kumu (D23) D23 D37 D30 ---
  Lunda (L52) L52 K22 K30 ---
  Lwalu (L221) --- L22b L30 L39 (Maho 1999)
  Mbala (H41) H41 K51 K60 ---
  Mbole (D11) D11 C68 D10 ---
  Ngangela (K12b) K12b K19 K20 ---
  Ntomba-Bikoro (C35a) C35a C35 C70 C65 (Botne 1999)


The main purpose of the New Updated Guthrie List (or, NUGL) is to preserve Guthrie's original codes as well as provide simple and consistent principles for assigning new codes to languages lacking in Guthrie's original classification, irrespective of whether or not we know their proper linguistic-genetic status within Bantu. NUGL is a referential classification, not a historical one.

The only reasonable thing we can do at this point is to stick to Guthrie's original classification (more specifically, his referential coding system) as an untouchable basis. But since that has many omissions, we need to update it. That is, we need to add new languages to it and these need to be assigned new codes. In order for this excercise to be useful, the new languages must be added to Guthrie's classification without making a mess of the old codings. The principles employed when new codes are assigned are simple and relatively straight-forward (cfr Maho 2001, 2003, forthcoming).

(More to be added on this later.)


A question that keeps popping up now and then is: how many Bantu languages are there? Unfortunately, the answer has to remain rather vague. Most estimates range between 300 and 600. However, any figure or range will include a heavy dose of guesswork. The linguistic situation in many parts of the Bantu area is still only fragmentarily documented, so often we have merely a list of known language names, and these may or may not be part of larger dialect continua.

Also, the problem of what is a language and what is a dialect is not necessarily solved even when good linguistic documentation is available. Speech varieties are usually defined as languages or dialects by factors other than degree of linguistic similarity, difference, or even mutual intelligibility. Sometimes two mutually intelligible and similar speech varieties are regarded as two separate languages if, for instance, the speakers decide that that's how they want it, or if there exist separate standardisations for them, which is the case with Kwanyama and Ndonga in Namibia, as well as Setswana, Sesotho, and Northern Sotho in South Africa. Several differing speech varieties may be subsumed under an artificially created norm variety, as is the case with Shona in Zimbabwe. In short, linguistic factors do not always determine what is a language and what is a dialect, as the distinction is not primarily about linguistic differences or similarities but rather political decisions.

So, how many Bantu languages are there, then? In the current version of the NUGL, there are appr. 540 non-indented entries. These may cautiously be interpreted as languages. This is as good an answer as any, even though a range of, say, 400-500 would seem more appropriate. There may in fact be less, but at the current state of Bantu linguistics we simply cannot achieve any more accuracy than that.


Jouni F. Maho. 2001. The Bantu area: (towards clearing up) a mess. Africa & Asia: Göteborg working papers on Asian and African languages and literatures, n. 1, p. 40-49.
  [ complete text PDF ]

Jouni F. Maho. 2002. The Bantu line-up: comparative overview of three Bantu classifications. Dept of Oriental and African Languages, Göteborg University. Pp 59.
  [ complete text PDF ]

Jouni F. Maho. 2003. A classification of the Bantu languages: an update of Guthrie's referential system. In: The Bantu languages, p. 639-651. Edited by Derek Nurse & Gérard Philippson. Routledge language family series, n. 4. London & New York: Routledge.

Jouni F. Maho. 2008. Indices to Bantu languages (an accompanying volume to "The new updated Guthrie list"). Studies in African linguistics, v. 73. Munich: Lincom Europa. Pp 187.

Jouni F. Maho. 200x. The New Updated Guthrie List. Forthcoming. Pp 234.
  [ simplified web version PDF ]

Image Image
« BACK TO FRONT | Papers & Stuff | Blog à la Maho | Music | Foto Galleries | Assorted Links
Image Image
these pages are sometimes updated