Phoneme frequencies within various languages

Discussion of natural languages, or language in general.
Post Reply
User avatar
Particles the Greek
Lebom
Lebom
Posts: 181
Joined: Tue Sep 17, 2013 1:48 am
Location: Between clauses

Phoneme frequencies within various languages

Post by Particles the Greek »

Google only turns up results for English, French, Dutch, and Spanish, and even then you have to poke around quite a bit. Does anybody know of any more, by any chance?

Note that I'm referring to *within*, not *across*, languages; for example the commonest consonant in English is /n/, but in French it's /r/.
Non fidendus est crocodilus quis posteriorem dentem acerbum conquetur.

User avatar
Ketumak
Lebom
Lebom
Posts: 231
Joined: Sun Feb 09, 2003 3:42 pm
Location: The Lost Land of Suburbia (a.k.a. Harrogate, UK)
Contact:

Re: Phoneme frequencies within various languages

Post by Ketumak »

I found this:

http://www.letterfrequency.org/

It's very Eurocentric and it's about letters not phonemes, but it's a start.

CatDoom
Avisaru
Avisaru
Posts: 739
Joined: Fri Sep 20, 2013 1:12 am

Re: Phoneme frequencies within various languages

Post by CatDoom »

So I found this article from 1957, which has phoneme frequencies for 9 kind of random languages (Maori, Hidatsa, Winnebago, Shawnee, Choctaw, Havasupai, Navajo, Chontal [not sure if this is Highland Contal or Huamelultec], and Tarascan). I only skimmed the article (you can access it online with a free account), and I'm not sure what their dataset is, but they give the relative rankings for each phoneme of each language as follows, in order from most to least common (a slash indicates a tie):

Maori: a, i, e, t, k, o, h, r, u, n, m/ŋ, p, w, ɸ

Maori evidently has vowel length, but most of the long vowels are apparently uncommon.

Hidatsa: a, n, i, h, k, ʔ, u, ʃ, m, iː, ts, íː, t, aː, p, e, í/x, o, á, áː, ú, ó, óː/úː, é/eː/éː, oː, uː

Winnebago: e, ã, i, a, g, n, h, r/ʒ, k, u, ĩ, ũ, dʒ/w, ʔ, o, ʃ, m, s, tʃ, x, p/t/z, b/ɣ

Shawnee: i, a, e, w, k, l, t, aː, n, ʔ, o/eː, tʃ, m, j, p, iː, ʃ, θ, oː

Choctaw: a, i, t, o, h, k, l/n, u/tʃ, m, p, j, g/s, e, ə/ʃ, b, f, ã/w, õ, x, θ, ĩ/eː, aː/iː, oː/ʔ

Wikipedia indicates that Choctaw only has [g] as an allophone of [k], and that [θ] is an allophone in free variation with [ɬ]

Havasupai Language: k, a, i, m, á, j, h, f, w, u, t, ts, n, s, í, p, ú/l, o, ʔ, é, e/θ, ó, q, íː, úː/ʈ

Navajo: a, i, ʔ, d, n, x, o, iː/t, b, ɬ, g/l, aː, eː, t', oː, e, j, s, z, ĩː, k, dʒ, ĩ, ɣ, ãː/k'/ʃ, ẽ/ts, ã, ẽː, tʃ'/ʒ, m/tɬ, tʃ/dz, ts', õː/kʷ/xʷ

This one's a little screwy. For one thing, they describe the language as having a contrast between voiced and voiceless stops and affricates, rather than having plain and aspirated series. For another, unlike in some of the other languages, tone isn't distinguished in the vowels. I'm not sure why they only include one lateral affricate, either.

Chontal: u, n, á, h, ʔ, t, k, í, i, ɨ, a, é, l, tʃ, b, ʃ, k', ó/p, o, m, ú, j, ɨ́, t', w, s, ts, ts'/r, e/p', tʃ', d

Tarascan: a, i, n, k, s, u, e, r, t, p, o/m, h, d, b, ə, tʃ, ɽ, ŋ, g/l/ts, ʃ, kʰ/tʰ, f, pʰ/tʃʰ

This inventory differs from that given in the Wikipedia article on Purépecha/Tarascan, but it sounds like there are a number of divergent varieties, so who knows.

Post Reply