Google only turns up results for English, French, Dutch, and Spanish, and even then you have to poke around quite a bit. Does anybody know of any more, by any chance?
Note that I'm referring to *within*, not *across*, languages; for example the commonest consonant in English is /n/, but in French it's /r/.
Phoneme frequencies within various languages
- Particles the Greek
- Lebom
- Posts: 181
- Joined: Tue Sep 17, 2013 1:48 am
- Location: Between clauses
Phoneme frequencies within various languages
Non fidendus est crocodilus quis posteriorem dentem acerbum conquetur.
- Ketumak
- Lebom
- Posts: 231
- Joined: Sun Feb 09, 2003 3:42 pm
- Location: The Lost Land of Suburbia (a.k.a. Harrogate, UK)
- Contact:
Re: Phoneme frequencies within various languages
I found this:
http://www.letterfrequency.org/
It's very Eurocentric and it's about letters not phonemes, but it's a start.
http://www.letterfrequency.org/
It's very Eurocentric and it's about letters not phonemes, but it's a start.
Re: Phoneme frequencies within various languages
So I found this article from 1957, which has phoneme frequencies for 9 kind of random languages (Maori, Hidatsa, Winnebago, Shawnee, Choctaw, Havasupai, Navajo, Chontal [not sure if this is Highland Contal or Huamelultec], and Tarascan). I only skimmed the article (you can access it online with a free account), and I'm not sure what their dataset is, but they give the relative rankings for each phoneme of each language as follows, in order from most to least common (a slash indicates a tie):
Maori: a, i, e, t, k, o, h, r, u, n, m/ŋ, p, w, ɸ
Maori evidently has vowel length, but most of the long vowels are apparently uncommon.
Hidatsa: a, n, i, h, k, ʔ, u, ʃ, m, iː, ts, íː, t, aː, p, e, í/x, o, á, áː, ú, ó, óː/úː, é/eː/éː, oː, uː
Winnebago: e, ã, i, a, g, n, h, r/ʒ, k, u, ĩ, ũ, dʒ/w, ʔ, o, ʃ, m, s, tʃ, x, p/t/z, b/ɣ
Shawnee: i, a, e, w, k, l, t, aː, n, ʔ, o/eː, tʃ, m, j, p, iː, ʃ, θ, oː
Choctaw: a, i, t, o, h, k, l/n, u/tʃ, m, p, j, g/s, e, ə/ʃ, b, f, ã/w, õ, x, θ, ĩ/eː, aː/iː, oː/ʔ
Wikipedia indicates that Choctaw only has [g] as an allophone of [k], and that [θ] is an allophone in free variation with [ɬ]
Havasupai Language: k, a, i, m, á, j, h, f, w, u, t, ts, n, s, í, p, ú/l, o, ʔ, é, e/θ, ó, q, íː, úː/ʈ
Navajo: a, i, ʔ, d, n, x, o, iː/t, b, ɬ, g/l, aː, eː, t', oː, e, j, s, z, ĩː, k, dʒ, ĩ, ɣ, ãː/k'/ʃ, ẽ/ts, ã, ẽː, tʃ'/ʒ, m/tɬ, tʃ/dz, ts', õː/kʷ/xʷ
This one's a little screwy. For one thing, they describe the language as having a contrast between voiced and voiceless stops and affricates, rather than having plain and aspirated series. For another, unlike in some of the other languages, tone isn't distinguished in the vowels. I'm not sure why they only include one lateral affricate, either.
Chontal: u, n, á, h, ʔ, t, k, í, i, ɨ, a, é, l, tʃ, b, ʃ, k', ó/p, o, m, ú, j, ɨ́, t', w, s, ts, ts'/r, e/p', tʃ', d
Tarascan: a, i, n, k, s, u, e, r, t, p, o/m, h, d, b, ə, tʃ, ɽ, ŋ, g/l/ts, ʃ, kʰ/tʰ, f, pʰ/tʃʰ
This inventory differs from that given in the Wikipedia article on Purépecha/Tarascan, but it sounds like there are a number of divergent varieties, so who knows.
Maori: a, i, e, t, k, o, h, r, u, n, m/ŋ, p, w, ɸ
Maori evidently has vowel length, but most of the long vowels are apparently uncommon.
Hidatsa: a, n, i, h, k, ʔ, u, ʃ, m, iː, ts, íː, t, aː, p, e, í/x, o, á, áː, ú, ó, óː/úː, é/eː/éː, oː, uː
Winnebago: e, ã, i, a, g, n, h, r/ʒ, k, u, ĩ, ũ, dʒ/w, ʔ, o, ʃ, m, s, tʃ, x, p/t/z, b/ɣ
Shawnee: i, a, e, w, k, l, t, aː, n, ʔ, o/eː, tʃ, m, j, p, iː, ʃ, θ, oː
Choctaw: a, i, t, o, h, k, l/n, u/tʃ, m, p, j, g/s, e, ə/ʃ, b, f, ã/w, õ, x, θ, ĩ/eː, aː/iː, oː/ʔ
Wikipedia indicates that Choctaw only has [g] as an allophone of [k], and that [θ] is an allophone in free variation with [ɬ]
Havasupai Language: k, a, i, m, á, j, h, f, w, u, t, ts, n, s, í, p, ú/l, o, ʔ, é, e/θ, ó, q, íː, úː/ʈ
Navajo: a, i, ʔ, d, n, x, o, iː/t, b, ɬ, g/l, aː, eː, t', oː, e, j, s, z, ĩː, k, dʒ, ĩ, ɣ, ãː/k'/ʃ, ẽ/ts, ã, ẽː, tʃ'/ʒ, m/tɬ, tʃ/dz, ts', õː/kʷ/xʷ
This one's a little screwy. For one thing, they describe the language as having a contrast between voiced and voiceless stops and affricates, rather than having plain and aspirated series. For another, unlike in some of the other languages, tone isn't distinguished in the vowels. I'm not sure why they only include one lateral affricate, either.
Chontal: u, n, á, h, ʔ, t, k, í, i, ɨ, a, é, l, tʃ, b, ʃ, k', ó/p, o, m, ú, j, ɨ́, t', w, s, ts, ts'/r, e/p', tʃ', d
Tarascan: a, i, n, k, s, u, e, r, t, p, o/m, h, d, b, ə, tʃ, ɽ, ŋ, g/l/ts, ʃ, kʰ/tʰ, f, pʰ/tʃʰ
This inventory differs from that given in the Wikipedia article on Purépecha/Tarascan, but it sounds like there are a number of divergent varieties, so who knows.