Phonotactics and language identification
-
- Sanci
- Posts: 66
- Joined: Tue May 11, 2010 5:51 pm
Phonotactics and language identification
I remember reading once somewhere that one of the ways we identify words as English or not involves phonotactics - like the whole gostak things, we know that 'distims' and 'doshes' are both potential English words, even though we may not know them specifically. We also recognize a word like 'vlim' is definitely not English.
But my question is this - when faced with words that are clearly possible in English phonotactics, we can still place them fairly accurately in terms of origin. So, for example, the name Tunde Adebimpe is clearly African (probably Niger-Congo) even though with a bit of respelling, it fits the English paradigm. Similarly, 'doroshke' is clearly slavic, though again and for the same etc. What else is going on here?
But my question is this - when faced with words that are clearly possible in English phonotactics, we can still place them fairly accurately in terms of origin. So, for example, the name Tunde Adebimpe is clearly African (probably Niger-Congo) even though with a bit of respelling, it fits the English paradigm. Similarly, 'doroshke' is clearly slavic, though again and for the same etc. What else is going on here?
[quote="TomHChappell"]I don't know if that answers your question; is English a natlang?[/quote]
Sound sequences aren't simply evaluated on a binary choice of "allowed" or "not allowed", but also on probability. Anyone familiar with African names knows that certain combinations are much more probable in Niger-Congo languages than they are in English, even if they fall within the realms of the possible in English.
Since when does English allow word-final short /E/?Kai_DaiGoji wrote:So, for example, the name Tunde Adebimpe is clearly African (probably Niger-Congo) even though with a bit of respelling, it fits the English paradigm.
[i]Linguistics will become a science when linguists begin standing on one another's shoulders instead of on one another's toes.[/i]
—Stephen R. Anderson
[i]Málin eru höfuðeinkenni þjóðanna.[/i]
—Séra Tómas Sæmundsson
—Stephen R. Anderson
[i]Málin eru höfuðeinkenni þjóðanna.[/i]
—Séra Tómas Sæmundsson
-
- Sanci
- Posts: 66
- Joined: Tue May 11, 2010 5:51 pm
I was reading it as /eɪ/ which is allowed all the time.Echobeats wrote:Sound sequences aren't simply evaluated on a binary choice of "allowed" or "not allowed", but also on probability. Anyone familiar with African names knows that certain combinations are much more probable in Niger-Congo languages than they are in English, even if they fall within the realms of the possible in English.
Since when does English allow word-final short /E/?Kai_DaiGoji wrote:So, for example, the name Tunde Adebimpe is clearly African (probably Niger-Congo) even though with a bit of respelling, it fits the English paradigm.
[quote="TomHChappell"]I don't know if that answers your question; is English a natlang?[/quote]
Final <e> is usually silent though in English. It's generally ethnic words/names that we pronounce with /eI/ or /i/. Like Enrique, karaoke, etc.Kai_DaiGoji wrote:I was reading it as /eɪ/ which is allowed all the time.Echobeats wrote:Sound sequences aren't simply evaluated on a binary choice of "allowed" or "not allowed", but also on probability. Anyone familiar with African names knows that certain combinations are much more probable in Niger-Congo languages than they are in English, even if they fall within the realms of the possible in English.
Since when does English allow word-final short /E/?Kai_DaiGoji wrote:So, for example, the name Tunde Adebimpe is clearly African (probably Niger-Congo) even though with a bit of respelling, it fits the English paradigm.
Also, there are certain names that can be misleading. The name Sam Rainsy at first glance looks like it could be a typical English or American name, but it actually is the name of a Cambodian politician.
Last edited by Silk on Mon Aug 23, 2010 2:41 pm, edited 1 time in total.
Echobeats makes an important point about probability and frequency.
The human brain seems to be able to recognize statistical patterns which can be formalized using Markov chains and similar probabilistic models, but beefed up by our knowledge of phonetic classes.
Given any sequence of letters, and any set of Markov models, there is a model which is more likely to generate that sequence than any other model. If the models are taken to be "what a given language looks like" then we will probably think the sequence "looks like a word in language X".
The human brain seems to be able to recognize statistical patterns which can be formalized using Markov chains and similar probabilistic models, but beefed up by our knowledge of phonetic classes.
Given any sequence of letters, and any set of Markov models, there is a model which is more likely to generate that sequence than any other model. If the models are taken to be "what a given language looks like" then we will probably think the sequence "looks like a word in language X".
- Ulrike Meinhof
- Avisaru
- Posts: 267
- Joined: Wed Apr 20, 2005 12:31 pm
- Location: Lund
- Contact:
Re: Phonotactics and language identification
Though 'vlog' is.Kai_DaiGoji wrote:We also recognize a word like 'vlim' is definitely not English.
Attention, je pelote !
I've read that that's probably just that there aren't any native words with 'vl', not that 'vl' is a disallowed sequence of letters – hence we're actually fairly alright with neologisms like 'vlog' and names like 'Vlad'.
Speaking of which, what happened to Vlad? Did he just decamp to IRC and never look back; is he even still around on IRC, come to that?
Speaking of which, what happened to Vlad? Did he just decamp to IRC and never look back; is he even still around on IRC, come to that?
- Guitarplayer II
- Lebom
- Posts: 76
- Joined: Wed Aug 23, 2006 4:44 pm
- Location: Marburg, Germany
- Contact:
Re: Phonotactics and language identification
A lot of people prefer to pronounce it vee-log, though.Dingbats wrote:Though 'vlog' is.Kai_DaiGoji wrote:We also recognize a word like 'vlim' is definitely not English.
Re: Phonotactics and language identification
What?? Really???? That's horrible.Silk wrote:A lot of people prefer to pronounce it vee-log, though.Dingbats wrote:Though 'vlog' is.Kai_DaiGoji wrote:We also recognize a word like 'vlim' is definitely not English.
It's like 'flog' or 'blog' but with a 'v'. I hate people.
- Radius Solis
- Smeric
- Posts: 1248
- Joined: Tue Mar 30, 2004 5:40 pm
- Location: Si'ahl
- Contact:
There's also a commercial pickle brand, Vlasic, which like "Vlad", nobody seems to have any trouble pronouncing. (In fact these two names are my main go-to evidence for when I argue with people about whether there's a difference between a phonology disallowing something and merely having a gap for it - some people conflate these ideas.)
-
- Lebom
- Posts: 196
- Joined: Tue May 11, 2010 5:50 pm
- Location: Berlin, Germany
- LinguistCat
- Avisaru
- Posts: 250
- Joined: Thu Apr 13, 2006 7:24 pm
- Location: Off on the side
I have heard people put a very short schwa between the v and the l... That or the v becomes syllabic. Either way, the <vl> is not pronounced as a consonant cluster...Fanu wrote:Thinking of people pronouncing this like vee-lad made me laugh.Radius Solis wrote:(...) "Vlad" (...)
The stars are an ocean. Your breasts, are also an ocean.
It strikes me that proper names can sometimes behave a bit like interjections in terms of their phonotactics. For instance consider the fact that many people pronounce "LaTeX" with an [x] at the end despite having no such sound in their native language.Radius Solis wrote:There's also a commercial pickle brand, Vlasic, which like "Vlad", nobody seems to have any trouble pronouncing. (In fact these two names are my main go-to evidence for when I argue with people about whether there's a difference between a phonology disallowing something and merely having a gap for it - some people conflate these ideas.)
I agree with you that /vl/ is probably an accidental gap of sorts in English, but I think it's worth making a distinction between:
1. Phonotactically impossible clusters like, say, initial /kb/
2. Sequences of sounds that don't appear natively, but which most people have little trouble with -- /vl/ being a good example of this in English
3. Eminently plausible sequences of sounds that just so happen not to exist as words, like feg or brud (now someone's gonna tell me that both of these do exist in some obscure dialect)
The difference between (2) and (3) would be that while some speakers might give pause when asked to read the word vlim, and possibly pronounce it something like [v@"lIm], no native English speaker would bat an eyelid at feg. As far as accidental gaps go, feg is "more accidental" than vlim. It might be best to imagine it as a continuum of phonotactical acceptability.
I'd actually be tempted to say vee-log for vlog, just because there's a significant probability that it would be mis-heard as "blog" in context.
Really? Everyone I've met who insists on pronouncing LaTeX the "correct" way says something like [lAtEk]. I don't think I've ever heard a monolingual speaker of AmE who wasn't also a language geek pronounce [x] correctly (i.e. not as [k] or [h]).Magb wrote:It strikes me that proper names can sometimes behave a bit like interjections in terms of their phonotactics. For instance consider the fact that many people pronounce "LaTeX" with an [x] at the end despite having no such sound in their native language.
It's (broadly) [faɪ.ˈjuw.lɛ]
#define FEMALE
ConlangDictionary 0.3 3/15/14 (ZBB thread)
Quis vult in terra stare,
Cum possit volitare?
#define FEMALE
ConlangDictionary 0.3 3/15/14 (ZBB thread)
Quis vult in terra stare,
Cum possit volitare?
I didn't mean to start a debate about the pronunciation of LaTeX. I probably shouldn't have used the phrase "many people", but I have heard people use [x] in it. Apparently the pronunciation with [x] is Donald Knuth's pet project. I should've known.
The LaTeX thing was a bad example anyway. I take it back.
The LaTeX thing was a bad example anyway. I take it back.