Oh God, please shoot me now...KathAveara wrote:Surprise, surprise...TaylorS wrote:Non-linguist morons keep pushing the Anatolian Hypothesis.
Good fucking grief, why does the AH keep rising from the dead like a zombie? It it's always non-linguists who keep reviving it.
The Great Proto-Indo-European Thread
Re: The Great Proto-Indo-European Thread
Re: The Great Proto-Indo-European Thread
The Horse, The Wheel, And Language says the Yamnaya people, the presumed speakers of Late PIE, were mainly herders of cattle, goats, and sheep and did very little farming. In fact the book claims that the transition from Sredny-Stog to Yamnaya, representing the horse being fully domesticated and wheeled carts becoming wide-spread, lead to a general abandonment of agriculture on the Pontic Steppe as pastoralism became a better way of making a living on the steppesMatrix wrote:All in all, the IE people were probably aggressive farmers.WeepingElf wrote:I think the Anatolian hypothesis is so fashionable among non-specialists because people these days would prefer being descended from peaceful farmers rather than aggressive warriors.
Re: The Great Proto-Indo-European Thread
Fuck, maybe my rural upbringing is showing, but there are dialects where "cattle" means any livestock? do these people think chocolate milk comes from brown cows, too??? O_OR.Rusanov wrote:Fucking amazing.finlay wrote: Basically, but cows are only female, technically. Cattle is the only word that strictly refers to the whole species, while you have other words like bulls, oxen and calves for different variations on sex and age.
In my idiolect of circa a few hours ago cow referred to the species as a whole, and cattle was synonymous with livestock. And I considered myself a *fluent* english speaker :^/
Re: The Great Proto-Indo-European Thread
probably a result of the importance of cows as a source of milk?Terra wrote:Ket has the same oddity:It's an odd semantic hole, now that I think about it.Stefan Georg's grammar, page 128 wrote:[131] Like in Russian (korova), English (cow), or German (Kuh), the female gender is taken
as the default representative of the species.
Re: The Great Proto-Indo-European Thread
Eh, hes not a native speaker I think, so its more understandable.TaylorS wrote:
Fuck, maybe my rural upbringing is showing, but there are dialects where "cattle" means any livestock? do these people think chocolate milk comes from brown cows, too??? O_O
And now Sunàqʷa the Sea Lamprey with our weather report:

Re: The Great Proto-Indo-European Thread
The Ket were nomadic hunter-gatherers until the mid 20th century when the Russians forced them to settle in collectivized villages. The only domesticated animal they had was the dog. Some groups adopted reindeer herding from neighboring foreign groups (Selkups, Ewenks, etc). So, they probably didn't drink much milk, cow's or any other animal's.probably a result of the importance of cows as a source of milk?
It's probably related to how most herds of cows have only a few bulls, because bulls tend to fight eachother.
Re: The Great Proto-Indo-European Thread
I don't think most ranchers even keep bulls around, or at least not many; steers are generally easier to handle, and as far as I know the only disadvantage of gelding a bull is that you can't breed it. Are the Ket words relating to cattle loans? Do they have a word for "steer"?
Re: The Great Proto-Indo-European Thread
(1) I assume that the word is a loan, but I don't find any explicit mention that they are. However, the word does have a proper Ket-ish shape, so it could be inherited.I don't think most ranchers even keep bulls around, or at least not many; steers are generally easier to handle, and as far as I know the only disadvantage of gelding a bull is that you can't breed it. Are the Ket words relating to cattle loans? Do they have a word for "steer"?
(2) I don't find a word for "steer", but I do find one for "castrated reindeer".
Also, fyi, the Ket word for "milk" is a "mámul", which is a clear compound of "ma?m" (glottalized tone) (meaning "(female) breast") and "ûl" (high tone) (meaning "water").
- alynnidalar
- Avisaru

- Posts: 491
- Joined: Fri Aug 15, 2014 9:35 pm
- Location: Michigan, USA
Re: The Great Proto-Indo-European Thread
And yet the default word for "chicken" isn't "hen", which I've always thought was interesting.TaylorS wrote:probably a result of the importance of cows as a source of milk?Terra wrote:Ket has the same oddity:It's an odd semantic hole, now that I think about it.Stefan Georg's grammar, page 128 wrote:[131] Like in Russian (korova), English (cow), or German (Kuh), the female gender is taken
as the default representative of the species.
I generally forget to say, so if it's relevant and I don't mention it--I'm from Southern Michigan and speak Inland North American English. Yes, I have the Northern Cities Vowel Shift; no, I don't have the cot-caught merger; and it is called pop.
Re: The Great Proto-Indo-European Thread
Historically, "cattle" could indeed refer to any livestock- Dickens uses the word in this way.TaylorS wrote:Fuck, maybe my rural upbringing is showing, but there are dialects where "cattle" means any livestock? do these people think chocolate milk comes from brown cows, too??? O_OR.Rusanov wrote:Fucking amazing.finlay wrote: Basically, but cows are only female, technically. Cattle is the only word that strictly refers to the whole species, while you have other words like bulls, oxen and calves for different variations on sex and age.
In my idiolect of circa a few hours ago cow referred to the species as a whole, and cattle was synonymous with livestock. And I considered myself a *fluent* english speaker :^/
Etymologically, of course, "cattle" is derived from a word meaning "movable property". The semantic blurriness between [plural of cow] and [wealth/property] is pretty well documented in IE (in an attempt to wrench the thread back on track...)
Salmoneus wrote:(NB Dewrad is behaving like an adult - a petty, sarcastic and uncharitable adult, admittedly, but none the less note the infinitely higher quality of flame)
Re: The Great Proto-Indo-European Thread
It is in German - Huhn can mean chicken in Gerneral or the female, while the male is Hahn, and Henne (cognate to "hen") only means the female.alynnidalar wrote: And yet the default word for "chicken" isn't "hen", which I've always thought was interesting.
Re: The Great Proto-Indo-European Thread
In (Northern) Dutch, the default word is "kip", which is derived from the obsolete verb "kippen", meaning "to hatch", itself from Old French "eschepir" or "esquepir" with the same meaning. It can be used as a collective for both the male ("haan") and female ("hoen") (as "kippen"), and is used as mass noun (as "kip") when referring to chicken meat. It seems that the semantic shift from hatching > hatchlings > young chicken > grown chicken has occured in more languages.hwhatting wrote:It is in German - Huhn can mean chicken in Gerneral or the female, while the male is Hahn, and Henne (cognate to "hen") only means the female.alynnidalar wrote: And yet the default word for "chicken" isn't "hen", which I've always thought was interesting.
JAL
Re: The Great Proto-Indo-European Thread
Off topicjal wrote:hwhatting wrote: In (Northern) Dutch, the default word is "kip", which is derived from the obsolete verb "kippen", meaning "to hatch", itself from Old French "eschepir" or "esquepir" with the same meaning. It can be used as a collective for both the male ("haan") and female ("hoen") (as "kippen"), and is used as mass noun (as "kip") when referring to chicken meat. It seems that the semantic shift from hatching > hatchlings > young chicken > grown chicken has occured in more languages.
JAL
Where did you find the proposed link to Old French? I was looking at etymologiebank but there I only see some other proposed etymologies.
χʁɵn̩
gʁonɛ̃g
gɾɪ̃slɑ̃
gʁonɛ̃g
gɾɪ̃slɑ̃
Re: The Great Proto-Indo-European Thread
If I read that correctly it means the form is reconstructed based on the Old French words you mentioned, but not derived from them. Anyway, thanks for the link. And I do realise I'm nitpicking right now.
χʁɵn̩
gʁonɛ̃g
gɾɪ̃slɑ̃
gʁonɛ̃g
gɾɪ̃slɑ̃
Re: The Great Proto-Indo-European Thread
Yeah, but since Old French and Old Dutch do not share a recent common ancestor, I took it to mean that they're derived from them.Grunnen wrote:If I read that correctly it means the form is reconstructed based on the Old French words you mentioned, but not derived from them. Anyway, thanks for the link. And I do realise I'm nitpicking right now.
JAL
Re: The Great Proto-Indo-European Thread
It's also possible (and the reconstructed form makes that likely) that it's one of those Frankish loans into Old French, so Old French is used as a witness for the unattested Old Frankish ancestor of the Dutch forms.jal wrote:Yeah, but since Old French and Old Dutch do not share a recent common ancestor, I took it to mean that they're derived from them.Grunnen wrote:If I read that correctly it means the form is reconstructed based on the Old French words you mentioned, but not derived from them. Anyway, thanks for the link. And I do realise I'm nitpicking right now.
Re: The Great Proto-Indo-European Thread
Right, I hadn't thought of that, silly me. That indeed seems likely.hwhatting wrote:It's also possible (and the reconstructed form makes that likely) that it's one of those Frankish loans into Old French, so Old French is used as a witness for the unattested Old Frankish ancestor of the Dutch forms.
JAL
Re: The Great Proto-Indo-European Thread
I should've been clearerjal wrote:Right, I hadn't thought of that, silly me. That indeed seems likely.hwhatting wrote:It's also possible (and the reconstructed form makes that likely) that it's one of those Frankish loans into Old French, so Old French is used as a witness for the unattested Old Frankish ancestor of the Dutch forms.
JAL
χʁɵn̩
gʁonɛ̃g
gɾɪ̃slɑ̃
gʁonɛ̃g
gɾɪ̃slɑ̃
Re: The Great Proto-Indo-European Thread
-- Disclaimer: I do not have any degree in Linguistics neither in a language reconstruction. So I may not be aware of the last techniques. Sorry, this post is way too long. And Happy New Year --
About the Atkinson and Gray methodology and the Anatolian Hypothesis, I wanted to write a post saying that they did a good job with the methods. But the more I look into the details, the more shaky it looks to me.
~~~~ The articles ~~~~
"Language-tree divergence times support the Anatolian theory of Indo-European origin" in Letters to nature (27 November 2003) by Atkinson and Gray and
"How old is the Indo-European language family? Illumination or more moths to the flame?" chapter in Gray Lab Publications, 2006.
"An Indoeuropean classification : a lexicostatistical experiment" By Dyen I., Kruskal J.B., Black P, in Transactions of the American Philosophical Society, Volume 82, 1992.
I did not read their 2012 article. I had to register in order to read it.
~~~~ Data ~~~~
Data come from the Indo-European cognacy database. I am perfectly incompetent to judge the vocabulary used for the different languages. However, I have problems with the Oïl language family!
1. Language choice
They are using Modern French (a koinè of Oïl languages), Walloon (without precising which dialect), and two Haïtian creoles (creole C and D, had a hard time finding what C and D means). Are creoles that legitimate, since they are incorparating a lot of foreign elements? Even Dyen and al. has to "cheat" in order to use them for his method!
They had plenty of Oïl patois they could use: Gallo, Normand, Jersais, Bourguignon, Franc-Comtois, Picard, Poitevin, Angevin.. Finding good lexicons is hard, even for a French speaker, but they exist. And when I see the diversity of phonology, I don't know what to expect if I were to build a Swadesh List of Oïl dialects.
And why these two creoles? You have creole from Louisiana, Martinique, Reunion, Mauritius, Guyanne. I haven't check on these, but I expect also some deviations between them.
By the way, I have found some funny languages for the Greek family, such as the katharévousa greek, which appears to be conlang in my opinion.
2. Cognacy problem:
I have found some mistakes with the cognate used. For instance the creole word for fear is clearly a cognate of the french word "peur", but is not written as such. So some data, may be wrong and you may overestimate the rate of replacement of cognate!
With these data, you end up having creoles that branched off 500 years ago and Wallon branched off 400 years ago, which inconsistent since Walloon was already a distinct language in the 13th century! Moreover, many trees the author obtain with their method (43%) don't put French with Walloon, but probably with the two creoles.
~~~~ Methodology ~~~~
I do not have problems with the bayesian and the tree method they used for classification, they are good methods. I have problem with the way they used them.
1. Missing values
For each meaning of the Swadesh List, they list for each language the ones that are present and the ones that are absent. When they don't find a word in the dictionnary they put 0. Doing this is bad from a statistical point of view, coding "this cognate is missing" is not the same thing as "this cognate is not used." You either over-estimate the decay rate and it would be more likely to put the original branching far away in the past.
About this, the authors say two opposite things. In 2003, they said that due to this, coding the branching off Hitite adn Tocharian from IE was estimated to be 10400 BP if they did not constrained the time period of these languages. But in 2006, they say that recoding it does not change anything. This is, I think, a reason why there are not using Luvian, Gaulish, for building the tree. I do not know how Mr.Bayes (the software they used) handle missing data.
2. Bayesian model
I won't go into the details of the Bayesian model (way too much to write). After computation, they only use 1 000 trees. That's not a lot, usually you average over a few million iterations to have good estimations and you repeat the process several times to make sure your algorithm give the same answer each time. Yes, you can be stuck on a wrong value until the end, that's the game with Bayesian methods.
For each branch of these trees, they obtained replacement rates for each cognate (which can be wrong, ie. creole), the estimated date of apparition of the branches (which can also be wrong).
The tree presented in the articles what the majority of the thousand trees display.
~~~~ Results ~~~~
1. About the Indo-European tree:
I do not know what to think of the finally tree. Hittite and Tocharian are probably younger than what is shown (9th and 8th millenium BP). Only 40% of the trees separate Armeno-Greek from the rest of IE in the 7th millenium BP. That means that one of them may be poorly classified. Same thing with albano-iranian (36%) or balto-slavic (44%).
The Indo-iranian familly looks good (everything is 100% sure, which is also kind of weird, we are talking about stats). So this tree really looks shaky.
2. About the Urheimat:
By the way, these stats don't tell you that PIE comes from Anatolia, the author are just inferring that from the timeline, which is probably wrong.
3. Others questions:
- Is the method of using disappearance and replacement of cognate a good method to estimate the age of separation of language?
- Is the Dyen and al. (1992) IELex a good reference for Indo-European cognates? (I found Wiktionary as a source)
~~~~ Conclusions ~~~~
1. Always have an field expert when you are doing stats! Or: never trust your stats blindly! A statistic alone may be misleading if you can't interpret it. And if it makes no sense to the expert, your method has a problem.
2. It would be interesting to test this method on a simulation case. The problem with raw data is that you never know the truth.
About the Atkinson and Gray methodology and the Anatolian Hypothesis, I wanted to write a post saying that they did a good job with the methods. But the more I look into the details, the more shaky it looks to me.
~~~~ The articles ~~~~
"Language-tree divergence times support the Anatolian theory of Indo-European origin" in Letters to nature (27 November 2003) by Atkinson and Gray and
"How old is the Indo-European language family? Illumination or more moths to the flame?" chapter in Gray Lab Publications, 2006.
"An Indoeuropean classification : a lexicostatistical experiment" By Dyen I., Kruskal J.B., Black P, in Transactions of the American Philosophical Society, Volume 82, 1992.
I did not read their 2012 article. I had to register in order to read it.
~~~~ Data ~~~~
Data come from the Indo-European cognacy database. I am perfectly incompetent to judge the vocabulary used for the different languages. However, I have problems with the Oïl language family!
1. Language choice
They are using Modern French (a koinè of Oïl languages), Walloon (without precising which dialect), and two Haïtian creoles (creole C and D, had a hard time finding what C and D means). Are creoles that legitimate, since they are incorparating a lot of foreign elements? Even Dyen and al. has to "cheat" in order to use them for his method!
They had plenty of Oïl patois they could use: Gallo, Normand, Jersais, Bourguignon, Franc-Comtois, Picard, Poitevin, Angevin.. Finding good lexicons is hard, even for a French speaker, but they exist. And when I see the diversity of phonology, I don't know what to expect if I were to build a Swadesh List of Oïl dialects.
And why these two creoles? You have creole from Louisiana, Martinique, Reunion, Mauritius, Guyanne. I haven't check on these, but I expect also some deviations between them.
By the way, I have found some funny languages for the Greek family, such as the katharévousa greek, which appears to be conlang in my opinion.
2. Cognacy problem:
I have found some mistakes with the cognate used. For instance the creole word for fear is clearly a cognate of the french word "peur", but is not written as such. So some data, may be wrong and you may overestimate the rate of replacement of cognate!
With these data, you end up having creoles that branched off 500 years ago and Wallon branched off 400 years ago, which inconsistent since Walloon was already a distinct language in the 13th century! Moreover, many trees the author obtain with their method (43%) don't put French with Walloon, but probably with the two creoles.
~~~~ Methodology ~~~~
I do not have problems with the bayesian and the tree method they used for classification, they are good methods. I have problem with the way they used them.
1. Missing values
For each meaning of the Swadesh List, they list for each language the ones that are present and the ones that are absent. When they don't find a word in the dictionnary they put 0. Doing this is bad from a statistical point of view, coding "this cognate is missing" is not the same thing as "this cognate is not used." You either over-estimate the decay rate and it would be more likely to put the original branching far away in the past.
About this, the authors say two opposite things. In 2003, they said that due to this, coding the branching off Hitite adn Tocharian from IE was estimated to be 10400 BP if they did not constrained the time period of these languages. But in 2006, they say that recoding it does not change anything. This is, I think, a reason why there are not using Luvian, Gaulish, for building the tree. I do not know how Mr.Bayes (the software they used) handle missing data.
2. Bayesian model
I won't go into the details of the Bayesian model (way too much to write). After computation, they only use 1 000 trees. That's not a lot, usually you average over a few million iterations to have good estimations and you repeat the process several times to make sure your algorithm give the same answer each time. Yes, you can be stuck on a wrong value until the end, that's the game with Bayesian methods.
For each branch of these trees, they obtained replacement rates for each cognate (which can be wrong, ie. creole), the estimated date of apparition of the branches (which can also be wrong).
The tree presented in the articles what the majority of the thousand trees display.
~~~~ Results ~~~~
1. About the Indo-European tree:
I do not know what to think of the finally tree. Hittite and Tocharian are probably younger than what is shown (9th and 8th millenium BP). Only 40% of the trees separate Armeno-Greek from the rest of IE in the 7th millenium BP. That means that one of them may be poorly classified. Same thing with albano-iranian (36%) or balto-slavic (44%).
The Indo-iranian familly looks good (everything is 100% sure, which is also kind of weird, we are talking about stats). So this tree really looks shaky.
2. About the Urheimat:
By the way, these stats don't tell you that PIE comes from Anatolia, the author are just inferring that from the timeline, which is probably wrong.
3. Others questions:
- Is the method of using disappearance and replacement of cognate a good method to estimate the age of separation of language?
- Is the Dyen and al. (1992) IELex a good reference for Indo-European cognates? (I found Wiktionary as a source)
~~~~ Conclusions ~~~~
1. Always have an field expert when you are doing stats! Or: never trust your stats blindly! A statistic alone may be misleading if you can't interpret it. And if it makes no sense to the expert, your method has a problem.
2. It would be interesting to test this method on a simulation case. The problem with raw data is that you never know the truth.
- WeepingElf
- Smeric

- Posts: 1630
- Joined: Wed Mar 08, 2006 5:00 pm
- Location: Braunschweig, Germany
- Contact:
Re: The Great Proto-Indo-European Thread
Seconded.
(And welcome to the ZBB!)
(And welcome to the ZBB!)
...brought to you by the Weeping Elf
Tha cvastam émi cvastam santham amal phelsa. -- Friedrich Schiller
ESTAR-3SG:P human-OBJ only human-OBJ true-OBJ REL-LOC play-3SG:A
Tha cvastam émi cvastam santham amal phelsa. -- Friedrich Schiller
ESTAR-3SG:P human-OBJ only human-OBJ true-OBJ REL-LOC play-3SG:A
Re: The Great Proto-Indo-European Thread
Given that all satem languages exhibit the RUKI sound change I've been considering that satem is a proper genetic grouping, as opposed to centum, which contains languages that split off before those sound changes. Infact I remember reading this somewhere, but can't recall exactly where.
My point is that if Proto-Satem-language(or language continuum) indeed existed, then centum languages don't need to ever have had palatovelars, which were an innovation of satem languages. Since *e, *i and *y were abundant, this would have caused palatalisation to occur at many places, with paradigmatic leveling spreading them even further. Then the labiovelars delabialised to fill in the gap of velars, which by then have become quite rare. This implies that PIE only had plain velars and labiovelars.
This, of course, raises the question in what exactly environments the palatalisation took place.
My point is that if Proto-Satem-language(or language continuum) indeed existed, then centum languages don't need to ever have had palatovelars, which were an innovation of satem languages. Since *e, *i and *y were abundant, this would have caused palatalisation to occur at many places, with paradigmatic leveling spreading them even further. Then the labiovelars delabialised to fill in the gap of velars, which by then have become quite rare. This implies that PIE only had plain velars and labiovelars.
This, of course, raises the question in what exactly environments the palatalisation took place.
Re: The Great Proto-Indo-European Thread
Most people I talk to agree that the distinction was not a velar-palatovelar one but rather uvular-velar... does that fit your theory?
Slava, čĭstŭ, hrabrostĭ!
- KathTheDragon
- Smeric

- Posts: 2139
- Joined: Thu Apr 25, 2013 4:48 am
- Location: Brittania
Re: The Great Proto-Indo-European Thread
If you go the two-dorsal route, you have problems like the numeral '8', which is of a shape that should inhibit palatalisation.




