Compounding and the structure of the lexicon
Posted: Tue Apr 10, 2012 1:33 pm
Hi all, as part of my work on Mɛdíṭṣai, I'm trying to understand better the structure of the lexicon in languages where a large proportion of the lexicon is made up of compounds or idiomatic combinations of roots, e.g. the various Chinese dialects, Vietnamese. Just from browsing in dictionaries it's clear that there is structure in many areas, and I wondered if anyone was aware of any major comparative work in this area? Or even an in-depth survey of compounding in an individual language.
Just to take an example, I have a Berlitz Vietnamese Concise Dictionary. I'm not quite sure where I got it, I probably found it in a cheap bookshop at some point. But anyway, it gives "đàn ông" as the translation for man. Scanning down the dictionary, I notice that the translation it gives for woman is "đàn bà". So I guess that "đàn" might mean "person" or similar, but look it up and find that it means "flock,herd,play music". "bà" means "grandmother,lady" and "ông" means "grandfather,gentleman".
So, from this little dictionary exploration:
1. the listed words for "man" and "woman" are not unanalysable, but are compounds with a fairly transparent second element and a first element (đàn) that seems to have no obvious semantic link with the meaning of the compounds, unless something is missing from the dictionary (entirely possible).
2. Even though man and woman are polymorphemic, "grandmother" and "grandfather" are monomorphemic. A number of other family terms also appear to be monomorphemic, e.g. mother, father, ...
So, interesting question number one: is there any kind of markedness in this area? Is it true, for example, that family terms are more likely to be mono-morphemic than terms like "man" or "woman" in languages that have lots of compounding? Are there any general rules or common patterns?
Interesting question number two: clearly, some morphemes are more likely to be used in compounding than others. How do the frequencies of different morphemes compare? Do we have a power law like pattern, where there's a vast difference in frequency between a few very commonly compounded morphemes vs the rest? If so, what are the most common meanings to be at the top / in the most frequent category?
Interesting question number three: How many compounds involve elements expressing meanings that they don't appear to have as independent morphemes (maybe "đàn" is like this in Vietnamese???). What are the most common kinds of semantic drift? How does this kind of meaning shift related to the frequency of use, as mentioned in the preceding question?
And so on. What I'd love is something that goes into a lot of detail around these kinds of questions for languages where compounding is very frequent, as in many of the isolating languages of East Asia. Books or websites are both good. I've tried Googling myself, but if something detailed and free is out there then so far I've failed to find the right combination of magic words.
If not, then I guess I can just wander through dictionaries for a few different languages for different semantic areas (people, family terms, etc).
EDIT: Some vaguely useful results from Googling.
Maybe "The Oxford Handbook of Compounding" might be useful, although the sections on specific languages only seem to be 10 or 20 pages each, so probably there isn't too much detail about each:
http://www.amazon.co.uk/Oxford-Handbook ... ap_title_0
This is interesting from a general point of view but says little about the semantics of compounds, and doesn't contain any data about frequencies of use either:
http://www.lingref.com/cpp/decemb/5/paper1617.pdf
Wikipedia has a short list of common affixes in Vietnamese that might be related to compounding:
http://en.wikipedia.org/wiki/Vietnamese ... Affixation
There's a similar list here:
http://www.vietnamese-grammar.group.she ... 0&LANG=_en
I'll keep Googling and post any good links here as I go.
Just to take an example, I have a Berlitz Vietnamese Concise Dictionary. I'm not quite sure where I got it, I probably found it in a cheap bookshop at some point. But anyway, it gives "đàn ông" as the translation for man. Scanning down the dictionary, I notice that the translation it gives for woman is "đàn bà". So I guess that "đàn" might mean "person" or similar, but look it up and find that it means "flock,herd,play music". "bà" means "grandmother,lady" and "ông" means "grandfather,gentleman".
So, from this little dictionary exploration:
1. the listed words for "man" and "woman" are not unanalysable, but are compounds with a fairly transparent second element and a first element (đàn) that seems to have no obvious semantic link with the meaning of the compounds, unless something is missing from the dictionary (entirely possible).
2. Even though man and woman are polymorphemic, "grandmother" and "grandfather" are monomorphemic. A number of other family terms also appear to be monomorphemic, e.g. mother, father, ...
So, interesting question number one: is there any kind of markedness in this area? Is it true, for example, that family terms are more likely to be mono-morphemic than terms like "man" or "woman" in languages that have lots of compounding? Are there any general rules or common patterns?
Interesting question number two: clearly, some morphemes are more likely to be used in compounding than others. How do the frequencies of different morphemes compare? Do we have a power law like pattern, where there's a vast difference in frequency between a few very commonly compounded morphemes vs the rest? If so, what are the most common meanings to be at the top / in the most frequent category?
Interesting question number three: How many compounds involve elements expressing meanings that they don't appear to have as independent morphemes (maybe "đàn" is like this in Vietnamese???). What are the most common kinds of semantic drift? How does this kind of meaning shift related to the frequency of use, as mentioned in the preceding question?
And so on. What I'd love is something that goes into a lot of detail around these kinds of questions for languages where compounding is very frequent, as in many of the isolating languages of East Asia. Books or websites are both good. I've tried Googling myself, but if something detailed and free is out there then so far I've failed to find the right combination of magic words.
If not, then I guess I can just wander through dictionaries for a few different languages for different semantic areas (people, family terms, etc).
EDIT: Some vaguely useful results from Googling.
Maybe "The Oxford Handbook of Compounding" might be useful, although the sections on specific languages only seem to be 10 or 20 pages each, so probably there isn't too much detail about each:
http://www.amazon.co.uk/Oxford-Handbook ... ap_title_0
This is interesting from a general point of view but says little about the semantics of compounds, and doesn't contain any data about frequencies of use either:
http://www.lingref.com/cpp/decemb/5/paper1617.pdf
Wikipedia has a short list of common affixes in Vietnamese that might be related to compounding:
http://en.wikipedia.org/wiki/Vietnamese ... Affixation
There's a similar list here:
http://www.vietnamese-grammar.group.she ... 0&LANG=_en
I'll keep Googling and post any good links here as I go.