Compounding and the structure of the lexicon

Discussion of natural languages, or language in general.
Post Reply
chris_notts
Avisaru
Avisaru
Posts: 275
Joined: Wed Dec 15, 2004 9:05 am
Location: Nottingham, England
Contact:

Compounding and the structure of the lexicon

Post by chris_notts »

Hi all, as part of my work on Mɛdíṭṣai, I'm trying to understand better the structure of the lexicon in languages where a large proportion of the lexicon is made up of compounds or idiomatic combinations of roots, e.g. the various Chinese dialects, Vietnamese. Just from browsing in dictionaries it's clear that there is structure in many areas, and I wondered if anyone was aware of any major comparative work in this area? Or even an in-depth survey of compounding in an individual language.

Just to take an example, I have a Berlitz Vietnamese Concise Dictionary. I'm not quite sure where I got it, I probably found it in a cheap bookshop at some point. But anyway, it gives "đàn ông" as the translation for man. Scanning down the dictionary, I notice that the translation it gives for woman is "đàn bà". So I guess that "đàn" might mean "person" or similar, but look it up and find that it means "flock,herd,play music". "bà" means "grandmother,lady" and "ông" means "grandfather,gentleman".

So, from this little dictionary exploration:

1. the listed words for "man" and "woman" are not unanalysable, but are compounds with a fairly transparent second element and a first element (đàn) that seems to have no obvious semantic link with the meaning of the compounds, unless something is missing from the dictionary (entirely possible).

2. Even though man and woman are polymorphemic, "grandmother" and "grandfather" are monomorphemic. A number of other family terms also appear to be monomorphemic, e.g. mother, father, ...

So, interesting question number one: is there any kind of markedness in this area? Is it true, for example, that family terms are more likely to be mono-morphemic than terms like "man" or "woman" in languages that have lots of compounding? Are there any general rules or common patterns?

Interesting question number two: clearly, some morphemes are more likely to be used in compounding than others. How do the frequencies of different morphemes compare? Do we have a power law like pattern, where there's a vast difference in frequency between a few very commonly compounded morphemes vs the rest? If so, what are the most common meanings to be at the top / in the most frequent category?

Interesting question number three: How many compounds involve elements expressing meanings that they don't appear to have as independent morphemes (maybe "đàn" is like this in Vietnamese???). What are the most common kinds of semantic drift? How does this kind of meaning shift related to the frequency of use, as mentioned in the preceding question?

And so on. What I'd love is something that goes into a lot of detail around these kinds of questions for languages where compounding is very frequent, as in many of the isolating languages of East Asia. Books or websites are both good. I've tried Googling myself, but if something detailed and free is out there then so far I've failed to find the right combination of magic words.

If not, then I guess I can just wander through dictionaries for a few different languages for different semantic areas (people, family terms, etc).

EDIT: Some vaguely useful results from Googling.

Maybe "The Oxford Handbook of Compounding" might be useful, although the sections on specific languages only seem to be 10 or 20 pages each, so probably there isn't too much detail about each:

http://www.amazon.co.uk/Oxford-Handbook ... ap_title_0

This is interesting from a general point of view but says little about the semantics of compounds, and doesn't contain any data about frequencies of use either:

http://www.lingref.com/cpp/decemb/5/paper1617.pdf

Wikipedia has a short list of common affixes in Vietnamese that might be related to compounding:

http://en.wikipedia.org/wiki/Vietnamese ... Affixation

There's a similar list here:

http://www.vietnamese-grammar.group.she ... 0&LANG=_en

I'll keep Googling and post any good links here as I go.
Try the online version of the HaSC sound change applier: http://chrisdb.dyndns-at-home.com/HaSC

chris_notts
Avisaru
Avisaru
Posts: 275
Joined: Wed Dec 15, 2004 9:05 am
Location: Nottingham, England
Contact:

Re: Compounding and the structure of the lexicon

Post by chris_notts »

I found this:

http://www.google.co.uk/url?sa=t&rct=j& ... _Q&cad=rja

It's 1116 pages about compounding. I think it's a pdf copy of the Oxford book I mentioned above.
Try the online version of the HaSC sound change applier: http://chrisdb.dyndns-at-home.com/HaSC

chris_notts
Avisaru
Avisaru
Posts: 275
Joined: Wed Dec 15, 2004 9:05 am
Location: Nottingham, England
Contact:

Re: Compounding and the structure of the lexicon

Post by chris_notts »

So far I've found nothing that great apart from the Oxford book, so I think I'll give up looking for now unless anyone else has anything.
Try the online version of the HaSC sound change applier: http://chrisdb.dyndns-at-home.com/HaSC

User avatar
clawgrip
Smeric
Smeric
Posts: 1723
Joined: Wed Feb 29, 2012 8:21 am
Location: Tokyo

Re: Compounding and the structure of the lexicon

Post by clawgrip »

The book is certainly interesting. Thanks for finding it.

Japanese has an interesting situation regarding family terms. They appear monomorphemic, except not. Several are reduplicated, i.e. mother haha, father chichi, grandmother baba, grandfather jiji. Sibling words also appear to be polymorphemic, but whatever morphemes they incorporate seem to have disappeared in modern Japanese: cf. older sister ane, older brother ani, younger sister imōto, younger brother otōto. 'Wife' is the monomorphemic tsuma, and while 'husband', danna, is written with two kanji (旦那), the kanji have nothing to do with 'husband' and I imagine they are just ateji (characters used phonetically, regardless of meaning) and that danna is indeed monomorphemic as well.

The words for man and woman are also both monomorphemic: otoko and onna respectively.

All of these words are the most basic, plain forms. Polite versions are all polymorphemic.

chris_notts
Avisaru
Avisaru
Posts: 275
Joined: Wed Dec 15, 2004 9:05 am
Location: Nottingham, England
Contact:

Re: Compounding and the structure of the lexicon

Post by chris_notts »

clawgrip wrote:The book is certainly interesting. Thanks for finding it.

Japanese has an interesting situation regarding family terms. They appear monomorphemic, except not. Several are reduplicated, i.e. mother haha, father chichi, grandmother baba, grandfather jiji.
Reduplication for kin-terms is quite interesting. Is this related to any other productive function of reduplication in Japanese?
Sibling words also appear to be polymorphemic, but whatever morphemes they incorporate seem to have disappeared in modern Japanese: cf. older sister ane, older brother ani, younger sister imōto, younger brother otōto.
Another possibility is that they were changed to look more like each other by analogy. This often happens with numbers I think, because people tend to say them in order. They spot almost patterns and then regularise around them.
The words for man and woman are also both monomorphemic: otoko and onna respectively.
I forgot to mention Basque emakume, "woman". This is almost certainly an old compound, I think, since kume and ume both mean "child", and eme means "female", along with, according to the dictionary, "soft" and "smooth" (for some reason I find that quite funny). So I guess it was one of the following:

female child -> girl -> woman

or

woman-child (coordinative compound) -> woman (because women were often with children?)
Try the online version of the HaSC sound change applier: http://chrisdb.dyndns-at-home.com/HaSC

User avatar
linguoboy
Sanno
Sanno
Posts: 3681
Joined: Tue Sep 17, 2002 9:00 am
Location: Rogers Park/Evanston

Re: Compounding and the structure of the lexicon

Post by linguoboy »

clawgrip wrote:Sibling words also appear to be polymorphemic, but whatever morphemes they incorporate seem to have disappeared in modern Japanese: cf. older sister ane, older brother ani, younger sister imōto, younger brother otōto.
S.M. Martin (I have his Japanese language through time) seems to think that the feminine forms are derived from the masculine by the addition of the morpheme *mina (contracted to me in the names of animals (e.g. mehitsuji "ewe")

Bob Johnson
Avisaru
Avisaru
Posts: 704
Joined: Fri Dec 03, 2010 9:41 am
Location: NY, USA

Re: Compounding and the structure of the lexicon

Post by Bob Johnson »

chris_notts wrote:Reduplication for kin-terms is quite interesting. Is this related to any other productive function of reduplication in Japanese?
Not in Modern; cf. <tori> "bird" <toridori> "lots of birds"; <toki> "time" <tokidoki> "from time to time"

I seem to recall reading that <otoko> "man" derived in part from the ancestor of <hito> "person" but maybe I'm thinking of <otouto> "little brother" instead.

spats
Lebom
Lebom
Posts: 129
Joined: Wed Mar 21, 2007 10:43 pm
Location: Virginia, U.S.A
Contact:

Re: Compounding and the structure of the lexicon

Post by spats »

chris_notts wrote: I forgot to mention Basque emakume, "woman". This is almost certainly an old compound, I think, since kume and ume both mean "child", and eme means "female", along with, according to the dictionary, "soft" and "smooth" (for some reason I find that quite funny). So I guess it was one of the following:

female child -> girl -> woman

or

woman-child (coordinative compound) -> woman (because women were often with children?)
It's probably simpler than that.

Consider "girl", which used to mean "child", then came to mean female child, then female non-adult, and is now used informally (at least in U.S. English) to refer to adult women.

I think there is an (unfortunate, quite pathological) tendency across cultures to infantilize women and that shows up in the language.

Though I can think of another explanation:

At least in the American English case, "woman" and "man" are often seen as too formal to use in casual language, so you get stuff like "girl" and "gal" and "chick" - but also "guy" and "dude". I think this is an effect of the next generation growing up and not seeing themselves (or not wanting to be) as "as old as" the previous one, so while their parents were "men and women", they are instead "guys and girls". This would especially be true in a culture where age is not venerated and which doesn't make as big of a deal about the transition to adulthood (which definitely describes modern America).

I think the first explanation is more relevant in the general case; we've been coming up with diminutive words for "woman" long before youth became king.

Bob Johnson
Avisaru
Avisaru
Posts: 704
Joined: Fri Dec 03, 2010 9:41 am
Location: NY, USA

Re: Compounding and the structure of the lexicon

Post by Bob Johnson »

spats wrote:I think the first explanation is more relevant in the general case; we've been coming up with diminutive words for "woman" long before youth became king.
How long ago was that exactly? A younger woman can bear more children; this isn't just some Gen X thing.

Semi-related: I recall reading that the -ko suffix in modern Japanese girl names (.. well, it's going out of style now) originated as a suffix like <san>; it means "child" or "small thing" (though the latter is less productive now).

User avatar
linguoboy
Sanno
Sanno
Posts: 3681
Joined: Tue Sep 17, 2002 9:00 am
Location: Rogers Park/Evanston

Re: Compounding and the structure of the lexicon

Post by linguoboy »

Bob Johnson wrote:Semi-related: I recall reading that the -ko suffix in modern Japanese girl names (.. well, it's going out of style now) originated as a suffix like <san>; it means "child" or "small thing" (though the latter is less productive now).
It's actually been out of style for some time now. The only -ko name in the top ten any more is Riko.

In older Japanese literature (i.e. Taishō and earlier), it seems common for both upper- and lower-class women to have names with the same basic root differentiated with affixes. So for instance 春 haru "spring" with the suffix 子 -ko is a lady's name; with the prefix 阿 o-, it's the name of her maid.

User avatar
Hakaku
Lebom
Lebom
Posts: 132
Joined: Sat Feb 03, 2007 12:55 pm
Location: 常世

Re: Compounding and the structure of the lexicon

Post by Hakaku »

sister imōto, younger brother otōto.
These two stem from the roots imo 'woman (who is close)' and oto 'man' followed by a fusion of hito "person", which had shifted to futo and then (w)uto. The word otoko also features the root oto-, meaning it's not monomorphemic, but bimorphemic. The second morpheme simply being the diminutive -ko (cf. onna < onnago < wominago "woman"). It's likely even trimorphemic if you consider the initial o- to be an honorific prefix.
Chances are it's Ryukyuan (Resources).

spats
Lebom
Lebom
Posts: 129
Joined: Wed Mar 21, 2007 10:43 pm
Location: Virginia, U.S.A
Contact:

Re: Compounding and the structure of the lexicon

Post by spats »

linguoboy wrote:
Bob Johnson wrote:Semi-related: I recall reading that the -ko suffix in modern Japanese girl names (.. well, it's going out of style now) originated as a suffix like <san>; it means "child" or "small thing" (though the latter is less productive now).
It's actually been out of style for some time now. The only -ko name in the top ten any more is Riko.
What are the top ten? Is there a particular affix or compound which is more popular now, or has the whole thing started to go out of style?

Japanese always struck me as a little odd in that despite having no grammatical gender, it had a lot of names that explicitly marked gender. Is that a common pattern in the world's languages, or just something cultural?

User avatar
linguoboy
Sanno
Sanno
Posts: 3681
Joined: Tue Sep 17, 2002 9:00 am
Location: Rogers Park/Evanston

Re: Compounding and the structure of the lexicon

Post by linguoboy »

spats wrote:What are the top ten? Is there a particular affix or compound which is more popular now, or has the whole thing started to go out of style?
According this source, they are:
  1. Yua
  2. Yui
  3. Aoi
  4. Hina
  5. Riko
  6. Rin
  7. Sakura
  8. Yuna
  9. Miu
  10. Misaki
I remember hearing that the suffix -mi (美) "beauty" was formerly trendy but it seems less common these days.

Bob Johnson
Avisaru
Avisaru
Posts: 704
Joined: Fri Dec 03, 2010 9:41 am
Location: NY, USA

Re: Compounding and the structure of the lexicon

Post by Bob Johnson »

spats wrote:Japanese always struck me as a little odd in that despite having no grammatical gender, it had a lot of names that explicitly marked gender. Is that a common pattern in the world's languages, or just something cultural?
It has semantic gender, and that's part of the cause: 美しい <utsukushii> is "beautiful (as a woman is beautiful)", and you so you don't get its kanji 美 <mi> in boys' names. You can get 春 <haru> "spring" or 桜 <sakura> "cherry blossom" in boys names, but they're both fairly girly. (Sakura is #7 in that chart for girls). I knew a Chinese guy with given name 昴 (Ang, forgot which tone) "the Pleiades", which in Japanese is <subaru> and pretty girly in either. Meanwhile 太 <ta> "thick; fat" will show up more in boys' names (#3 and #8); likewise 大 (multiple readings) "large" (#2 and #9).
linguoboy wrote:Yua
what... that doesn't sound like a name, but 結愛 does look like one at least... though I want to read it as <yuai>. And it was 4th in 2009. Weird.

User avatar
linguoboy
Sanno
Sanno
Posts: 3681
Joined: Tue Sep 17, 2002 9:00 am
Location: Rogers Park/Evanston

Re: Compounding and the structure of the lexicon

Post by linguoboy »

Bob Johnson wrote:
linguoboy wrote:Yua
what... that doesn't sound like a name, but 結愛 does look like one at least... though I want to read it as <yuai>. And it was 4th in 2009. Weird.
Yeah, I was thrown by that one, too.

As you might expect in such a large and diverse country, the Chinese vary in their perceptions of "girliness". I have a Hakka friend from South China who says Northerners often mistakenly think he's female on account of one of the characters in his given name. (Can't remember what it is at the moment.)

User avatar
clawgrip
Smeric
Smeric
Posts: 1723
Joined: Wed Feb 29, 2012 8:21 am
Location: Tokyo

Re: Compounding and the structure of the lexicon

Post by clawgrip »

linguoboy wrote:
Bob Johnson wrote:Semi-related: I recall reading that the -ko suffix in modern Japanese girl names (.. well, it's going out of style now) originated as a suffix like <san>; it means "child" or "small thing" (though the latter is less productive now).
It's actually been out of style for some time now. The only -ko name in the top ten any more is Riko.

In older Japanese literature (i.e. Taishō and earlier), it seems common for both upper- and lower-class women to have names with the same basic root differentiated with affixes. So for instance 春 haru "spring" with the suffix 子 -ko is a lady's name; with the prefix 阿 o-, it's the name of her maid.
Judging from my experience, 子 seems to have been in its death throes somewhere in the 80s, though it does still appear from time to time even in younger people (since some parents still favour tradition over trend).
Hakaku wrote:
sister imōto, younger brother otōto.
These two stem from the roots imo 'woman (who is close)' and oto 'man' followed by a fusion of hito "person", which had shifted to futo and then (w)uto. The word otoko also features the root oto-, meaning it's not monomorphemic, but bimorphemic. The second morpheme simply being the diminutive -ko (cf. onna < onnago < wominago "woman"). It's likely even trimorphemic if you consider the initial o- to be an honorific prefix.
Very interesting...I should have picked up on the oto- similarity.
Bob Johnson wrote:Not in Modern; cf. <tori> "bird" <toridori> "lots of birds"; <toki> "time" <tokidoki> "from time to time"
I've never heard toridori before. Hitobito, kuniguni, sorezore, iroiro, samazama are a couple other common reduplicated plurals or plural-like words. Reduplication also plays a major part of mimetic terms and onomatopoeia, making it still productive in modern Japanese, e.g. dandan, jojo, masumasu, gizagiza, nyokinyoki, gokugoku, hāhā, gyangyan, etc., though admittedly they only occur in reduplicated form, never in single form.
Bob Johnson wrote:
spats wrote:Japanese always struck me as a little odd in that despite having no grammatical gender, it had a lot of names that explicitly marked gender. Is that a common pattern in the world's languages, or just something cultural?
It has semantic gender, and that's part of the cause: 美しい <utsukushii> is "beautiful (as a woman is beautiful)", and you so you don't get its kanji 美 <mi> in boys' names. You can get 春 <haru> "spring" or 桜 <sakura> "cherry blossom" in boys names, but they're both fairly girly. (Sakura is #7 in that chart for girls). I knew a Chinese guy with given name 昴 (Ang, forgot which tone) "the Pleiades", which in Japanese is <subaru> and pretty girly in either. Meanwhile 太 <ta> "thick; fat" will show up more in boys' names (#3 and #8); likewise 大 (multiple readings) "large" (#2 and #9).
linguoboy wrote:Yua
what... that doesn't sound like a name, but 結愛 does look like one at least... though I want to read it as <yuai>. And it was 4th in 2009. Weird.
美 does appear in male names sometimes...My brother-in-law, for one, has it in his name, and there are many other male names that use it as well.

As for 結愛 and so on, there seems to be a trend these days in using kanji creatively, purposely bending the acceptable limits of kanji pronunciation (dropping the i from 愛 ai in Yua, for example). These days there are lots of kids out there with weird names that are almost impossible to read without asking the pronunciation.

spats
Lebom
Lebom
Posts: 129
Joined: Wed Mar 21, 2007 10:43 pm
Location: Virginia, U.S.A
Contact:

Re: Compounding and the structure of the lexicon

Post by spats »

clawgrip wrote:As for 結愛 and so on, there seems to be a trend these days in using kanji creatively, purposely bending the acceptable limits of kanji pronunciation (dropping the i from 愛 ai in Yua, for example). These days there are lots of kids out there with weird names that are almost impossible to read without asking the pronunciation.
I wonder if the choice of -a rather than -ai was Western-influenced? Similarly, is the pattern of novel AAVE names starting with De-/Da-/Le-/La- Romance-influenced? (I am aware that Mormons/Utahans do this too, so it is not an isolated phenomenon.)

How many cultures at how many times, when picking names, just decide to make shit up? It seems that names typically come from one of a few places:
1. The name of an existing person (living or dead)/the pool of existing names in general.
2. Random words or compounds of words, possibly with a masculinizing or feminizing affix.
3. Random sounds that are pleasant or evocative and have no specific meaning.
4. ??? (is there a fourth?)

English mostly did (1) for a very long time, and most cultures to it to some extent. Hebrew and Arabic seem to strongly prefer (1). I was under the impression that Chinese and Japanese mostly did (2). How frequent is (3)? Is there another option?

User avatar
linguoboy
Sanno
Sanno
Posts: 3681
Joined: Tue Sep 17, 2002 9:00 am
Location: Rogers Park/Evanston

Re: Compounding and the structure of the lexicon

Post by linguoboy »

spats wrote:How many cultures at how many times, when picking names, just decide to make shit up? It seems that names typically come from one of a few places:
1. The name of an existing person (living or dead)/the pool of existing names in general.
2. Random words or compounds of words, possibly with a masculinizing or feminizing affix.
3. Random sounds that are pleasant or evocative and have no specific meaning.
4. ??? (is there a fourth?)

English mostly did (1) for a very long time, and most cultures to it to some extent. Hebrew and Arabic seem to strongly prefer (1). I was under the impression that Chinese and Japanese mostly did (2). How frequent is (3)? Is there another option?
This is a subject really deserving of a thread of its own. Traditionally, the Chinese (and some cultures they influenced heavily, like Korea) did something which doesn't really fit into any of your categories. A clan would have a "generational character" (sometimes decided by consensus, sometimes dictated by a poem, so hardly "random") which would form the first element in all given names. The second character would be chosen to fit well with the first and (in families where the same generational character was used for both sexes) to give some indication of sex. So, for instance, Mao Zedong (毛澤東) had as his surviving siblings Mao Zemin (毛澤民), Mao Zetan (毛澤覃), and an adopted sister Mao Zejian (毛澤建). His sons' names all contained the character 岸, whereas his daughters had one-character names.

Ran
Lebom
Lebom
Posts: 145
Joined: Fri Sep 13, 2002 9:37 pm
Location: Winterfell / Lannisport / Highgarden
Contact:

Re: Compounding and the structure of the lexicon

Post by Ran »

chris_notts wrote: I'll keep Googling and post any good links here as I go.
Here's something I found that you might find useful:

http://books.google.com/books?id=lUIj3O ... navlinks_s

You can also use search terms like "Chinese word formation" and "Chinese morphology" if you haven't tried those.
Winter is coming

User avatar
clawgrip
Smeric
Smeric
Posts: 1723
Joined: Wed Feb 29, 2012 8:21 am
Location: Tokyo

Re: Compounding and the structure of the lexicon

Post by clawgrip »

spats wrote:
clawgrip wrote:As for 結愛 and so on, there seems to be a trend these days in using kanji creatively, purposely bending the acceptable limits of kanji pronunciation (dropping the i from 愛 ai in Yua, for example). These days there are lots of kids out there with weird names that are almost impossible to read without asking the pronunciation.
I wonder if the choice of -a rather than -ai was Western-influenced?
I don't really see how it would be western-influenced. Actually 結愛 is probably not the best example because I think 愛 already has a somewhat established tradition of being pronounced a, but I would imagine people choose it because 愛 is cuter than say, 歩 (結歩 is one of many other possibilities for Yua).

In Japan people usually chose their children's names in one of two ways, either by deciding the name and then choosing kanji for it, or choosing a kanji and then choosing a name that uses it. The total stroke count is important for some people, and not for others, so that influences kanji choice as well. It is not usually based on some tradition, though it is not uncommon for people to have a common kanji in each of their children's names (e.g. 哲志 Satoshi, 志音 Shion, 正志 Masashi).

hwhatting
Smeric
Smeric
Posts: 2315
Joined: Fri Sep 13, 2002 2:49 am
Location: Bonn, Germany

Re: Compounding and the structure of the lexicon

Post by hwhatting »

spats wrote:1. The name of an existing person (living or dead)/the pool of existing names in general.
2. Random words or compounds of words, possibly with a masculinizing or feminizing affix.
3. Random sounds that are pleasant or evocative and have no specific meaning.
4. ??? (is there a fourth?)
Method 2) is AFAIK the most far-spread method historically; e.g. in Europe method 1) won out only several centuries after Christianisation. "Random words" is probably not the best way to describe it - these names seem to have been motivated by various things (wishes for the child's character or future, attributes of the child, circumstances of birth etc.). The process has been studied for compound names in older IE languages - quite often one or both elements of the compound name are taken from names of relatives and e.g. in Germanic you find alliterative name pairs like Hiltibrant son of Hadubrant.
Method 3) seems to have been in use in ancient Anatolia, where names of a structure (C)(V)C(C)V like Atta, Nana, etc. abound that mostly don't have a meaning in the attested Anatolian languages - although it's of course possible that they are inherited from a displaced non-IE language where they had meaning. And as long as we don't know the genealogy and social circle of a person or details of the naming process, we can never be sure whether a name that looks like 2) or 3) isn't actually formed by 1).

chris_notts
Avisaru
Avisaru
Posts: 275
Joined: Wed Dec 15, 2004 9:05 am
Location: Nottingham, England
Contact:

Re: Compounding and the structure of the lexicon

Post by chris_notts »

I'll have to see if I can order a copy from the public library or something, since it's quite an expensive book to buy and obviously I can't read the whole thing on Google Books...
Try the online version of the HaSC sound change applier: http://chrisdb.dyndns-at-home.com/HaSC

Post Reply