8 - How Does Polysynthesis Arise?
Alright! So, we've covered a number of the basic traits that polysynthetic languages tend to have, which should hopefully help with creating your own! But, I suspect many of you are considering developing a polysynthetic conlang out of a non-poly language or conlang that you already have (or perhaps as a future descendent of some non-poly natlang). A natural question, then, is what are some pathways by which polysynthesis can develop? I'm not going to try to cover all the possibilities here, just a few of them (I'm not even sure how much research has been done in this area, to be honest), in roughly the same order as the various traits of polysynthetic languages were discussed in previous sections.
8.1 - Sidebar: Grammaticalization
Before getting into the discussion, though, it's important to know the basics of what
grammaticalization is. Grammaticalization is the evolution of formerly independent words into grammatical markers -- often
clitics or affixes. Roughly speaking, clitics are phonologically bound to a host word, but behave syntactically as though they were independent; for example, the English possessive suffix is a clitic, since it is phonologically part of the preceding word, but attaches to an entire noun
phrase, rather than necessarily to the possessor noun, for instance:
[the Queen of England NP]='s crown (where the actual possessor is the Queen, not England). Clitics are normally set off with equals signs "=" in glosses. Several things tend to happen in the process of grammaticalization: there is often irregular phonological reduction/erosion of the word/clitic/affix, and
semantic bleaching, where much of the specific meaning of the original word is lost as it comes to indicate broader grammatical relationships instead.
I'll give a classic example from English to demonstrate the basic ideas involved. Originally, the expression "going to" only had one meaning: the literal sense of "being in motion toward a goal". So, "I'm going to meet with him" meant quite literally, "I'm on my way to meet him". It's easy to see, though, the connection this sense has with an intentive/future sense: if I'm on my way to meet with someone, presumably I will actually be meeting him soon in the future. So "going to" began to become grammaticalized into a marker of future tense, and now the normal interpretation of "I'm going to meet him" is the same as "I will meet him." We can see in this process semantic bleaching: "going to" no longer has a full lexical meaning in this use, but rather expresses the grammatical category of future tense. We can also see phonological reduction: in normal speech, "going to" in its future sense is rarely pronounced as two full words, but rather as something like [ɡə̆nə] or just [nə], and usually cliticizes to the preceding word. In normal speech, I'd say "I'm'na meet with him" (for me, something like [aɪmnə miʔ wɪθɨm],
aɪ=m=nə miʔ wɪθ=ɨm, 1sg.
SUBJ=
AUX:1sg=
FUT meet with=3sg.
MASC.
OBJ). Similar process of grammaticalization have operated many times in English (for instance the other common future form, the clitic ='ll, comes from "will", originally meaning "want to", a meaning still reflected in the corresponding noun "will") and in languages throughout the world. It is grammaticalization that can help to create new affixes, and thus greater synthesis, as we will see in the sections below.
8.2 - Polypersonal Marking
The pathway by which
polypersonal marking on verbs develops is quite straightforward: through grammaticalization, independent pronouns become cliticized with the verb root and eventually become inseperable affixes, often with some phonological reduction from their earlier form.
Various Romance languages actually offer good examples of this. Most Romance languages continue the Latin system of already marking the subject with an inflection on the verb (recall the Spanish examples above:
hablo, "
I speak";
hablas, "
you speak", etc.). However, Romance languages also mark objects on the verb as well, using pronominal clitics, though the placement of these clitics varies from language to language and depending on the exact situation. Some examples (again from Spanish) can help demonstrate this:
- no me lo digas, "don't tell me that" (more accurately: no me=lo=dig-as, don't to.me=it.OBJ-say:SUBJUNCTIVE-you.SUBJ). Here, the subject is marked with a verb suffix (-as), as usual in Romance, and both the direct object (me=, "me") and indirect object (lo=, "it, that") are indicated with clitics that are attached to the beginning of the verb.
- ayer la vi, "I saw her yesterday" (more accurately: ayer la=vi, yesterday her=I:saw). Here again, the verb is inflected to mark the subject, and the direct object (la=, "her") is marked with a clitic preposed to the verb.
- dámelo, "give it to me" (more accurately: dá=me=lo, give:2sg.SUBJ.IMPERATIVE=to.me=it.OBJ). Once again the verb inflects to mark its subject, and the object(s) are marked with clitics, though in this case they follow the verb (and by Spanish spelling convention, are written as one word).
In fact, in a number of Spanish dialects these clitics are well on the way to being obligatory person markers, in that they often cooccur with a coreferent full NP, as in colloquial San Salvadoran Spanish:
[NOTE 1]
- ya los leí los libros.
"I already read those books" (or more literally, "I already read-those those books")
(ya los=leí los libros, already them.MASC=I:read the.MASC.PL books)
Note that these object pronouns are reduced versions of the full pronouns of Latin: thus,
lo for example is from Latin
illum, "that", which has irregularly lost its initial vowel in the process of gramaticalization into an unstressed clitic. Note also that unlike in Latin, these Romance object clitic pronouns cannot freely occur in many different positions. Instead, there are a limited number of places within the clause where they can occur (generally either directly before or directly following the verb): this is another indication that they have become grammaticalized and are no longer completely independent pronouns, but rather are partly on the way to becoming verbal affixes marking person. Many Romance languages, thus, are a good demonstration of the beginning stages of the creation of polypersonal marking.
In fact, one Romance language is well known for having already developed true polypersonal marking, and is sometimes called "polysynthetic": French. In its evolution from Latin, French has undergone a number of phonological reductions, which have ultimately resulted in the French verb no longer effectively inflecting to mark the person of its subject, as other Romance languages are capable of (see
here for an overview of some of these changes). As a result, French makes much more frequent use of personal pronouns than other Romance languages -- but in unmarked contexts these "pronouns", in fact, have become fused to the verb, both phonologically and morphosyntactically. Though they are still (sometimes) written as separate pronouns in the standard orthography, there are arguments for considering them true verbal affixes marking person.
Take the French sentence,
je vais le lui donner, "I'm going to give it to him/her". Though written as several separate words, this is phonologically a single word, [ʒvɛləlɥidɔne]. There are also syntactic criteria for considering this a single word, though I'm not going to get into them here -- if you're curious, see
this PDF for a number of arguments in favor of considering sentences like this as a single grammatical word in French).
[NOTE 2] The point here, though, is not to prove whether spoken French should be considered "polysynthetic"; the point is that this provides an example of how polypersonal marking
could arise, whether or not French's pronominal verbal markers are clitics or true affixes.
8.3 - Noun Incorporation
Marianne Mithun, in the same article proposing a typology of NI (which I used as a significant source in the section on
noun incorporation above), discusses pathways by which the process of NI can develop. At its most basic level, of course, NI is simply a compound composed of a noun root + a verb root, and thus the development of NI is in many ways as simple as a language coming to permit noun-verb compounds as a productive process. Nonetheless, I'll note here a few ways in which some of the other common characteristics of NI can develop, and some examples of languages at various stages of developing productive NI.
Firstly, Mithun notes a common tendency, in many languages, for verbs "to coalesce with indefinite direct objects," and provides several Hungarian examples, "in which the referentiality and definiteness of the object affect the form of the predicate":
- Péter olvassa az újságot = "Peter is reading the newspaper"
(Peter reads-OBJ the newspaper)
(object is both referential and definite; object follows the verb and the verb is marked with the definite transitivity suffix -sa)
- Péter olvas egy újságot = "Peter is reading a [specific] newspaper"
(Peter reads a newspaper)
(object is referential but indefinite; no definite transitivity suffix appears on the verb)
- Péter újságot olvas = "Peter is reading a newspaper, Peter is newspaper-reading"
(Peter newspaper reads)
(object is nonreferential and indefinite; object precedes the verb and verb shows no definite transitivity suffix)
It's quite easy to see that we're well on the way here to true noun incorporation. When the object of a verb is not a clear, definite, referential object, but is rather indefinite and nonreferential (and thus serving more as a modifier of the verb, rather than an independent participant), it is more closely connected syntactically with the verb, which itself now lacks transitive marking. All that is needed to develop full NI is for such constructions to become lexicalized and to cease being merely a marker of definiteness.
Lahu, a Tibeto-Burman language, takes this a bit further. While the noun and verb remain distinct phonological words, in instances of "incorporation", the two are more closely tied syntactically, and have a difference in meaning from unincorporated examples. For instance, compare the following two examples:
- jɨ̀ thà’ dɔ̀ = "to drink (the) liquor" (="to drink the liquor in question, e.g., as opposed to something else")
(liquor OBJ drink)
- jɨ̀ dɔ̀ = "to drink liquor" (="to drink liquor in general")
(liquor drink)
Again, these remain two separate words, but we can see here that in the "incorporated" second example, the liquor is no longer marked as a direct object of the verb, but simply is acting to "qualify the type of drinking involved." Apparently, children are reinterpreting such structures as unitary syntactic words; for example, while adults normally place the negative particle
mâ immediately before the verb (as in the first example below), children sometimes treat the noun-verb compound as a unity verb and place the negative particle before the entire complex, as in the second example below:
- ni-ma mâ hā = "I'm not sad"
(heart not wretched)
- mâ ni-ma hā
(not heart wretched)
A similar case can be seen in some languages in Oceania. Take the following examples from
Mokilese:
- ngoah kohkoa oaring-kai = "I am grinding these coconuts"
(I grind cocount-these)
- ngoah ko oaring = "I am coconut-grinding"
(I grind coconut)
Note that while the verb and its "incorporated" object are still separate phonological words, in the incorporation manifest in the second example here, the verb and noun are syntactically bound to one another, and behave as a single unit. In Oceanic languages with this sort of incorporation, furthermore, the verbs involved generally behave as though they are intransitive (recall that incorporation is generally a valence-reducing operation). The following example from
Tongan can illustrate this well, because Tongan is an ergative language: that is, the subject of transitive verbs (the ergative participant) is marked differently from both the subject of intransitive verbs and the object of transitive verbs (both of which are marked the same as one another, as the absolutive participant):
- na‘e inu ‘a e kavá ‘é Sione = "John drank the kava"
(PAST drink ABS CONN kava ERG John)
- na‘e inu kava ‘a Sione = "John drank kava, John kava-drank"
(PAST drink kava ABS John)
Note how in the first sentence (without incorporation), the kava is marked as the absolutive (here, the object of a transitive verb) with the preceding particle
ʻa, and John is marked as the subject of a transitive verb with the preceding ergative particle
ʻé. In the second sentence (with syntactic, though not phonological, incorporation), John is now marked with the absolutive particle, and the kava is unmarked, thus indicating that the verb is now intransitive, with John -- now the subject of an intransitive rather than a transitive verb -- marked as absolutive.
Thus, we can see here several steps in the development of NI, from independent direct objects coalescing with a verb when they are indefinite and nonreferential, to such nouns ceasing to be verbal arguments at all, and becoming qualifiers of the verb. From here, we simply need phonological fusion of the verb and noun to have "classic" compounding and NI, of the kind described in section 4 above.
8.4 - Other Affixes
I'll have a bit less to say on this topic. Partly this is because of the wide variety of things that fall under the umbrella of "
other affixes." But this is also because the origin of such "other affixes" are often very straightforward. In cases where there is evidence on their origin, they are normally derived from older compounding: either noun-verb compounds (incorporation), or verb-verb compounds. I'll provide a few examples here, focusing especially on the languages exemplified in section 5 above, and with the goal of simply providing a demonstration of some of the many possibilities open to you via the grammaticalization of older compounds.
Instrumental affixes in Numic. These were not discussed in section 5, but they are parallel to many of the sorts of affixes that were discussed there. In many cases, the reconstructed
Proto-Numic instrumental affixes have clear similarities to reconstructed Proto-Numic (PN) or Proto-Uto-Aztecan (PUA) independent noun or verb roots (NB: I don't know how current some of these reconstructions are, and they don't mark any of the geminating, spirantizing, etc. characteristics of Numic morphemes):
- *ma-, "with the hand" (cf. PUA *mai, PN *moʔo, "hand")
- *ca-, "with the hand, grasping" (cf. PN *caʔi, "grasp")
- *ta-, "with the foot" (cf. PUA *tannah, "foot")
- *kɨ-, "with the teeth" (cf. PUA *kɨʔi, "bite")
- *mu-, "with the nose" (cf. PUA/PN *mupi, "nose")
- *co-, "with the head, shoulder" (cf. PUA *cohŋi, "head, shoulder")
- *ko-, "top, face" (cf. PN *koba-i, "face")
- *su-, "with mental activity" (cf. PUA *sunna, "heart", *suuwa, "believe")
- *pi-, "buttocks, back" (cf. PUA *pih, "back")
- *ku-, "with heat" (cf. PUA *kuh, "fire")
- *sɨ-, "with cold" (cf. PUA *sup, "cold")
- *ta-, "with sun" (cf. PN *taba, "sun")
- *wɨ-, "with long object/with force" (cf. (PUA? PN?) *wɨ, "penis")
- *ci-, "with the point of a long object" (cf. (PUA? PN?) *ci-a, "rose")
Yup'ik Postbases (see
section 5.3). Most postbases have no known connection to corresponding roots with similar meaning. However, there are a few postbases where the etymology seems clear, and the ultimate origin of most of the root-postbase combinations seems clearly to be from old compounds. A few postbases with identifiable sources are:
- -tur = "to eat, to use" (cf. atur-, "to use")
- -carte = "to hit in the (body part)" (cf. qacarte-, "to hit or slap with the hand")
- -ngirte = "injured, be injured in the (body part)" (cf. akngirte-, "to hurt, get hurt")
Lexical Affixes (see
section 5.4). Although most of the lexical suffixes in Northwest Coast languages have no obvious cognates in independent roots, there are several which do, and which indicate that root+lexical suffix combinations originated in compounds. The examples below are from
Spokane, a Salishan language (this presentation glosses over the complex specifics of how exactly the lexical affixes are derived from independent roots; essentially, roots that became lexical suffixes lost their initial consonant, and both roots and/or suffixes could sometimes be reanalyzed as containing a connective affix
-ł- used in compounds, or the nominalizer
s-):
- -eneʔ, "ear, surface" (cf. t̓éneʔ, "ear")
- -úlixʷ, "ground, dirt, earth" (cf. st̓úlixʷ, "ibid.")
- -elixʷ, "person" (cf. sqélixʷ, "ibid.")
- -ic̓eʔ, "skin, hide" (cf. síc̓-m, "blanket" (with -m, middle voice))
- -łčey̓, "urine" (cf. tčéy̓, "to urinate")
- -łq̓ey̓t, "shoulder" (cf. łáq̓-t, "it is wide" (with -t, durative))
- -ey̓, "sickness" (cf. wéyt, "he's sick" (with -t, durative))
- -ewł, "conveyance, boat" (cf. séwłkł [sic?], "water")
- -eslk̓ʷ, "wood" (cf. luk̓ʷ, "stick of wood")
- -asq̓l̓, "roaster" (cf. q̓ʷl, "to roast")
- -esšn̓, "knobbed object, rounded object, berry, fruit, rock, forehead" (cf. šsén̓s, "stone")
- -epł, "buttock" (cf. pł, "thick")
- -cin, "mouth, food, words, language, edge, shore" (cf. cn, "to hum, speak softly") (?)
Koasati (see
section 5.5). The origins of a number of the Koasati verbal affixes I described can be determined. I'll discuss them in the order in which they were presented in section 5.5.
- The indefinite prefix aat-, "someone" (slot 9) is connected to the independent noun ááti, "person". As far as I can tell, its use as an indefinite prefix began with a form of incorporation (person-[verb], meaning "[to verb] someone/people"), a process that can be easily seen in this prefix's use in nominalizations: aatasíhka, "policeman" (lit., "person-tier", from asíhkan, "to tie up"); atholló, "witch" (lit., "person-dangerous", from hollon, "to be dangerous"); aatistahoobachilká, "camera" (lit., "people-photographer", from stahobaachin, "to photograph").
- The directional and instrumental prefixes of slots 7 and 8 have their origin in earlier free verbs used in clause chains. The final -t- of all of these prefixes, in fact, was at one point the same-subject marker (Muskogean languages have a switch reference system, where, basically, verb suffixes identify whether the follow verb has the same subject or a different subject from the verb to which the suffix is appended). So, for example, the origin of the general instrumental prefix (i)s(t)- is in Proto-Muskogean constructions involving the verb *isi, "to take", e.g., *isi-t aya-n (= "take and go") > stááyan (= "to carry (i.e., to go with something)"). The directionals have similar origins: oht- "go and..." is from Proto-Muskogean *oNa-t..., "arrive there and..." (with *oNa, "reach"); iit- "come and..." is from Proto-Muskogean *ila-t..., "arrive here and..." (with *ila, "come").
- The distributive and iterative prefixes (slot 6) seem to date back to the Proto-Muskogean plural/impersonal proclitic *oho=.
- A number of the specific locative prefixes (slot 3) can be seen to derive from older incorporated noun roots:
- itta-, "action on the ground or in fire" may be connected to the Mikasuki noun i:ti, "fire."
- oo(w)-, "action in water" is connected to the independent noun okí, "water."
- paa-, "action on a raised, artificial, or non-ground surface" is connected to the postposition páána, "on top of."
- ibii-, "action on the human face" seems to be related to the nouns ibitáála, "face"; ibisááni, "nose"; and ibithkaní, "nasal mucous."
- ichoo-, "action on the mouth" I presume is derived from the Proto-Muskogean noun *i-¢okV, "mouth."
- nok-, "action on the human throat" is from the Proto-Muskogean noun *nok-, "neck, throat"
- Many of the adverbial suffixes (slot 1) can be derived from formerly free words that were at some point incorporated into the verb. Thus, for instance, aahoosi-, "very", is cognate with Choctaw ahosi, "most, almost, near, nearly"; and Chickasaw ao’si, "almost."
- The ability suffix (slot 5) -halpiisa probably derives from an earlier clause chain construction, as with the instrumental and directional prefixes. The chain would have consisted of the main verb, with the suffix -h, a subordinating connector, followed by a verb related to the modern verb stalpíísan, "to be enough" (this form has the instrumental prefix st-, so the original verb would have been alpíísan).
-------------------------- NOTES ---------------------------
1) Thanks to Serafín for this example and for bringing this phenomenon to my attention!
2) Unfortunately, I don't speak any French, so I can't really evaluate the claims there. Also unfortunately, there's several obvious errors in the phonetic transcriptions there, but hopefully that's not a sign of general sloppiness in analysis...