Modern Polish phonotactics finally deciphered!
Posted: Thu Sep 25, 2014 5:58 pm
Well, I have analyzed a sample of over 100 000 Polish words this evening and that's what I have come up with.
(Please note that I have no academic linguistic education and I do not attempt to make this thread a scientific work in any way.)
First, let's be reminded what the Polish phoneme inventory looks like.
The consonants, sorted by their place of articulation:
Labial: /m p b f v/
Alveolar: /n t d ts dz s z r l/
Post-alveolar: /ts̠ dz̠ s̠ z̠/
Palatal: /ɲ tɕ dʑ ɕ ʑ j/
Velar: /ŋ k ɡ x w/
You could sometimes see this set enlarged with these two series:
Labio-palatal: /mʲ pʲ bʲ fʲ vʲ/
Velo-palatal: /kʲ ɡʲ xʲ/
These however: contrast with plain consonants only before vowels and can be reanalyzed of sequences of said consonant and a diphthong onglide.
The vowels:
Front: /i ɛ/
Central: /ɨ a/
Back: /u ɔ/
Additionally four of them appearing after palatalized consonants can be reanalyzed as following diphthongs: /iɛ ia iu io/. With that postulate we can see that plain labials and velars are in complementary distribution with their palatalized counterparts. The latter occur only preceding /i/ and the former elsewhere.
That approach will greatly simplify the description of the consonant clusters permitted in onsets. As can be deduced, phonologically and phonetically labio-palatals and velo-palatals do not appear in codas.
With these in mind, we can start describing the phonotactics:
The onset:
The template of onset clusters can be viewed as (F)(S)(C₁)(R₁)(C₂)(R₂)-, where:
· F stands for fricatives /f v/
· S stands for sibilants /s z ɕ ʑ/
· C₁ stands for labials /m p b/, alveolars /t d ts dz/, a post-alveolar /ts̠/, palatals /tɕ dʑ/, velars /k ɡ x/
· R₁ stands for labials /f v/, alveolars /s z r l/, post-alveolars /s̠ z̠/, palatals /ɕ ʑ/ and a velar /w/
· C₂ stands for labials /p b/, alveolars /t d ts dz s z/, post-alveolars /ts̠ dz̠ s̠ z̠/, palatals /tɕ dʑ ɕ ʑ/, velars /k ɡ x/
· R₂ stands for labials /m f v/, alveolars /n r l/, post-alveolars /s̠ z̠/, palatals /ɲ j/ and a velar /w/
There are some other conditions to be fulfilled as well:
· a cluster cannot contain both voiced and voiceless obstruents, e.g. /kf ɡv km ɡm/ are permitted and /kv ɡf/ are not.
· C₁ and C₂ cannot be of the same point of articulation — some inflected words can violate this rule due to morphological reasons, e.g. /ts̠ts̠ɔŋ/, an inflected form of /ts̠tɕitɕ/.
· R₁ cannot be a labial if any of C₁, C₂ and R₂ is a labial.
· R₁ cannot be a sibilant /s z ɕ ʑ/ if C₁ is an alveolar, post-alveolar or palatal.
· R₁ cannot be /r l/ if R₂ is any of /r l s̠ z̠ w/.
· R₁ cannot be /w/ if there is any R₂.
The coda:
The template of onset clusters can be viewed as -(N)(C₁)(S)(C₂)(R), where:
· N stands for labials /m f v/, alveolars /n r l/, palatals /ɲ j/ and velars /ŋ w/
· C₁ stands for labials /p b/, alveolars /t d ts dz/, a post-alveolar /ts̠/, palatals /tɕ dʑ/ and velars /k ɡ x/
· S stands for sibilants /s z ɕ ʑ/
· C₂ stands for labials /p b/, alveolars /t d ts dz/, post-alveolars /ts̠ dz̠/, palatals /tɕ dʑ/ and velars /k ɡ x/
· R stands for labials /m f v/, alveolars /n r l/, post-alveolars /s̠ z̠/, a palatal /ɲ/ and a velar /w/
There are some constraints as well (like nasals almost always agreeing in their place of articulations with following stops). The first two rules from the onset section apply, in particular.
(Also thanks to Whatever-Her-Nick-Is-At-The-Moment Meilani for helping me determine the rules.)
(Please note that I have no academic linguistic education and I do not attempt to make this thread a scientific work in any way.)
First, let's be reminded what the Polish phoneme inventory looks like.
The consonants, sorted by their place of articulation:
Labial: /m p b f v/
Alveolar: /n t d ts dz s z r l/
Post-alveolar: /ts̠ dz̠ s̠ z̠/
Palatal: /ɲ tɕ dʑ ɕ ʑ j/
Velar: /ŋ k ɡ x w/
You could sometimes see this set enlarged with these two series:
Labio-palatal: /mʲ pʲ bʲ fʲ vʲ/
Velo-palatal: /kʲ ɡʲ xʲ/
These however: contrast with plain consonants only before vowels and can be reanalyzed of sequences of said consonant and a diphthong onglide.
The vowels:
Front: /i ɛ/
Central: /ɨ a/
Back: /u ɔ/
Additionally four of them appearing after palatalized consonants can be reanalyzed as following diphthongs: /iɛ ia iu io/. With that postulate we can see that plain labials and velars are in complementary distribution with their palatalized counterparts. The latter occur only preceding /i/ and the former elsewhere.
That approach will greatly simplify the description of the consonant clusters permitted in onsets. As can be deduced, phonologically and phonetically labio-palatals and velo-palatals do not appear in codas.
With these in mind, we can start describing the phonotactics:
The onset:
The template of onset clusters can be viewed as (F)(S)(C₁)(R₁)(C₂)(R₂)-, where:
· F stands for fricatives /f v/
· S stands for sibilants /s z ɕ ʑ/
· C₁ stands for labials /m p b/, alveolars /t d ts dz/, a post-alveolar /ts̠/, palatals /tɕ dʑ/, velars /k ɡ x/
· R₁ stands for labials /f v/, alveolars /s z r l/, post-alveolars /s̠ z̠/, palatals /ɕ ʑ/ and a velar /w/
· C₂ stands for labials /p b/, alveolars /t d ts dz s z/, post-alveolars /ts̠ dz̠ s̠ z̠/, palatals /tɕ dʑ ɕ ʑ/, velars /k ɡ x/
· R₂ stands for labials /m f v/, alveolars /n r l/, post-alveolars /s̠ z̠/, palatals /ɲ j/ and a velar /w/
There are some other conditions to be fulfilled as well:
· a cluster cannot contain both voiced and voiceless obstruents, e.g. /kf ɡv km ɡm/ are permitted and /kv ɡf/ are not.
· C₁ and C₂ cannot be of the same point of articulation — some inflected words can violate this rule due to morphological reasons, e.g. /ts̠ts̠ɔŋ/, an inflected form of /ts̠tɕitɕ/.
· R₁ cannot be a labial if any of C₁, C₂ and R₂ is a labial.
· R₁ cannot be a sibilant /s z ɕ ʑ/ if C₁ is an alveolar, post-alveolar or palatal.
· R₁ cannot be /r l/ if R₂ is any of /r l s̠ z̠ w/.
· R₁ cannot be /w/ if there is any R₂.
The coda:
The template of onset clusters can be viewed as -(N)(C₁)(S)(C₂)(R), where:
· N stands for labials /m f v/, alveolars /n r l/, palatals /ɲ j/ and velars /ŋ w/
· C₁ stands for labials /p b/, alveolars /t d ts dz/, a post-alveolar /ts̠/, palatals /tɕ dʑ/ and velars /k ɡ x/
· S stands for sibilants /s z ɕ ʑ/
· C₂ stands for labials /p b/, alveolars /t d ts dz/, post-alveolars /ts̠ dz̠/, palatals /tɕ dʑ/ and velars /k ɡ x/
· R stands for labials /m f v/, alveolars /n r l/, post-alveolars /s̠ z̠/, a palatal /ɲ/ and a velar /w/
There are some constraints as well (like nasals almost always agreeing in their place of articulations with following stops). The first two rules from the onset section apply, in particular.
(Also thanks to Whatever-Her-Nick-Is-At-The-Moment Meilani for helping me determine the rules.)