I'll try this again - essential binary features
I'll try this again - essential binary features
This thread supercedes, and renders obsolete, the thread I started recently in C&C.
The Big Question is:
"Which binary features are necessary to cover all of the segmental symbols on the IPA chart?"
I've yet to find a satisfactory answer; maybe there isn't one, but there's no harm in asking. From what I've gathered after looking at various sites, the answer seems to be at least the following:
For sonority: syllabic, consonantal/vocalic, approximant, sonorant, continuant
For place of articulation: round, anterior, distributed, high, low, back, tense, ATR, voice, spread glottis, closed glottis
For manner of articulation: continuant again, strident, lateral, delayed release, dental
For vowels: front and back, high and low, round, tense
This leads to at least three Lesser Questions:
1. What have I overlooked?
2. How does [ə] (schwa) differ from [ɜ] or [ɘ] (mid central unrounded vowels)?
3. How does [kʲ] differ from [c]?
More will doubtless follow!
Suprasegmentals aren't so difficult, but any comments would be appreciated here too.
The Big Question is:
"Which binary features are necessary to cover all of the segmental symbols on the IPA chart?"
I've yet to find a satisfactory answer; maybe there isn't one, but there's no harm in asking. From what I've gathered after looking at various sites, the answer seems to be at least the following:
For sonority: syllabic, consonantal/vocalic, approximant, sonorant, continuant
For place of articulation: round, anterior, distributed, high, low, back, tense, ATR, voice, spread glottis, closed glottis
For manner of articulation: continuant again, strident, lateral, delayed release, dental
For vowels: front and back, high and low, round, tense
This leads to at least three Lesser Questions:
1. What have I overlooked?
2. How does [ə] (schwa) differ from [ɜ] or [ɘ] (mid central unrounded vowels)?
3. How does [kʲ] differ from [c]?
More will doubtless follow!
Suprasegmentals aren't so difficult, but any comments would be appreciated here too.
Zompist's Markov generator wrote:it was labelled" orange marmalade," but that is unutterably hideous.
Re: I'll try this again - essential binary features
This is indeed a good question, and from looking at the usages of binary features I have seen, I myself have wondered how they can cover the full range of possible phones, especially in the case of vowels.
Dibotahamdn duthma jallni agaynni ra hgitn lakrhmi.
Amuhawr jalla vowa vta hlakrhi hdm duthmi xaja.
Irdro. Irdro. Irdro. Irdro. Irdro. Irdro. Irdro.
Amuhawr jalla vowa vta hlakrhi hdm duthmi xaja.
Irdro. Irdro. Irdro. Irdro. Irdro. Irdro. Irdro.
Re: I'll try this again - essential binary features
There's a good discussion of this in Roger Lass's Phonology, with his own proposed set of binary features (an adaptation of Chomsky & Halle).
Having established the system in chapter 5, he then casts doubt on it in chapter 6. He points out that it's rather arbitrary, that there is no particular reason that a set of features should be binary or even integral. He plays a bit with integral systems (these go well with vowel openness or backness).
For vowels, the obvious problem is that neither openness nor backness is binary. There are vowel systems with five degrees of height, and ones with four degrees of backness. One could simply add binary features— though you'll need 3 of them to handle those five-height systems— but what's the advantage over simply numbering the heights?
Similarly, what do you do with Estonian's three vowel lengths? Again, yes, you can just add another binary feature, but why bother?
When you look at acoustic phonetics, or the detailed description of ongoing sound changes (e.g. in Labov), then it's normal to use real numbers instead— e.g. the F1 and F2 formants. In Principles of Linguistic Change: Internal Factors Labov talks about some vocalic sound changes that are easy to explain in terms of two-dimensional formant space, but hard to explain in terms of binary features.
Anyway, I suspect that the whole process is more "analyzing the IPA" than analyzing human language. Not that respected linguists haven't spent a lot of effort trying.
BTW, without looking too closely at your list, you've left out non-pulmonic airstreams, and nasal vowels, and I don't know how you handle co-articulation.
Having established the system in chapter 5, he then casts doubt on it in chapter 6. He points out that it's rather arbitrary, that there is no particular reason that a set of features should be binary or even integral. He plays a bit with integral systems (these go well with vowel openness or backness).
For vowels, the obvious problem is that neither openness nor backness is binary. There are vowel systems with five degrees of height, and ones with four degrees of backness. One could simply add binary features— though you'll need 3 of them to handle those five-height systems— but what's the advantage over simply numbering the heights?
Similarly, what do you do with Estonian's three vowel lengths? Again, yes, you can just add another binary feature, but why bother?
When you look at acoustic phonetics, or the detailed description of ongoing sound changes (e.g. in Labov), then it's normal to use real numbers instead— e.g. the F1 and F2 formants. In Principles of Linguistic Change: Internal Factors Labov talks about some vocalic sound changes that are easy to explain in terms of two-dimensional formant space, but hard to explain in terms of binary features.
Anyway, I suspect that the whole process is more "analyzing the IPA" than analyzing human language. Not that respected linguists haven't spent a lot of effort trying.
BTW, without looking too closely at your list, you've left out non-pulmonic airstreams, and nasal vowels, and I don't know how you handle co-articulation.
Re: I'll try this again - essential binary features
Wikipedia suggests that high-low and front-back are not as phonetically natural as a system with three directions: front, raised, and retracted. Then again, that seems to somewhat disregard the common phonological similarities in the behavior of the high vowels /i/ and /u/.
Re: I'll try this again - essential binary features
Much of this was prompted by this chart, which doesn't mention any special features for clicks. The intention is that I can collect a set of binary features large enough to be useful for a SCA, which is why I mentioned the IPA; if the features can describe (most of) the IPA, there are probably enough of them. Of course, this doesn't address the question of whether it's necessary to do everything with binary features in the first place; but integral values can be represented in binary anyway
Perhaps I'm still asking the wrong question, and it should be "what is the best way to represent phonemes/phones in an SCA which can handle both IPA text and featural analysis?". It's very unlikely that I'd need to worry about more than three degrees of vowel backness, for example.
And I forgot to mention a "nasal" feature, yes. Presumably the features of - say - coarticulated [kp] aren't simply the features of [k] plus those of [p]?
Perhaps I'm still asking the wrong question, and it should be "what is the best way to represent phonemes/phones in an SCA which can handle both IPA text and featural analysis?". It's very unlikely that I'd need to worry about more than three degrees of vowel backness, for example.
And I forgot to mention a "nasal" feature, yes. Presumably the features of - say - coarticulated [kp] aren't simply the features of [k] plus those of [p]?
Zompist's Markov generator wrote:it was labelled" orange marmalade," but that is unutterably hideous.
Re: I'll try this again - essential binary features
I think in practice [ə] is often used as the lax counterpart of either [ɜ] or [ɘ].alice wrote:2. How does [ə] (schwa) differ from [ɜ] or [ɘ] (mid central unrounded vowels)?
Normatively, in both cases the central part of the tongue is raised towards palatum. However, for [c] this raising causes plosion, but for [kʲ] the plosion occurs farther back, with the back of the tongue touching velum.3. How does [kʲ] differ from [c]?
In practice, these two are often used interchangeably.
The conlanger formerly known as “the conlanger formerly known as Pole, the”.
If we don't study the mistakes of the future we're doomed to repeat them for the first time.
If we don't study the mistakes of the future we're doomed to repeat them for the first time.
Re: I'll try this again - essential binary features
Also, Wikipedia says that [c] may also be used to represent an alveolo-palatal [t̠ʲ] (for example, in Hungarian, where it is spelled "ty", and where it is historically derived in some cases from the cluster /tj/, but not as far as I know from /kj/). There's a similar ambiguity with the symbol ɲ, which is often used to represent an alveolo-palatal.alice wrote: 3. How does [kʲ] differ from [c]?
(Personally, I find it unnecessary to use the distinct letter [c] just to represent the fronted allophone of /k/ in languages like French and Italian, but the letter has been used this way.)
Re: I'll try this again - essential binary features
Given that chart, I think features([kp]) = features([k]) + features[p] works. Have you noticed that feature cont has 3 values - '+', '-' and '±'? The affricates are a bit more awkward - some of the features are taken from the fricative part rather than combined. Consider t͡s v. t͡θ! How would you handle presigmatised stops (e.g. [ˢt]?). Would you use the same encoding as t͡s, but with a fourth value of cont, namely '∓'?alice wrote:Presumably the features of - say - coarticulated [kp] aren't simply the features of [k] plus those of [p]?
Are you planning to handle tones just as strings of pitches? (I presume tone features such as length and glottalisation will be incorporated in the segmentals, along with creakiness.)
I presume you propose to attempt to support the phonemic domain as well as the phonetic. For that you will need things like syllable boundaries - Thai final stop + syllable-boundary + liquid is not the same as any of the three similar combinations of syllable-boundary + Thai stop + liquid. (The minimal pairs for the orthography and for phonetics are different, so I haven't any examples to hand.) The liquid disappears from the latter but not the former in excited speech.
Re: I'll try this again - essential binary features
You may wish to consider supporting some abstract tone or pitch designations. For example, some diachronic descriptions will start with tones A, B and C, and some descriptions report generic tones 1 to 8, leaving the phonetics for the language-specific notes. If I were playing with Slavic accent developments, I might want to start with abstract 'H' and 'L' and come up with the standard accent symbols rather than worry about their precise realisations.
Have you enough flexibility to handle a rule like, "Move stress from a prefix to the first syllable after the prefix"? It's a rule one will need if generating a Romance language from a form of Latin close to Classical Latin. One wouldn't have to mark the Romance stress oneself.
Do you have a set of tricky sound changes ready for the next stage? Doing West Germanic consonant gemination without a brute force list of changes and doing the Sanskrit conditional change n > ɳ would be a good test.
Have you enough flexibility to handle a rule like, "Move stress from a prefix to the first syllable after the prefix"? It's a rule one will need if generating a Romance language from a form of Latin close to Classical Latin. One wouldn't have to mark the Romance stress oneself.
Do you have a set of tricky sound changes ready for the next stage? Doing West Germanic consonant gemination without a brute force list of changes and doing the Sanskrit conditional change n > ɳ would be a good test.
Re: I'll try this again - essential binary features
Since you ask...
I'm trying to do something more detailed than phonemic, but not going so far into the phonetic that my head explodes. (Which is not very far.)
Yes, and I worked out how to get by with just 2.Richard W wrote:Have you noticed that feature cont has 3 values - '+', '-' and '±'?
There's a tradeoff here between doing everything possible and doing (almost) everything useful.Richard W wrote:The affricates are a bit more awkward - some of the features are taken from the fricative part rather than combined. Consider t͡s v. t͡θ!
Unless there's good reason not to, this can be treated as /st/.Richard W wrote:How would you handle presigmatised stops (e.g. [ˢt]?). Would you use the same encoding as t͡s, but with a fourth value of cont, namely '∓'?
No.Richard W wrote:Are you planning to handle tones just as strings of pitches? (I presume tone features such as length and glottalisation will be incorporated in the segmentals, along with creakiness.)
"attempt" is probably correctRichard W wrote:I presume you propose to attempt to support the phonemic domain as well as the phonetic. For that you will need things like syllable boundaries - Thai final stop + syllable-boundary + liquid is not the same as any of the three similar combinations of syllable-boundary + Thai stop + liquid. (The minimal pairs for the orthography and for phonetics are different, so I haven't any examples to hand.) The liquid disappears from the latter but not the former in excited speech.
I've already considered something very like this.Richard W wrote:You may wish to consider supporting some abstract tone or pitch designations. For example, some diachronic descriptions will start with tones A, B and C, and some descriptions report generic tones 1 to 8, leaving the phonetics for the language-specific notes. If I were playing with Slavic accent developments, I might want to start with abstract 'H' and 'L' and come up with the standard accent symbols rather than worry about their precise realisations.
That depends on how an unstressable prefix is identified, and I'd be very interested to know if any other SCAs can do this.Richard W wrote:Have you enough flexibility to handle a rule like, "Move stress from a prefix to the first syllable after the prefix"? It's a rule one will need if generating a Romance language from a form of Latin close to Classical Latin. One wouldn't have to mark the Romance stress oneself.
Very much yes. The first of these is easy; the second might not be too difficult.Richard W wrote:Do you have a set of tricky sound changes ready for the next stage? Doing West Germanic consonant gemination without a brute force list of changes and doing the Sanskrit conditional change n > ɳ would be a good test.
Zompist's Markov generator wrote:it was labelled" orange marmalade," but that is unutterably hideous.
Re: I'll try this again - essential binary features
For a tool that is purely a specialised string editor, one just adds a boundary marker '#' between the prefix and the rest of the word. Of course, handling junctures systematically gets a bit more complicated.alice wrote:That depends on how an unstressable prefix is identified, and I'd be very interested to know if any other SCAs can do this.Richard W wrote:Have you enough flexibility to handle a rule like, "Move stress from a prefix to the first syllable after the prefix"? It's a rule one will need if generating a Romance language from a form of Latin close to Classical Latin. One wouldn't have to mark the Romance stress oneself.



