Some comments about gen and SCA²

Discussions worth keeping around later.
Post Reply
MysteryMan23
Sanci
Sanci
Posts: 42
Joined: Sun Feb 05, 2012 6:35 pm

Some comments about gen and SCA²

Post by MysteryMan23 »

Hey everyone!

I've been using SCA² and gen for my conlangs, and they've proven quite useful. However, I've encountered a few problems that hamper their usefulness.
  1. In gen, sometimes when I'm generating words using the "output table" function, the program breaks down and outputs a really long word. In the other modes, on the other hand, the program seems to work just fine. I've been testing it, and I've found that when I remove the rewrite rules, the problem goes away. Also, there are cases where the rewrite rules don't cause any problems at all. It appears that the problem happens when I try to replace single characters with digraphs, but it happens in other cases too, and I can't seem to find a real pattern. In any case, this is most definitely a bug.
  2. In SCA², I can't make even basic vowel harmony happen. When I try, the changes only end up applying to the first vowel that breaks the rules, but not anything beyond that. In fact, it seems to apply each rule one at a time, such that pikukupi comes out like "pikikupu" instead of the desired "pikikipi". I don't know if it's a bug, or if I'm just doing something wrong, but I'd kind of like to be able to do vowel harmony with this program, for a number of reasons.
I'll provide examples for each of my problems below, so you can experiment for yourself.

Code: Select all

For gen:
C=ptkPTK
V=aeiou
P|ph
T|th
K|kh
CV
V

Set the output type to "Wordlist (as table)" to see the bug. You should get a really long word instead of a real table of words.

For SCA²:
F=ie
B=uo
C=ptkbdg
F/B/B(C)_
B/F/F(C)_

Test with words of 3 or more (C)V syllables. You should see failure to produce the expected vowel harmony, i.e. something like "pitoku" turning into "piteku" instead of "piteki".

User avatar
Radius Solis
Smeric
Smeric
Posts: 1248
Joined: Tue Mar 30, 2004 5:40 pm
Location: Si'ahl
Contact:

Re: Some comments about gen and SCA²

Post by Radius Solis »

MysteryMan23 wrote:
  • In SCA², I can't make even basic vowel harmony happen. When I try, the changes only end up applying to the first vowel that breaks the rules, but not anything beyond that. In fact, it seems to apply each rule one at a time, such that pikukupi comes out like "pikikupu" instead of the desired "pikikipi". I don't know if it's a bug, or if I'm just doing something wrong, but I'd kind of like to be able to do vowel harmony with this program, for a number of reasons.
The program is clearly applying each sound change exactly once.

It might be nice if it automatically recursed, but SCAs generally don't: it's more effort for the programmer to implement that than it is for you to work around it by repeating the rule. For instance, in your example, change it to read

Code: Select all

F=ie
B=uo
C=ptkbdg
F/B/B(C)_
F/B/B(C)_
F/B/B(C)_
B/F/F(C)_
B/F/F(C)_
B/F/F(C)_
, and magically each rule will apply to up to three syllables in a row. Increase the repeats if more are needed.

Cedh
Sanno
Sanno
Posts: 938
Joined: Tue Nov 14, 2006 10:30 am
Location: Tübingen, Germany
Contact:

Re: Some comments about gen and SCA²

Post by Cedh »

MysteryMan23 wrote:In gen, sometimes when I'm generating words using the "output table" function, the program breaks down and outputs a really long word. In the other modes, on the other hand, the program seems to work just fine. I've been testing it, and I've found that when I remove the rewrite rules, the problem goes away. Also, there are cases where the rewrite rules don't cause any problems at all. It appears that the problem happens when I try to replace single characters with digraphs, but it happens in other cases too, and I can't seem to find a real pattern. In any case, this is most definitely a bug.
I also came across this bug a while ago. And I managed to figure out what the error is: The rewrite rules are applied after generating the output HTML, so any time you include a rewrite rule that affects a character which is used in the HTML output, the document structure will be messed up. Here's some detail about my observations:
cedh audmanh (in an e-mail to Zompist) wrote:My rewrite rules (which include b|p’ and d|t’, using voiced plosive graphemes as shorthand symbols for ejectives) affect not just the words but also the generated HTML source code, which comes out as something like the following with the output option "Wordlist (as table)" (copied from Firebug):

Code: Select all

<tap’le>
<tt’>mūsūa</tt’>
<tt’>wīrka</tt’>
<tt’>tə</tt’>
<tt’>əā</tt’>
<tt’>mulu</tt’>
<tt’>sət</tt’>
<tt’>wat’una</tt’>
</tap’le>
A very similar thing happens when the output is set to "Big-ass wordlist":

Code: Select all

<div id="mytext">
tātəlik’ə
<p’r/>
məkwu
<p’r/>
yīttī
<p’r/>
tai
</div>

Also, Gen doesn't appear to distinguish between upper and lower case letters within words, so it's not possible to use X-SAMPA. MysteryMan23, you would need to replace your category definition C=ptkPTK by something like C=ptkɸθx to keep the two sets of consonants distinct in the generated words. (Incidentally, this would work as a temporary workaround against the bug described above too).

User avatar
Pole, the
Smeric
Smeric
Posts: 1606
Joined: Sat Feb 11, 2012 9:50 am

Re: Some comments about gen and SCA²

Post by Pole, the »

Well, gen and sca² are great, but bugfull. For example:
• the applier doesn't see \t (the tabulator sign) as a whitespace;
• the categories symbols are only recognised when at start of a rule, making you unable of applying such rules as CS/\\/_, the workaround is simply listing each combination, which is kinda pointless;
• you can use wildcards only once per a rule, so. [sc]t/\\/_ and s[ptk]/\\/_ will work correctly, whereas [sc][ptk]/\\/_ won't, the workaround — see above;
• the same applies when you try to use both at once ­— P[rl]/B/_ won't work, too.
The conlanger formerly known as “the conlanger formerly known as Pole, the”.

If we don't study the mistakes of the future we're doomed to repeat them for the first time.

User avatar
Terra
Avisaru
Avisaru
Posts: 571
Joined: Tue May 24, 2005 10:01 am

Re: Some comments about gen and SCA²

Post by Terra »

Note: I've also built a sound change applier. It can be found here: http://jc.tech-galaxy.com/apps/sound_ch ... plier.html It doesn't do everything that Zomp's can, but I think that the layout is neater and prettier, at least.
It might be nice if it automatically recursed, but SCAs generally don't: it's more effort for the programmer to implement that than it is for you to work around it by repeating the rule. For instance, in your example, change it to read
It's hard to do, because then you (the programmer) have to check whether the sound change will be applied over and over again forever, which in some cases is equivalent to solving the halting problem (which can't be done).
the applier doesn't see \t (the tabulator sign) as a whitespace;
Well damn, mine doesn't either. I'll go and fix it in a bit...
the categories symbols are only recognised when at start of a rule, making you unable of applying such rules as CS/\\/_, the workaround is simply listing each combination, which is kinda pointless;
But the following works in SCA2:

Code: Select all

V=aeiou
P=ptc
B=bdg
P/B/V_V
The rewrite rules are applied after generating the output HTML
Why don't you just move your rewrite rules into the "sound changes" category? I don't understand the point of the rewrite category in the first place.

User avatar
Drydic
Smeric
Smeric
Posts: 1652
Joined: Tue Oct 08, 2002 12:23 pm
Location: I am a prisoner in my own mind.
Contact:

Re: Some comments about gen and SCA²

Post by Drydic »

Terra wrote:
The rewrite rules are applied after generating the output HTML
Why don't you just move your rewrite rules into the "sound changes" category? I don't understand the point of the rewrite category in the first place.
Generally when single characters are not part of the orthography, but some symbol is (in this case, b ~ p' etc.), to avoid using digraphs, which SCAs tend to misinterpret or scream at, unless they're specifically included somehow.
Image Image
Common Zein Scratchpad & other Stuffs! OMG AN ACTUAL CONPOST WTFBBQ

Formerly known as Drydic.

User avatar
Pole, the
Smeric
Smeric
Posts: 1606
Joined: Sat Feb 11, 2012 9:50 am

Re: Some comments about gen and SCA²

Post by Pole, the »

Well yes, P/B/_ works, however mP/B/_ doesn't.
The conlanger formerly known as “the conlanger formerly known as Pole, the”.

If we don't study the mistakes of the future we're doomed to repeat them for the first time.

User avatar
Terra
Avisaru
Avisaru
Posts: 571
Joined: Tue May 24, 2005 10:01 am

Re: Some comments about gen and SCA²

Post by Terra »

Generally when single characters are not part of the orthography, but some symbol is (in this case, b ~ p' etc.), to avoid using digraphs, which SCAs tend to misinterpret or scream at, unless they're specifically included somehow.
Oh, so the issue is that the apostrophe means something special in a sound rule in Zomp's SCA. Okay, I guess.

Cedh
Sanno
Sanno
Posts: 938
Joined: Tue Nov 14, 2006 10:30 am
Location: Tübingen, Germany
Contact:

Re: Some comments about gen and SCA²

Post by Cedh »

Terra wrote:
Generally when single characters are not part of the orthography, but some symbol is (in this case, b ~ p' etc.), to avoid using digraphs, which SCAs tend to misinterpret or scream at, unless they're specifically included somehow.
Oh, so the issue is that the apostrophe means something special in a sound rule in Zomp's SCA. Okay, I guess.
No, the issue is that you can't define digraphs in Gen, so you have to use a single (placeholder) symbol for each sound in the Categories field, and change that symbol to the appropriate digraph with a rewrite rule. That's a limitation of the design, but you can deal with it - unless, in the current version of Gen, you use a character in the source part of a rewrite rule which is also used in the relevant part of the HTML source (i.e. one of <a b d e l r t>).

User avatar
Terra
Avisaru
Avisaru
Posts: 571
Joined: Tue May 24, 2005 10:01 am

Re: Some comments about gen and SCA²

Post by Terra »

cedh audmanh wrote:
Terra wrote:
Generally when single characters are not part of the orthography, but some symbol is (in this case, b ~ p' etc.), to avoid using digraphs, which SCAs tend to misinterpret or scream at, unless they're specifically included somehow.
Oh, so the issue is that the apostrophe means something special in a sound rule in Zomp's SCA. Okay, I guess.
No, the issue is that you can't define digraphs in Gen, so you have to use a single (placeholder) symbol for each sound in the Categories field, and change that symbol to the appropriate digraph with a rewrite rule. That's a limitation of the design, but you can deal with it - unless, in the current version of Gen, you use a character in the source part of a rewrite rule which is also used in the relevant part of the HTML source (i.e. one of <a b d e l r t>).
So why can't you just put the rewrite rules last in the sound changes column then?

User avatar
Pole, the
Smeric
Smeric
Posts: 1606
Joined: Sat Feb 11, 2012 9:50 am

Re: Some comments about gen and SCA²

Post by Pole, the »

Terra wrote:
cedh audmanh wrote:
Terra wrote:
Generally when single characters are not part of the orthography, but some symbol is (in this case, b ~ p' etc.), to avoid using digraphs, which SCAs tend to misinterpret or scream at, unless they're specifically included somehow.
Oh, so the issue is that the apostrophe means something special in a sound rule in Zomp's SCA. Okay, I guess.
No, the issue is that you can't define digraphs in Gen, so you have to use a single (placeholder) symbol for each sound in the Categories field, and change that symbol to the appropriate digraph with a rewrite rule. That's a limitation of the design, but you can deal with it - unless, in the current version of Gen, you use a character in the source part of a rewrite rule which is also used in the relevant part of the HTML source (i.e. one of <a b d e l r t>).
So why can't you just put the rewrite rules last in the sound changes column then?
Because there is no such field in the Gen.
The conlanger formerly known as “the conlanger formerly known as Pole, the”.

If we don't study the mistakes of the future we're doomed to repeat them for the first time.

User avatar
Terra
Avisaru
Avisaru
Posts: 571
Joined: Tue May 24, 2005 10:01 am

Re: Some comments about gen and SCA²

Post by Terra »

Pole wrote:
Terra wrote:
cedh audmanh wrote:
Terra wrote:
Generally when single characters are not part of the orthography, but some symbol is (in this case, b ~ p' etc.), to avoid using digraphs, which SCAs tend to misinterpret or scream at, unless they're specifically included somehow.
Oh, so the issue is that the apostrophe means something special in a sound rule in Zomp's SCA. Okay, I guess.
No, the issue is that you can't define digraphs in Gen, so you have to use a single (placeholder) symbol for each sound in the Categories field, and change that symbol to the appropriate digraph with a rewrite rule. That's a limitation of the design, but you can deal with it - unless, in the current version of Gen, you use a character in the source part of a rewrite rule which is also used in the relevant part of the HTML source (i.e. one of <a b d e l r t>).
So why can't you just put the rewrite rules last in the sound changes column then?
Because there is no such field in the Gen.
Sorry, I was thinking about SCA2, which does have such a field.

zompist
Boardlord
Boardlord
Posts: 3368
Joined: Thu Sep 12, 2002 8:26 pm
Location: In the den
Contact:

Re: Some comments about gen and SCA²

Post by zompist »

I rarely seem to be in the mood to look at JavaScript :? , but I did fix up gen today and I'll look at SCA2 later.

Changes:

1. Fixed the rewrite rules. I figured it would save processing time to apply them all at once, but it turns out to take no appreciable time to apply them word by word. So that should fix their interfering with the html.

2. Removed the case insensitivity. I forget why I did that and I can't think of a good reason now. If you were writing KH Kh kh indifferently, it won't work right anymore. If it's really a problem I'll add a checkbox.

3. Added an "Always" option for monosyllabicity. I originally avoided this because I thought it was misleading for running text, but it's much more useful for the word list options, so there you go.

MysteryMan23
Sanci
Sanci
Posts: 42
Joined: Sun Feb 05, 2012 6:35 pm

Re: Some comments about gen and SCA²

Post by MysteryMan23 »

Erm...I'm not quite sure what you did, zompist, but gen no longer deletes letters when you leave the right side of the rewrite rule blank. This pretty much breaks the pseudo-Japanese example given by the guide. You might wish to look into that. Thanks.

Post Reply