Haedus SCA - Bugfix (01/24)

R.Rusanov · Post by **R.Rusanov** » Mon Dec 30, 2013 12:51 am

I don't know what this is but it made me misremember 'haedus' instead of 'haemus' causing one of my conlangs to develop a word for 'blood' with no reasonable etymology! This makes me rather incensed.

Zju · Post by **Zju** » Thu Jan 02, 2014 11:45 am

Much thanks for this useful tool!
There is only one thing which I don't know how to do - how to break up one word in two? The reason I want to that is to mark the word class that arises from segment loss, e.g. lorem → lor N, ipsum → ips N, but lore → lor C, ipsu → ips C, where C and N are genders.

edit:
also for some reason the simple rule

q > qn

causes the program to enter in an endless loop or something, and after I define

T = pʲ p t k

redefining it as

T = p t k

seems to have no effect.

Morrígan · Post by **Morrígan** » Thu Jan 02, 2014 11:56 am

Zju wrote:Much thanks for this useful tool!
There is only one thing which I don't know how to do - how to break up one word in two? The reason I want to that is to mark the word class that arises from segment loss, e.g. lorem → lor N, ipsum → ips N, but lore → lor C, ipsu → ips C, where C and N are genders.

I guess you could stick a hyphen or clitic marker in there, but since I designed the system to work on the basis of phonetic segments, there really is no "word boundary" pseudo-segment in the current model.

chris_notts · Post by **chris_notts** » Thu Jan 02, 2014 12:26 pm

Zju wrote: also for some reason the simple rule

q > qn

causes the program to enter in an endless loop or something

Probably the tool reapplies the same rule from the beginning of the word repeatedly. This is necessary if multiple replacements needs to be made or if a rule feeds itself, e.g. if you have some kind of harmony rule in action. But it does mean that you can infinite loops with poorly formulated rules.

Since your output contains the input (i.e input = q, output = qn), you end up adding an infinite number of ns after q. E.g

qa -> qna
qna -> qnna
qnna -> qnnna
...

I only use my own SCA, so I don't know what the best fix is. Perhaps you can fix it by adding some context? E.g. apply the rule q -> qn only before a vowel, or before a non-nasal.

EDIT: in my SCA HaSC, my solution was:

1. apply an iteration limit, and stop if the word hasn't converged after 1000 iterations with an error, so the user knows what is happening
2. by default force sound changes to advance through the word, so infinite insertion is not possible. There is a way to remove this restriction (replace -> by -->), but the default prevents unintended accidental problems of this kind

Zju · Post by **Zju** » Thu Jan 02, 2014 12:32 pm

Oh, I see, thanks. That was a minor issue, actually. My bigger problem is this:

Zju wrote:after I define

T = pʲ p t k

redefining it as

T = p t k

seems to have no effect.

This effectively means that I after make a simple change in phoneme inventory (e.g. adding a palatal to the plain stop series) means that I have to abandon the previous group name altogether and start using a new one. This isn't very comfortable.

Morrígan · Post by **Morrígan** » Thu Jan 02, 2014 12:45 pm

Zju wrote:also for some reason the simple rule

q > qn

causes the program to enter in an endless loop or something

This was a bug in the initial release that was fixed, but I haven't gotten around to releasing the fixed version. I designed the rule application to avoid loops by moving a pointer along the word - when a rule is applied, it was supposed to jump ahead so that, after "aqa" is changed to "aqna", the pointer will be at the second "a".

Zju wrote: T = pʲ p t k
redefining it as
T = p t k
seems to have no effect.

This I will have to look into further. EDIT: nevermind, found the problem. I should probably prepare a release soon. I'm nearly done adding support for negatives, so I'd prefer to get that in before I release.

chris_notts · Post by **chris_notts** » Thu Jan 02, 2014 12:51 pm

Morrígan wrote:This was a bug in the initial release that was fixed, but I haven't gotten around to releasing the fixed version. I designed the rule application to avoid loops by moving a pointer along the word - when a rule is applied, it was supposed to jump ahead so that, after "aqa" is changed to "aqna", the pointer will be at the second "a".

How do you handle backwards sound changes that feed themselves, e.g. back-front vowel harmony from the final vowel of the word to the initial vowel? Does it rely on greedy regex operators, to get around the forced forward movement of sound changes?

Morrígan · Post by **Morrígan** » Thu Jan 02, 2014 12:56 pm

chris_notts wrote:
Morrígan wrote:This was a bug in the initial release that was fixed, but I haven't gotten around to releasing the fixed version. I designed the rule application to avoid loops by moving a pointer along the word - when a rule is applied, it was supposed to jump ahead so that, after "aqa" is changed to "aqna", the pointer will be at the second "a".
How do you handle backwards sound changes that feed themselves, e.g. back-front vowel harmony from the final vowel of the word to the initial vowel? Does it rely on greedy regex operators, to get around the forced forward movement of sound changes?

I don't really, yet. I plan to have rules which just apply right-to-left instead of left-to-right. That's also something I'd like to get in place really soon, since it shouldn't be too much work, based on the code that's currently written.

Zju · Post by **Zju** » Thu Jan 02, 2014 1:38 pm

Also, it'd be nice if you supported curly brackets on the left side of the sound change. Currently it doesn't throw an error, but the results are pretty weird and seemingly random.

Morrígan · Post by **Morrígan** » Fri Jan 03, 2014 9:54 am

Zju wrote:Also, it'd be nice if you supported curly brackets on the left side of the sound change. Currently it doesn't throw an error, but the results are pretty weird and seemingly random.

Yeah, I think right now it's going to interpret them as literals. I might add this functionality, since it's basically the same as using variables.

I do need to write up a better user guide, preferably one which is clear (this is the hard part) about what the capabilities and limitations are. As HTS becomes more powerful, this will become more important.

I'm very close to having negative conditions implemented, but I'm having some issues with Subversion and want to make sure this is taken care of before I move forward. I also need to make sure I appreciate the relevant parsing steps, and have tests to ensure that the changes are working as expected.

Zju · Post by **Zju** » Sun Jan 05, 2014 6:48 am

Just a little trick I felt like sharing. Say you want to palatize t d s z to tʃ dʒ ʃ ʒ before front vowels and then change the vowels into something else. Well, if you write

Code: Select all

T = t d s z
CH =  tʃ dʒ ʃ ʒ
E = e i y j
A = o u u w
TE > CHA

it won't work - try it and you'll see. If you instead write

Code: Select all

T = t d s z
CH =  tʃ dʒ ʃ ʒ
E = e i y j
A = o u u w
E > A / T_
T > CH / _A

this will also change e.g. so to ʃo. What you need is a special letter that you don't use anywhere else, say q. Then this will do the job:

Code: Select all

T = t d s z
CH =  tʃ dʒ ʃ ʒ
E = e i y j
A = o u u w
E > qA / T_
Tq > CH

The reverse direction also works. You wouldn't want to delete the q's (q > 0) so that you can see if all instances of TE have been changed to CHA.

Zju · Post by **Zju** » Sun Jan 05, 2014 8:09 am

Just found another bug. When you apply the rule ox > l to

Code: Select all

oxoxoxoxox
moxmoxmoxmoxmox
mmoxmmoxmmoxmmoxmmox

it yields

Code: Select all

loxloxl
mlmoxmlmoxml
mmlmmlmmlmmlmml

instead of

Code: Select all

lllll
mlmlmlmlml
mmlmmlmmlmmlmml

Morrígan · Post by **Morrígan** » Mon Jan 06, 2014 12:14 am

That's partly the same bug, and partly a new one, so thanks for finding that. I actually got"loxl", but the same code is responsible.

I'll release a bug-fix tomorrow, and continue working on the negatives - I realized I was doing it in the wrong way and thought of a much easier way to do it, so that should be done in only a few hours of work, assuming I actually sit down to do it.

Morrígan · Post by **Morrígan** » Mon Jan 06, 2014 12:49 pm

Bugfix Release: Download Here (2013.01.06)

Version with new features to follow in a few days, as well as improved documentation.

Buran · Post by **Buran** » Tue Jan 07, 2014 3:39 am

I have very, very little experience working with this kind of software. Could you write an idiot/newb-proof guide/user manual for this SCA? Thanks.

Morrígan · Post by **Morrígan** » Fri Jan 17, 2014 11:04 am

Adjective Recoil wrote:I have very, very little experience working with this kind of software. Could you write an idiot/newb-proof guide/user manual for this SCA? Thanks.

I'd definitely like to; the current one is pretty haphazard. What are the opinions of people as to what would be a useful structure? I'm thinking maybe something along the lines of a how-to tutorial, with a technical appendix.

Also, I've been working on negatives on and off, but personal stuff (and the fact that I program 8 hours a day at work) has kept my productivity down. The negatives are close to being implemented, I just keep changing my mind with respect to the way they are implemented, and I think I have that settled now.

Any opinions on notation? At the moment the notation uses "!" with other expressions

Code: Select all

1. !a       # Accepts any symbol that is not "a"
2. !(abc)   # Accepts symbol sequence that is not "abc"
3. !{a b c} # Accepts any symbol that is not "a", not "b", and not "c"
4. !V       # Accepts any sequence that is not containted in the variable "V"

In the case of (2), I'm fairly certain that !(abc) = ab!c

I could use "^", "~" or another symbol instead of "!", whatever avoids using up characters people already use. Thoughts?

Zju · Post by **Zju** » Fri Jan 17, 2014 2:17 pm

I find using ! as a negation symbol very intuitive.

In the case of (2), I'm fairly certain that !(abc) = ab!c

Do you mean that ab!c will behave as !(abc) or that !(abc) will accept sequences of ab that aren't followed by c?

Morrígan · Post by **Morrígan** » Fri Jan 17, 2014 2:25 pm

Zju wrote:I find using ! as a negation symbol very intuitive.

In the case of (2), I'm fairly certain that !(abc) = ab!c
Do you mean that ab!c will behave as !(abc) or that !(abc) will accept sequences of ab that aren't followed by c?

The latter; unlike other markers, ! is a prefix.

chris_notts · Post by **chris_notts** » Sat Jan 18, 2014 1:36 pm

Morrígan wrote: In the case of (2), I'm fairly certain that !(abc) = ab!c

I could use "^", "~" or another symbol instead of "!", whatever avoids using up characters people already use. Thoughts?

In my own SCA I use ~ as a suffix operator.

I think that interpreting !(abc) as ab!c is confusing. If I understand your terminology correctly, then this means:

SEQUENCE MATCH
abc N
bbc N
aac N
abb Y

Why, for example, should "bbc" not be a match for the pattern !(abc)? !(abc) looks like it should mean "match any three input tokens together don't form the sequence abc". But that's not the same as ab!c, which looks like it should mean "match ab then any token that isn't c".

Morrígan · Post by **Morrígan** » Sat Jan 18, 2014 1:52 pm

chris_notts wrote: I think that interpreting !(abc) as ab!c is confusing. If I understand your terminology correctly, then this means:
...
Why, for example, should "bbc" not be a match for the pattern !(abc)? !(abc) looks like it should mean "match any three input tokens together don't form the sequence abc". But that's not the same as ab!c, which looks like it should mean "match ab then any token that isn't c".

It's not an interpretation, it was something I was wondering about. But you are right, they are not equivalent. I was thinking about it the wrong way and hadn't gotten a chance to sit down and draw some machines.

Morrígan · Post by **Morrígan** » Mon Jan 20, 2014 11:29 am

So, upon further not-terribly-detailed analysis, it seems to be the case that

Code: Select all

!(abc) = { !abc a!bc ab!c }

in which case, I may choose not to support the use of negation on groups !(...), at least not for a while.

I have negated terminals working, and just need to attach a dead-state to get negated sets working. After that, I intend to make sure that you can put flags in the rules file to give the user control over segmentation and normalization.

Cedh · Post by **Cedh** » Mon Jan 20, 2014 2:46 pm

I think it would be good to have two different types of group negation, which would of course need different syntax to distinguish between them:

chris_notts wrote:!(abc) looks like it should mean "match any three input tokens together don't form the sequence abc".

This is the first type. The number of tokens in the brackets gives the length of the string to be matched, and strings should be matched only if they're not identical to the string in brackets. This might be more clearly written as !("abc"), although the quotes probably aren't necessary.

The second type is that which is written as <-abc> in Geoff's SCA. It means "match any single input token that's neither a nor b nor c, i.e. not contained in the group".

Morrígan · Post by **Morrígan** » Mon Jan 20, 2014 2:48 pm

Cedh wrote:The second type is that which is written as <-abc> in Geoff's SCA. It means "match any single input token that's neither a nor b nor c, i.e. not contained in the group".

Code: Select all

!{a b c}

chris_notts · Post by **chris_notts** » Mon Jan 20, 2014 4:50 pm

Are !(abc) and !{a b c} intended to be special cases, or can you put ! before any (complex) pattern and get a result? In HaSC you can negate more or less any pattern, so e.g the following would be fine:

(CC(a|e|i)(k|m)i~)~

= match a sequence that isn't: two consonants + a,e,or i + k or m + not i

Why you'd want to do that I don't know, but the functionality exists.

Morrígan · Post by **Morrígan** » Mon Jan 20, 2014 5:47 pm

chris_notts wrote:Are !(abc) and !{a b c} intended to be special cases, or can you put ! before any (complex) pattern and get a result? In HaSC you can negate more or less any pattern, so e.g the following would be fine:

(CC(a|e|i)(k|m)i~)~

= match a sequence that isn't: two consonants + a,e,or i + k or m + not i

Why you'd want to do that I don't know, but the functionality exists.

!{a b c} and !a are supported, but !(abc) is not, and may not be. I don't really want to even be in the business of adding functionality to this, but neither am I really in the mood to work on other projects unfortunately.

I'm also considering abandoning using DFAs and moving to something which is recursively enumerable.

zompist bboard

Haedus SCA - Bugfix (01/24)

Re: Haedus Toolbox SCA

Re: Haedus Toolbox SCA

Re: Haedus Toolbox SCA

Re: Haedus Toolbox SCA

Re: Haedus Toolbox SCA

Re: Haedus Toolbox SCA

Re: Haedus Toolbox SCA

Re: Haedus Toolbox SCA

Re: Haedus Toolbox SCA

Re: Haedus Toolbox SCA

Re: Haedus Toolbox SCA

Re: Haedus Toolbox SCA

Re: Haedus Toolbox SCA

Re: Haedus SCA - Bugfix (01/06)

Re: Haedus SCA - Bugfix (01/06)

Re: Haedus SCA - Bugfix (01/06)

Re: Haedus SCA - Bugfix (01/06)

Re: Haedus SCA - Bugfix (01/06)

Re: Haedus SCA - Bugfix (01/06)

Re: Haedus SCA - Bugfix (01/06)

Re: Haedus SCA - Bugfix (01/06)

Re: Haedus SCA - Bugfix (01/06)

Re: Haedus SCA - Bugfix (01/06)

Re: Haedus SCA - Bugfix (01/06)

Re: Haedus SCA - Bugfix (01/06)