An Extended Sound Change Applier

bradrn · Post by **bradrn** » Tue Mar 15, 2016 1:39 am

EDIT: The link given is outdated; you can get the latest version from here

First of all, I know that there are many, many sound changers online already. The goal of this project is not to make the best sound change applier, but to make a sound change supplier that supports lots of features and is also compatible with zompist's SCA2 - as mentioned above, there are many very good sound changers already, but their format is quite different to the SCA's (e.g. going from P/B/_ *reaction -> reagdion to * <ustop> <vstop> _ ! reaction -> reagdion) and it could be hard to completely change how you write sound change rules.

Since I had some time, I decided to create a sound change applier that does exactly that. Thus, I present the exSCA (Extended Sound Change Applier), which you can get from here (no mac version yet, though I'm working on it - sorry!), supporting:

Everything that the SCA2 does (except the wildcard)
Syllabification using regexes
Automatic affixer
Categories within categories
Syntax highlighting
Opening and saving sound changes and lexicon
Writing custom sound changes in Python
Everything is typable using an ordinary computer keyboard - ² is replaced by > (I haven't added glosses or the wildcard yet)

And some things it does not support:

The wildcard (in the meantime you can implement it using Python - at the bottom of the post is some code you can copy into the program)
Backtracking (so a/b/_(C)C would not apply to the word 'ap' because it would interpret the 'p' as part of the (C))
The Edit menu (in the meantime, use Ctrl-C, Ctrl-X and Ctrl-V).

Screenshots (sorry if these don't work)

New Features
ASCII
The SCA2 uses ² to represent duplication. The exSCA replaces this with the character >, which can be typed much more easily.

Syllabification
Many people have complained on this forum about the lack of syllabification in the SCA and SCA2. In the exSCA, I have included regex syllabification, accessed by putting an 'x' in front of the rule, separated from it by a single space, which adds a . at each syllable boundary. Its main purpose would probably be to not let sound changes work across syllable boundaries (e.g. st/ss/_ x, when applied to a word like aste, would syllabificate it as as.te and not match anything because the sequence of characters is s.t, not st).
Syllabification regexes
The syllabification algorithm operates using regexes: the program finds all the parts of the word that matches the regex, then takes them and sticks a . between them. (Note: One consequence of the way the algorithm is implemented is that if you provide a regex that doesn't match all of the word, the word will get scrambled!) The regexes themselves are ordinary .NET Framework regexes, with one exception: you can use category names in them (e.g. using the definitions C=ptkbdg and V=aeiou, the regex C?V will be turned into [ptkbdg]?[aeiou]). These will have a beige background.
The default regex is an expression that matches (C)V(C(C)) words, leveraging the regex capabilities of the .NET Framework. Below I provide some regexes for several different syllable structures:

(C)V: C?V

(C)V(C): C?V(((?=CC))C|(((?=C$))C|))

Onset-Rime-Coda (as in Chinese): There are two variations of this one, depending on how you want to parse a VCVC word (e.g. anaŋ).
If you want it to parse as VC-VC (e.g. an-aŋ): O?RC?
If you want it to parse as V-CVC (e.g. a-naŋ): O?R(((?=O))|(((?=C))C|))

(C)(C)V(C): (CC?)?V(((?=CC))C|(((?=C$))C|))

Affixer (Decliner/Conjugator)
The automatic affixer (called the Decliner/Conjugator in the program, and accessed via Tools->Decliner/Conjugator or Ctrl-J) is a utility I created to help run declensions or conjugations through the exSCA quickly. When you open it up there are three textboxes: a prefixes one, a stems one and a suffixes one. (No infix support yet, sorry!)
The affixer works by using slots. Each line you put into each textbox counts as one slot. Each slot is composed of several parts, each separated by one space. When you press OK, the words textbox in the main window is filled with all combinations of the different slots with exactly one affix from each slot in each word. You can use an * (asterisk) to denote that the slot is optional - or, to be precise, the empty affix. Each textbox (except the stems textbox) by default contains one * (asterisk). If you keep a textbox empty, you shouldn't remove it. (You can, but then the affixer won't work.) Putting this all together, here is some sample input to the affixer and its output: