ASCA v0.1.6 - NEW

Substantial postings about constructed languages and constructed worlds in general. Good place to mention your own or evaluate someone else's. Put quick questions in C&C Quickies instead.
User avatar
Morrígan
Avisaru
Avisaru
Posts: 396
Joined: Thu Sep 09, 2004 9:33 am
Location: Wizard Tower

ASCA v0.1.6 - NEW

Post by Morrígan »

As a few of you may have heard, I'm working on a new sound change applier, written in Java. This is partly for my own edification, since I don't have a huge amount of practical programming experience. I'd like to release this as an executable and JAR package, run through the command line and with debug support.

I'm working on some basic features right now, namely Optionals, Variables, and Sets, which I'll explain presently.

The basic rule format is intended to look more like sound-change rules, as they are written normally:

Code: Select all

1.    a b c > x y z / P_S
You probably recognize how this works if you've ever used an SCA before. One feature I've already added is how you can handle unconditioned rules:

Code: Select all

2.    a b c > x y z / _
3.    a b c > x y z
4.  **a b c > z y z /
Rules 2 and 3 do the same thing; rule 4 will result in an error. The Condition can be omitted, but cannot be empty. So far, this stuff works.

Actually, this still doesn't support word boundaries; However, it does allow comment lines starting with # and even inline-comments in ##

My current task is to correctly implement Variables; I did a sloppy job yesterday, and only the first variable in a rule actually got processed. Variables are capital letters and/or numbers preceded by @.

Code: Select all

5.    @C       = p t k b d g r l m n
6.    @V       = a e i o u
7.    @PLOSIVE = p t k b d g
All of these definitions should be valid; if all goes according to plan, you should also be able to put variables into variable definitions:

Code: Select all

8.    @NASAL     = m n
9.    @LIQUID    = r l
10.   @SPIRANT   = s z
11.   @CONSONANT = @PLOSIVE @SPIRANT @LIQUID @NASAL
There is another feature that I think will be both interesting and useful, but will require a lot of debugging: the way things are written now, Rules and Variables are read concurrently, in a single pass of the Rules file. In principal, you should be able to define and redefine Variables on the fly throughout the file, which might be good if you need to redefine @CONSONANT to add /θ ð/ from the lenition of /t̪ d̪/ especially because these have no need to be in the file from the beginning. If this works, I may also permit you to delete variables once you are done with them.

Optionals are strings enclosed by parens (...) indicating that the enclosed string may be present or omitted, as is the standard practice. Unlike in some other SCAs, these can be included in both the Initial and Condition, not just the Condition. an Optional in the Final will produce an error.

Finally, I'd like to allow ad hoc Sets, which basically work like Variables, but are not stored in memory. These are useful where you would otherwise use a Variable, but perhaps only the once.

I would like to allow Optionals and Sets in the Initials, but this is inviting bugs.

Some other features I'd like to add are related to some of these others. One is Many-to-One replacement:

Code: Select all

12.   a e o > ə
I might also add an Audit feature, which inspects your input lexicon and rules and finds rules which are never used (this usually happens with two or more variables in a digraph or in the Condition). I might also include a Stats mode to do cluster counts. I already have the software written, but I'd need to add it to the commend line.

Also, some real-world rules are hard to run using the standard find-and-replace method that this and other SCAs seem to use. Among these are things like vowel syncopation ( I know Jeff was having trouble with this with MUBA's VSCA ) because they require large strings, which means a huge search-space. Another such problem is Grassmann's law in eastern Indo-European, where an aspirated stop in a root is de-aspirated if another aspirated stop occurs later in the root. Distance assimilation/dissimilation will probably always be a problem. Here are the rules from my VSCA code for doing Grassmann's law in Kuma-Koban:

Code: Select all

VS=aeiou[ə]
VL=āēīōū[ə̄]
V=<VS><VL>
N=mn
R=rl
IU=iu
CH=[pʰ][tʰ][cʰ][kʰ]

[pʰ][tʰ][cʰ][kʰ]/bdɟg/_V(N)<CH>          576
[pʰ][tʰ][cʰ][kʰ]/bdɟg/(R)_V<IU><CH>     1152
[pʰ][tʰ][cʰ][kʰ]/bdɟg/_VR(s)<CH>         768
                                        2496
The numbers to the left indicate the number of individial rules generated by each search, which I actuallyl trimmed down substantially, because it was such a runtime hog. It means the program has to search each word for 2496 different strings.

I think I know how I could potentially solve this using regular expressions (built in, of course; I want this to be relatively user-friendly), but this will require a totally different processing mechanism than the other rules and will at least have to wait until I finish getting the basics done.

I'll try to make an alpha of some kind available as soon as I can.
Last edited by Morrígan on Sat Feb 26, 2011 5:19 pm, edited 23 times in total.

User avatar
vohpenonomae
N'guny
N'guny
Posts: 91
Joined: Sat Nov 02, 2002 4:23 am

Re: ASCA - A Sound Change Applier

Post by vohpenonomae »

This is a worthy project, but it comes too late for me to make use of, unfortunately; I recently completed all the sound change files for the Central Mountain family using a revised version of Zomp's Sounds. They're long and use many work-arounds, but at this point I'd rather not re-translate them to another SCA format.

There's no rule I've been unable to render into Sounds format; a baroque combination of variables, placeholders and work-arounds suffice for even my most complex changes. But an easier way to do complicated changes would certainly be welcomed by others, I imagine.
"On that island lies the flesh and bone of the Great Charging Bear, for as long as the grass grows and water runs," he said. "Where his spirit dwells, no one can say."

User avatar
Mbwa
Lebom
Lebom
Posts: 142
Joined: Sat Aug 18, 2007 1:48 pm

Post by Mbwa »

Looks awesome.

User avatar
Morrígan
Avisaru
Avisaru
Posts: 396
Joined: Thu Sep 09, 2004 9:33 am
Location: Wizard Tower

Post by Morrígan »

Ok, while I still can't release any version of this program yet, I do have a few things to report.

First, I am still working on implementing variables correctly; I need to sit down and think about how to do this cleanly and correctly.

Second, I have added Many-to-One rule support, so that any number of Initials can be replaced by the same Final:

Code: Select all

a b c > d
Third, I have also added support for Zero, allowing for both deletion and insertion:

Code: Select all

a b c > 0              ## Many-to-Zero uncoditioned Rule
a b c > 0 0 0          ## Basic uncoditioned Deletion Rule
a b c > 0 v w / x_y    ## Basic coditioned Deletion Rule
0 > a / x_y            ## Conditioned Insertion
All of the above rules appear to execute as expected. If an Insertion rule lacks a Condition, it should do nothing at all, but I will add checks just to ensure this reported as an invalid rule.

I'd also like to ask if anyone has particular features they would like to see in an SCA (other than a Distinctive or Binary Features model, which I might add, if I ever get one built for another program).

User avatar
vohpenonomae
N'guny
N'guny
Posts: 91
Joined: Sat Nov 02, 2002 4:23 am

Post by vohpenonomae »

TheGoatMan wrote:Ok, while I still can't release any version of this program yet, I do have a few things to report.

First, I am still working on implementing variables correctly; I need to sit down and think about how to do this cleanly and correctly.

Second, I have added Many-to-One rule support, so that any number of Initials can be replaced by the same Final:

Code: Select all

a b c > d
Third, I have also added support for Zero, allowing for both deletion and insertion:

Code: Select all

a b c > 0              ## Many-to-Zero uncoditioned Rule
a b c > 0 0 0          ## Basic uncoditioned Deletion Rule
a b c > 0 v w / x_y    ## Basic coditioned Deletion Rule
0 > a / x_y            ## Conditioned Insertion
All of the above rules appear to execute as expected. If an Insertion rule lacks a Condition, it should do nothing at all, but I will add checks just to ensure this reported as an invalid rule.

I'd also like to ask if anyone has particular features they would like to see in an SCA (other than a Distinctive or Binary Features model, which I might add, if I ever get one built for another program).
My only suggestion would be to avoid having the whole thing work on a standard phonetic notation, like IPA Zounds requires IPA input. This is a major hassle when you (like I) use many Americanist symbols, sometimes mixed with IPA. Instead, let the full range of Unicode characters be allowed for input and output.
"On that island lies the flesh and bone of the Great Charging Bear, for as long as the grass grows and water runs," he said. "Where his spirit dwells, no one can say."

User avatar
Morrígan
Avisaru
Avisaru
Posts: 396
Joined: Thu Sep 09, 2004 9:33 am
Location: Wizard Tower

Post by Morrígan »

vohpenonomae wrote: My only suggestion would be to avoid having the whole thing work on a standard phonetic notation, like IPA Zounds requires IPA input. This is a major hassle when you (like I) use many Americanist symbols, sometimes mixed with IPA. Instead, let the full range of Unicode characters be allowed for input and output.
Oh, right; that is why I hate IPA Zounds. I couldn't recall. Like the fact that it doesn't even have full support for IPA symbols like ɕ ʑ and æ.

Currently, It should work on anything that isn't pre-defined as a delimiter or other special character, in any encoding and from any Unicode range. Actually, with respect to that, I think I might allow the user to define certain delimiters, like the Variable marker @ or the Condition delimiter /, should one be using X-Sampa. Maybe I should use $ and | (the pipe) by default.

User avatar
Boşkoventi
Lebom
Lebom
Posts: 157
Joined: Mon Aug 14, 2006 4:22 pm
Location: Somewhere north of Dixieland

Post by Boşkoventi »

TheGoatMan wrote:Oh, right; that is why I hate IPA Zounds. I couldn't recall. Like the fact that it doesn't even have full support for IPA symbols like ɕ ʑ and æ.
Meh. I 'just' altered one of the program's files to include those (and a number of other symbols). Actually not as easy as you might think (or as it should be), but doable.
TheGoatMan wrote:I'd also like to ask if anyone has particular features they would like to see in an SCA (other than a Distinctive or Binary Features model, which I might add, if I ever get one built for another program).
"Except / Unless" - e.g. r>0/!_V meaning "delete r except before a vowel".

I don't know if you have this already, or how common it is, but the lack of it is the biggest complaint I have about IPA Zounds.
Radius Solis wrote:The scientific method! It works, bitches.
Είναι όλα Ελληνικά για μένα.

User avatar
Morrígan
Avisaru
Avisaru
Posts: 396
Joined: Thu Sep 09, 2004 9:33 am
Location: Wizard Tower

Post by Morrígan »

Boskobènet wrote:
TheGoatMan wrote:Oh, right; that is why I hate IPA Zounds. I couldn't recall. Like the fact that it doesn't even have full support for IPA symbols like ɕ ʑ and æ.
Meh. I 'just' altered one of the program's files to include those (and a number of other symbols). Actually not as easy as you might think (or as it should be), but doable.
TheGoatMan wrote:I'd also like to ask if anyone has particular features they would like to see in an SCA (other than a Distinctive or Binary Features model, which I might add, if I ever get one built for another program).
"Except / Unless" - e.g. r>0/!_V meaning "delete r except before a vowel".

I don't know if you have this already, or how common it is, but the lack of it is the biggest complaint I have about IPA Zounds.
This is one of the things I like about MUBA's VSCA. It makes some rules a lot easier. I was debating whether to allow this or not, because (AFAIK) one does not normally write sound-change rules likes this "in the real world", by which my mean the one phonetics class I took. OTOH, it should be very easy to implement, though I'm not going to try until after I finish implementing Sets.

User avatar
Morrígan
Avisaru
Avisaru
Posts: 396
Joined: Thu Sep 09, 2004 9:33 am
Location: Wizard Tower

Post by Morrígan »

[edit]Bad news everyone, ASCA isn't really working at the moment; the version I released somehow wasn't handling Conditions properly. I'm working on that particular problem. This is what I get for working too hard and not sleeping enough[/edit]

Good news everyone. ASCA works well enough that I can distribute it! Mind you, I have no idea how stable or unstable it is. I will be destributing the current version in both .exe and .jar formats, and will be posting an information and download page on my website (which I hope works).

The files can be found here. To summarize with the previous posts in mind, these are the things you must know to operate this program. '

First my presumption is that you will still need the JRE installed on your system, whether you are using the JAR or EXE version. You ought to be using version 1.6.2 or better.

To use this program, you must provide, in the command line or through a batch file, three file paths:

Code: Select all

-f or --file
-o or --out
-r or --rules
These specify, respectively, the input lexicon, the output lexicon, and the rules to be applied to derive the latter from the former.

There are two optional commands:

Code: Select all

-p or --print
-h or --help
Where -h prints this same type of information, while -p is a command to print a Rule Audit to the console. This will show each rule and the associated find-and-replace patterns.

One option takes a file path also:

Code: Select all

-a or --audit
This is like -p but will write the report to the specified file.

RULES
A rule consists of two main parts: Transformation and Condition. The Transformation further consists of a list of Initials and Finals, while the Condition consists of a Precondition and Postcondition.

Code: Select all

1.   a b c > d e f / x_y
In the above example (1), a b c > d e f is the Transformation, and x_y is the Condition; these are seperated by the backslash character \. a b c is the Initials while d e f is finals, seperated by a right angle-bracket >. Note also that the Initial and Final are space-delimited lists. Finally, the Precondition and Postcondition are seperated by an underscore _ which, as is the custom, represents the location of the Initials and Finals in the search-string.

One may also choose to write an unconditioned rule, which can be done in either of two ways:

Code: Select all

2.   a b c > d e f / _
3.   a b c > d e f
(2) is more typical, but I have added the support for the (3) for the sake of brevity.

ZEROES
I have also added support of Zeroes, by which I mean the use of the symbol 0 to represent a null-string or the empty set. These can be used to delete or insert strings:

Code: Select all

4.   a b c > 0 0 0
5.   0 > b / a_c
(4) will delete a, b, and c unconditionally. (5) will insert a b wherever the string ac occurs. If you fail to add a contion, the rule will do nothing. I should add code to report this error, should it occurs, but I have not yet.

VARIABLES
Variables are, of course, the real meat-and-potatoes of any Sound-Change-Applier. Their power, however, makes them problematic, and difficult to implement cleanly. I have made a substantial effort to catch any potential errors involved variable parsing, but cannot make any guarantees at the moment.
A variable name may be a string of any length (preferably in capital letters and numbers) beginning with the identifier sigil @. Variables are defined in the following manner:

Code: Select all

6.   @C = p t k b d g
7.   @V = a e i o u
As you can see from (6) and (7), this is done as in other SCAs, but with a space-delimited list.

There are some important rules for using variables, and you will receive errors if not used correctly:

Code: Select all

8.   @A @B > @C @D
9.   @A@B > @C@D
In a rule like (8) or (9), the corresponding variables on both side of Transformation must match in length: @A and @C must contain the same number of elements, as must @B and @D.
You are not permitted to have different numbers of variables in the Initials and Finals. And as I just said, these must also have the same number of elements. You may define variables using other variables, but be mindful of the length of the resulting composite if it is to be used in the Transformation.

SINGLE FINALS
It is also possible to build rules with only a single Final. This means that all Initials will be turned into this one Final. This can be done with Zeroes or any string literal in the Final, and with a Variable in the initial.

Code: Select all

10.  a b c > d
11.  a b c > 0
12.  @A > 0
13.  @B > b
14.  @A @B > @X
You can see from (10-13) than this may be used with Zeroes as well variables.

Well hell, I hope that's it. If you have any problems, do report them and be sure to include the text of the Rule Audit.

User avatar
Morrígan
Avisaru
Avisaru
Posts: 396
Joined: Thu Sep 09, 2004 9:33 am
Location: Wizard Tower

Post by Morrígan »

Ok, this is a minor update, so nobody get too excited. First, I apologize for my premature release. I got overly excited and-

Let me rephrase. I've been overworking myself trying to get a working version of this thing, and got hasty. I didn't even bother to actually field-test ASCA; I was just using basic test sets. Apparently a deleted two lines, which caused ASCA to completely ignore any Conditions given.

On the other hand, I've already added support for Optionals in the Condition, so you can use parens to optionally omit strings:

Code: Select all

## GRASSMANN'S LAW
@CH = pʰ tʰ cʰ kʰ 
pʰ tʰ cʰ kʰ > b d ɟ g / _@V(@N)@CH
pʰ tʰ cʰ kʰ > b d ɟ g / _(@R)@V@IU@CH
pʰ tʰ cʰ kʰ > b d ɟ g / _@V(@R)(s)@CH
I need to fix some variable handling (it borks the strings if you have a variable name like @PLOSIVE and another @P; I thought this might happen, so rewriting it should be easy. There might be another solution I can look into...) but it should be easy to add an OR keyword to let you write rules that can be applied under multiple conditions; basically a union of the sets of conditions. I have a notion of how I might add support for exceptions (like VSCA's UNLESS statement), but that is farther down on my to-do list.

I also realized I don't have support for word boundaries... woops. That should be easy to fix too.

Another thing I might add in the future is the ability to handle CSV files. To do that in VSCA, I had to use a hack where I used the word delimiter " in the CSV in place of a word boundary.

Bear with me; I think we should be there soon.

User avatar
Morrígan
Avisaru
Avisaru
Posts: 396
Joined: Thu Sep 09, 2004 9:33 am
Location: Wizard Tower

Post by Morrígan »

OK. You can find a zip package containing both jar and exe versions of ASCA along with some sample rules here.

It works pretty much just as well as any other SCA, as near as I can tell; I translated the PIE-to-PKK rules for VSCA into ASCA format, and they seem to work quite well (not perfectly mind you; this will take some debugging).

The main caveat right now is that ASCA doesn't support word-boundaries, so _# won't work. I'm trying to fix this. I'm also not sure that Optionals work 100%, so I'll need to do testing.

In spite of the current shortcomings, I do have some very very good news: ASCA is much faster (~7.5x) than VSCA, probably owing to the fact that the latter was written in Perl, which is an interpreted, rather than compiled, language.

So, I wish everyone luck. And do report any bugs you find.

[edit]There is a bug in how variables are handled, which I've identified using Audit; I'll take a look at it, but I should probably get some sleep soon.[/edit].

User avatar
Morrígan
Avisaru
Avisaru
Posts: 396
Joined: Thu Sep 09, 2004 9:33 am
Location: Wizard Tower

Post by Morrígan »

I have a new release of version 0.0.2 of ASCA available for download.

New Features:
  • * Word Boundaries
    * Optionals
    * Sets
The archive contains an example file. I still need to work up a full manual on my website.

You can now put Optionals and Sets in the Condition:

Code: Select all

1.   kʷʰ kʷ gʷ > kʰ k g / _{@ROUND @OBSTRUENT}
2.   pʰ tʰ cʰ kʰ > b d ɟ g / _(@R)@V(@N)@CH
You can also put Sets and Variables inside of Optionals:

Code: Select all

3.   i u > y w / (@AE)_@AE
4.   pʰ tʰ cʰ kʰ > b d ɟ g / _@V({r l})(s)@CH
And you can put word-boundaries inside of sets:

Code: Select all

5.   @VS  > @VL  / _@X{@C #}
Making use of of these features, I was able to run my ASCA rules file in about 1/8th the time it took for the same VSCA file. We are talking about a greater than 8-fold improvement.

User avatar
Morrígan
Avisaru
Avisaru
Posts: 396
Joined: Thu Sep 09, 2004 9:33 am
Location: Wizard Tower

Post by Morrígan »

http://www.acsu.buffalo.edu/~sgmccabe/ASCA/index.html

Added a User Manual to the ASCA page.

I'm also looking into what parts of the code can be streamlined.

tezcatlip0ca
Avisaru
Avisaru
Posts: 385
Joined: Fri Mar 12, 2010 6:30 pm

Post by tezcatlip0ca »

I'd like a probability function, like "change /@/ to /u/ abter labialized consonants in 85% of words"...

User avatar
Morrígan
Avisaru
Avisaru
Posts: 396
Joined: Thu Sep 09, 2004 9:33 am
Location: Wizard Tower

Post by Morrígan »

Aid'os wrote:I'd like a probability function, like "change /@/ to /u/ abter labialized consonants in 85% of words"...
This came up in Bricka's thread a few months ago. I forget exactly what the consensus came to, but it didn't seem like a very good idea, given that sound changes really don't happen that way. Or, they don't happen that way on moderately large time scales.

Even when some sound changes are probabilistic, word frequency is the biggest factor.

User avatar
Torco
Smeric
Smeric
Posts: 2372
Joined: Thu Aug 30, 2007 10:45 pm
Location: Santiago de Chile

Post by Torco »

TheGoatMan wrote:
Aid'os wrote:I'd like a probability function, like "change /@/ to /u/ abter labialized consonants in 85% of words"...
This came up in Bricka's thread a few months ago. I forget exactly what the consensus came to, but it didn't seem like a very good idea, given that sound changes really don't happen that way. Or, they don't happen that way on moderately large time scales.

Even when some sound changes are probabilistic, word frequency is the biggest factor.
But some changes overlap with others, right? like during the Great Lenition of /t/ into /T/, some plosives got aspirated, so some /t/s turn to [T] and others to [t_h]. Or not ? :D

User avatar
Morrígan
Avisaru
Avisaru
Posts: 396
Joined: Thu Sep 09, 2004 9:33 am
Location: Wizard Tower

Post by Morrígan »

Torco wrote:
TheGoatMan wrote:
Aid'os wrote:I'd like a probability function, like "change /@/ to /u/ abter labialized consonants in 85% of words"...
This came up in Bricka's thread a few months ago. I forget exactly what the consensus came to, but it didn't seem like a very good idea, given that sound changes really don't happen that way. Or, they don't happen that way on moderately large time scales.

Even when some sound changes are probabilistic, word frequency is the biggest factor.
But some changes overlap with others, right? like during the Great Lenition of /t/ into /T/, some plosives got aspirated, so some /t/s turn to [T] and others to [t_h]. Or not ? :D
I'm not sure what you mean. As I understand it, certain sound changes might occur in limited contexts (phonetically) and in certain frequent words. These changes can then spread throughout a population of speakers, an through as lexicon.

This is what happens with /s/ > [h] lenition in some varieties of American Spanish (I know this happens in Puerto Rico and the Dominican Republic), but TMK it isn't a regular sound law.

s > h / _Σ (or however you write a syllable boundary)

doesn't occur in 100% of words; maybe not even most. But I do recall reading that it occurs several times more frequently in the most common lexemes.

This also happens in Chicano English, where /t d/ are elided word-finally. Again, this is a lexical frequency phenomenon.

Cedh
Sanno
Sanno
Posts: 938
Joined: Tue Nov 14, 2006 10:30 am
Location: Tübingen, Germany
Contact:

Post by Cedh »

I think what Torco means is that two sound changes can start in different closely related dialects at the same time, eventually both applying to all words in the language, but not in the same order for all words. A natlang example can be found in the history of French, where the loss of unstressed medial vowels interfered with intervocalic voicing of obstruents. MANICA > manche did not undergo voicing of its medial *kʲ because the vowel was dropped before lenition could apply. In the phonologically parallel GRANICA > grange, however, *kʲ was still found between vowels at this time, and so it became *gʲ.

User avatar
Morrígan
Avisaru
Avisaru
Posts: 396
Joined: Thu Sep 09, 2004 9:33 am
Location: Wizard Tower

Post by Morrígan »

cedh audmanh wrote:I think what Torco means is that two sound changes can start in different closely related dialects at the same time, eventually both applying to all words in the language, but not in the same order for all words. A natlang example can be found in the history of French, where the loss of unstressed medial vowels interfered with intervocalic voicing of obstruents. MANICA > manche did not undergo voicing of its medial *kʲ because the vowel was dropped before lenition could apply. In the phonologically parallel GRANICA > grange, however, *kʲ was still found between vowels at this time, and so it became *gʲ.
Fascinating.

[edit]
I don't feel like bumping the thread; I've released v0.0.3 with a few minor fixes, and a little bit of streamlining. It runs a couple of seconds faster on the PKK rules.
[/edit]

User avatar
Morrígan
Avisaru
Avisaru
Posts: 396
Joined: Thu Sep 09, 2004 9:33 am
Location: Wizard Tower

Post by Morrígan »

v0.0.4 now released with CSV support!


Just use the -c or --csv command-line switch to parse you lexicon as a CSV rather than normal text file. No need to change your rules for compatibility; #_ and _# should work as expected.

See the ASCA page

or

Direct Download

Also, could someone respond to this thread and tell me that I didn't waste my time developing this monstrosity?

User avatar
vohpenonomae
N'guny
N'guny
Posts: 91
Joined: Sat Nov 02, 2002 4:23 am

Post by vohpenonomae »

TheGoatMan wrote:Also, could someone respond to this thread and tell me that I didn't waste my time developing this monstrosity?
If you'd done it 6 months ago, I'd have used it for Central Mountain. I ended up using Mark's Sounds, for two reasons--(1) No other SCA had all the features I needed, so they were all basically equivalent in terms of function to me; and (2) all other SCAs had bugs, making them unreliable. The more complicated the programming, the more bugs; and I needed something that was rock-solid, and nothing is solider than Sounds. I ended up having to alter Sounds slightly (enlarging buffers and the like) and re-compliing it, but it works great; I'd have loved not to have had to use so many placeholders and lengthy work-arounds, but I got what I wanted in the end.

Still, I might have a use for your project. Sounds can't handle Unicode, and all my orthographies use Unicode symbols; I've been using small VSCA transformation files to translate from my orthographies to Sounds-readable symbols and back again, but VSCA won't let me use [ and ] as characters, and I use these for n_yw and m_yw in Kapakwonak's Sounds file; I've been translating [ and ] to other things before orthographical translations are applied. But I'd welcome a chance to omit this step; I could easily rewrite my translation files to your format.
"On that island lies the flesh and bone of the Great Charging Bear, for as long as the grass grows and water runs," he said. "Where his spirit dwells, no one can say."

User avatar
Morrígan
Avisaru
Avisaru
Posts: 396
Joined: Thu Sep 09, 2004 9:33 am
Location: Wizard Tower

Post by Morrígan »

vohpenonomae wrote:If you'd done it 6 months ago, I'd have used it for Central Mountain.
Unfortunately, I was too busy with school to make any good progress, though I'd already begun. I think I ran into some kind of problem and didn't get back ti fix it.

VSCA was a little buggy, but only with variable handling, as far as I knew. I discovered this early on and learned to just write my rules more carefully.

IIRC, a rule like

Code: Select all

[iX][uX]/īū/_
gets run as

Code: Select all

[iX][uX]/ūū/_
so I just got into the habit of writing

Code: Select all

[iX]/ī/_
[uX]/ū/_
which was an admitted pain, but that appeared to be the only bug. Other rules ran correctly.

My biggest concern with ASCA currently is the that any rule involving "p" will also change the "p" in "pʰ" unless there is a postcondition, because in this case p > b / _@V generates a change list like this:

Code: Select all

pa > ba
pe > be
pi > bi
po > bo
pu > bu
This will be easy to fix with UNLESS statements. But as you correctly observe, the more I build into this system, the more likely something is to go horribly wrong.

There is another way around this problem, but I'll have to give it some though. I have already written an algorithm for splitting IPA strings into arrays of strings based on their codepoints so that diacritics get grouped with the right character, and I think I know how to use this with ASCA, but it could easily have undesirable results, and could make the program much more rigid; were I to add this feature, it would have to be switched on.

User avatar
Morrígan
Avisaru
Avisaru
Posts: 396
Joined: Thu Sep 09, 2004 9:33 am
Location: Wizard Tower

Post by Morrígan »

Nothing new to report, really. I'm currently working on another version of ASCA.

v0.0.4 remains fundamentally stable, but can't handle complex conditions; or, it can but if a condition corresponds to a large number of possible strings, it will take forever to run. As long as you don't run rules that span more than one (maybe two) syllables, it should be fine. Remember, if there are 1,000 unique syllables possible in a language, than a one-syllable rule will have to search 1,000 strings; two syllables means 1,000,000, and so on.

ASCA works by generating all of these combinations as soon as the rule gets read, and then using the compiled literals to do find-and-replace operations. With small rules, it is very fast, since there is very little overhead.

I don't think this is how other SCAs work, and probably for good reason. For an analogy, compare linear and polynomial equations. At first, while x < 1, linear functions have greater values than polynomials, but above 1, the polynomial values explode. I think something like that is happening here.

I'm working on a different algorithm that will have more balanced performance, but this will probably take me a while, so I might not have any update for at least a week.

Fajrulo
Niš
Niš
Posts: 1
Joined: Fri Jul 09, 2010 5:05 pm
Location: CO, USA

Post by Fajrulo »

TheGoatMan wrote:I'm working on a different algorithm that will have more balanced performance, but this will probably take me a while, so I might not have any update for at least a week.
I excitedly await this revamp. ASCA has proven delightful thus far, but I relish the chance to de-comment some of my changes that presuppose optional clusters of 3 or 4 consonants.
"OK, I think I've figured this thing out. We can go up and down, but not side to side or back in time."

--Homer Simpson

User avatar
Morrígan
Avisaru
Avisaru
Posts: 396
Joined: Thu Sep 09, 2004 9:33 am
Location: Wizard Tower

Post by Morrígan »

Fajrulo wrote:I excitedly await this revamp. ASCA has proven delightful thus far, but I relish the chance to de-comment some of my changes that presuppose optional clusters of 3 or 4 consonants.
Yeah, exactly the problem I'm trying to deal with. I'm hoping to have version 0.0.5 out over the weekend, but unforseen bugs could always get in the way.

I think I will also be taking restrictions off of what variables can named. As long as you are careful, I'm going to allow the use of any characters not already reserved, with the recommendation that names fall into three types:

One-Character variables (V, C, N, ...)
Short names with @ (@VS, @VL, @RES)
Long names with [] ([+Obstruent], [Vowel], [Consonant])

I originally used the @ sigil because I wanted variable names to be longer than one character, but needed to differentiate between two that appeared next to each-other.

Then I realized you could still use this system even if I didn't enforce it in the code, so why not make the variable-naming more open?

Anyway, back to work for me.

Post Reply