How do you store your lexicon?
Re: How do you store your lexicon?
I thought in the wrong direction there, I think. Rather, you'd link the matching pattern_id to the root. You'd still somehow need a list of how which patterns can apply to which root, though. A mediating table of the format translation(translation_id, root_id, pattern_id, translation) would be a suitable strategy, I think. Now I need to read up on what a NATURAL JOIN is.Chuma wrote:I'm not sure I understand your way of doing it - why does tbl_pattern have a root_id?
Re: How do you store your lexicon?
I've never done any database administration before, but SQLite seemed like a good place to start (that way I don't need to worry as much about the complex O/H of mySQL or something like that). Come to think of it, my programming skills have kind of gone out the window since I quit university; LaTeX and HTML is all I do these days, ignoring my use of the CLI on my Arch Linux system. So much for aspiring to be a geek.
Anyhoo... how about:
I think that will cover most cases; perhaps a date_added field may be useful for documenting the growth of the lexicon. And I'm not sure whether pronunciation would be necessary, but it could be useful as well.
Can a table be extended with extra columns after creation? Or would I have to allocate a new table?
Anyhoo... how about:
Code: Select all
tbl_root (root_id, root, semantic_meaning)
tbl_pattern (pattern_id, pattern, type)
tbl_translation (translation_id, root_id, pattern_id, translation, example, note)
Can a table be extended with extra columns after creation? Or would I have to allocate a new table?
Re: How do you store your lexicon?
What does "type" mean? And what sort of frontend did you have in mind to query the database?
Re: How do you store your lexicon?
"Type" describes whether the pattern is nominal, adjectival, etc, but in more specific detail. So, for example, k-s-t is the root for words representing writing, and -a-oi- is the pattern for a person that performs the verb; putting them together yields kasoit "writer/author/...". The type field would list what the pattern is used for, in its broadest sense.
I was thinking of writing a PHP front-end, which allows searching and modifying the dictionary. Later on if I even decided to host my conlangs I could always write a read-only front-end, and keep the local one for editing the contents.
I was thinking of writing a PHP front-end, which allows searching and modifying the dictionary. Later on if I even decided to host my conlangs I could always write a read-only front-end, and keep the local one for editing the contents.
Re: How do you store your lexicon?
I wish I knew any programming beyond html and css. I feel useless reading all of this. I want a small app that has four fields:
[original word] [part of speech] [definition(s)] [etymology]
I want it to give me an XML document that looks like this:
<entry>
<word>crap</word><pos>n.</pos><def>kind of poop</def><etym>17th Century France</etym>
</entry
Or even just html document with entry turned into <p>, word <span class="word">, pos <span class="pos"> etc. These can then be styled with CSS later or in InDesign.
Handling this kind of code is a nightmare through a text editor and browser or Dreamweaver.
And I want it to work on my Mac.
I have a feeling this is entirely doable and not even hard. I just need to find out how.
That's all.
[original word] [part of speech] [definition(s)] [etymology]
I want it to give me an XML document that looks like this:
<entry>
<word>crap</word><pos>n.</pos><def>kind of poop</def><etym>17th Century France</etym>
</entry
Or even just html document with entry turned into <p>, word <span class="word">, pos <span class="pos"> etc. These can then be styled with CSS later or in InDesign.
Handling this kind of code is a nightmare through a text editor and browser or Dreamweaver.
And I want it to work on my Mac.
I have a feeling this is entirely doable and not even hard. I just need to find out how.
That's all.
vec
Re: How do you store your lexicon?
Ooh ooh. And then it can also export an xml that only has the original words for inputting into a Sound Change applier. And even better: the Sound Change applier is built in.
How does this sound?
How does this sound?
vec
-
- Avisaru
- Posts: 704
- Joined: Fri Dec 03, 2010 9:41 am
- Location: NY, USA
Re: How do you store your lexicon?
Is a text file with tab-separated fields okay?vecfaranti wrote:I want a small app that has four fields:
[original word] [part of speech] [definition(s)] [etymology]
Pardon my French, or rather Perl:
Code: Select all
while (<>) {
chomp;
s%^([^\t]+)\t+([^\t]+)\t+([^\t]+)\t+([^\t]+)$%<word>\1</word><pos>\2</pos><def>\3</def><etym>\4</etym>%;
print "<entry>\n" . $_ . "\n</entry>\n\n";
}
Re: How do you store your lexicon?
I'm not far beyond that; I studied a bit of programming in university but other than that don't get it nearly as well as I might appear. The hardest part of this (or indeed, any project) is trying to work out exactly what the requirements are; I've hit what feels like a chicken-and-egg scenario.vecfaranti wrote:I wish I knew any programming beyond html and css.
Re: How do you store your lexicon?
Looks good. Or like this:hito wrote:Perl
Code: Select all
for(<>){s%(.*)\t(.*)\t(.*)\t(.*)\n%<entry><word>\1</word><pos>\2</pos><def>\3</def><etym>\4</etym></entry>\n%;print}
Code: Select all
for(<>){
@ARGV=split;
print '<entry><word>',shift,'</word><pos>',shift,'</pos><def>',shift,'</def><etym>',shift,"</etym></entry>\n"}
I think this might also work:
Code: Select all
for(<>){
@ARGV=("</etym></entry>\n",'</def><etym>','</pos><def>','</word><pos>','<entry><word>');
s/^/pop/e;
s/\t/pop/e;
s/\t/pop/e;
s/\t/pop/e;
s/\n/pop/e;
print;
}
Re: How do you store your lexicon?
Perl = Pathalogically Eclectic Rubbish Lister
In all seriousness though, I might be able to make a start on this dictionary database soon. All I need to do is work out the basic set of patterns I need (I should be able to extend it later though), and fix up my Nginx configuration so my localhost site is better organised. Setting the document root to ~/srv/www/ was a shortcut I'm really wishing I hadn't taken now. Ugh. I hate configuring servers; Nginx is fast but the documentation is poor.
In all seriousness though, I might be able to make a start on this dictionary database soon. All I need to do is work out the basic set of patterns I need (I should be able to extend it later though), and fix up my Nginx configuration so my localhost site is better organised. Setting the document root to ~/srv/www/ was a shortcut I'm really wishing I hadn't taken now. Ugh. I hate configuring servers; Nginx is fast but the documentation is poor.
Re: How do you store your lexicon?
I use Apache, made a folder /var/www/myusername, chmodded that to 777 and made a shortcut there in ~/ so that I can access stuff from http://localhost/myusername now.
Re: How do you store your lexicon?
Mine is the inverse: ~/srv is my www-directory, and I've symlinked it to /srv/http/‹myusername› so Nginx will find it. My main site is in ~/srv/www/zgplace which has been linked to /srv/http/zgplace so the absolute links (used for url rewriting) will work; symlinking is easier than manually editing however many files I have to the correct links. The PHP script I use to stitch together my site works well enough, although I suspect it has a few holes in it that allows someone to peek inside directories not usually accessible o_O so I will definitely fix that before I host it to the world.Guitarplayer wrote:I use Apache, made a folder /var/www/myusername, chmodded that to 777 and made a shortcut there in ~/ so that I can access stuff from http://localhost/myusername now.
If you're only using your Apache server for a local site, Nginx might be worth a look, as it's lightweight and relatively fast. I don't know how well it handles mySQL databases though, as I've never used mySQL for anything, although chances are it works quite well. SQLite has a PHP module so I can use it directly without any complicated client-server setup.
Re: How do you store your lexicon?
See what I've just found out: http://zbb.spinnwebe.com/viewtopic.php? ... 28#p847028
Re: How do you store your lexicon?
That's cool. I wonder if SQLite has something similar to this as well?
I'm still stuck on the planning/implementation stage. I had to work out the vowel patterns and root forms before I could even begin to code something, and I've almost got that worked out but trying to work out the tables I'll need will be a headache.
Each root has (in theory) up to nine forms, which are secondary patterns. Then the nominalisations and common forms leaves me with at least 36 patterns, and I've not added all of them to the grammar yet. I think I have enough though, as a new pattern will simply be another row in the table.
The roots table(s) will be the simplest:
The length field specifies number of consonants, two or three, I've decided to avoid quadriliteral roots for the time being, but I guess I can always add them later.
The pattern table shouldn't be too difficult either:
I need two patterns, as biliteral and triliteral roots have slightly different forms. If I decide to add a set of quadriliteral patterns at a later date, I'd need to create a new table containing the set of patterns. Alternatively, I could split the pattern table into:
This may make implementation easier, as I can break a word down into its constituent parts. It's also extensible
But combining them all with a translation is going to be difficult:
I really want to skip having a word field: surely you can do some kind of regex to get the computer to assemble the roots+patterns into words? I mean, if pattern_triliteral contains 1a2oi3 and root contains ntl, I could split the root into three characters and simply replace 1,2,3 with n,t,l.
The join alone for selecting the root+pattern combination is going to be complicated. And I know no SQL whatsoever. Erk.
This is so typically me. Over-engineering a task and making it so overwhelmingly complex that I never get around to implementing it because I don't even know the basics. Part of the reason I never finish anything.
I'm still stuck on the planning/implementation stage. I had to work out the vowel patterns and root forms before I could even begin to code something, and I've almost got that worked out but trying to work out the tables I'll need will be a headache.
Each root has (in theory) up to nine forms, which are secondary patterns. Then the nominalisations and common forms leaves me with at least 36 patterns, and I've not added all of them to the grammar yet. I think I have enough though, as a new pattern will simply be another row in the table.
The roots table(s) will be the simplest:
Code: Select all
tbl_root (root_id, root, length, semantic_meaning, r_date_added)
The pattern table shouldn't be too difficult either:
Code: Select all
tbl_pattern (pattern_id, pattern_triliteral, pattern_biliteral, root_form, p_type, p_date_added)
Code: Select all
tbl_pattern (pattern_id, root_form, p_type, p_date_added)
tbl_biliteral (pattern_id, pattern_biliteral)
tbl_triliteral (pattern_id, pattern_triliteral)
...
But combining them all with a translation is going to be difficult:
Code: Select all
tbl_word (word_id, word, root_id, pattern_id, root_form, translation, example, note, t_date_added)
The join alone for selecting the root+pattern combination is going to be complicated. And I know no SQL whatsoever. Erk.
This is so typically me. Over-engineering a task and making it so overwhelmingly complex that I never get around to implementing it because I don't even know the basics. Part of the reason I never finish anything.
Re: How do you store your lexicon?
Ditto, me -_-Zoqaeski wrote:This is so typically me. Over-engineering a task and making it so overwhelmingly complex that I never get around to implementing it because I don't even know the basics. Part of the reason I never finish anything.
Re: How do you store your lexicon?
*Today we are cancelling the Apocalypse*
Last edited by Izambri on Thu Nov 28, 2013 6:09 pm, edited 1 time in total.
Un llapis mai dibuixa sense una mà.
Re: How do you store your lexicon?
Argh. Three hours of playing around with postgreSQL and I have no idea what I'm doing. I just can't plan it right, every time I write something I change my mind because I come up with a "better way".
Still no luck. How can I have "sub-tables"? patterns_biliteral and patterns_triliteral inherit patterns so if I add a new row to one it should show up in all three. Then I can just fill in the appropriate blanks. But it doesn't.
Still no luck. How can I have "sub-tables"? patterns_biliteral and patterns_triliteral inherit patterns so if I add a new row to one it should show up in all three. Then I can just fill in the appropriate blanks. But it doesn't.
Re: How do you store your lexicon?
So I asked a couple of questions on Stack Overflow and it turns out that this will be a lot easier than I'd thought. After reading the document much more thoroughly I figured out how I can link my tables to one another as required, e.g. tbl_patterns_triliteral depends on tbl_patterns for the pattern_id. The best solution I've come up with (after modifying an answer from Stack Overflow) for splicing the roots and patterns together is this:
If I can somehow turn the above into a function of some sort which will dynamically select column root_i–root_ix from the value of root_form (1–9) and tbl_patterns_triliteral or tbl_patterns_triliteral from either the length of root or the value of root_length, I'll have about half of the database code planning finished. The next step will be searching the database for roots/words/patterns and that'll probably be much more difficult to work out, as some sort of text-searching ability will be required.
The front-end will be written in PHP, so coding a parser to output lexical entries will require a lot of thought as to representation of the data in a meaningful way. I quite like how @Guitarplayer's Ayeri dictionary displays the search results.
Code: Select all
SELECT
root,
root_i as pattern_i,
translate(root_i, "123", array_to_string(root,'')) as word_i
FROM tbl_roots
NATURAL JOIN tbl_patterns
NATURAL JOIN tbl_patterns_triliteral
WHERE root IS NOT NULL AND root_i IS NOT NULL;
The front-end will be written in PHP, so coding a parser to output lexical entries will require a lot of thought as to representation of the data in a meaningful way. I quite like how @Guitarplayer's Ayeri dictionary displays the search results.
Re: How do you store your lexicon?
That has lots of dirty coding to do that, though. I'm currently thinking about how to simplify it. Even LEO or dict.cc don't bother with collapsing multiple translations into a single displayed record, but instead both use tables. I think I'll do, too. That's not as elegant, but more straightforward and more adaptable than reading everything into a huge-ass array and then manipulate that with PHP.Zoqaeski wrote:I quite like how @Guitarplayer's Ayeri dictionary displays the search results.
Also, @tags don't work here, PHPBB is not so "socialized" yet.