How do you store your lexicon?

Museum for the best conlanging and conworldery threads. Ask mods to move threads here.
Cockroach
Lebom
Lebom
Posts: 154
Joined: Thu Jun 23, 2005 9:26 pm
Location: Seattle Metropolitan Area

Re: How do you store your lexicon?

Post by Cockroach »

Word

User avatar
Jipí
Smeric
Smeric
Posts: 1128
Joined: Sat Apr 12, 2003 1:48 pm
Location: Litareng, Keynami
Contact:

Re: How do you store your lexicon?

Post by Jipí »

Chuma wrote:I'm not sure I understand your way of doing it - why does tbl_pattern have a root_id?
I thought in the wrong direction there, I think. Rather, you'd link the matching pattern_id to the root. You'd still somehow need a list of how which patterns can apply to which root, though. A mediating table of the format translation(translation_id, root_id, pattern_id, translation) would be a suitable strategy, I think. Now I need to read up on what a NATURAL JOIN is.

zoqaëski
Sanci
Sanci
Posts: 28
Joined: Tue Apr 05, 2005 7:01 am
Location: Hiding

Re: How do you store your lexicon?

Post by zoqaëski »

I've never done any database administration before, but SQLite seemed like a good place to start (that way I don't need to worry as much about the complex O/H of mySQL or something like that). Come to think of it, my programming skills have kind of gone out the window since I quit university; LaTeX and HTML is all I do these days, ignoring my use of the CLI on my Arch Linux system. So much for aspiring to be a geek.

Anyhoo... how about:

Code: Select all

tbl_root (root_id, root, semantic_meaning)
tbl_pattern (pattern_id, pattern, type)
tbl_translation (translation_id, root_id, pattern_id, translation, example, note)
I think that will cover most cases; perhaps a date_added field may be useful for documenting the growth of the lexicon. And I'm not sure whether pronunciation would be necessary, but it could be useful as well.

Can a table be extended with extra columns after creation? Or would I have to allocate a new table?
Zoqaëski : /θoˈca.jes.ki/
Last.fm
Twitter

User avatar
Chuma
Avisaru
Avisaru
Posts: 387
Joined: Sat Oct 28, 2006 9:01 pm
Location: Hyperborea

Re: How do you store your lexicon?

Post by Chuma »

What does "type" mean? And what sort of frontend did you have in mind to query the database?

zoqaëski
Sanci
Sanci
Posts: 28
Joined: Tue Apr 05, 2005 7:01 am
Location: Hiding

Re: How do you store your lexicon?

Post by zoqaëski »

"Type" describes whether the pattern is nominal, adjectival, etc, but in more specific detail. So, for example, k-s-t is the root for words representing writing, and -a-oi- is the pattern for a person that performs the verb; putting them together yields kasoit "writer/author/...". The type field would list what the pattern is used for, in its broadest sense.

I was thinking of writing a PHP front-end, which allows searching and modifying the dictionary. Later on if I even decided to host my conlangs I could always write a read-only front-end, and keep the local one for editing the contents.
Zoqaëski : /θoˈca.jes.ki/
Last.fm
Twitter

User avatar
vec
Avisaru
Avisaru
Posts: 639
Joined: Tue Sep 16, 2003 10:42 am
Location: Reykjavík, Iceland
Contact:

Re: How do you store your lexicon?

Post by vec »

I wish I knew any programming beyond html and css. I feel useless reading all of this. I want a small app that has four fields:

[original word] [part of speech] [definition(s)] [etymology]

I want it to give me an XML document that looks like this:

<entry>
<word>crap</word><pos>n.</pos><def>kind of poop</def><etym>17th Century France</etym>
</entry

Or even just html document with entry turned into <p>, word <span class="word">, pos <span class="pos"> etc. These can then be styled with CSS later or in InDesign.

Handling this kind of code is a nightmare through a text editor and browser or Dreamweaver.

And I want it to work on my Mac.

I have a feeling this is entirely doable and not even hard. I just need to find out how.

That's all.
vec

User avatar
vec
Avisaru
Avisaru
Posts: 639
Joined: Tue Sep 16, 2003 10:42 am
Location: Reykjavík, Iceland
Contact:

Re: How do you store your lexicon?

Post by vec »

Ooh ooh. And then it can also export an xml that only has the original words for inputting into a Sound Change applier. And even better: the Sound Change applier is built in.

How does this sound?
vec

Bob Johnson
Avisaru
Avisaru
Posts: 704
Joined: Fri Dec 03, 2010 9:41 am
Location: NY, USA

Re: How do you store your lexicon?

Post by Bob Johnson »

vecfaranti wrote:I want a small app that has four fields:

[original word] [part of speech] [definition(s)] [etymology]
Is a text file with tab-separated fields okay?

Pardon my French, or rather Perl:

Code: Select all

while (<>) {
  chomp;
  s%^([^\t]+)\t+([^\t]+)\t+([^\t]+)\t+([^\t]+)$%<word>\1</word><pos>\2</pos><def>\3</def><etym>\4</etym>%;
  print "<entry>\n" . $_ . "\n</entry>\n\n";
}
Perl was designed to mangle text files, not to be read.

zoqaëski
Sanci
Sanci
Posts: 28
Joined: Tue Apr 05, 2005 7:01 am
Location: Hiding

Re: How do you store your lexicon?

Post by zoqaëski »

vecfaranti wrote:I wish I knew any programming beyond html and css.
I'm not far beyond that; I studied a bit of programming in university but other than that don't get it nearly as well as I might appear. The hardest part of this (or indeed, any project) is trying to work out exactly what the requirements are; I've hit what feels like a chicken-and-egg scenario.
Zoqaëski : /θoˈca.jes.ki/
Last.fm
Twitter

User avatar
Chuma
Avisaru
Avisaru
Posts: 387
Joined: Sat Oct 28, 2006 9:01 pm
Location: Hyperborea

Re: How do you store your lexicon?

Post by Chuma »

hito wrote:Perl
Looks good. Or like this:

Code: Select all

for(<>){s%(.*)\t(.*)\t(.*)\t(.*)\n%<entry><word>\1</word><pos>\2</pos><def>\3</def><etym>\4</etym></entry>\n%;print}
Here's another variant:

Code: Select all

for(<>){
@ARGV=split;
print '<entry><word>',shift,'</word><pos>',shift,'</pos><def>',shift,'</def><etym>',shift,"</etym></entry>\n"}
That would split on spaces, so you might want "split /\t/;" instead.

I think this might also work:

Code: Select all

for(<>){
@ARGV=("</etym></entry>\n",'</def><etym>','</pos><def>','</word><pos>','<entry><word>');
s/^/pop/e;
s/\t/pop/e;
s/\t/pop/e;
s/\t/pop/e;
s/\n/pop/e;
print;
}
Yay perl! :D

zoqaëski
Sanci
Sanci
Posts: 28
Joined: Tue Apr 05, 2005 7:01 am
Location: Hiding

Re: How do you store your lexicon?

Post by zoqaëski »

Perl = Pathalogically Eclectic Rubbish Lister 8)

In all seriousness though, I might be able to make a start on this dictionary database soon. All I need to do is work out the basic set of patterns I need (I should be able to extend it later though), and fix up my Nginx configuration so my localhost site is better organised. Setting the document root to ~/srv/www/ was a shortcut I'm really wishing I hadn't taken now. Ugh. I hate configuring servers; Nginx is fast but the documentation is poor.
Zoqaëski : /θoˈca.jes.ki/
Last.fm
Twitter

User avatar
Jipí
Smeric
Smeric
Posts: 1128
Joined: Sat Apr 12, 2003 1:48 pm
Location: Litareng, Keynami
Contact:

Re: How do you store your lexicon?

Post by Jipí »

I use Apache, made a folder /var/www/myusername, chmodded that to 777 and made a shortcut there in ~/ so that I can access stuff from http://localhost/myusername now.

zoqaëski
Sanci
Sanci
Posts: 28
Joined: Tue Apr 05, 2005 7:01 am
Location: Hiding

Re: How do you store your lexicon?

Post by zoqaëski »

Guitarplayer wrote:I use Apache, made a folder /var/www/myusername, chmodded that to 777 and made a shortcut there in ~/ so that I can access stuff from http://localhost/myusername now.
Mine is the inverse: ~/srv is my www-directory, and I've symlinked it to /srv/http/‹myusername› so Nginx will find it. My main site is in ~/srv/www/zgplace which has been linked to /srv/http/zgplace so the absolute links (used for url rewriting) will work; symlinking is easier than manually editing however many files I have to the correct links. The PHP script I use to stitch together my site works well enough, although I suspect it has a few holes in it that allows someone to peek inside directories not usually accessible o_O so I will definitely fix that before I host it to the world.

If you're only using your Apache server for a local site, Nginx might be worth a look, as it's lightweight and relatively fast. I don't know how well it handles mySQL databases though, as I've never used mySQL for anything, although chances are it works quite well. SQLite has a PHP module so I can use it directly without any complicated client-server setup.
Zoqaëski : /θoˈca.jes.ki/
Last.fm
Twitter

User avatar
Jipí
Smeric
Smeric
Posts: 1128
Joined: Sat Apr 12, 2003 1:48 pm
Location: Litareng, Keynami
Contact:

Re: How do you store your lexicon?

Post by Jipí »


zoqaëski
Sanci
Sanci
Posts: 28
Joined: Tue Apr 05, 2005 7:01 am
Location: Hiding

Re: How do you store your lexicon?

Post by zoqaëski »

That's cool. I wonder if SQLite has something similar to this as well?

I'm still stuck on the planning/implementation stage. I had to work out the vowel patterns and root forms before I could even begin to code something, and I've almost got that worked out but trying to work out the tables I'll need will be a headache.

Each root has (in theory) up to nine forms, which are secondary patterns. Then the nominalisations and common forms leaves me with at least 36 patterns, and I've not added all of them to the grammar yet. I think I have enough though, as a new pattern will simply be another row in the table.

The roots table(s) will be the simplest:

Code: Select all

tbl_root (root_id, root, length, semantic_meaning, r_date_added)
The length field specifies number of consonants, two or three, I've decided to avoid quadriliteral roots for the time being, but I guess I can always add them later.

The pattern table shouldn't be too difficult either:

Code: Select all

tbl_pattern (pattern_id, pattern_triliteral, pattern_biliteral, root_form, p_type, p_date_added)
I need two patterns, as biliteral and triliteral roots have slightly different forms. If I decide to add a set of quadriliteral patterns at a later date, I'd need to create a new table containing the set of patterns. Alternatively, I could split the pattern table into:

Code: Select all

tbl_pattern (pattern_id, root_form, p_type, p_date_added)
tbl_biliteral (pattern_id, pattern_biliteral)
tbl_triliteral (pattern_id, pattern_triliteral)
...
This may make implementation easier, as I can break a word down into its constituent parts. It's also extensible :)

But combining them all with a translation is going to be difficult:

Code: Select all

tbl_word (word_id, word, root_id, pattern_id, root_form, translation, example, note, t_date_added)
I really want to skip having a word field: surely you can do some kind of regex to get the computer to assemble the roots+patterns into words? I mean, if pattern_triliteral contains 1a2oi3 and root contains ntl, I could split the root into three characters and simply replace 1,2,3 with n,t,l.

The join alone for selecting the root+pattern combination is going to be complicated. And I know no SQL whatsoever. Erk.

This is so typically me. Over-engineering a task and making it so overwhelmingly complex that I never get around to implementing it because I don't even know the basics. Part of the reason I never finish anything.
Zoqaëski : /θoˈca.jes.ki/
Last.fm
Twitter

User avatar
Jipí
Smeric
Smeric
Posts: 1128
Joined: Sat Apr 12, 2003 1:48 pm
Location: Litareng, Keynami
Contact:

Re: How do you store your lexicon?

Post by Jipí »

Zoqaeski wrote:This is so typically me. Over-engineering a task and making it so overwhelmingly complex that I never get around to implementing it because I don't even know the basics. Part of the reason I never finish anything.
Ditto, me -_-

User avatar
Izambri
Smeric
Smeric
Posts: 1556
Joined: Sun Apr 04, 2004 4:27 pm
Location: Catalonia

Re: How do you store your lexicon?

Post by Izambri »

*Today we are cancelling the Apocalypse*
Last edited by Izambri on Thu Nov 28, 2013 6:09 pm, edited 1 time in total.
Un llapis mai dibuixa sense una mà.

zoqaëski
Sanci
Sanci
Posts: 28
Joined: Tue Apr 05, 2005 7:01 am
Location: Hiding

Re: How do you store your lexicon?

Post by zoqaëski »

Argh. Three hours of playing around with postgreSQL and I have no idea what I'm doing. I just can't plan it right, every time I write something I change my mind because I come up with a "better way".

Still no luck. How can I have "sub-tables"? patterns_biliteral and patterns_triliteral inherit patterns so if I add a new row to one it should show up in all three. Then I can just fill in the appropriate blanks. But it doesn't.
Zoqaëski : /θoˈca.jes.ki/
Last.fm
Twitter

zoqaëski
Sanci
Sanci
Posts: 28
Joined: Tue Apr 05, 2005 7:01 am
Location: Hiding

Re: How do you store your lexicon?

Post by zoqaëski »

So I asked a couple of questions on Stack Overflow and it turns out that this will be a lot easier than I'd thought. After reading the document much more thoroughly I figured out how I can link my tables to one another as required, e.g. tbl_patterns_triliteral depends on tbl_patterns for the pattern_id. The best solution I've come up with (after modifying an answer from Stack Overflow) for splicing the roots and patterns together is this:

Code: Select all

SELECT 
    root, 
    root_i as pattern_i,
    translate(root_i, "123", array_to_string(root,'')) as word_i
FROM tbl_roots 
    NATURAL JOIN tbl_patterns
    NATURAL JOIN tbl_patterns_triliteral
WHERE root IS NOT NULL AND root_i IS NOT NULL;
If I can somehow turn the above into a function of some sort which will dynamically select column root_iroot_ix from the value of root_form (1–9) and tbl_patterns_triliteral or tbl_patterns_triliteral from either the length of root or the value of root_length, I'll have about half of the database code planning finished. The next step will be searching the database for roots/words/patterns and that'll probably be much more difficult to work out, as some sort of text-searching ability will be required.

The front-end will be written in PHP, so coding a parser to output lexical entries will require a lot of thought as to representation of the data in a meaningful way. I quite like how @Guitarplayer's Ayeri dictionary displays the search results.
Zoqaëski : /θoˈca.jes.ki/
Last.fm
Twitter

User avatar
Jipí
Smeric
Smeric
Posts: 1128
Joined: Sat Apr 12, 2003 1:48 pm
Location: Litareng, Keynami
Contact:

Re: How do you store your lexicon?

Post by Jipí »

Zoqaeski wrote:I quite like how @Guitarplayer's Ayeri dictionary displays the search results.
That has lots of dirty coding to do that, though. I'm currently thinking about how to simplify it. Even LEO or dict.cc don't bother with collapsing multiple translations into a single displayed record, but instead both use tables. I think I'll do, too. That's not as elegant, but more straightforward and more adaptable than reading everything into a huge-ass array and then manipulate that with PHP.

Image

Also, @tags don't work here, PHPBB is not so "socialized" yet.

Post Reply