Page 1 of 2

How do you store your lexicon?

Posted: Sun Jan 23, 2011 5:48 am
by zoqaëski
I'm pretty sure a number of threads on this topic have sprung up recently, but searching couldn't find them so my apologies for creating a duplicate thread if I have done so.

How do you store your conlang's lexicon?

I'm up to the stage with my Qevesa grammar that I need to start building up a lexicon so I can participate in TCs to further refine the grammar. Problem is, I've got no idea how to store it, or even where to start; I've experimented with Text Documents or CSVs in OpenOffice, XHTML definition lists, plain text files, and none of them are satisfactory. Faiuwle's ConlangDictionary looks interesting but I'm not sure if it would work well with what I have in mind.

Ideally I'd like to create a searchable database accessible from my browser, using a SQLite or similar backend, that's simple to maintain and extensible. I quite like Guitarplayer's dictionary for Ayeri on his TayBenung site, and there's a few others I've seen that look like a well-designed interface (although I have no idea how well they work internally).

The main problem is that as Qevesa is based around consonantal roots, so I really need a means of storing all the derivative forms, and searching them all, as well as being able to edit them with ease. The resulting system could be visualised as a 3D table (X=vowel patterns / Y=roots / Z=meanings), so a database would probably be ideal, but trying to develop this is a seemingly insurmountable task. Where should I start?

Re: How do you store your lexicon?

Posted: Sun Jan 23, 2011 6:50 am
by Jipí
Aw do we have to have that thread A G A I N ? (Sorry!) What you may want to have a look at is SQL syntax, Codd's 12 Rules, and how inner joins work. Also, how to make queries and fetch them with e.g. PHP. Yes, it requires some work, but it's not unworkable.

Re: How do you store your lexicon?

Posted: Sun Jan 23, 2011 12:00 pm
by Aurora Rossa
I use a Microsoft Word file with lists of lexemes, place names, personal names, and so forth.

Re: How do you store your lexicon?

Posted: Sun Jan 23, 2011 3:10 pm
by bulbaquil
Excel.

Re: How do you store your lexicon?

Posted: Sun Jan 23, 2011 3:14 pm
by Risla
Google Docs spreadsheet. I've also got a notebook where I write down new words, but the spreadsheet is much easier to organize.

Re: How do you store your lexicon?

Posted: Sun Jan 23, 2011 3:17 pm
by Acid Badger
Excel.

Re: How do you store your lexicon?

Posted: Sun Jan 23, 2011 3:54 pm
by WeepingElf
A simple but well-formatted HTML file, in which each lexeme sits on a line for itself.

Re: How do you store your lexicon?

Posted: Sun Jan 23, 2011 5:59 pm
by makvas
Google Docs spreadsheet, since it's a collaborative project. You can have multiple pages in one spreadsheet file, which is handy for organization.

Re: How do you store your lexicon?

Posted: Sun Jan 23, 2011 7:19 pm
by personak
Lexique Pro is great.

Re: How do you store your lexicon?

Posted: Sun Jan 23, 2011 9:36 pm
by Bedelato
For a long time I used Excel, but I'm currently in the process of migrating everything to Lexique Pro.

Re: How do you store your lexicon?

Posted: Mon Jan 24, 2011 12:16 am
by Bristel
Pages or Numbers.

Re: How do you store your lexicon?

Posted: Mon Jan 24, 2011 12:27 am
by Ashroot
I use to use 3x5 cards for the few words I had but now I am going to Lexique Pro.

Re: How do you store your lexicon?

Posted: Mon Jan 24, 2011 1:26 am
by Foolster41
I have a word doc, and a dictonary on my wiki.

Re: How do you store your lexicon?

Posted: Mon Jan 24, 2011 1:30 am
by WanderlustKoko
Open Office at the moment. I believe the have a similar thing to excel in it as well.

Re: How do you store your lexicon?

Posted: Mon Jan 24, 2011 7:19 am
by alice
In smallish glass jars. I find almond vinegar preserves it best.

Re: How do you store your lexicon?

Posted: Mon Jan 24, 2011 5:53 pm
by Chuma
As for me, I use a Numbers file. I don't need anything more complicated.

I took a course in SQL a while back, but I didn't find it particularly useful. I don't think you would need anything like that for your problem.

I might be able to help you, if you can a) explain exactly what you need, and b) run Perl programs.

I'm not sure why you would need a rank 3 array. I would have guessed roots on one axis, patterns on one axis, and for each entry the English word (and the conlang word, if that's not completely obvious from root+pattern). Then you could search for either an English word, a conlang word, or a root+pattern, and see all matching entries, or you could search for a root, and see a list of all patterns which are possible with that root (and perhaps their whole entries).
Is that what you need, or are there any other searches you need to be able to do?

Re: How do you store your lexicon?

Posted: Mon Jan 24, 2011 6:00 pm
by Jipí
Chuma, I think you're suggesting basically what a Join does, except you'd deal with dynamically generated results instead of a static list:

Code: Select all

SELECT root, pattern, translation
FROM tbl_root, tbl_pattern, tbl_translation
WHERE root LIKE "x-y-z"
   AND tbl_root.root_id = tbl_pattern.root_id
   AND tbl_root.rood_id = tbl_translation.root_id;
Something like this. At least that's kind of what I'm doing for my database. And for the purpose of a dictionary IMNSHO a properly set-up relational database is not useless or overkill. And you can generate derivative lists in different formats easily.

Re: How do you store your lexicon?

Posted: Mon Jan 24, 2011 7:37 pm
by Z500
in individual text files for easy feeding to IPA Zounds

Re: How do you store your lexicon?

Posted: Tue Jan 25, 2011 2:59 am
by zoqaëski
Chuma wrote:I'm not sure why you would need a rank 3 array. I would have guessed roots on one axis, patterns on one axis, and for each entry the English word (and the conlang word, if that's not completely obvious from root+pattern). Then you could search for either an English word, a conlang word, or a root+pattern, and see all matching entries, or you could search for a root, and see a list of all patterns which are possible with that root (and perhaps their whole entries).
Is that what you need, or are there any other searches you need to be able to do?
That's exactly what I'm looking to do, now that I've got an unlimited net connection I'm going to start reading up on SQLite so I don't have the overhead of something as big as mySQL or whatever. Basically, the table keys would be root (c1-c2-c3) × pattern (v0-v1-v2-v3), and the translation and information in the matrix cells (the 3rd dimension could list fields such as conlang word, pronunciation, English meaning(s), etc), although the actual conlang word could be generated on the fly from some basic regex, I guess.
Guitarplayer wrote:Chuma, I think you're suggesting basically what a Join does, except you'd deal with dynamically generated results instead of a static list:

Code: Select all

SELECT root, pattern, translation
FROM tbl_root, tbl_pattern, tbl_translation
WHERE root LIKE "x-y-z"
   AND tbl_root.root_id = tbl_pattern.root_id
   AND tbl_root.rood_id = tbl_translation.root_id;
Something like this. At least that's kind of what I'm doing for my database. And for the purpose of a dictionary IMNSHO a properly set-up relational database is not useless or overkill. And you can generate derivative lists in different formats easily.
That's the idea. Presumably I could use a SQLite DB for the backend, and generate static lists as I need for export, extracting the useful information I want.

The trick is to keep this whole scheme simple (i.e. KISS principle) but extensible and easy-to-maintain.

Re: How do you store your lexicon?

Posted: Tue Jan 25, 2011 6:40 am
by masako
Image

Re: How do you store your lexicon?

Posted: Tue Jan 25, 2011 6:07 pm
by Chuma
Zoqaeski wrote:the 3rd dimension could list fields such as conlang word, pronunciation, English meaning(s), etc
Oh, I see. Yeah, that could be thought of as a third dimension in a table, depending on the application.
Guitarplayer wrote:Something like this.
Looks sensible.

Phew, I feel like I've forgot all about SQL. Let's see... what tables might we have?

tbl_root (root_id, root)
tbl_pattern (pattern_id, pattern)
tbl_translation (translation_id, translation, root_id, pattern_id)

Then you should be able to do

Code: Select all

SELECT root, pattern, translation
FROM tbl_root NATURAL JOIN tbl_pattern NATURAL JOIN tbl_translation
WHERE root = "x-y-z"
I'm not sure I understand your way of doing it - why does tbl_pattern have a root_id?

Anyway, that should work, but it doesn't sound like the simplest way to me. Not sure how clever the SQL interpreter is, but if you do a straight join it would have a huge list to look through, so that's hardly effective (not that it should matter unless you have a really huge lexicon).

Basically all you need is one ordinary table. I also assume that you have a finite constant number of patterns, which makes things easier if you store the table in a file. You just make a list, where each line has the different patterns for a single root. That seems much easier to me than making an SQL database just for this one wordlist. But hey, do as you like.

Re: How do you store your lexicon?

Posted: Tue Jan 25, 2011 8:03 pm
by justin
personak wrote:Lexique Pro is great.
You wouldn't happen to know how to get Lexique Pro to treat capital letters the same as lowercase when it sorts, would you? I have proper nouns in my wordlist, and they sort before the ones that aren't capitalized.

Re: How do you store your lexicon?

Posted: Tue Jan 25, 2011 8:44 pm
by zoqaëski
Chuma wrote: Phew, I feel like I've forgot all about SQL. Let's see... what tables might we have?

tbl_root (root_id, root)
tbl_pattern (pattern_id, pattern)
tbl_translation (translation_id, translation, root_id, pattern_id)

Then you should be able to do

Code: Select all

SELECT root, pattern, translation
FROM tbl_root NATURAL JOIN tbl_pattern NATURAL JOIN tbl_translation
WHERE root = "x-y-z"
I'm not sure I understand your way of doing it - why does tbl_pattern have a root_id?

Anyway, that should work, but it doesn't sound like the simplest way to me. Not sure how clever the SQL interpreter is, but if you do a straight join it would have a huge list to look through, so that's hardly effective (not that it should matter unless you have a really huge lexicon).

Basically all you need is one ordinary table. I also assume that you have a finite constant number of patterns, which makes things easier if you store the table in a file. You just make a list, where each line has the different patterns for a single root. That seems much easier to me than making an SQL database just for this one wordlist. But hey, do as you like.
Both biliteral and triliteral roots are permissible, and there are 22 consonant phonemes. So that makes 22^2 + 22^3 = 11132 possible roots... not that I think I'll use all of them, of course. I'm only going to store nominal and adjectival forms and the verbal infinitive, because that will keep the number of patterns down to a manageable level. Besides, there's no point in having a dictionary entry for every aspectual conjugation, is there?

What I don't really understand is whether I can actually store the database as a table reminiscent of an addition or multiplication table, because logically root + pattern = word. If I give IDs to every root and pattern, the cells could be identified by some kind of mathematical relationship, and each cell could then consist of a "document" containing the necessary information fields.

EDIT: Reading some of the SQLite (and SQL in general) documentation, it seems that the best idea will be to have separate tables for roots and patterns, and searching performs a join. The earlier idea of having a table with a column for each pattern might be impossible to implement, because the actual table will need to resize, and I don't think that's possible.

Re: How do you store your lexicon?

Posted: Tue Jan 25, 2011 9:54 pm
by Bedelato
sano wrote:Image
What the...

EXCEL 97 !!? :o :o :o :o :o

Re: How do you store your lexicon?

Posted: Wed Jan 26, 2011 5:23 am
by Chuma
Zoqaeski wrote:What I don't really understand is whether I can actually store the database as a table reminiscent of an addition or multiplication table, because logically root + pattern = word. If I give IDs to every root and pattern, the cells could be identified by some kind of mathematical relationship, and each cell could then consist of a "document" containing the necessary information fields.
That's pretty much it, yeah. In some languages you would need numerical IDs to identify the roots, in others you could use the root itself as key. But that's all pretty basic programming, and won't affect the finished program.
Zoqaeski wrote:Reading some of the SQLite (and SQL in general) documentation, it seems that the best idea will be to have separate tables for roots and patterns, and searching performs a join.
Yes, that would be the normal way to do things in SQL, I think. Not in other programs.