(Maybe this should be in C&C. Mods, do as you please.)
The idea
Way back in the seventies, computer interpretation of natural languages relied on grammatical rules given by the programmer. As it turned out, that didn't work very well; real human language had so many exceptions that the rules failed most of the time. So instead, they started using machine learning. They would annotate a huge text, manually analysing how the words related to each other, and then the computer would make statistics and use that to try to guess other sentences. This is the method usually used today, and it often analyses 80-90% of the words correctly. That's not entirely satisfying, particularly since it might mean only 50% of the sentences completely correct, and there is a great deal of research being done on increasing this accuracy.
So I thought: What if we could simplify human language so that it becomes possible to analyse with formal rules? Would it necessarily be a far too limited language, or could we actually get it to look quite similar to natural language?
You might wonder where such a language would be used. In some applications, we have to use normal human language, we can't expect people to just go and change their grammar. In other applications we are happy to use a completely unnatural language, such as programming. There might be situations where a compromise could come in handy, maybe with future household robots, robot toys, or voice-controlled operating systems for disabled people.
Anyway, that's all speculation, and hopefully I don't need to justify conlanging for you guys.
I'm also making a little 3D-program which demonstrates the language. It's not very impressive, but if anyone wants I can send it. It requires Java and Perl.
Formal grammar
Unlike most languages, this one can be neatly written on a single page. This is only the most basic version of the grammar. I'm working on extending it.
Code: Select all
clause {
phrase*
}
phrase {
noun_phrase
| pronoun_phrase
| verb_phrase
}
noun_phrase {
noun_article attribute attribute*
}
pronoun_phrase {
pronoun attribute*
}
verb_phrase {
copula attribute*
}
attribute {
[para_article] <property> meta_attribute*
}
meta_attribute {
adverbial
| secondary_predicate
}
adverbial {
BEGINADV attribute* ENDADV
}
secondary_predicate {
BEGINSP clause ENDSP
}
noun_article {
ART.DEF.PLU.ABS
| ART.DEF.PLU.ERG
| ART.DEF.SING.ABS
| ART.DEF.SING.ERG
| ART.IND.PLU.ABS
| ART.IND.PLU.ERG
| ART.IND.SING.ABS
| ART.IND.SING.ERG
}
pronoun {
PRON.1P.PLU.ABS
| PRON.1P.PLU.ERG
| PRON.1P.SING.ABS
| PRON.1P.SING.ERG
| PRON.2P.PLU.ABS
| PRON.2P.PLU.ERG
| PRON.2P.SING.ABS
| PRON.2P.SING.ERG
| PRON.3P.PLU.ABS
| PRON.3P.PLU.ERG
| PRON.3P.NEUT.ABS
| PRON.3P.NEUT.ERG
| PRON.3P.MASC.ABS
| PRON.3P.MASC.ERG
| PRON.3P.FEM.ABS
| PRON.3P.FEM.ERG
}
copula {
COP.PRES
| COP.PAST
| COP.IMP
}
para_article {
PARA.ABS
| PARA.ERG
}
Explanations
The language is rigidly head-initial, with free word order whenever possible. It can be thought of as units made up of a head word and a number of dependant units, with the head first and the dependants in any order.
A clause can be a whole sentence, but some sentences are written as several clauses. The phrases which make up a clause can be put in any order. The final version will have special types of clauses, marked with a clause head.
A normal clause has exactly one verb phrase, and one or two (or in rare cases zero) pronoun or noun phrases. The language technically allows muliples, so you can choose whether the program should use them.
All content words are formally the same part of speech, "properties". Add a noun article and you have a noun phrase, add a copula and you have a verb phrase, add nothing and you have an "attribute", similar to an adjective or adverb.
A noun phrase needs at least one attribute, but they can be of any type and in any order. You can say "the red" rather than "the red one". Saying "the giant German" is exactly the same as "the German giant".
A meta-attribute (or just "meta" for short) is simply the attribute of an attribute, as opposed to the attribute of a noun or verb. In "the very tall man runs quite fast", "very", "quite" and "fast" are metas, but "tall", "man", and "runs" are plain attributes.
Adverbials are just ordinary attributes which apply to another attribute, whereas a secondary predicate is a whole (partial) clause specifying in which way the object relates to the predicate which defines it. In this simplified they can only give the other argument to a defining transitive predicate, such as "bus" in "bus driver".
Cases, as usual, describe the relation of the noun to the current predicate. Absolutive is the patient/experiencer, either the single noun relating to an intransitive verb, or the object of a transitive verb. Ergative is the agent/instigator, the subject of a transitive verb (but can be used on its own, which would give something like an antipassive voice).
Para-articles describe the case relation of the noun to the defining predicate. It is normally left out if it is absolutive. They work a lot like the "-er" and "-ee" suffixes in English.
Examples
ART.SING.DEF.ERG mouse COP.PRES eat ART.SING.DEF.ABS cheese
"The mouse eats the cheese."
ART.SING.DEF.ERG mouse COP.PRES eat
"The mouse eats."
COP.PRES eat ART.SING.DEF.ABS cheese
"The cheese is eaten."
ART.SING.DEF.(?) cheese eat
"the cheese which is eaten"
ART.SING.DEF.(?) mouse PARA.ERG eat
"the mouse which eats"
ART.SING.DEF.(?) mouse run
"the mouse which runs"
ART.SING.DEF.(?) mouse run fast
"the fast mouse which runs"
ART.SING.DEF.(?) mouse run BEGINADV fast ENDADV
"the mouse which runs fast"
ART.SING.DEF.(?) mouse PARA.ERG eat BEGINSP ART.SING.DEF.ABS cheese ENDSP
"the mouse which eats the cheese"
What I would like to know
- General thoughts. (Altho you don't need to tell me it's a bad idea for a thesis - I've been working on it all year already.)
- What grammatical things would be reasonable to add next? Perhaps I don't need the entire complexity of natural language, but it should probably be a bit more advanced than this.
- Specifically, how should those metas be handled? The whole begin-end deal seems horribly unnatural. I have some ideas for changing them.
- Also, what about prepositions? How could they fit into the system? I consider them to be content words, but apart from that I don't know. Locative case? Clause-level prepositional phrases, or prepositions as attributes?
- Ideas for an example text. It should be short and not too grammatically complex. I tried "the north wind and the sun", but it has some slightly tricky grammar. It would also be good if it can easily be represented graphically, so I can show it in my little 3D-program.
- Reasonable translations for the form words, so that I can write it like almost-normal English. Some of those words have more or less obvious English equivalents, such as ART.DEF.SING.ABS = "the", but others don't. For example for the para-articles - what would be suitable English words for those?
Any comments are greatly appreciated.