Page 1 of 1

Help me build lexical databases for romance and english

Posted: Sun Oct 16, 2016 12:03 pm
by Radagast the Third
Hi folks, I am back in a new incrarnaiton to ask for some help with menial labor in elaborating some comparative wordlists that will serve for all kinds of purposes - most specifically as a baseline for various attempts to create software that can classify languages. The database will be open for anyone to use, so if you have any thing you might use something like this for, you can benefit from the work when it is done.

I am beginning a hobby project of creating a program that can evaluate similarities and differences between related languages. But to make the program work I need to have databases that I can use to evaluate how accurate it is. To do this I was hoping you all might help me crowd source wordlists for Romance languages and English dialects to use as a baseline.

If you would like to participate you can follow these links to a couple of google spreadsheets.

Please type in a broad IPA based transcription - preferably phonemic. Feel free to add varieties if you have good data for them.

Please also mention which source you are working from by typing it into the first row where the language name is.

This one is for English dialects:
https://docs.google.com/spreadsheets/d/ ... p=sharing

This one is for Romance languages:
https://docs.google.com/spreadsheets/d/ ... sp=sharing

Re: Help me build lexical databases for romance and english

Posted: Sun Oct 16, 2016 1:04 pm
by mèþru
Welcome back! Have some pickles and tea!
ImageImage
This might interest you.

Re: Help me build lexical databases for romance and english

Posted: Sun Oct 16, 2016 2:02 pm
by Ser
Eh, only 216 items, I'll help you filling out the Spanish, French and Latin columns.

I'll use accusative forms of words for Latin though, e.g. nigrum for 'black' instead of niger, since that's the form that Romance words (almost always) come from.

I sent you a request for getting editing powers.

Re: Help me build lexical databases for romance and english

Posted: Sun Oct 16, 2016 2:07 pm
by Radagast the Third
Yes, only 216 items, so it is doable. If you think specific items are missing that would be illuminating you are free to add them as well.

Thanks for wanting to help, I have granted your access request.

Re: Help me build lexical databases for romance and english

Posted: Sun Oct 16, 2016 2:47 pm
by Salmoneus
If you don't mind my asking, why does the English one focus on the relatively minor differences between the accents of different parts of the US, while ignoring the dialect differences in the UK (outside of London)? I know most of those dialects don't have that many millions of speakers, relatively speaking... but then neither does Jamaican English.

You may also want to clarify exactly what you mean by terms like 'Cockney' and 'RP'. It's debatable whether anybody speaks either of them any more. So for 'RP', do you mean actual RP, or do you mean modern SSBE? And for 'Cockney' do you mean 19th and early 20th century cockney proper, or modern MLE (the sociological successor to cockney), or the subdialects of MLE that retain most of their cockney heritage and most avoid 'contamination' from black speakers?

I'd be happy to give you SSBE pronunciations, if you want.

However, you may want to note that all of that (and more) is already on wikipedia.

Re: Help me build lexical databases for romance and english

Posted: Sun Oct 16, 2016 3:43 pm
by Ser
I also don't think it's feasible to have a column for Vulgar Latin. Vulgar Latin basically refers to the spoken sociolects of Latin across various centuries (especially between the 1st c. BC and roughly the 8th century) and places, a concept particularly useful when a word is attested in Romance in multiple places (especially early on) but it's not in written Latin, like Old Spanish señero/a and Old Galico-Portuguese senlleiro/a, from vulg. Lat. *singularius/a/um...

Re: Help me build lexical databases for romance and english

Posted: Mon Oct 17, 2016 2:17 am
by Radagast the Third
Good points.

As for English I want to have the most divergent varieties represented. I did want to include west-country dialects, geordie, etc. But I am not sure there is enough data available. If you have data for other UK dialects you are more than welcome to add them.

RP is of course not spoken by a lot of people anymore, but there is a lot of data - but I guess SSBE would be equally useful (so please do add it Salmoneus). For Cockney I would prefer the most divergent forms (i.e. probably 18th/19th century).

It is in each list a case of the more the better. The more varieties that can be included the better. We want to have maximum diversity in the lists for them to be able to work well as a baseline for comparison with other language groups.


By the way, here is a lot of data for Romance.http://www.soundcomparisons.com/#/en/Ro ... ht/Lgs_All

Re: Help me build lexical databases for romance and english

Posted: Mon Oct 17, 2016 4:47 am
by Salmoneus
To clarify, are you interested in phonology only, or also lexicon?

Re: Help me build lexical databases for romance and english

Posted: Mon Oct 17, 2016 4:54 am
by Radagast the Third
Both, definitely. And not too fine phonetic distinctions, only distinctions that can reasonably be considered phonemic.

Re: Help me build lexical databases for romance and english

Posted: Thu Oct 20, 2016 6:50 am
by Znex
Alright, I'll add in Australian English stuff.

Re: Help me build lexical databases for romance and english

Posted: Thu Oct 20, 2016 4:55 pm
by Salmoneus
Editing of the document seems to be lost. I'd have to create a google account, tell you about my google account, and ask you to give my google account permission to edit the document, and then log into the google account in order to do anything...

Re: Help me build lexical databases for romance and english

Posted: Sat Oct 22, 2016 9:54 am
by Radagast the Third
You edited something and it was lost? Or the editing function is not working for you? I didnt think a google account was necessary when someone shares the editing link. I can try and share a personal link with you in a private message.