IPA-to-speech?

Discussion of natural languages, or language in general.
Post Reply
User avatar
CGreathouse
Sanci
Sanci
Posts: 15
Joined: Thu Feb 09, 2006 1:44 pm
Location: Eastern USA
Contact:

IPA-to-speech?

Post by CGreathouse »

I was wondering if there was a good IPA-to-sound converter available somewhere, whether online or as a download. Just something where I can type "ˌfoʊnəˈtɪʃən" and get some semi-reasonable approximation of the word.

I know of systems (e.g., MBROLA) that let you control output with very fine controls (number of milliseconds, pitch, and all that) but this is more than I'm looking for -- I want something that is simple to use. (In a pinch maybe I could use such a system if I wrote a conversion layer first, but that seems like something that has probably already been done better.)

I know of
http://www2.research.att.com/~ttsweb/tts/demo.php
which can be used the way I want by typing something like

Code: Select all

<phoneme alphabet="ipa" ph="ˌfoʊnəˈtɪʃən"> </phoneme>
but it's pretty flaky -- I had to try this one four times to make it work. (Also, I suspect it will fail with some of the more interesting sounds, but I'll take what I can get in that regard; even a system that only does basic English and Romance sounds would be useful to me.)

And yes, I know of Paul Meier's IPA chart, but since that doesn't let you string sounds together it wouldn't be useful for me.

Bonus points for a system that lets you pass a string in a URL (or in a HTTP POST variable). Further bonus points for a system with liberal licensing. :D

User avatar
finlay
Sumerul
Sumerul
Posts: 3600
Joined: Mon Dec 22, 2003 12:35 pm
Location: Tokyo

Re: IPA-to-speech?

Post by finlay »

Doesn't exist – what you've posted is closer than anything I've ever seen before, even though it clearly only works for english.

The main problem is the sheer size of such a project – without even getting started on the diacritics, the offglides and transitions from consonant to vowel are very subtle and could sound very wrong if you combined them wrongly. Basically, imagine you have to record [t] differently for each vowel it could occur next to...

User avatar
treskro
Avisaru
Avisaru
Posts: 306
Joined: Sun Nov 14, 2010 9:33 pm
Location: オデュッセウスの家

Re: IPA-to-speech?

Post by treskro »

I don't think it would be that difficult..Sure, stops would have to be recorded separately for different vowels, but for something basic, just to get the flavor of a new language, a one-to-one regurgitation of sounds would suffice.
axhiuk.

看蝦米

Bristel
Smeric
Smeric
Posts: 1258
Joined: Mon Jun 01, 2009 3:07 pm
Location: Miracle, Inc. Headquarters
Contact:

Re: IPA-to-speech?

Post by Bristel »

treskro3 wrote:I don't think it would be that difficult..Sure, stops would have to be recorded separately for different vowels, but for something basic, just to get the flavor of a new language, a one-to-one regurgitation of sounds would suffice.
Stops would have to be separately recorded for every secondary articulation, plus vowels, consonant clusters, etc.
[bɹ̠ˤʷɪs.təɫ]
Nōn quālibet inīquā cupiditāte illectus hoc agō
Yo te pongo en tu lugar...
Taisc mach Daró

User avatar
Soap
Smeric
Smeric
Posts: 1228
Joined: Sun Feb 16, 2003 2:57 pm
Location: Scattered disc
Contact:

Re: IPA-to-speech?

Post by Soap »

Eh, why would we need to go into that level of detail? The Text to Speech software that already exists doesnt do that, and it may not be perfectly normal sounding but it gets us by.
Sunàqʷa the Sea Lamprey says:
Image

User avatar
makvas
Avisaru
Avisaru
Posts: 251
Joined: Wed Jul 19, 2006 6:13 pm
Location: The Southland

Re: IPA-to-speech?

Post by makvas »

Soap wrote:Eh, why would we need to go into that level of detail? The Text to Speech software that already exists doesnt do that, and it may not be perfectly normal sounding but it gets us by.
Why are you asking why? I mean, wouldn't it be cool to be able to produce sounds from arbitrary languages just using IPA input? This would mean we could generate sounds from "unsupported" languages just by knowing the IPA. It's certainly interesting and possible, but yes, a large undertaking.

User avatar
masako
Smeric
Smeric
Posts: 1731
Joined: Sat Nov 06, 2004 4:31 pm
Location: 가매
Contact:

Re: IPA-to-speech?

Post by masako »

finlay wrote:Doesn't exist – what you've posted is closer than anything I've ever seen before, even though it clearly only works for english.
and Spanish, Italian, German and French.

I use that site to make my sound samples for Kala because the phonology fits with Spanish.

User avatar
Jetboy
Avisaru
Avisaru
Posts: 270
Joined: Sat Apr 17, 2010 6:49 pm

Re: IPA-to-speech?

Post by Jetboy »

If not actual strings of segments, it would be neat to have something that could combine features– what if one wants to hear a nasalized /s/, instead of just a nasalized /t/? Or maybe something like one of PIE's aspirated labiovelars, /gʷʰ/?
"A positive attitude may not solve all your problems, but it will annoy enough people to make it worth the effort."
–Herm Albright
Even better than a proto-conlang, it's the *kondn̥ǵʰwéh₂s

Travis B.
Sumerul
Sumerul
Posts: 3570
Joined: Mon Jun 20, 2005 12:47 pm
Location: Milwaukee, US

Re: IPA-to-speech?

Post by Travis B. »

An issue with IPA-to-speech is that to accurately represent convert IPA for a given language into audio, one would need a lot of detailed phonetic information that most transcriptions, even my infamous transcriptions of English, simply do not provide. IPA actually has a significant range of ambiguity in how it is actually used, and while human readers of IPA normally know what is actually meant by any given IPA and do not necessarily care about all the phonetic details omitted, to actually generate audio accurately for what some given IPA represents one would need to know all these details.
Dibotahamdn duthma jallni agaynni ra hgitn lakrhmi.
Amuhawr jalla vowa vta hlakrhi hdm duthmi xaja.
Irdro. Irdro. Irdro. Irdro. Irdro. Irdro. Irdro.

Bedelato
Lebom
Lebom
Posts: 193
Joined: Sat Oct 30, 2010 1:13 pm
Location: Another place

Re: IPA-to-speech?

Post by Bedelato »

He's not asking for perfectly precise renderings.

All we need is a system that will take basic IPA (or X-SAMPA, or some other phonetic notation) and convert the segments to audio without bias toward a specific language. But you don't need to use separate recordings for every permutation of features, and really the amount of information in even broad transcriptions should be enough to give a reasonable rendering, even when interpreted literally. It won't be perfect, but it would be nice to have a language-independent system.

Actually a better idea would be to use the IPA symbols as shorthand for the parameters of an articulatory synthesis engine, or something, instead of using prerecorded audio. Yes, you would still need data for each symbol, but in this case you're using CPU time instead of RAM.

Imagine, they could get away with so much more by using a phonetic alphabet instead of dedicating hundreds of lines of code to parsing a particular language's orthography.
At, casteda dus des ometh coisen at tusta o diédem thum čisbugan. Ai, thiosa če sane búem mos sil, ne?
Also, I broke all your metal ropes and used them to feed the cheeseburgers. Yes, today just keeps getting better, doesn't it?

User avatar
Radius Solis
Smeric
Smeric
Posts: 1248
Joined: Tue Mar 30, 2004 5:40 pm
Location: Si'ahl
Contact:

Re: IPA-to-speech?

Post by Radius Solis »

Bedelato wrote:He's not asking for perfectly precise renderings.

All we need is a system that will take basic IPA (or X-SAMPA, or some other phonetic notation) and convert the segments to audio without bias toward a specific language.
This is still based on the assumption that individual phone sounds pronounced in sequence by a computer will sound anything remotely like human speech. An assumption that is probably false. The major reason for this becomes clear when you start looking at spectrographs: the target waveforms for each phone are a minority of speech production time, the majority being spent in transition between targets. Delete all that transitional time and the result is not likely to sound much like human speech.

So an IPA-to-speech program really does need to know how to transition from each phone's target position to that of each other phone, you can't realistically ignore that and still have a useful program. And it's a great deal easier said than done - if N is the number of phones your program knows about, it has to handle N^2 transition patterns... so if we ignore diacritics and just assume there are 135 basic IPA symbols (I counted up the charts but can't find a listed figure), there are 135*135 = 18,225 possible sequences of one IPA symbol with another. And that is ignoring the fact that languages differ substantially on how they transition from one phone to the next! You'd have to either pick one (which would result in a language-biased program) or else examine how each transition is handled in multiple languages and average the patterns somehow.

Bedelato
Lebom
Lebom
Posts: 193
Joined: Sat Oct 30, 2010 1:13 pm
Location: Another place

Re: IPA-to-speech?

Post by Bedelato »

But couldn't you use, like, articulatory synthesis or something?

I swear I said this above :x
At, casteda dus des ometh coisen at tusta o diédem thum čisbugan. Ai, thiosa če sane búem mos sil, ne?
Also, I broke all your metal ropes and used them to feed the cheeseburgers. Yes, today just keeps getting better, doesn't it?

User avatar
CGreathouse
Sanci
Sanci
Posts: 15
Joined: Thu Feb 09, 2006 1:44 pm
Location: Eastern USA
Contact:

Re: IPA-to-speech?

Post by CGreathouse »

Bedelato wrote:He's not asking for perfectly precise renderings.
Exactly. If I needed precise renderings of a language into audio, I'd ask a native speaker... or the creator, if it was a conlang. :mrgreen:

But something that could do basic phonetics would be really nice. 50 common phones + 100 common diphones would be great.

Post Reply