Year

1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010

Month

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

Index

Links

October 2004

intellectual history T.L. Taylor
- intellectual history Mike
Text-tospeech (Was Shift in time) Mike Rozak

Arnau Rosselló Castelló wrote:

> I'd like to ask, then, about phoneme recognition and synthesis. My
> idea is to translate a voice input from the into a string of
> phonemes(with pitch, speed and volume hopefully), but not
> translating it to words. ...

> Would such a thing be feasible?

Yes and no.

If you know WHAT words the user spoke, and have a recording of what
they said, I already have a tool (at www.mxac.com.au/m3d) that will
extract the phonemes, volume, timing, and pitch. It's called
"transplantedy prosody" by speech-type people. An example of a
verbose form of transplanted prosody is:

<TransPros>
<OrigText>This is a sample recording whose phonemes have been
generated using speech recognition.</OrigText>
<Word text="This" ph="dh ih1 s" TPPitchRel=0.92,1.16,TPDurAbs=0.04,0.17,0.17/>
<Break time9ms/>
<Word text="is" ph="ih1 z" TPPitchRel=0.97,0.91 TPDurAbs=0.08,0.05/>
<Word text="a" ph="ah0" TPPitchRel=0.92 TPDurAbs=0.11/>
<Word text="sample" ph="s ae1 m p ah0 l" TPPitchRel=,1.07,1.24,1.33,1.35,1.28 TPDurAbs=0.15,0.09,0.03,0.10,0.03,0.05/>
<Word text="recording" ph="r iy0 k ao1 r d iy0 ng" TPPitchRel=1.22,1.34,,0.90,0.86,1.03,0.84,1.14 TPDurAbs=0.03,0.11,0.14,0.07,0.05,0.04,0.09,0.10/>
<Break time'9ms/>
<Word text="whose" ph="hh uw1 z" TPPitchRel=,0.88, TPDurAbs=0.05,0.10,0.11/>
<Break timeYms/>
<Word text="phonemes" ph="f ow1 n iy1 m z" TPPitchRel=,1.15,0.97,0.86,0.90, TPDurAbs=0.04,0.12,0.05,0.07,0.19,0.07/>
<Word text="have" ph="hh ae1 v" TPPitchRel=,0.90,0.84 TPDurAbs=0.07,0.11,0.05/>
<Break time9ms/>
<Word text="been" ph="b iy0 n" TPPitchRel=1.02,0.91,0.90 TPDurAbs3D0.02,0.05,0.11/>
<Word text="generated" ph="jh eh1 n er0 ey1 t ih0 d" TPPitchRel=1.11,0.97,1.06,0.91,0.83,,0.92, TPDurAbs=0.06,0.08,0.05,0.11,0.08,0.06,0.05,0.04/>
<Break timeYms/>
<Word text="using" ph="y uw1 z iy0 ng" TPPitchRel=0.86,0.89,0.88,0.97,1.00 TPDurAbs=0.10,0.06,0.07,0.05,0.08/>
<Word text="speech" ph="s p iy1 ch" TPPitchRel=,,1.08, TPDurAbs=0.09,0.08,0.09,0.10/>
<Break timeYms/>
<Word text="recognition" ph="r eh1 k ih0 g n ih1 sh ih0 n" TPPitchRel=0.95,0.87,,0.83,,0.85,0.85,,0.73, TPDurAbs=0.05,0.06,0.14,0.03,0.03,0.07,0.05,0.11,0.03,0.06/>
<Punct text="." DurAbs=0.01/>
</TransPros>

(This example uses relative pitch so I can translate the prosody
onto a female voice, if I wish. I can also produce absolute pitch,
which can be good for singing, although if you really want singing
there are specialized TTS engines just for that. This example
doesn't include volume information because it doesn't make that much
difference to the quality of the transplanted prosody.)

> How much bandwidth would the phoneme stream need(well i could time
> myself saying something and then count... but maybe someone has
> more elaborate aproximations)?

Wrap it up into a binary format and it will only be a few times
larger than text.

> You could also swap phonemes in the stream to create accents
> and/or unintelligible languages. These are off the top of my head,
> I'm sure there are many more interesting things that can be done
> with it.

I have code to do the accents, as you mentioned. (My web site gives
an example. I can also change phoneme duration and pitch contours to
create different voices.) The other technique for generating new
languages you talked about is written into the VW I'm working on.

Back to the chainging one's voice issue:

The problem is getting a transcription of the speech. To do that,
you need speech recognition to basically do dictation. Current
dictation systems are only 96%-98% accurate... assuming the right
phase of the moon and whatever other finegaling marketing people
do. The real numbers are usually 2x the error rate that marketing
claims (92%-94% accurate). That means 1 in 10-20 words are wrong,
often in confusing ways that make for humourous anecdotes. I was
testing the dictation system at Microsoft and dictated, "I gave a
penut to the squirrel." It came up with "I gave a penis to the
squirrel." After that we made sure to get rid of potentially
insulting/offensive words so they didn't accentally get dictated.

There are ways to improve the accuracy:

1) Have the VW log EVERYTHING every typed in chat, and use it to
create a "langauge model"... which is speech-guy terminology for
predicting what words come next. A good language model is produced
from 1 billion+ words, creating a database called a tri-gram. A
tri-gram just remembers, "If I heard word X followed by word Y,
what is the chance of word Z being spoken?". Thus, in a dictation
system, if you speak "new york" it'll be expecting you to say
"city". If you say "pity" instead, it will think it heard
incorrectly and write out "city" anyway. All speech recognition
engine writers have internal tools for generating language models;
I don't know how many of them allow 3rd parties to use them.

2) Dynamically update the language model based on the
context... if there are orcs nearby, the user is more likely to be
speaking about "orcs" than "forks". This requires some coding on
the part of the VW so it knows what words will be in context. I'm
not sure how many SR engine providers support the dynamic language
model.

> How could such a system deal with ambient noise, or tapping the
> mic?

The player has to wear a headset microphone. They have to talk to it
in a calm and measured voice (SR accuracy goes down when you scream
at it). Taping on the microphone can easily be ignored.

> Is there any king of public project(i know about festival only),
> creating these kind of programs? papers maybe?

Festival does TTS. I have my own TTS, up on my web site. AT&T has
the best TTS I've heard recently, but the best apparently takes
several hundred MB of RAM.

There are public SR engines. I think you can download the Sphinx
recognizer. Microsoft has a recognizer. IBM has one. Dragon has
another. Etc.

Papers - I think I saw some books on how to create your own TTS
engine. You're probably better off with a book than an academic
paper since the papers are usually based on prior knowlege and are
more esoteric. There was one book by Dennis Klatt ("From text to
speech" MIT press???) that I read about 10 years ago. TTS is no
longer synthesized that way, but it's a start in case you can't find
anything more recent.

The other approach is to use algorithms to change the voice without
knowing what was spoken. These can change from male to female, but
won't get rid of a player's Bronx accent. And by the way, there are
better algorithms than what you've probably heard, but they take
more CPU than most games want to surrender. My software will do
algorithmic voice mucking too (a non-technical term), although its
ability to go from my male voice into female is questionable. Male
to male is easy. Male to midget/giant (what you've probably heard)
is trivial. (My web site, http://www.mxac.com.au/m3d/tts.htm, has a
sample of my "female" TTS voice, derived from my voice model. My
wave editor, included in the 3d app, has tools for doing this.) If
you search on the web you'll find research for gender changing.

Mike Rozak
http://www.mxac.com.au
Text-tospeech (Was Shift in time) Mike Rozak
[NEWS][BIZ] China to limit online gaming Ghilardi Filippo
TECH: Punkbuster type technology and why we don't see any big name MMOG's using it David Wright
- TECH: Punkbuster type technology and why we don't see anybig name MMOG's using it
- - TECH: Punkbuster type technology and why we don't seeanybig name MMOG's using it Ghilardi Filippo
ADMIN: Attitude, this list, and moderation J C Lawrence
(no subject) Unknown
PvP and teamspeak? Ola Fosheim Grøstad
- PvP and teamspeak? Victor Wachter
- PvP and teamspeak? Corey Crawford
- PvP and teamspeak? Brian Hook
- PvP and teamspeak? Victor Wachter
- PvP and teamspeak? Ghilardi Filippo
- PvP and teamspeak? David Johansson
- PvP and teamspeak?
- PvP and teamspeak? Douglas Goodall
- - PvP and teamspeak? Matt Mihaly
  - - PvP and teamspeak? Ola Fosheim Grøstad
    - - PvP and teamspeak? Matt Mihaly
      - PvP and teamspeak? Ola Fosheim Grøstad
    - PvP and teamspeak?
    - - PvP and teamspeak? Matt Mihaly
- PvP and teamspeak? Corey Crawford
- PvP and teamspeak? Ling Lo
- - PvP and teamspeak? Dana V. Baldwin
- PvP and teamspeak? Cosmik
- PvP and teamspeak? Cosmik
- PvP and teamspeak? Mike Rozak
- - PvP and teamspeak? Corey Cauble
  - PvP and teamspeak? Ghilardi Filippo
Text-to-speech (Was Shift in time) Mike Rozak
Why do smart people grind? ceo
- Why do smart people grind? Geoff Hollis
- - Why do smart people grind? Matt Mihaly
  - - Why do smart people grind? Miroslav Silovic
    - Why do smart people grind? ghovs
  - Why do smart people grind? Oliver Smith
- Why do smart people grind?
- - Why do smart people grind? neild-mud@misago.org
  - - Why do smart people grind? Kirinyaga
DGN: Effect of voice chat on game design Brett Bibby
- DGN: Effect of voice chat on game design Cosmik
- DGN: Effect of voice chat on game design Mike Rozak
- - DGN: Effect of voice chat on game design Russ Whiteman
  - - DGN: Effect of voice chat on game design Matt Mihaly
    - - DGN: Effect of voice chat on game design Ted L. Chen
- DGN: Effect of voice chat on game design Per Magne Bjørnerud
- DGN: Effect of voice chat on game design Paul Canniff
- - DGN: Effect of voice chat on game design Mike Rozak
- DGN: Effect of voice chat on game design darksuit
- DGN: Effect of voice chat on game design Dana V. Baldwin
- DGN: Effect of voice chat on game design Sean Kelly
- DGN: Effect of voice chat on game design Richard A. Bartle
- - DGN: Effect of voice chat on game design Mike Rozak
  - - DGN: Effect of voice chat on game design Richard A. Bartle
    - - DGN: Effect of voice chat on game design ceo
      - DGN: Effect of voice chat on game design Matt Mihaly
      - DGN: Effect of voice chat on game design Mike Rozak
    - [SPAM] DGN: Effect of voice chat on game design Dana V. Baldwin
    - - [SPAM] DGN: Effect of voice chat on game design Ola Fosheim Grøstad
      - [SPAM] DGN: Effect of voice chat on game design Tess Snider
        [SPAM] DGN: Effect of voice chat on game design Byron Ellacott
    - DGN: Effect of voice chat on game design Damion Schubert
    - - DGN: Effect of voice chat on game design Mike Rozak
      - DGN: Effect of voice chat on game design Damion Schubert
        DGN: Effect of voice chat on game design Bloo
        DGN: Effect of voice chat on game design Corey Cauble
        DGN: Effect of voice chat on game design Amanda Walker
        DGN: Effect of voice chat on game design Mike Rozak
        DGN: Effect of voice chat on game design Dana V. Baldwin
        DGN: Effect of voice chat on game design Mike Rozak
        DGN: Effect of voice chat on game design Dana V. Baldwin
        DGN: Effect of voice chat on game design Dana V. Baldwin
        DGN: Effect of voice chat on game design Corey Cauble
  - DGN: Effect of voice chat on game design Ola Fosheim Grøstad
  - - DGN: Effect of voice chat on game design Douglas Goodall
  - DGN: Effect of voice chat on game design Brett Bibby
- DGN: Effect of voice chat on game design Douglas Galbi
- - DGN: Effect of voice chat on game design Ted L. Chen
- DGN: Effect of voice chat on game design Douglas Galbi
- - DGN: Effect of voice chat on game design Richard A. Bartle
- DGN: Effect of voice chat on game design Tess Snider
- DGN: Effect of voice chat on game design Lost Penguin
- - DGN: Effect of voice chat on game design Mike Rozak
- DGN: Effect of voice chat on game design Mike Rozak
- - DGN: Effect of voice chat on game design Dana V. Baldwin
  - DGN: Effect of voice chat on game design Damion Schubert
Newsweek prints an article reguarding the selling of virtual currency Chris
Newsweek prints an article reguarding the selling of virtual currency Chris
Cheating in the world Ola Fosheim Grøstad
- Cheating in the world Matt Mihaly
- - Cheating in the world Ola Fosheim Grøstad
  - - Cheating in the world Matt Mihaly
    - Cheating in the world Matt Mihaly
    - Cheating in the world Michael Hartman
    - Cheating in the world Michael Hartman
- Cheating in the world Paul Schwanz
- - Cheating in the world Ola Fosheim Grøstad
- Cheating in the world Paul Schwanz
- Cheating in the world Damion Schubert
- - Cheating in the world Ola Fosheim Grøstad
  - - Cheating in the world Matt Mihaly
    - Cheating in the world Matt Mihaly
    - - Cheating in the world Dana V. Baldwin
    - Cheating in the world Ted L. Chen
    - - Cheating in the world Ola Fosheim Grøstad
      - Cheating in the world Matt Mihaly
        Cheating in the world Shannon Sullivan
        Cheating in the world Ola Fosheim Grøstad
        Cheating in the world Shannon Sullivan
        Cheating in the world Ola Fosheim Grøstad
        Cheating in the world Fred Snyder
        Cheating in the world Ola Fosheim Grøstad
        Cheating in the world Zach Collins (Siege)
        Cheating in the world Miroslav Silovic
        Cheating in the world Corey Cauble
        Cheating in the world Shannon Sullivan
Will players pay for public services? Ola Fosheim Grøstad
- Will players pay for public services? ceo
- - Will players pay for public services? Tom Hunter
  - - Will players pay for public services? ceo
    - - Will players pay for public services? Johan
      - Will players pay for public services? Tom Hunter
      - Will players pay for public services? Johan
- Will players pay for public services? ceo
- Will players pay for public services? Matt Mihaly
- Will players pay for public services? Hans-Henrik Staerfeldt
- Will players pay for public services? Hans-Henrik Staerfeldt
Indies unite? Ola Fosheim Grøstad
- Indies unite? Mike Rozak
- Indies unite? Mike Rozak
- Indies unite? Boris Triebel
- Indies unite? Dana V. Baldwin
NEWS: Oblivion RPG's (next version of Morrowind) NPC AI Mike Rozak
NEWS: Oblivion RPG's (next version of Morrowind) NPC AI Mike Rozak
- NEWS: Oblivion RPG's (next version of Morrowind) NPC AI John Arras
BIZ: Europe & Distrubution Johan
- BIZ: Europe & Distrubution Matt Mihaly
- BIZ: Europe & Distrubution Daniel James
- R: BIZ: Europe & Distrubution Valerio Santinelli
- - BIZ: Europe & Distrubution Dana V. Baldwin
- BIZ: Europe & Distrubution HRose
- BIZ: Europe & Distrubution Vincent Archer
- - BIZ: Europe & Distrubution Johan
  - - BIZ: Europe & Distrubution Matt Mihaly
    - BIZ: Europe & Distrubution Tamzen Cannoy
    - - BIZ: Europe & Distrubution Ghilardi Filippo
      - [OT] Europe & Distribution Ben Carter
    - BIZ: Europe & Distrubution Ghilardi Filippo
    - - BIZ: Europe & Distrubution ceo
      - BIZ: Europe & Distrubution Michael Hartman
        BIZ: Europe & Distrubution ceo
- BIZ: Europe & Distrubution HRose
- - BIZ: Europe & Distrubution Dana V. Baldwin
  - BIZ: Europe & Distrubution Erik Bethke
  - - BIZ: Europe & Distrubution Bloo
    - BIZ: Europe & Distrubution Per Vognsen
[OT] Hi all an first time MMORPG designer tries to get in the loop Sam Byard
- [OT] Hi all an first time MMORPG designer tries to get inthe loop Brian Lindahl
- - [OT] Hi all an first time MMORPG designer tries to getinthe loop Michael Sellers
  - - [OT] Hi all an first time MMORPG designer tries togetinthe loop Brian Lindahl
    - [OT] Hi all an first time MMORPG designer tries togetinthe loop Steven Yeah
    - - [OT] Hi all an first time MMORPG designer triestogetinthe loop Brian Lindahl
- [OT] Hi all an first time MMORPG designer tries to get in the loop rbartle
- - [OT] Hi all an first time MMORPG designer tries to getin the loop Matt Mihaly
  - - [OT] Hi all an first time MMORPG designer tries to getin the loop rbartle
  - [OT] Hi all an first time MMORPG designer tries to getin the loop Sam Byard
[SPAM] DGN: Effect of voice chat on game design Tess Snider
MEDIA: Chris Crawford on Interactive Storytelling Mike Rozak
Sweatshops? Tess Snider

MUD-Dev Archive

Year

Month

Index

October 2004