You should consider contacting a professor of Speech and Language
Pathology regarding this... as well as doing research in diagnosis and
treatment of speech disorders, there is much work in helping people learn
to use a particular accent. As an aside, they probably would turn out
to know something about throat microphones, cochlear implants, and
other applied issues that overlap some common goals in wearables.
Brad
-Bradley A. Singletary (
) ICQ# 36654824
On Fri, 4 Jun 1999, Ben Houston wrote:
> Tristram,
>
> Cool idea. Althought, it is slower for a human to do the translation from
> phonemes to coherent words that it is just to read correctly represented
> words -- if you understand what I mean. I bet you can read this phrase
> better th an u c an r ee d th es w un. But that fact aside...
>
> Did you know that they already use spectrographs to help teach deaf people
> to learn to speak? The spectrograph gives a form of visual feedback to a
> speaker. A deaf person can see how they are speaking and compare it against
> another in the visual domain.
>
> I once did a presentation in my linguistics class on how speech recognition
> systems work. Afterwards my professors suggested that speech recognition
> systems could be used in foriegn language instruction. It sounds too
> logical to not exist in a research lab -- thus I bet there is already
> research in this domain?
>
> Cheers,
> -ben houston
> http://chat.carleton.ca/~bhouston
>
>
>
>
>
> ----- Original Message -----
> From: Tristram W. Metcalfe III <
>
> To: <
>; <
>;
> <
>; <
>
> Sent: Friday, June 04, 1999 2:35 AM
> Subject: Re: Linux Voice to Text
>
>
> I hope this is a potential to SIMPLIFY your key question <voice to text>,
> additionally there may be benefits to lighting the ram load / etc. in voice.
>
> What if,, (inserting my ignorance here) within the first step in the Speech
> To Text
> task you where to add *Words* to the IBM / Linux / other? data base that
> were
> Simply the *Phonemes*,, the basic 44+ new *Words* we use all the time, that
> already
> are in the front end of the recognition phase.
>
> If so then the actual human voice recognition speed, might,, go (way?) up
> while
> offering a "True data record" (for any later retrieval apps) of what
> actually was
> said avoiding the mis spelled, mis construed, mis judged "recognition" that
> slows
> and frustrates the state of the art STT 'training' in ASR software.
>
> My short presentation to ddlinux of "Phonetic Text Display" had found a
> problem
> pointed out by Sean Wheeler at mit Speech Interfaces (attached in case not
> on
> list);
> where he said phoneme recognition is improved by its bundling into larger
> words,
> that a context, or the best voiced (easiest recognized) phonemes may then
> lead to
> that particular unrec'd phoneme etc., to better grasp, i.e. 'separate
> phonemes may
> be harder to recognize'. This Phonetic Text Display should not slow or
> disable the
> current word rec. process timing, that once done is displayed (as per the
> users
> wishes).
>
> My (vacuous?) thoughts are that basically this may not be that large an
> issue, we
> simply could insert generic blank text phoneme/words. Similar to flat-grey
> blanks
> in closed captioning. These P.T.D. blanks could additionally show (prior to
> their
> true recognition) volume, pitch, timbre stress displaying 'real time'ing
> pauses
> etc., aiding the user's guess, while the better spelled recognition happens
> down
> the ('split second') road, where context recognition can edit the real time
> that is
> happening up front. This leaves a TRUE record of actually what was said.
>
> This all would greatly benefit a particular very specific community of
> millions of
> users, if it could work in real time (using friendly wc/hmds!!). The deaf
> and hard
> of hearing would then have access to the total world of actual truly
> accented
> spoken human speech languages.
>
> This additionally is of great value to those who have been largely cut off
> from the
> human sound stream, thus getting no instant feed back to self correct ones
> voice.
> Learning lip and sign are the best communication to date, but real time
> subjective
> voice feedback would allow their own voice to stay fine tuned for their
> objective
> speech to and from the rest of the world.
>
> The hope is that such a 'streamlined?' IBM voice I/O such as this could
> benefit
> all. The reading of Phonetic text may have other voice language teaching
> benefits.
> It would help the speed &/or reading by aiding separation of phonetic sound
> events
> by adding characters to the alphabet as found in doubled up letters, i.e.
> the
> familiar <ae> we see written as <one joined character> (not possible in
> html?).
>
> Enough rant,
> Sorry if this is too unfeasible,
>
> h ah v ah g oo d w ee k e n d l oo k s l ie k i t s s s s n ah t g u n
> ah r ae
> n
> <seconds.later>
> have a good weekend (it) looks like its not going to rain!
>
> The learning curve on reading it could be quick if you can't wait for the
> state of
> the art to catch up to you!.
>
> fww
> tris
>
>
>
>
>
>
> Rusty Foster wrote:
>
> > On Thu, 27 May 1999, Ansel Sermersheim wrote:
> > > >>>>> "Tim" == Tim Gray <
> writes:
> > >
> > > > check out www.zachary.com/creemer/xvoice.html It sounds like we
> > > > might have a V2T app for wearables that might actually = work! it
> > > > when one uses the IBM viavoice SDK for linux.
> >
> > I also d/l'd this, but I haven't gotten to actually try it yet
> (waiting
> > on a good microphone). I did start it up and click the buttons, though :-)
> >
> > Has anyone considered the possibility of hacking one of IBM's
> demos
> > into a simple "black box" kind of interface? I don't know any C++, so I'm
> kind
> > of helpless at the moment to do it myself, but what I have in mind is
> basically
> > just a console app that takes in some audio and spits back some text. This
> > could be glued into, say, a perl interface that then does stuff based on
> the
> > text (this part I can do!). Perhaps it could be a simple daemon that
> listens
> > for mic input, and when it gets some, converts it to text and sends it out
> on
> > port xxx. Then you write yourself an interface that acts like a client,
> > connects to port whatever, and does something when it receives text from
> the
> > socket. Whammo-- you have a voice-shell.
> >
> > I have some more concrete ideas on how this shell could be set up,
> but
> > it's all pretty academic until I can find a way to convert the voice into
> text.
> > (C'mon, isn't that the easy part? ;-)). Anyone have any suggestions?
> >
> > -Rusty
> >
> > --
> > Subcription/unsubscription/info requests: send e-mail with subject of
> > "subscribe", "unsubscribe", or "info" to
> > Wear-Hard Mailing List Archive (searchable): http://wearables.blu.org
>
> --
> ÐÏࡱá
>
>
> --
> Subcription/unsubscription/info requests: send e-mail with subject of
> "subscribe", "unsubscribe", or "info" to
> Wear-Hard Mailing List Archive (searchable): http://wearables.blu.org
>
>
> --
> Subcription/unsubscription/info requests: send e-mail with subject of
> "subscribe", "unsubscribe", or "info" to
> Wear-Hard Mailing List Archive (searchable): http://wearables.blu.org
>
--
Subcription/unsubscription/info requests: send e-mail with subject of
"subscribe", "unsubscribe", or "info" to
Wear-Hard Mailing List Archive (searchable): http://wearables.blu.org
From Wear-Hard Mailing list Archive (WH)
Maintained by R. Paul McCarty
Archive created with babymail