If you
have used an interactive voice response system then you have used an
voice xml application, without knowing what technology is used behind
that application. Voicexml is an xml application that has special xml
grammar meant for handling the voice of the user.
_______________________________________________
The
speech to text engine and the text to speech engine plays an important
role in such applications. Support for multiple languages is available
in a good voice xml server.
Know the
simple voice xml grammar would help you understand what is going behind
the scenes. For example the tag <prompt> under the <menu>
tag would prompt the user with some audio message that is defined in
the <audio> tag.
The text
to speech engine converts the text message in the <audio> tag
to audio message and the user hears that message. If you respond to
that audio message with some voice, it is converted back to text and
compared and the <choice> tag is used to divert the user to other
part of the application where you will get related voice prompts according
to your response. The <form> and the <block> tags are used
in the voice xml grammar to present some information to the user.