| David Portabella | Miroslav Melichar | Martin Rajman |
|---|
The goal of this practical session is to illustrate a concrete implementation of the methodology for rapid dialogue prototyping (RDPM) that has been presented in the course. For this purpose, you will be invited to test and improve a dialogue-enabled application for controlling the devices of a Smart Home.
RDPM is a framework to deploy computer interaction applications. It can run in standalone mode or Wizard of Oz (Wizard) mode. When it runs in the Wizard mode, it means that there is a person supervising the system. Note that the user using it should believe that he/she is interacting with a fully automatic system and that there is no human supervising it.
The Wizard can do several tasks, among others:
We are going to use the RDPM for a Smart Home application, which aims at controlling the devices of a room. This tutorial is meant to be done in teams of two people. One person will be the User that is inside the room and interacts with the system, and the second person will be the Wizard of Oz (Wizard), who will supervise the system.
As this room is not prepared to control the devices, we are going to use a Room Simulator: In one computer there will be a screen with a picture of a room, with some devices (some lights, a fan, the blinds, a TV and an answering machine). The User is supposed to imagine that he/she is in the room and interacts with the system, while looking at the device he/she wishes to control.
The Wizard will be in front of another computer supervising the system. As the Speech Recognition Engine is not plugged in the system, in all the cases the Wizard will have at least to understand the user query, either by typing its transcription or by producing the corresponding semantic pairs.
This tutorial is to be done following the slides of the course.
The organization of the session is as follows:
Before starting, you will need to download Tidt-rdp,
and unzip it to your directory, e.g. c:\Tidp-rdp
(create first the directory C:\Tidp-rdp and then unzip the files right there)
(if you are sharing the same account with different
groups, be sure not to have the same names for your directories).
The zip file contains the RDPM framework and the application for a Smart Home.
RDPM requires Java JDK 1.4.1 (or higher). In the zip file JDK for Ms Windows is already included. If you are using another operating system such as Linux, please make sure Java JDK 1.4.1 (or higher) is installed.
Depending on the number of computers available, you can use 2 computers for each team (one for the Room simulator, another one for the Wizard), or you can see the two screen windows on just one computer. In this latter case, the User shouldn't cheat by looking at the Wizard window.
How to run:
To run the system without the Wizard Interface Control:
c:\tidt-rdp\runSmartHome
RunEnvSim.bat
c:\tidt-rdp\runSmartHome
RunEnvSimWOz.bat
c:\Tidp-rdp\runSmartHome\RunEnvSimRemote.bat
c:\Tidp-rdp\runSmartHome\RunVNCClient.batc:\Tidp-rdp\runSmartHome\RunSinEnv.bat
In this first case, there was no Wizard supervising the system.
1.3: The Smart Home task model
The task is described in the form of a set of relational tables (frames), where the rows correspond the possible functions (also called "solutions" or "targets") and the columns are the attributes needed to uniquely identify each of the functions and to invoke it.
The task model of our Smart Home application corresponds to a simplified version of the Inspire project task model:
| DeviceName | DeviceLocation | ActionName | Day | TVShowName | Channel | TVShowAction |
| Fan | - | SwitchOn | - | - | - | - |
| Light | Left | Down | - | - | - | - |
| TV | - | Record | Mon | Tomorrow | TSR | Record |
| ... |
Note: The task model is defined in the file Resources/solutions.txt.
Now the Wizard is going to do the same task as before, but at the same time we'll explain step by step what is happening in the system. No User needed here.
Start the system using the Room Simulator and the Wizard in just one computer,
by executing the batch file: c:\Tidp-rdp\runSmartHome\RunSinEnvWOz.bat
Quick description of the Wizard Control Interface (screenshot1):
The system says:
In the screen it is shown that:
Start GDN,
In the Input Simulator screen, check the "Autosubmit" option.
In the InputSRE simulator, type the text "
DeviceName GDN.
Provided that the user recognizes that the fan is what he/she is looking for, he/she can now complete the request,
consequently type "the fan please" + RETURN.
The system communicates with the user using the prompts (i.e. the spoken text) and expects an answer covered by a predefined grammar.
Each GDN defines a set of prompts (we'll see afterwards that there are different types of prompts.
To see this strategy operating:
Note: The prompts are defined for each GDN in the file Resouces/config.txt, under the nodes config/gdns/node/prompts/
Imagine a situation like the following:
the user says "
ActionName=SwitchOn, DeviceName=Light.To avoid this situation, the system could ask for confirmation at every step:
the user says "
We propose instead an implicit confirmation strategy. The system informs the user about the understood concepts and formulates the next question at the same time. If the user realizes that the system misunderstood a concept, he can quickly notice it without having to confirm at every step.
To see this strategy operating:
Note: How the implicit confirmation is built is defined in the file Resources/config.txt, under the node config/globalPrompts/implicitConfirmation.
To see this strategy operating:
Note: This extra help prompt is defined with the following condition coded in javascript:
java.isEstablished('DeviceName','TV').
Note: The condition for the prompts can be defined for each type of prompt, for each GDNs.
See the file Resouces/config.txt, the cond attribute under the node config/gdns/node[@id='ActionName']/prompts/prompt[@type='help']/
The Natural Language Understanding module is responsable for translating a natural language utterance to a set of semantic pairs.
In the current system we are using regular expressions. For instance, the semantic pair {ActionName=PlayMsg} can be produced from an utterance covered with the regular expression "play (message|msg)" (so, "play message" or "play msg").
The grammar is defined in the file Resouces/mappingNLU.xml. Look at it and try it with the system.
To see this strategy operating:
Note: The other active GDNs are defined for each GDN. See the file Resouces/config.txt, under the node config/gdns/node[@id='DeviceName']/otherActiveNodes/
List Processing GDNs allow the user to browse through a list of values and select one of them through the number identifying its position in the list. This type of GDN is particularly useful in the case of a large number of possible values for an attribute, when the values are linguistically complex or when the range of the values is not known in advance. Reducing the interaction vocabulary to numerals add robustness to the speech recognition component, potentially substantially improving its recognition rate.
To see this strategy operating:
Note: A List Processing GDN is defined by setting type="list" in the GDN definition and there is not grammar associated with its values.
See the file Resouces/config.txt, under the node config/gdns/node[@id='TVShowName']
mappingNLU/map with <cv name='_LIST'/>
The RDP methodology was explained in the course: we start by producing a model of the targeted application (the task model), and afterwards we derive an initial dialogue model from the produced task model.
In this phase, the system typically has some limitations, among others:
These limitations can be corrected. We will now give to the Wizard the task of correcting or producing the semantic pairs. If there is a basic grammar, it can always be used during the Wizard experiment to produce some initial semantic pairs. However, the Wizard can correct them or produce them if the grammar is not present or incomplete. The results of the Wizard's work will be saved in a log file to help improving the grammar after the experiment.
Start the system using the Room Simulator and the Wizard in just one computer,
by executing the batch file: c:\Tidp-rdp\runSmartHome\RunSinEnvWOz.bat
In the Input Simulator screen, make sure that the "Autosubmit" option is deselected.
In the InputSRE simulator, type the text "switch on the ventilator" + RETURN.
What happened?
In the next section we are going to learn about the different strategies used in the RDP.
To see this strategy operating:
Start the system using the Room Simulator and the Wizard in just one computer,
by executing the batch file: c:\Tidp-rdp\runSmartHome\RunSinEnvWOz.bat
Note: The prompts are defined for each GDN. See the file Resouces/config.txt, under the node config/gdns/node[@id='DeviceName']/prompts/prompt[@type='help']
Note: the grammar is defined in the file Resources/mappingNLU.txt, under the node mappingNLU/map with <cv name='_HELP' value='_HELP'/>
To see this strategy operating:
Note: the grammar is defined in the file Resources/mappingNLU.txt, under a map node with <cv name='_REPEAT' value='_REPEAT'/>
Question: The GDN contains the definition of different prompts types: main prompt, help prompt, no match prompt...
Should the repeat prompt be defined there as well?
To see this strategy operating:
Note: The prompts are defined for each GDN. See the file Resouces/config.txt, under the node config/gdns/node[@id='DeviceName']/prompts/prompt[@type='noinput']
To see this strategy operating:
Note: The prompts are defined for each GDN. See the file Resouces/config.txt, under the node config/gdns/node[@id='DeviceName']/prompts/prompt[@type='nomatch']
It is the acquire-filter-(propagate)-activate cycle explained in the course.
To see this strategy operating:
Start the system using the Room Simulator and the Wizard in just one computer,
by executing the batch file: c:\Tidp-rdp\runSmartHome\RunSinEnvWOz.bat
Question: Should the implicit confirmation also inform about the propagated values?
We are here interested in the third case explained in the course:
The number of the solutions is higher than 1 AND the branching strategy is unable to find another GDN to select (i.e. all the slots are filled-in or their value is propagated). This typically happens when user provided "ANY" value for some of the attributes.
To see this strategy operating:
A dead end is a dialogue situation where no solution can be found that corresponds to the request of the user. To cope with this, we use the following relaxation strategy:
Let's try the first case:
The second case:
With your constraints Light as Device and Next as Operation I cannot finad any solution. Could you specify something else than Light for Device . Yes or No?
An explicit confirmation for irreversible actions.
To see this strategy operating:
Note: The need for an explicit confirmation is defined for each solution in the solutions file using the needsConfirmation attribute.
See the file Resources/solutions.txt, the value of the field needsConfirmation for the solution:
solutionId#DeviceName|Location|ActionName|Day|TVShowName|Channel|TVShowAction#needsConfirmation?#list of possible action feedback messages, separated by '|'
054#AnsweringMachine|-|Delete|-|-|-|-#true#054fail|054_1|054_2|054_3
Note: The prompts for the confirmation GDN are defined in the file Resources/config.txt, under the node config/gdns/node[@id='Confirmation']/prompts/prompt.
We call an incoherence situation when multiple incompatible values have been provided for the same GDN.
What should be the right strategy? some systems take only in consideration the last value.
We propose to inform about it to the user so he can provide the right one.
To see this strategy operating:
Note: How the incoherence prompt is built is defined in the file Resources/config.txt, under the node config/globalPrompts/incoherence.
The user can ask the system to change the focus.
To see this strategy operating:
Note: The grammar is defined in the file Resources/mappingNLU.txt, under the node mappingNLU/map with <cv name='_NEWFOCUS'/>
You are now ready to improve the system:
mappingNLU/map with <cv name='DeviceName' value='Fan'/>
Now that your system is ready, it is important to evaluate it in order to have some "objective" measure of its quality. To do so, you will perform a field-test during which your system will be used by different test people. In your case, the test users will be the other students. Hereafter, we describe how a typical field-test works:
In your case, you will start directly at step 3. You will have to let several test persons interact with your system, and to fill the SQ. Then, on the basis of the obtained answers, produce simple statistics (average values for each question), and try to make some conclusions about your system.
What are the results? What is your feeling about them? Any idea on how to improve your system?