Computational Processing of Textual Data (TIDT)

Natural Language Processing Course

Practical work: Rapid Dialogue Prototyping

David Portabella Miroslav Melichar Martin Rajman


Introduction

The goal of this practical session is to illustrate a concrete implementation of the methodology for rapid dialogue prototyping (RDPM) that has been presented in the course. For this purpose, you will be invited to test and improve a dialogue-enabled application for controlling the devices of a Smart Home.

RDPM is a framework to deploy computer interaction applications. It can run in standalone mode or Wizard of Oz (Wizard) mode. When it runs in the Wizard mode, it means that there is a person supervising the system. Note that the user using it should believe that he/she is interacting with a fully automatic system and that there is no human supervising it.
The Wizard can do several tasks, among others:

We are going to use the RDPM for a Smart Home application, which aims at controlling the devices of a room. This tutorial is meant to be done in teams of two people. One person will be the User that is inside the room and interacts with the system, and the second person will be the Wizard of Oz (Wizard), who will supervise the system.

As this room is not prepared to control the devices, we are going to use a Room Simulator: In one computer there will be a screen with a picture of a room, with some devices (some lights, a fan, the blinds, a TV and an answering machine). The User is supposed to imagine that he/she is in the room and interacts with the system, while looking at the device he/she wishes to control.
The Wizard will be in front of another computer supervising the system. As the Speech Recognition Engine is not plugged in the system, in all the cases the Wizard will have at least to understand the user query, either by typing its transcription or by producing the corresponding semantic pairs.

This tutorial is to be done following the slides of the course.

The organization of the session is as follows:

  1. Installation and first try
  2. Understanding RDPM
  3. Improving the system
  4. Field-Test

1: Installation and first try

  1. Installation : Installing the system
  2. Quick hands-on: We play with the system. The Wizard only transcribes the user's query
  3. Task modeling: The Smart Home application
  4. A better understanding: The Wizard takes also a look about what is happening
  5. Prompts and grammars
  6. List Processing GDNs
  7. The role of the Wizard

1.1: Installation

Before starting, you will need to download Tidt-rdp, and unzip it to your directory, e.g. c:\Tidp-rdp
(create first the directory C:\Tidp-rdp and then unzip the files right there)
(if you are sharing the same account with different groups, be sure not to have the same names for your directories).

The zip file contains the RDPM framework and the application for a Smart Home.

RDPM requires Java JDK 1.4.1 (or higher). In the zip file JDK for Ms Windows is already included. If you are using another operating system such as Linux, please make sure Java JDK 1.4.1 (or higher) is installed.

Depending on the number of computers available, you can use 2 computers for each team (one for the Room simulator, another one for the Wizard), or you can see the two screen windows on just one computer. In this latter case, the User shouldn't cheat by looking at the Wizard window.

How to run:
To run the system without the Wizard Interface Control:

To run the Wizard Control Interface and the Room Simulator in just one computer: To use two computers, one for the Wizard Control Interface and another one for the Room Simulator:

1.2: Quick hands-on

  1. Start the system by executing the batch file: c:\Tidp-rdp\runSmartHome\RunSinEnv.bat

  2. User: you are in the room and the lights are switched off. Ask the system to switch them on, by typing in the InputSRE simulator.
  3. User: If you don't success, ask for help to the system.
  4. User: What devices do you see in the room? Try to operate them. What is you feeling about the system?

In this first case, there was no Wizard supervising the system.

1.3: The Smart Home task model

The task is described in the form of a set of relational tables (frames), where the rows correspond the possible functions (also called "solutions" or "targets") and the columns are the attributes needed to uniquely identify each of the functions and to invoke it.

The task model of our Smart Home application corresponds to a simplified version of the Inspire project task model:

DeviceNameDeviceLocationActionNameDayTVShowNameChannelTVShowAction
Fan-SwitchOn----
LightLeftDown----
TV-RecordMonTomorrowTSRRecord
...      

Note: The task model is defined in the file Resources/solutions.txt.

1.4: A better understanding

Now the Wizard is going to do the same task as before, but at the same time we'll explain step by step what is happening in the system. No User needed here.

Start the system using the Room Simulator and the Wizard in just one computer,
by executing the batch file: c:\Tidp-rdp\runSmartHome\RunSinEnvWOz.bat

Quick description of the Wizard Control Interface (screenshot1):

The system says:

Welcome to the Smart Home system. What can I do for you?

In the screen it is shown that:

  1. the current GDN is the Start GDN,
  2. there are other open GDNs: DeviceName, Location, ActionName, Day, Channel, and TVShowAction,
  3. the system can accept a request for help, a request for repetition, a request for new focus and detect a failure situation.

In the Input Simulator screen, check the "Autosubmit" option.
In the InputSRE simulator, type the text "

help please
" + RETURN.
What happened?
  1. The system has parsed the input text with the active grammar and has understood a Request For Help situation.
  2. The system has searched the help prompt defined in the Start GDN, and played it (the current GDN remains the Start GDN):
    The system allows you to control the following devices: Fan, Blinds, Lights, TV, and Answering machine. You can also stop the system by clicking on the "Stop" button.

Now type "switch on the ventilator" + RETURN.
  1. From the currently defined grammar, the system has successfully mapped "switch on" to the semantic pair ActionName=SwitchOn, but the word "ventilator" is not defined in the grammar and so it has been treated as noise and ignored.
  2. In the DialogStateInfo window, you find in black the value for the ActionName. In grey is indicated that the possible values now for DeviceName are only Fan, Light and TV (so, if the user ask for the Blinds or the Answering Machine, the Dialog Manager would run into an incoherence state).
  3. In the same way, you can see that there are no compatible values for Day, TVShowName, Channel and TVShowAction as these GDNs are only used for the device TV.
  4. The branching logic selects the next GDN (generally, the first one that is empty), in this case the DeviceName GDN.
  5. and it produces the prompt, which is an implicit confirmation of the last understood semantic pairs along with the main prompt of the new current GDN (DeviceName):
    I understood SwitchOn as Operation . What is the device you want to operate?
Generally there can be 3 reasons why they system does not understand the keyword "ventilator":
  1. The system is not prepared to operate such a device,
  2. The system is prepared to operate the device, but the grammar does not support the indicated keyword, but a synonym (e.g. fan),
  3. The system is prepared to operate the device, the grammar supports the keyword, but there was a speech recognition error.
In any case, a possible strategy for the user is to ask for help,
so type "help" or "what are the possibilities?" + RETURN.
In this case:
  1. The system has understood a Request For Help situation, and therefore plays the help prompt associated with the current GDN (DeviceName):
    The options are Fan, Blinds, Light, TV or Answering machine.

Provided that the user recognizes that the fan is what he/she is looking for, he/she can now complete the request,
consequently type "the fan please" + RETURN.

  1. Using the active grammar, the system has successfully mapped "fan" to the semantic pair {DeviceName=Fan}.
  2. In the DialogStateInfo window, it is shown that all the attributes have just one possible value (or zero, if the attribute is not relevant for the current solution), and so you can guess that the system have found 1 solution.
  3. Now the Dialog Manager will ask the Action Performer to execute the action associated with the solution (to switch on the fan).
  4. The Action Performer will then return a feedback about the action to the Dialog Manager.
    In this case, it could notify that the fan has switched on successfully, or that it was already on or that it failed to switch it on.
  5. The system will communicate the feedback to the User and restart the dialogue.

1.5: Prompts and grammars

The system communicates with the user using the prompts (i.e. the spoken text) and expects an answer covered by a predefined grammar.

1.5.1: Prompts

Each GDN defines a set of prompts (we'll see afterwards that there are different types of prompts.

To see this strategy operating:

Note: The prompts are defined for each GDN in the file Resouces/config.txt, under the nodes config/gdns/node/prompts/

1.5.2: Implicit Confirmation

Imagine a situation like the following:
the user says "

Switch on the TV
", but the system understands ActionName=SwitchOn, DeviceName=Light.
The system would ask afterwards: "
Which lamp
", and the user could be quite confused (do I need a lamp to switch on the TV?)

To avoid this situation, the system could ask for confirmation at every step:
the user says "

Switch on the TV
" and the system explicitly asks "
I understood Switch On as Operation and TV as device. Is it correct?
",
but this seems quite annoying.

We propose instead an implicit confirmation strategy. The system informs the user about the understood concepts and formulates the next question at the same time. If the user realizes that the system misunderstood a concept, he can quickly notice it without having to confirm at every step.

To see this strategy operating:

Note: How the implicit confirmation is built is defined in the file Resources/config.txt, under the node config/globalPrompts/implicitConfirmation.

1.5.3: Conditional prompts

To see this strategy operating:

Note: This extra help prompt is defined with the following condition coded in javascript:
java.isEstablished('DeviceName','TV').

Note: The condition for the prompts can be defined for each type of prompt, for each GDNs.
See the file Resouces/config.txt, the cond attribute under the node config/gdns/node[@id='ActionName']/prompts/prompt[@type='help']/

1.5.4: Grammars

The Natural Language Understanding module is responsable for translating a natural language utterance to a set of semantic pairs.
In the current system we are using regular expressions. For instance, the semantic pair {ActionName=PlayMsg} can be produced from an utterance covered with the regular expression "play (message|msg)" (so, "play message" or "play msg").

The grammar is defined in the file Resouces/mappingNLU.xml. Look at it and try it with the system.

1.5.5: Mixed initiative

To see this strategy operating:

Note: The other active GDNs are defined for each GDN. See the file Resouces/config.txt, under the node config/gdns/node[@id='DeviceName']/otherActiveNodes/

1.6: List Processing GDNs

List Processing GDNs allow the user to browse through a list of values and select one of them through the number identifying its position in the list. This type of GDN is particularly useful in the case of a large number of possible values for an attribute, when the values are linguistically complex or when the range of the values is not known in advance. Reducing the interaction vocabulary to numerals add robustness to the speech recognition component, potentially substantially improving its recognition rate.

To see this strategy operating:

Note: A List Processing GDN is defined by setting type="list" in the GDN definition and there is not grammar associated with its values.
See the file Resouces/config.txt, under the node config/gdns/node[@id='TVShowName']

The grammar is defined in the file Resources/mappingNLU.txt, under the node mappingNLU/map with <cv name='_LIST'/>

1.7: The role of the Wizard

The RDP methodology was explained in the course: we start by producing a model of the targeted application (the task model), and afterwards we derive an initial dialogue model from the produced task model.

In this phase, the system typically has some limitations, among others:

These limitations can be corrected. We will now give to the Wizard the task of correcting or producing the semantic pairs. If there is a basic grammar, it can always be used during the Wizard experiment to produce some initial semantic pairs. However, the Wizard can correct them or produce them if the grammar is not present or incomplete. The results of the Wizard's work will be saved in a log file to help improving the grammar after the experiment.

Start the system using the Room Simulator and the Wizard in just one computer,
by executing the batch file: c:\Tidp-rdp\runSmartHome\RunSinEnvWOz.bat

In the Input Simulator screen, make sure that the "Autosubmit" option is deselected.

In the InputSRE simulator, type the text "switch on the ventilator" + RETURN.
What happened?

In the next section we are going to learn about the different strategies used in the RDP.

2: Understanding RDPM

  1. Local dialogue management strategy
  2. Global dialogue management strategy

2.1: Local dialogue management strategy

  1. Help
  2. Repetition
  3. No input
  4. No match

2.1.1: Help

To see this strategy operating:

Start the system using the Room Simulator and the Wizard in just one computer,
by executing the batch file: c:\Tidp-rdp\runSmartHome\RunSinEnvWOz.bat

Note: The prompts are defined for each GDN. See the file Resouces/config.txt, under the node config/gdns/node[@id='DeviceName']/prompts/prompt[@type='help']

Note: the grammar is defined in the file Resources/mappingNLU.txt, under the node mappingNLU/map with <cv name='_HELP' value='_HELP'/>

2.1.2: Repetition

To see this strategy operating:

Note: the grammar is defined in the file Resources/mappingNLU.txt, under a map node with <cv name='_REPEAT' value='_REPEAT'/>

Question: The GDN contains the definition of different prompts types: main prompt, help prompt, no match prompt...
Should the repeat prompt be defined there as well?

2.1.3: No input

To see this strategy operating:

Note: The prompts are defined for each GDN. See the file Resouces/config.txt, under the node config/gdns/node[@id='DeviceName']/prompts/prompt[@type='noinput']

2.1.4: No match

To see this strategy operating:

Note: The prompts are defined for each GDN. See the file Resouces/config.txt, under the node config/gdns/node[@id='DeviceName']/prompts/prompt[@type='nomatch']

2.2: Global dialogue management strategy

  1. Branching Logic
  2. Dialogue Termination
  3. Dead Ends
  4. Explicit Confirmation
  5. Incoherencies
  6. Change focus

2.2.1: Branching Logic

It is the acquire-filter-(propagate)-activate cycle explained in the course.

To see this strategy operating:

Start the system using the Room Simulator and the Wizard in just one computer,
by executing the batch file: c:\Tidp-rdp\runSmartHome\RunSinEnvWOz.bat

Question: Should the implicit confirmation also inform about the propagated values?

2.2.2: Dialogue Termination Logic

We are here interested in the third case explained in the course:
The number of the solutions is higher than 1 AND the branching strategy is unable to find another GDN to select (i.e. all the slots are filled-in or their value is propagated). This typically happens when user provided "ANY" value for some of the attributes.

To see this strategy operating:

2.2.3: Dead ends

A dead end is a dialogue situation where no solution can be found that corresponds to the request of the user. To cope with this, we use the following relaxation strategy:

  1. determine the total number of solutions that are compatible with all values that have been explicitly acquired (i.e. not propagated), but one.
  2. remove the selected value, re-propagate from the remaining ones, and activate yes/no choice GDN to get the user's decision about the relaxation;
  3. if the user agrees with the relaxation, activate the next GDN according to the standard activation rule, otherwise choose another relaxable attribute and go to step 2;
  4. if the user rejects all relaxations possibilities, reset the dialogue.

Let's try the first case:

The second case:

2.2.4: Explicit confirmation

An explicit confirmation for irreversible actions.

To see this strategy operating:

Note: The need for an explicit confirmation is defined for each solution in the solutions file using the needsConfirmation attribute.
See the file Resources/solutions.txt, the value of the field needsConfirmation for the solution:
solutionId#DeviceName|Location|ActionName|Day|TVShowName|Channel|TVShowAction#needsConfirmation?#list of possible action feedback messages, separated by '|'
054#AnsweringMachine|-|Delete|-|-|-|-#true#054fail|054_1|054_2|054_3

Note: The prompts for the confirmation GDN are defined in the file Resources/config.txt, under the node config/gdns/node[@id='Confirmation']/prompts/prompt.

2.2.5: Incoherencies

We call an incoherence situation when multiple incompatible values have been provided for the same GDN.
What should be the right strategy? some systems take only in consideration the last value.
We propose to inform about it to the user so he can provide the right one.

To see this strategy operating:

Note: How the incoherence prompt is built is defined in the file Resources/config.txt, under the node config/globalPrompts/incoherence.

2.2.6: Change focus

The user can ask the system to change the focus.

To see this strategy operating:

Note: The grammar is defined in the file Resources/mappingNLU.txt, under the node mappingNLU/map with <cv name='_NEWFOCUS'/>

 

 

3: Improving the system

You are now ready to improve the system:

 

 

4: Field-Test

Now that your system is ready, it is important to evaluate it in order to have some "objective" measure of its quality. To do so, you will perform a field-test during which your system will be used by different test people. In your case, the test users will be the other students. Hereafter, we describe how a typical field-test works:

  1. Each person participating in the evaluation is informed about the purpose of the system;
  2. A short scenario is given to the participants in order to provide a concrete context for their interaction with the system; additional general instructions concerning the interaction with the system might also be provided;
  3. The user is exposed to the system and performs the interaction;
  4. After interaction ahe, a "satisfaction questionnaire" (SQ) is submitted to the user. An SQ is composed of different types of questions, and its general goal is to gather information about the opinions of the users concerning different aspects of the system (efficiency, ergonomy, ease of use, and so on). Then, on the basis of gathered results, statistics can be computed and some retrospective and prospective analysis of the system can be performed.
You can find here an example of an SQ (extracted from InfoVox project).

In your case, you will start directly at step 3. You will have to let several test persons interact with your system, and to fill the SQ. Then, on the basis of the obtained answers, produce simple statistics (average values for each question), and try to make some conclusions about your system.

What are the results? What is your feeling about them? Any idea on how to improve your system?

 

 


June 2005, David Portabella