PALS0039 Introduction to Deep Learning for Speech and Language Processing

Week 9 - Human-Machine Dialogue Systems

In which we discuss neural network systems for holding a conversation and answering questions.

Learning Objectives

By the end of the session the student will be able to:

describe the general architecture of conversational agents
outline the processing that takes place in neural speech recognition and neural speech synthesis systems.
describe a neural system for the conversion of text to logical forms
outline how questions can be answered using a knowledge base
use Keras to implement a simple automated assistant.

Outline

Today's lecture is given by Christos Christodoulopoulos from Amazon Cambridge.

History of dialogue systems

We present a brief history of dialogue systems from early text understanding in micro-worlds through to chatbots and conversational agents like Alexa and Siri. We discuss the 'Turing test' as a gold standard for human-machine dialogue.

Terry Winograd's SHRDLU natural language understanding system, 1970. Wikipedia.
D. Bobrow, R. Kaplan, M. Kay, D. Norman, H. Thompson, T. Winograd, GUS, a frame-drived dialog system. Artificial Intelligence, 1977.
The Turing Test. Stanford Encyclopedia of Philosophy.
Kuki chatbot

Dialogue system architecture

We discuss the components of a typical dialog system, comprising speech recognition, natural language understanding, dialogue manager, natural language generation and text to speech.

Natural Language Understanding with Alexa Skills.
C. Khatri, et al, Advancing the state of the art of open domain dialog systems through the Alexa prize. Arxiv 2018.

Question Answering

We address the problem of finding the answers to questions posed by users from a knowledgebase. Firstly queries are converted to some logical form through semantic parsing, then the logical form is used to retrieve matching data in a structured database of facts.

Wikidata query service.
S. Rongali, L. Soldaini, E. Monti, W. Hamza, Don't parse, generate! A sequence to sequence architecture for task-oriented semantic parsing. Arxiv 2020.

Semantic parsing of disfluent speech

Natural speech is full of disfluencies. We discuss how the intention of the user can be established despite disfluency.

P. Sen, I. Groves, Semantic parsing of disfluent speech. Amazon Science, 2021.

End-to-end neural data-to-text generation

We discuss emerging methods for neural text generation which operate directly from source data.

H. Harkous, I. Groves, A. Saffari, Have your text and use it too! End-to-end neural data-to-text generation with semantic fidelity. Arxiv 2020.

Fact verification

We discuss the ongoing challenge to verify statements made by users as supported or unsupported by evidence.

J. Thorne, A. Vlachos, C. Christodoulopoulos, FEVER: a large scale dataset for fact extraction and verification. Arxiv 2018.

Research Paper of the Week

A. Venkatesh, et al, On Evaluating and Comparing Open Domain Dialog Systems, NIPS Workshop Conversational AI, 2017.

Web Resources

Get started with the Alexa Skills Kit, learn how to give Alexa a new skill.

Readings

Be sure to read one or more of these discussions of dialogue system construction:

Chatbots and Dialogue Sytems, chapter from Jurafsky & Martin's Speech and Lnauage Processing.
A survey on human machine dialogue systems, Mallios, Bourbakis, 2016.
Introduction to Semantic Parsing, Kilian Evang, 2018.
L. Dong, M. Lapata, Language to Logical Form with Neural Attention, Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016. (Last week's Research Paper)

Exercises

Implement answers to the problems described in the notebooks below. Save your completed notebooks into your personal Google Drive account.

To be announced...

Word count: . Last modified: 22:45 11-Mar-2022.