Week 8 - Sequence to Sequence Models
In which we discuss networks that transform between sequences, with particular interest in machine translation.
Learning Objectives
By the end of the session the student will be able to:
- outline the history of machine translation
- describe metrics used for evaluation of machine translation systems
- explain the operation of sequence-to-sequence models
- discuss how limitations of the simple seqseq model is overcome through the use of attention
- outline how limitations of the recurrent seq2seq model can be addressed using the transformer architecture
- describe the application of sequence-to-sequence models to machine translation
- use Keras to implement and train a sequence-to-sequence model
Outline
- History of machine translation
We briefly describe the long history of machine translation of human languages, contrasting rule-based, exemplar-based, statistical and neural machine translation approaches. We also describe some commonly-used metrics for measuring the performance of translation systems, including the BLEU and METEOR scores.
- A history of machine translation from the Cold War to deep learning
- B. Vauquois, "A Survey of Formal Grammars and Algorithms for Recognition and Translation", 1968
- Systran software description.
- H. Somers, Example-based Machine Translation, J. Machine Translation, 1999.
- Overview of the MOSES statistical machine translation system.
- K. Vashee, Understanding MT Quality: BLEU scores.
- The METEOR automatic machine translation evaluation system.
- State of the art performance of machine translation.
- Sequence to sequence models
We describe how deep learning can be applied to the problem of sequence-to-sequence conversion. We present the basic architecture, then discuss how the introduction of attention and transformer mechanisms enhance the performance and practicality of the approach.
- Sequence to Sequence Learning with Neural Networks, Sutskever et al, 2014.
- K.Cho, B. Merrienboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, Y. Bengio, Learning phrase representations using RNN encoder-decoder for statistical machine translation, Arxiv, 2015.
- D. Bahdanau, K. Cho, Y. Bengio, Neural machine translation by jointly learning to align and translate, ICLR 2015.
- M. Luong, H. Pham, C. Manning, Effective approaches to attention-based neural machine translation, EMNLP 2015.
- A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. Gomez, L. Kaiser, I. Polosukhin, Attention is all you need, NIPS 2017.
- Multilingual neural machine translation
We discuss the multi-lingual translation problem in which there are large number of source and target languages. We describe one approach that claims to be able to translate bteewen pairs of languages for which no paired corpus is available.
- M. Johnson, M. Schuster, Q. Le, M. Krikun, Y. Wui, Z. Chen. N. Thorat, F. Viegas, M. Wattenberg, G. Corrado, M. Hughes, J. Dean, Google's Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation, Arxiv, 2017.
- N. Arivazhagan, A. Bapna, O. Firat, Massively multilingual neural machine translation in the wild: findings and chellenges. Arxiv, 2019.
- Other Applications of seq2seq Models
We briefly introduce some other Speech and Language problems that have been addressed by the seq2seq approach.
Research Paper of the Week
- L. Dong, M. Lapata, Language to Logical Form with Neural Attention, Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016.
Web Resources
- The illustrated seq2seq with attention. A visual guide to how attention works in seq2seq models.
- The illustrated transformer. A visual guide to how transformers work.
- How to Make a Language Translator. Lively video.
- OpenNMT: An open source neural machine translation system.
Readings
Be sure to read one or more of these discussions of neural machine translation:
- Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
- Creating A Language Translation Model Using Sequence To Sequence Learning Approach
- The Shallowness of Google Translate by Douglas Hofstadter.
Exercises
Implement answers to the problems described in the notebooks below. Save your completed notebooks into your personal Google Drive account.
Word count: . Last modified: 21:31 13-Mar-2022.