Artwork for podcast Data Science Conversations
Philipp Koehn (Part 2) - How Neural Networks have Transformed Machine Translation
Episode 214th October 2020 • Data Science Conversations • Damien Deighan and Philipp Diesinger
00:00:00 00:29:41

Share Episode


This is Part 2 of our conversation with Professor Philipp Koehn of Johns Hopkins University.  Professor Koehn is one of the world’s leading experts in the field of Machine Translation & NLP.  

In this episode we delve into commercial applications of machine translation, open source tools available and also take a look into what to expect in the field in the future.

Episode Summary:


  • Typical datasets used for training models
  • The role of infrastructure and technology in Machine Translation
  • How the academic research in Machine Translation has manifested into industry applications

  • Overview of what’s available in Open source tools for Machine Translation


  • The Future of Machine Translation and can it pass a Turing test




Philipp Koehn latest book - Neural Machine Translation - Amazon link:


Omniscien Technologies - Leading Enterprise Provider of machine translation services:


Open Source tools:


- Fairseq

- Marian

- OpenNMT

- Sockeye


Translated texts (parallel data) for training:



- Paracrawl


Two papers mentioned about excessive use of computing power to train NLP models:


- GPT-3

- Roberta