Austin Matthews

5850 Centre Ave #404 Pittsburgh, PA 15206


Carnegie Mellon University, Language Technologies Institute - Pittsburgh, PA

Ph.D. in Language Technologies, November 2019

M.S. in Language Technologies, August 2013

University of North Carolina at Chapel Hill – Chapel Hill, NC

B.S. in Computer Science, with distinction, May 2011

B.S. in Mathematics, with distinction, May 2011

Minor in Linguistics, May 2011

Awarded Carolina Scholar 2007-2011

North Carolina School of Science and Mathematics – Durham, NC

High School Diploma, June 2007


My Ph.D. thesis “Linguistic Knowledge for Neural Language Generation and Machine Translation”, published November 2019

First author on “Comparing Top-down and Bottom-up Neural Generative Dependency Models”, presented at CoNLL 2019

Co-inventor of “Optimized SMT System with Rapid Adaptation Capability”

First author on “Incorporating Morphological Knowledge in Open-Vocabulary Neural Language Models”, presented at NAACL 2018

Co-author on “XNMT: The eXtensible Neural Machine Translation Toolkit”

Co-author on “DyNet: The Dynamic Neural Network Toolkit”

First author on “Synthesizing German Compound Words for Machine Translation”, presented at ACL 2016

Co-author of “Transition-Based Dependency Parsing with Stack Long Short-Term Memory”, presented at ACL 2015

First author and organizer of “The CMU Machine Translation Systems at WMT 2014”, presented at WMT 2014

First author of “Tree Transduction Tools for cdec”, presented at Machine Translation Workshop 2014

First author on “A Bayesian Model for Sense Induction through Translation”, submitted to TACL in December 2013

Co-author of The CMU machine translation systems at “WMT 2013: Syntax, synthetic translation options, and pseudo-references”, presented at WMT 2013

Co-author of “Phonotactic Reconstruction of Encrypted VoIP Conversations: Hookt on fon-iks”, presented at the 2011 IEEE Symposium on Security and Privacy

  – Awarded Best Paper at the 2011 IEEE Symposium on Security and Privacy


Professional Experience

Unbabel, November 2019 - Present

Research scientist improving core machine translation technology

Supervised by Alon Lavie

Research Experience

CMU MT Group Research Assistant, May 2012 - November 2019

Incorporation of syntactic information into neural machine translation

Synthetic generation of morphological variants and compounds

Bayesian methods for sense induction and word alignment

Dialect adaptation in English-Arabic syntax-based machine translation

Tree-to-tree syntax-based architectures for machine translation

Optimization and parallelization of tree-to-tree translation architectures

Advised by Chris Dyer and Graham Neubig

Research Internship

DeepMind, May 2017 - August 2017

Neural models of quantitative reasoning

Grounded language acquisition in simulated 3-D environments

Advised by Stephen Clark

Research Internship

University of Helsinki, October 2016 - December 2016

Incorporating morphological information into attentional models of machine translation

Information transfer between neural MT systems for related language pairs

Advised by Jörg Tiedemann

Research Internship

Nara Institute of Science and Technology, August 2015 - October 2015

Attentional neural models for tree-to-string translation targeting diverse language pairs

Advised by Graham Neubig


Jelinek Summer Workshop on Speech and Language Technology, June 2015 - August 2015

Attentional neural models for continuous wide-band machine translation

Source-conditioned neural n-gram-based language models

Team lead by Trevor Cohn

Research Internship

Microsoft Research, May 2014 - August 2014

Speech-to-speech machine translation pipelines for Skype Translator

Capturing long-range dependencies with neural language models

Advised by Jonathan Clark

Professional Experience

Safaba Translation Solutions Developer, September 2011 - June 2015

Developed translation systems for over a dozen language pairs for commercial clients

Developed new methods for correctly mapping structured text format annotations from a source to target language in MT output, resulting in significant gains in fidelity of formatting in MT output

Developed software to integrate Safaba's translation systems into existing commercial workflow systems

Professional Experience

EvoApp Senior Developer, May 2011 - August 2011

Developed algorithms to find keywords describing large data sets

Developed software to analyze emotion in microblog messages

Developed query-based retrieval system for large bodies of short text

Research Experience

UNC Computer Science Research Assistant, June 2010 – May 2011

Developed algorithms for reconstruction of natural language data in highly noisy environments

Researched lingustically-motivated methods to evaluate phonetic errors in ASR systems

Advised by Fabian Monrose

Professional Experience

EvoApp Senior Developer, January 2009 - August 2009

Developed web-based customer relationship management software in Silverlight and C#

Study Abroad Experience

Keio University's International Program, September 2008 - August 2009

Studied Japanese language and culture at Keio University in Tokyo, Japan


International Business Machines (IBM), May 2008 - September 2008

Developed a web-based drag-and-drop WYSIWYG form builder UI in Javascript to accompany survey software

Worked on internal HTML form rendering engine in Java


Blue Lizard Technologies, June 2007 - August 2007

Developed an RSS reader in C# .NET