RNN-T ASR Systems and Enabling Contextualization For RNN-T ASR Systems

Speaker: Mahaveer Jain (Facebook)

Date and Time: Friday, June 18 at 10am CT


In this talk, Mahaveer will first discuss the general theory behind building Recurrent Neural Network Transducer (RNN-T) Automatic Speech Recognition (ASR) Systems. The RNN-T based End2End ASR systems has become a popular choice in the industry to build compact, accurate ASR systems that can be deployed on-device. Next, Mahaveer will discuss methods to enable contextualization for RNN-T ASR Systems. Contextualization allows us to use utterance specific context for ASR systems.


Mahaveer Jain is a Software Engineer at Facebook, Inc. Priorly, he was a graduate research assistant at Language Technology Institute (LTI) at Carnegie Mellon where he finished his master’s in language technologies. Mr. Mahaveer has worked extensively on building production ready RNN-T ASR systems at Facebook. His current focus is to enable contextualization for End2End ASR systems such as RNN-T. His work has been published in leading ASR conferences such as Interspeech, ICASSP etc. Mahaveer enjoys teaching and has given invited tutorial talks in deep learning classes at Georgia Tech and Carnegie Mellon on RNN-T ASR Systems.