Zihang Dai


About me

I'm a last year PhD student in LTI, CMU.


I'm interested in Deep Learning for Language and Deep Generative Models.


Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing.
Zihang Dai*, Guokun Lai*, Yiming Yang, Quoc V. Le
(*: equal contribution)
Preprint 2020. [arXiv] [code]

XLNet: Generalized Autoregressive Pretraining for Language Understanding.
Zhilin Yang*, Zihang Dai*, Yiming Yang, Jaime Carbonell, Ruslan Salakhutdinov, Quoc V. Le
(*: equal contribution)
NeurIPS 2019 (oral). [arXiv] [code]

Re-examination of the Role of Latent Variables in Sequence Modeling.
Guokun Lai*, Zihang Dai*, Yiming Yang, Shinjae Yoo
(*: equal contribution)
NeurIPS 2019. [arXiv] [code]

Transformer-XL: Attentive Language Models Beyond a Fixed-length Context.
Zihang Dai*, Zhilin Yang*, Yiming Yang, Jaime Carbonell, Quoc V. Le, Ruslan Salakhutdinov
(*: equal contribution)
ACL 2019 (oral). [arXiv] [code]

From Credit Assignment to Entropy Regularization: Two New Algorithms for Neural Sequence Prediction.
Zihang Dai*, Qizhe Xie*, Eduard Hovy
(*: equal contribution)
ACL 2018 (long paper). [arXiv] [code]

Breaking the Softmax Bottleneck: A High-Rank RNN Language Model.
Zhilin Yang*, Zihang Dai*, Ruslan Salakhutdinov, William W. Cohen.
(*: equal contribution)
ICLR 2018 (oral). [arXiv] [OpenReview] [code]

Good Semi-supervised Learning that Requires a Bad GAN.
Zihang Dai*, Zhilin Yang*, Fan Yang, William W. Cohen, Ruslan Salakhutdinov.
(*: equal contribution)
NIPS 2017. [arXiv] [code]

Controllable Invariance through Adversarial Feature Learning.
Qizhe Xie, Zihang Dai, Yulun Du, Eduard Hovy, Graham Neubig.
NIPS 2017. [arXiv]

An Interpretable Knowledge Transfer Model for Knowledge Base Completion.
Qizhe Xie, Xuezhe Ma, Zihang Dai and Eduard Hovy.
ACL 2017. [arXiv]

Calibrating Energy-based Generative Adversarial Networks.
Zihang Dai, Amjad Almahairi, Philip Bachman, Eduard Hovy, Aaron Courville.
ICLR 2017. [arXiv] [OpenReview] [code]

CFO: Conditional Focused Neural Question Answering with Large-scale Knowledge Bases.
Zihang Dai, Lei Li, Wei Xu.
ACL 2016 (long paper). [arXiv] [code]