Zihang Dai

GitHub

About me

I'm a 4th year PhD student in LTI, CMU.

Research

I'm interested in Deep Learning and Language Understanding.

Publication

XLNet: Generalized Autoregressive Pretraining for Language Understanding.
Zhilin Yang*, Zihang Dai*, Yiming Yang, Jaime Carbonell, Ruslan Salakhutdinov, Quoc V. Le
(*: equal contribution)
NeurIPS 2019 (oral). [arXiv] [code]

Re-examination of the Role of Latent Variables in Sequence Modeling.
Guokun Lai*, Zihang Dai*, Yiming Yang, Shinjae Yoo
(*: equal contribution)
NeurIPS 2019. [arXiv] [code]

Transformer-XL: Attentive Language Models Beyond a Fixed-length Context.
Zihang Dai*, Zhilin Yang*, Yiming Yang, Jaime Carbonell, Quoc V. Le, Ruslan Salakhutdinov
(*: equal contribution)
ACL 2019 (oral). [arXiv] [code]

From Credit Assignment to Entropy Regularization: Two New Algorithms for Neural Sequence Prediction.
Zihang Dai*, Qizhe Xie*, Eduard Hovy
(*: equal contribution)
ACL 2018 (long paper). [arXiv] [code]

Breaking the Softmax Bottleneck: A High-Rank RNN Language Model.
Zhilin Yang*, Zihang Dai*, Ruslan Salakhutdinov, William W. Cohen.
(*: equal contribution)
ICLR 2018 (oral). [arXiv] [OpenReview] [code]

Good Semi-supervised Learning that Requires a Bad GAN.
Zihang Dai*, Zhilin Yang*, Fan Yang, William W. Cohen, Ruslan Salakhutdinov.
(*: equal contribution)
NIPS 2017. [arXiv] [code]

Controllable Invariance through Adversarial Feature Learning.
Qizhe Xie, Zihang Dai, Yulun Du, Eduard Hovy, Graham Neubig.
NIPS 2017. [arXiv]

An Interpretable Knowledge Transfer Model for Knowledge Base Completion.
Qizhe Xie, Xuezhe Ma, Zihang Dai and Eduard Hovy.
ACL 2017. [arXiv]

Calibrating Energy-based Generative Adversarial Networks.
Zihang Dai, Amjad Almahairi, Philip Bachman, Eduard Hovy, Aaron Courville.
ICLR 2017. [arXiv] [OpenReview] [code]

CFO: Conditional Focused Neural Question Answering with Large-scale Knowledge Bases.
Zihang Dai, Lei Li, Wei Xu.
ACL 2016 (long paper). [arXiv] [code]