1. Semi-supervised classification with graph convolutional networks;kipf;ArXiv Preprint,2016
2. Attention is all you need;vaswani;Advances in neural information processing systems,2017
3. Adam: A method for stochastic optimization;kingma;ArXiv Preprint,2014
4. Bert: Pre-training of deep bidirectional transformers for language understanding;devlin;ArXiv Preprint,2018
5. The Penn Chinese TreeBank: Phrase structure annotation of a large corpus