CEG: A joint model for causal commonsense events enhanced story ending generation
Author:
Zhang Yushi, Yang YanORCID, Gu Ming, Gao Feng, Chen Chengcai, He Liang
Abstract
With the success of pre-trained language models, the performance of story ending generation has been dramatically improved while remaining challenging due to the lack of commonsense reasoning ability. Most previous works mainly focus on using commonsense knowledge to enhance the implicit correlations between words but ignore the hidden causality of sentences or events. In this paper, we proposeCausal commonsenseEnhanced joint model for story endingGeneration (CEG), which incorporates causal commonsense events knowledge to generate a reasonable story ending. Specifically, we first develop a commonsense events inference model trained on GLUCOSE, which converts static knowledge into a dynamic generation model to discover unseen knowledge. It uses prompts to produce various commonsense events behind the stories as pseudo-labels of the dataset. Then, we propose a joint model for the causal events inference task and the story ending generation task to inject inference knowledge into the generation, which consists of a shared encoder, an inference decoder, and a generation decoder. In the causal events inference task, we use the shared encoder and the inference decoder to reason the causal events behind each sentence of the story context to help the model better understand the story and provide long-distance dependencies for the story ending generation. In story ending generation, we combine the hidden states of the causal events with the story context to generate the story ending by the shared encoder and the generation decoder. We jointly train the model on two tasks so that the generation decoder produces the story endings that better match the clues. Experimental results on the ROCStories dataset show that our model outperforms the previous works, demonstrating the effectiveness of the joint model and the generated causal events.
Publisher
Public Library of Science (PLoS)
Subject
Multidisciplinary
Reference28 articles.
1. Language models are unsupervised multitask learners;A Radford;OpenAI blog,2019 2. Devlin J, Chang M, Lee K, Toutanova K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In: Burstein J, Doran C, Solorio T, editors. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers). Association for Computational Linguistics; 2019. p. 4171–4186. 3. Speer R, Chin J, Havasi C. ConceptNet 5.5: An Open Multilingual Graph of General Knowledge. In: Singh SP, Markovitch S, editors. Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, February 4-9, 2017, San Francisco, California, USA. AAAI Press; 2017. p. 4444–4451. Available from: http://aaai.org/ocs/index.php/AAAI/AAAI17/paper/view/14972. 4. Chen J, Chen J, Yu Z. Incorporating Structured Commonsense Knowledge in Story Completion. In: The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019, The Thirty-First Innovative Applications of Artificial Intelligence Conference, IAAI 2019, The Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019, Honolulu, Hawaii, USA, January 27—February 1, 2019. AAAI Press; 2019. p. 6244–6251. 5. Guan J, Wang Y, Huang M. Story Ending Generation with Incremental Encoding and Commonsense Knowledge. In: The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019, The Thirty-First Innovative Applications of Artificial Intelligence Conference, IAAI 2019, The Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019, Honolulu, Hawaii, USA, January 27—February 1, 2019. AAAI Press; 2019. p. 6473–6480.
|
|