Abstract
The basic requirement of text summarization is that the facts in the summary and the original text must be consistent. At present, most of the summarization models choose to introduce fact information in the decoding stage. With the increase of the text content, the ability to process fact information becomes weak, which leads to fact consistency errors in the model. From the perspective of data fusion in input section, this paper proposes a LTSum-FTL (Long Text Summarization model with Fact Triples Labeling) to improve factual consistency, which help readers obtain more accurate information. Firstly, use fact triples to represent the factual information of the original text. Then annotate the three attributes in the triple, and vector the annotation information and fuse into the input vector. Finally, use the improved masking mechanism to mask or replace the triple attributes of the input part, to improve the model summary ability. The experimental results show that the proposed model can effectively reduce the probability of fact consistency errors,it is at least 2.4%,1.1% and 0.3 higher than that of other comparison models in the metrics of Pre1,Pre2 and FactCC-Socre.