Affiliation:
1. School of Artificial Intelligence (School of Future Technology), Nanjing University of Information Science & Technology, China
2. School of Software, Nanjing University of Information Science & Technology, China
3. School of Software, Nanjing University of Information Science & Technology, China and School of Computer Engineering, Jiangsu Ocean University, China
4. Texas Tech University, USA
5. Auburn University, USA
6. King Saud University, Kingdom of Saudi Arabia
Abstract
Online shopping has become a crucial way to encourage daily consumption, where the User-generated, or crowdsourced product comments, can offer a broad range of feedback on e-commerce products. As a result, integrating critical opinions or major attitudes from the crowdsourced comments can provide valuable feedback for marketing strategy adjustment or product-quality monitoring. Unfortunately, the scarcity of annotated ground truth on the integrated comment, or the limited gold integration reference, has incurred the infeasibility of the regular supervised-learning-based comment integration. To resolve this problem, in this article, inspired by the principle of Transfer Learning, we propose a three-stage transferable and generative crowdsourced comment integration framework (
TTGCIF
) based on zero-and-few-shot learning with the support of domain distribution alignment. The proposed framework aims at generating abstractive integrated comment in target domain via the enhanced neural text generation model, by referring the available integration resource in related source domains, to avoid the exhausted effort on resource annotation devoted to the target domain. Specifically, at the first stage, to enhance the domain transferability, representations on the crowdsourced comments have been aligned up between the source and target domain, by minimizing the domain distribution discrepancy in the kernel space. At the second stage, Zero-shot comment integration mechanism has been adopted to deal with the dilemma that
none
of the gold integration reference may be available in target domain. In other words, taking the sample-level semantic prototype as input, the enhanced neural text generation model in
TTGCIF
is trained to learn data semantic association among different domains via semantic prototype transduction, so that the “
unlabeled
” crowdsourced comments in target domain can be associated with existing integration references in related source domains. At the third stage, based on the parameters trained at the second stage, fast domain adaptation mechanism in a Few-shot manner has also been adopted by seeking most potential parameters along the gradient direction constrained by instances across multiple source domains. In this way, parameters in
TTGCIF
can be sensitive to any alteration on training data, ensuring that even if only
few
annotated resource in target domain are available for “Fine-tune,”
TTGCIF
can still react promptly to achieve effective target domain adaptation. According to the experimental results,
TTGCIF
can achieve the best transferable product comment integration performance in target domain, with fast and stable domain adaption effect depending on
no more than
10% annotated resource in target domain. More importantly, even if
TTGCIF
has not been fine-tuned on the target domain, yet by referring to the available integration resource in related source domains, the integrated comments generated by
TTGCIF
on the target domain are still superior to those generated by models already fine-tuned on the target domain.
Funder
National Science Foundation of China
Jiangsu Natural Science Foundation
National Key Research and Development Program
Research Center of the Female Scientific and Medical Colleges, Deanship of Scientific Research, King Saud University, Saudi Arabia
Publisher
Association for Computing Machinery (ACM)
Reference51 articles.
1. Antreas Antoniou, Harrison Edwards, and Amos Storkey. 2019. How to train your MAML. In Proceedings of the 7th International Conference on Learning Representations. 1–11.
2. Yue Cao, Hui Liu, and Xiaojun Wan. 2020. Jointly learning to align and summarize for neural cross-lingual summarization. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 6220– 6231.
3. An empirical comparison of supervised learning algorithms
4. Subspace Distribution Adaptation Frameworks for Domain Adaptation
5. SPEC: Summary Preference Decomposition for Low-Resource Abstractive Summarization