Abstract
This paper investigates the implementation of LangChain, a language model-powered framework, in automating data analysis within the SaaS sector.The approach included setting up LangChain agents for exploratory, univariate, and bivariate analyses, as well as hypothesis testing, transforming extensive data into human language text answers. Experiments confirmed the effectiveness of the proposed method using GPT-3.5 LLM agents, tested on the Amazon AWS SaaS Sales Dataset. Identified deficiencies need to be addressed for complex queries and comprehensive reports. Future research prospects include improving the method for complex queries, providing more detailed information about companies and business models, creating report templates, and training the model to solve complex questions. To automate data analysis, the method of using LangChain agents was proposed. A software implementation was developed, and data analysis indicators were studied using SaaS sales data as a case study. The study demonstrated LangChain agents’ capability to automate data analysis processes in the SaaS industry. Future research will aim to expand its application across more complex data, larger number of data questions, and pre-trained LLMs
Publisher
National Academy of Sciences of Ukraine (Co. LTD Ukrinformnauka) (Publications)
Reference10 articles.
1. 1. A Madhuri, S. Phani Praveen, D Lokesh Sai Kumar, S Sindhura, Sai Srinivas Vellela. (2021). Challenges and Issues of Data Analytics in Emerging Scenarios for Big Data, Cloud and Image Mining. Annals of the Romanian Society for Cell Biology, 412-423. Retrieved from http://annalsofrscb.ro/index.php/journal/article/view/12
2. 2. Holkar A, Bhosale S, Harpale A, Pachangane VH. Unlocking the depth analysis of PDF using artificial intelligence, large language model, LangChain. Third Year, Information Technology, Jaywantrao Sawant Polytechnic, Pune, Maharashtra, India. International Research Journal of Modernization in Engineering Technology and Science. 2024;06(02):682. DOI: 10.56726/IRJMETS49113
3. 3. Bayer, S., Gimpel, H., & Markgraf, M. (2022). The role of domain expertise in trusting and following explainable AI decision support systems. Journal of Decision Systems, 32(1), 110-138. DOI: 10.1080/12460125.2021.1958505
4. 4. Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.
5. 5. Deng, X., Bashlovkina, V., Han, F., Baumgartner, S., & Bendersky, M. (2022). What do LLMs Know about Financial Markets? A Case Study on Reddit Market Sentiment Analysis. Companion Proceedings of the ACM Web Conference 2023. DOI: 10.1145/3543873.3587324.