Affiliation:
1. School of Computer Science Engineering and Technology Bennett University, Greater Noida, India
2. Department of CSE, Graphic Era Deemd to be University, Dehradun, Uttrakhand, India
3. Graphic Era Hill University, Dehradun, Uttarakhand, India
Abstract
Recent research suggests that by 2023, the production of data will exceed 300 exabytes per month, a figure surpassing human verbal communication by over 60 times. This exponential growth underscores the need for platforms to adapt in areas such as data analysis and storage. Efficient data organization is crucial, considering the growing scarcity of time and space resources. While manual sorting may suffice for small datasets in smaller organizations, large corporations dealing with millions or billions of documents require advanced tools to streamline storage, sorting, and analysis processes. In response to this need, this research introduces a novel architecture called Slick, designed to enhance sorting, filtering, organization, and analysis capabilities for any storage service. The proposed architecture incorporates two innovative techniques – Degree of Importance (DOI) and amortized clustering – along with established natural language processing methods such as Topic Modelling, Summarization, and Tonal Analysis. Additionally, a new methodology for keyword extraction and document grouping is presented, resulting in significantly improved response times. It offers a searchable platform where users can utilize succinct keywords, lengthy text passages, or complete documents to access the information they seek. Experimental findings demonstrate a nearly 46 percent reduction in average response time compared to existing methods in literature.