Affiliation:
1. Department of Industrial Engineering and Engineering Management, National Tsing Hua University, Hsinchu 300044, Taiwan
2. Center of Mathematics, Computing, and Cognition, Federal University of ABC, Santo André 09280-560, São Paulo, Brazil
Abstract
Data structures such as sets, lists, and arrays are fundamental in mathematics and computer science, playing a crucial role in numerous real-life applications. These structures represent a variety of entities, including solutions, conditions, and objectives. In scenarios involving large datasets, eliminating duplicate elements is essential to reduce complexity and enhance performance. This paper introduces a novel algorithm that uses logarithmic prime numbers to efficiently sort data structures and remove duplicates. The algorithm is mathematically rigorous, ensuring correctness and providing a thorough analysis of its time complexity. To demonstrate its practicality and effectiveness, we compare our method with existing algorithms, highlighting its superior speed and accuracy. An extensive experimental analysis across one thousand random test problems shows that our approach significantly outperforms two alternative techniques from the literature. By discussing the potential applications of the proposed algorithm in various domains, including computer science, engineering, and data management, we illustrate its adaptability through two practical examples in which our algorithm solves the problem more than 3×104 and 7×104 times faster than the existing algorithms in the literature. The results of these examples demonstrate that the superiority of our algorithm becomes increasingly pronounced with larger problem sizes.
Funder
NTHU
Ministry of Science and Technology
FAPESP
Reference35 articles.
1. An Analysis on Removal of Duplicate Records using Different Types of Data Mining Techniques: A Survey;Selvi;Int. J. Comput. Sci. Mob. Comput.,2017
2. Usage of task and data parallelism for finding the lower boundary vectors in a stochastic-flow network;Francesquini;Reliab. Eng. Syst. Saf.,2023
3. Andriyanov, N., Dementev, V., Tashlinskiy, A., and Vasiliev, K. (2021). The Study of Improving the Accuracy of Convolutional Neural Networks in Face Recognition Tasks. Pattern Recognition, Springer. ICPR International Workshops and Challenges. ICPR 2021. Lecture Notes in Computer Science.
4. Marszałek, Z. (2017). Parallelization of Modified Merge Sort Algorithm. Symmetry, 9.
5. An Efficient Technique for Removing Duplicates in A Dataset;Raj;Int. J. Eng. Res. Technol.,2013