Abstract
In the context of Web 3.0, classification emerges as a data mining task involving the assignment of labels or classes to each document. This chapter explores two main types of automatic classification: categorization and clustering. It provides an in-depth study of various supervised and unsupervised learning algorithms found in the literature, including k-means, KNN, hierarchical clustering (both agglomerative and divisive), among others. Furthermore, the practical applications and challenges associated with the classification task have been examined, in line with the evolving landscape of Web 3.0. In this context of decentralized networks and distributed information repositories, document classification is crucial for facilitating user access and navigation within these complex digital environments. Classification algorithms face new challenges such as managing heterogeneous data, considering metadata related to document provenance and integrity, and adapting to dynamic changes in content.