Abstract
ABSTRACTArchaeologists tend to produce slow data that is contextually rich but often difficult to generalize. An example is the analysis of lithic microdebitage, or knapping debris, that is smaller than 6.3 mm (0.25 in.). So far, scholars have relied on manual approaches that are prone to intra- and interobserver errors. In the following, we present a machine learning–based alternative together with experimental archaeology and dynamic image analysis. We use a dynamic image particle analyzer to measure each particle in experimentally produced lithic microdebitage (N = 5,299) as well as an archaeological soil sample (N = 73,313). We have developed four machine learning models based on Naïve Bayes, glmnet (generalized linear regression), random forest, and XGBoost (“Extreme Gradient Boost[ing]”) algorithms. Hyperparameter tuning optimized each model. A random forest model performed best with a sensitivity of 83.5%. It misclassified only 28 or 0.9% of lithic microdebitage. XGBoost models reached a sensitivity of 67.3%, whereas Naïve Bayes and glmnet models stayed below 50%. Except for glmnet models, transparency proved to be the most critical variable to distinguish microdebitage. Our approach objectifies and standardizes microdebitage analysis. Machine learning allows studying much larger sample sizes. Algorithms differ, though, and a random forest model offers the best performance so far.
Publisher
Cambridge University Press (CUP)
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献