The AI community building the future? A quantitative analysis of development activity on Hugging Face Hub-Reference-Cited by-同舟云学术

The AI community building the future? A quantitative analysis of development activity on Hugging Face Hub

Published:2024-06-24 Issue: Volume: Page:
ISSN:2432-2717
Container-title:Journal of Computational Social Science
language:en
Short-container-title:J Comput Soc Sc

Author:

Osborne Cailean^ORCID,Ding Jennifer,Kirk Hannah Rose

Abstract

AbstractOpen model developers have emerged as key actors in the political economy of artificial intelligence (AI), but we still have a limited understanding of collaborative practices in the open AI ecosystem. This paper responds to this gap with a three-part quantitative analysis of development activity on the Hugging Face (HF) Hub, a popular platform for building, sharing, and demonstrating models. First, various types of activity across 348,181 model, 65,761 dataset, and 156,642 space repositories exhibit right-skewed distributions. Activity is extremely imbalanced between repositories; for example, over 70% of models have 0 downloads, while 1% account for 99% of downloads. Furthermore, licenses matter: there are statistically significant differences in collaboration patterns in model repositories with permissive, restrictive, and no licenses. Second, we analyse a snapshot of the social network structure of collaboration in model repositories, finding that the community has a core-periphery structure, with a core of prolific developers and a majority of isolate developers (89%). Upon removing these isolates from the network, collaboration is characterised by high reciprocity regardless of developers’ network positions. Third, we examine model adoption through the lens of model usage in spaces, finding that a minority of models, developed by a handful of companies, are widely used on the HF Hub. Overall, the findings show that various types of activity across the HF Hub are characterised by Pareto distributions, congruent with open source software development patterns on platforms like GitHub. We conclude with recommendations for researchers, and practitioners to advance our understanding of open AI development.

Funder

Economic and Social Research Council

Publisher

Springer Science and Business Media LLC

Link

https://link.springer.com/content/pdf/10.1007/s42001-024-00300-8.pdf

Reference150 articles.

1. OSI. (2024). The Open Source AI definition—Draft v. 0.0.8. https://opensource.org/deepdive/drafts/the-open-source-ai-definition-draft-v-0-0-8. Accessed 1 May 2024.

2. OSI. (2007). The Open Source definition (v1.9). https://opensource.org/osd/. Accessed 10 April 2023.

3. Langenkamp, M., & Yue, D. N. (2022) How open source machine learning software shapes AI. In Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, And Society. AIES ’22 (pp. 385–395). Association for Computing Machinery. https://doi.org/10.1145/3514094.3534167 . Accessed 17 August 2023.

4. White, M., Haddad, I., Osborne, C., Xiao-Yang, L., Abdelmonsef, A., & Varghese, S. (2024). The model openness framework: Promoting completeness and openness for reproducibility, transparency and usability in AI. . https://doi.org/10.48550/arXiv.2403.13784. arXiv:2403.13784 [cs]. Accessed 30 May 2024.

5. arXiv: arXiv.org e-Print archive. (2024). Accessed 19 April 2024.

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Correction to: The AI community building the future? A quantitative analysis of development activity on Hugging Face Hub;Journal of Computational Social Science;2024-07-23