Abstract
AbstractOpen model developers have emerged as key actors in the political economy of artificial intelligence (AI), but we still have a limited understanding of collaborative practices in the open AI ecosystem. This paper responds to this gap with a three-part quantitative analysis of development activity on the Hugging Face (HF) Hub, a popular platform for building, sharing, and demonstrating models. First, various types of activity across 348,181 model, 65,761 dataset, and 156,642 space repositories exhibit right-skewed distributions. Activity is extremely imbalanced between repositories; for example, over 70% of models have 0 downloads, while 1% account for 99% of downloads. Furthermore, licenses matter: there are statistically significant differences in collaboration patterns in model repositories with permissive, restrictive, and no licenses. Second, we analyse a snapshot of the social network structure of collaboration in model repositories, finding that the community has a core-periphery structure, with a core of prolific developers and a majority of isolate developers (89%). Upon removing these isolates from the network, collaboration is characterised by high reciprocity regardless of developers’ network positions. Third, we examine model adoption through the lens of model usage in spaces, finding that a minority of models, developed by a handful of companies, are widely used on the HF Hub. Overall, the findings show that various types of activity across the HF Hub are characterised by Pareto distributions, congruent with open source software development patterns on platforms like GitHub. We conclude with recommendations for researchers, and practitioners to advance our understanding of open AI development.
Funder
Economic and Social Research Council
Publisher
Springer Science and Business Media LLC
Reference150 articles.
1. OSI. (2024). The Open Source AI definition—Draft v. 0.0.8. https://opensource.org/deepdive/drafts/the-open-source-ai-definition-draft-v-0-0-8. Accessed 1 May 2024.
2. OSI. (2007). The Open Source definition (v1.9). https://opensource.org/osd/. Accessed 10 April 2023.
3. Langenkamp, M., & Yue, D. N. (2022) How open source machine learning software shapes AI. In Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, And Society. AIES ’22 (pp. 385–395). Association for Computing Machinery. https://doi.org/10.1145/3514094.3534167 . Accessed 17 August 2023.
4. White, M., Haddad, I., Osborne, C., Xiao-Yang, L., Abdelmonsef, A., & Varghese, S. (2024). The model openness framework: Promoting completeness and openness for reproducibility, transparency and usability in AI. . https://doi.org/10.48550/arXiv.2403.13784. arXiv:2403.13784 [cs]. Accessed 30 May 2024.
5. arXiv: arXiv.org e-Print archive. (2024). Accessed 19 April 2024.
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献