Author:
Ramana Reddy Bussu Venkata
Abstract
Databricks, as a unified analytics platform, has emerged at the forefront of this evolution, offering scalable cloud-based solutions for data science and ML applications. This article explores the journey of Databricks in enabling data-driven decision-making through advanced analytics techniques. From its roots in Apache Spark to its current status as a leading platform for data engineering, data science, and machine learning, Databricks has continuously evolved to meet the growing demands of modern enterprises. This article examines the progression of data science/Machine Learning applications in Databricks, tracing their development from initial implementation to current state-of-the-art techniques and integration within the platform. Initially, the article delineates the inception of Databricks, focusing on its architecture and the early adoption of Apache Spark for big data processing. It explores how the platform's native support for various programming languages and its unified analytics engine facilitated the early stages of intelligent application development. The article further discusses the implications of these advancements for the future of data science and Intelligence within Databricks and the broader analytics ecosystem. It highlights the potential for further integration of AI and ML technologies, such as automated machine learning (AutoML) and real-time analytics, in enhancing decision-making processes and operational efficiencies across industries. The evolution of data science in Databricks has played a pivotal role in advancing big data analytics, offering scalable, efficient, and user-friendly solutions. This study not only charts the historical development of these applications within Databricks but also provides insights into future trends and potential areas for innovation. As data continues to grow in volume and complexity, platforms like Databricks will be instrumental in harnessing the power of data science and ML to drive insights and value across sectors.
Publisher
International Journal of Innovative Science and Research Technology
Reference14 articles.
1. Ruan, W., Chen, Y., Forouraghi, B. (2019). On Development of Data Science and Machine Learning Applications in Databricks. In: Xia, Y., Zhang, LJ. (eds) Services – SERVICES 2019. SERVICES 2019. Lecture Notes in Computer Science(), vol 11517. Springer, Cham. https://doi.org/10.1007/978-3-030-23381-5_6
2. L’Esteve, R.C. (2021). Machine Learning in Databricks. In: The Definitive Guide to Azure Data Engineering. Apress, Berkeley, CA. https://doi.org/10.1007/978-1-4842-7182-7_23
3. Landset, S., Khoshgoftaar, T. M., Ritcher, A. M., & Hasanin, T. (2015). A survey of open source tools for machine learning with big data in the Hadoop ecosystem. Journal of Big Data, 2(24), 1–36.
4. Chatterjee, S., Ghosh, S., Dawn, S., Hore, S., & Dey, N. (in press). Optimized forest type classification: A machine learning approach. In 3rd international conference on information system design and intelligent applications. Vishakhapatnam: Springer AISC.
5. Reinsel, D., Gantz, J., Rydning, J. Data Age 2025: The Evolution of Data to Life-Critical, Retrieved 10.06.2018 from https://www.seagate.com/www-content/our-story/trends/files/Seagate-WP-DataAge2025-March-2017.pdf
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献