Affiliation:
1. Department of Artificial Intelligence and Machine Learning, New Horizon College of Engineering, Bangalore, India
Abstract
This research is mainly focused on to talk about data preparation, what better way to start than from observation. Everyone is familiar with the adage that a data scientist should spend 80% of his or her time preparing the data and just 20% actually working with it, particularly when it comes to visualization. This essay will concentrate on data preparation, including the most common issues, solutions, and developments. Data must be put into the proper form before analysis can be done on it. Data manipulation and organization are steps in the preparation of data for analysis. Iteratively transforming unstructured, chaotic raw data into a more organized, practical form that is ready for further analysis is known as data preparation. Data profiling, cleaning, integration, and transformation are just a few of the primary activities (or tasks) that make up the entire preparation process.
Reference9 articles.
1. Zhang, Z., C. Zhang, and S. Zhang. 2003. An agent-based hybrid framework for database mining. Applied Artificial Intelligence 17(5–6):383–398.
2. Zhang, C., and S. Zhang. 2002. Association Rules Mining: Models and Algorithms. In Lecture Notes in Artificial Intelligence, volume 2307, page 243, Springer-Verlag
3. Zhang, H., and C. Ling. 2003. Numeric mapping and learnability of Na€ve Bayes. Applied Artificial Intelligence 17(5–6):507–518
4. Yang, Q., T. Li, and K. Wang. 2003. Web-log cleaning for constructing sequential classifiers. Applied Artificial Intelligence 17(5–6):431–441.
5. Tseng, S., K. Wang, and C. Lee. 2003. A pre-processing method to deal with missing values by integrating clustering and regression techniques. Applied Artificial Intelligence 17(5–6):535–544