Abstract
The luxury British steamship named Titanic unfortunately sank in 1912 after striking an iceberg, resulting in a severe loss in life and property. This research explains and analyzes what kind of characteristics of passengers were more correlated to the survival rate in the Titanic disaster. The research also demonstrates how to predict the survival rate in the Titanic disaster by using two interpretable machine learning algorithms named Decision Tree and Random Forest and finds which model is better for the survival rate prediction. The feature importance of each model is visualized, and it illustrates that a passenger’s age, sex and ticket class are the three most significant causes correlated to the survival rate. The ticket class implied passenger’s socioeconomic class, which means physical space of the cabins on the Titanic played an important role in surviving from the Titanic disaster. After data cleaning and model building, the result of accuracy score proves that Decision Tree algorithm performs better than Random Forest.
Reference10 articles.
1. Kononenko, Igor. "Machine learning for medical diagnosis: history, state of the art and perspective." Artificial Intelligence in medicine 23.1, 2001, 89 - 109.
2. Dixon, Matthew F., Igor Halperin, and Paul Bilokon. "Machine learning in Finance." Vol. 1406. New York, NY, USA: Springer International Publishing, 2020.
3. History.com Editors, "Titanic": https://www.history.com/topics/early-20th-century-us/titanic, 2009.
4. A. Singh, S. Saraswat and N. Faujdar, "Analyzing Titanic disaster using machine learning algorithms," 2017 International Conference on Computing, Communication and Automation (ICCCA), 2017, pp. 406 - 411.
5. A. Dasgupta, V. P. Mishra, S. Jha, B. Singh and V. K. Shukla, "Predicting the Likelihood of Survival of Titanic’s Passengers by Machine Learning," 2021 International Conference on Computational Intelligence and Knowledge Economy (ICCIKE), 2021, pp. 52 - 57.