Abstract
Rainfall plays a crucial role in the water cycle, serving as a direct input for agricultural practices and water resource management. However, its patterns vary significantly across different regions, creating challenges for sustainable water use. This study focuses on the Rangpur district in northwestern Bangladesh, where irrigation heavily relies on unpredictable rainfall. To address this, this study employed three machine learning regression methods – Random Forest, Support Vector Machine, and Gradient Boosting Machine – using historical annual rainfall data from 1990 to 2020. The analysis was conducted on Google Colab, an open-source Python environment. Hyperparameter optimization via grid search was conducted on the three models to maximize prediction accuracy. The analysis revealed that the Random Forest model to be the most accurate for rainfall prediction in the Rangpur district. During the testing phase, it achieved an R-squared value of 0.75, indicating a strong correlation between predictions and actual rainfall. Interestingly, Gradient Boosting Machine outperformed Random Forest in the training phase, highlighting the importance of considering both training and testing performance for model selection. Additionally, Random Forest regression confirmed the strong relationship between predicted and observed rainfall by generating highest correlation (97%). This study demonstrates the effectiveness of Random Forest Regression for forecasting rainfall in Rangpur district. This knowledge can contribute to resilient water management strategies, enabling farmers and authorities to adapt irrigation practices and optimize resource allocation in response to predicted precipitation patterns. Future research could involve incorporating additional environmental variables into the model and exploring ensemble learning techniques for potentially further improving prediction accuracy.