Biometric recognition may be used in conjunction with human authentication on a smartphone to improve accuracy, reliability, and simplicity, and to aid in fraud prevention and user authentication. While single biometric authentication addresses environmental degradation and sensor noise limitations, and the single point of failure scenario in biometric systems can result in more robust biometric systems, multimodal biometric authentication can improve the accuracy of identification and recognition. The purpose of this research is to propose a facial and speech authentication system that is cloud-based and supports a web-based examination approach. The system enables students' biometrics to be registered, students to be recognized, and student recognition results to be reported. The confusion matrix is used to compare the results of positive and negative detection in various ways, including accuracy score, precision value, and recall value. Adaptive multimodal biometric authentication should be designed and evaluated for further research using the optimal weights for each biometric.