BACKGROUND
Understanding the core principles of nutrition is essential in today’s world of abundant, often contradictory dietary advice, empowering individuals to make informed dietary choices, crucial for having a proper diet and managing diet-related Non-Communicable Diseases (NCDs). The role of Artificial Intelligence (AI) systems in providing nutritional information is increasingly prominent, but their reliability in this domain is not well-established yet.
OBJECTIVE
This study compares the nutrition knowledge of state-of-the-art AI systems (ChatGPT-4, Bard, Copilot, and ChatGPT-3.5) with human subjects having different levels of nutrition knowledge.
METHODS
The “General Nutrition Knowledge Questionnaire–Revised” (GNKQ-R) was administered to four AI systems and human subjects. The AI systems were tested using zero-shot prompts. Responses were scored per the GNKQ’s guidelines across four sections: “Dietary Recommendations”; “Food Groups”; “Healthy Food Choices”; “Diet, Disease and Weight Management”. Human subjects were grouped based on their academic background (dietetics vs English students), age, sex/gender, education level, and health status.
RESULTS
The average performance of AI systems across all LLMs was 77.3±5.1 out of 88, which comparable to the dietetics students and significantly higher than the English students. ChatGPT-4 scored highest among the AI systems (82/88), surpassing both groups of students (dietetics: 79.3/88, English: 67.7/88) as well as all other demographic groups. In “Dietary Recommendations”, ChatGPT-4 and ChatGPT-3.5 nearly matched dietetics students. ChatGPT-4 excelled in “Food Groups”, outperforming all human groups. In “Healthy Food Choices”, ChatGPT-4 achieved a perfect score, indicating a deep understanding. ChatGPT-3.5 excelled in “Diet, Disease and Weight Management”. Variations in the performances of the AI systems across different sections were observed, suggesting knowledge gaps in certain areas. AI systems, particularly ChatGPT-4 and ChatGPT-3.5, showed proficiency in nutrition knowledge, rivaling or surpassing dietetics students in certain sections. This indicates their potential utility in nutritional guidance. However, there are nuances and specific details where AI systems lack compared to specialized human education. The study highlights the potential of AI in public health and educational settings but also underscores the value of expert human judgment.
CONCLUSIONS
AI systems show promise in understanding complex subjects like nutrition and can be a valuable adjunct educational tool. However, specialized human education and expertise remain irreplaceable, emphasizing the need for a combined approach of AI systems insights with expert human judgment in nutrition and dietetics.