‘First impressions’ are a popular topic in social psychology. They are researched because the initial judgments of others are consequential in everyday life (such as job interviews, first dates, justice outcomes). In the context of broader concerns about the credibility of psychological science, first impressions research has developed commendable initiatives for improving reliability (open stimulus databases, international collaborations, replication studies and reanalyses). However, these initiatives can impact the validity of studying how people form first impressions. There is a long history of critiquing the usefulness of passive-observer judgments of controlled, reduced, presentations of people—and these concerns are still relevant today. Here, we highlight the praiseworthy practices improving reliability in first impressions research, before moving on to identify persistent methodological concerns in the field. This includes inadequate stimulus sampling and diversity, constrained participant response options, limited consideration of study context, and limitations of atomised presentations of target people. We identify how these methodological limitations impact theory development, how we might be over/underestimating everyday experience, and even misunderstanding social differences in autism and mental health. Finally, we identify opportunities for methodological reform, focusing on codifying instead of controlling interactions, promoting inductive, participant-led, methodologies, and asking for stronger theory development and clarity on ‘can’ vs. ‘do’ research questions. Overall, we praise reforms for improving the reliability of first impressions research, but improvements to making scientific predictions about first impressions require renewed consideration of validity.