Large-scale research of social movements has required more detailed, recent, and specific data about protest events. Analyses of these data allow for new insights into movement emergence, consequences, and tactical innovation and adaptation. One of the issues with this kind of analysis, however, is that the generation of event data is incredibly costly. Human coders must pore through news sources, looking for instances of protest and coding many variables by hand. Because of the high labor costs, projects are typically limited to one or two newspapers per country. This, in turn, exacerbates issues of selection and description biases.This article aims to address this issue with the development, validation, and application of a system for automating the generation of protest event data. This system, called the Machine-Learning Protest Event Data System (MPEDS), is the first of its kind coming from within the social movement community. MPEDS uses recent innovations from machine learning and natural language processing to generate protest event data with little to no human intervention. The system aims to have the effect of increasing the speed and reducing the labor costs associated with identifying and coding collective action events in news sources, thus increasing the timeliness of protest data and reducing biases due to excessive reliance on too few news sources. Work on MPEDS is ongoing, and to that end, the system will also be open, available for replication, and extendable by future social movement researchers, and social and computational scientists.