The Power of Characters: Evaluating Machine Learning-Modified Bayesian Improved Surname Geocoding Inference of Race in Redistricting-Reference-Cited by-同舟云学术

The Power of Characters: Evaluating Machine Learning-Modified Bayesian Improved Surname Geocoding Inference of Race in Redistricting

Published:2024-05-22 Issue: Volume: Page:1-22
ISSN:1532-4400
Container-title:State Politics & Policy Quarterly
language:en
Short-container-title:State Politics Policy Q.

Author:

Curiel John A.^ORCID,DeLuca Kevin^ORCID

Abstract

Abstract Identifying racial disparities in policy and politics is a pressing area of research within the United States. Where early work made use of identifying potentially noisy correlations between county or precinct demographics and election outcomes, the advent of Bayesian Improved Surname Geocoding (BISG) vastly improved estimation of race by employing voter lists. Machine Learning (ML)-modified BISG in turn offers accuracy gains over the static – and potentially outdated – surname dictionaries present in traditional BISG. However, the extent to which ML might substantively alter the policy and political implications of redistricting is unclear given its improvements in voter race estimation. Therefore, we ascertain the potential gains of ML-modified BISG in improving the estimation of race for the purpose of redistricting majority-minority districts. We evaluate an ML-modified BISG program against traditional BISG estimates in correctly estimating the race of voters for creating majority-minority congressional districts within North Carolina and Georgia, and in state assembly districts in Wisconsin. Our results demonstrate that ML-modified BISG offers substantive gains over traditional BISG, especially in diverse political geographic units. Further, we find meaningful improvements in accuracy when estimating majority-minority district racial composition. We conclude with recommendations on when and how to use the two methods, in addition how to ensure transparency and confidence in BISG-related research.

Publisher

Cambridge University Press (CUP)

Reference42 articles.

1. Redistricting Out Descriptive Representation: The Harmful Effect of Splitting ZIP Codes on the Constituent–Representative Link

2. Chaturvedi, Rochana , and Chaturvedi, Sugat . 2020. “It’s All in the Name: A Character Based Approach to Infer Religion.” https://arxiv.org/abs/2010.14479, arXiv Working Paper.

3. A New Method for Estimating Race/Ethnicity and Associated Disparities Where Administrative Records Lack Self-Reported Race/Ethnicity

4. Congress in Black and White