There is a great deal of interest in the role that partisanship, and cross-party animosity in particular, plays in interactions on social media. Most prior research, however, must infer users’ judgments of others’ posts from engagement data. Here, we leverage data from Birdwatch, Twitter’s crowdsourced fact-checking pilot program, to directly measure judgments of whether other users’ tweets are misleading, and whether other users’ free-text evaluations of third-party tweets are helpful. For both sets of judgments, we find that contextual features – in particular, the partisanship of the users – are far more predictive of judgments than content features. Specifically, users are more likely to write negative evaluations of tweets from counter-partisans; and are more likely to rate evaluations from counter-partisans as unhelpful. Our findings provide clear evidence that users systematically reject content from those with whom they disagree politically. Platform designers must consider the ramifications of partisanship when implementing crowdsourcing programs.