In Course 2 (Module 7), we studied the North Carolina and Texas redistricting cascades from a legal perspective—tracking the brutal timeline of maps enacted, struck down, and replaced. Now, we examine those exact same court cases from the perspective of the data scientist. By studying the explicit expert witness reports filed in Harper v. Hall (NC) and LULAC v. Abbott (TX), we can see exactly how the abstract ReCom methodologies from the previous module were translated into active legal weapons on the witness stand.
In This Module
- Covers: Expert witness testimony utilizing MCMC ensembles in North Carolina and Texas redistricting litigation.
- Why it matters: Theory is useless if the judge doesn't understand it. Practitioners must learn how to present a 100,000-map ensemble as a simple, compelling visual "bell curve" that a non-technical judge can use to strike down a legislature's work.
- After this module, the reader can: Deconstruct a federal expert witness report and translate computational outliers into legal violations.
Reading List
Conceptual
-
A masterclass in presentation. Dr. Chen generated 1,000 completely random North Carolina maps using only the state's required non-partisan criteria (compactness, equal population, county preservation). He then plotted the Republican legislature's enacted map against the 1,000 random ones. The visual result showed that the enacted map was more hostile to Democratic voters than 99.9% of the random, neutrally drawn maps. Read to understand how algorithms answer legal questions of intent.
Methods
-
Analyzing how MCMC is uniquely applied to the scale of race. In Texas, the state claimed that it was impossible to draw a certain number of Hispanic-majority districts while remaining geographically compact. Dr. Duchin's ensemble proved the opposite: the algorithm easily generated thousands of maps that were both highly compact and proportional to the surging Hispanic population density, proving the state intentionally cracked the demographic.
Technical Reference
-
The transparency requirement. In litigation, if you use an algorithm, you must provide your script and your seed files to the opposition so they can attempt to reproduce your test. Reviewing the open-source repositories from these trials shows exactly how the GerryChain parameters (discussed in Module 7) were hard-coded for the courtroom.
Key Concepts
How did Dr. Jowei Chen prove partisan intent in Harper v. Hall (North Carolina)?
Dr. Chen generated 1,000 random North Carolina maps using only non-partisan criteria: compactness, equal population, and county preservation. The enacted map was more hostile to Democratic voters than 99.9% of the randomly generated alternatives—a statistical outlier proving it was mathematically near-impossible for such an extreme partisan outcome to occur by chance.
How did Dr. Duchin disprove Texas's claim that Hispanic-majority districts were geometrically impossible?
In LULAC v. Abbott, Dr. Duchin's ensemble easily generated thousands of maps that were both highly compact and proportional to the surging Hispanic population density. Because the algorithm effortlessly produced what the state claimed was impossible, it mathematically proved the state intentionally cracked the Hispanic demographic to dilute their voting power.
Why must redistricting algorithms and seed files be open-source for litigation?
In federal litigation, expert algorithms must be fully reproducible by opposing counsel. The MGGG Lab and Princeton Gerrymandering Project publish their GerryChain configurations, ensemble parameters, and seed data on GitHub. This transparency requirement is non-negotiable—reviewing these repositories shows how academic theory is hard-coded for courtroom deployment.
Goal: Finalize the Ensemble section of your Methodology Portfolio by drafting the visual deliverable.
If you have run the math, you must now present the finding to the legal team.
- Design the Deliverable: Describe the exact visualization you will present to the judge. (e.g., "A histogram plotting the number of Republican-leaning districts on the X-axis and frequency of algorithmically generated maps on the Y-axis. A red line will indicate the enacted map's extreme location outside the standard distribution curve.")
- Prepare for Rebuttal: The state will hire their own expert to attack your model. What is the most expected vulnerability in your parameterization? (e.g., "The state will likely argue that our algorithm excessively prioritized racial thresholds over county boundary preservation.") Write your defense into the Portfolio.