This is the core technical apex of Course 3. In the past, politicians claimed their highly skewed maps were just the natural result of geographic constraints—that Democrats naturally packed themselves into cities ("political geography"), and thus proportional representation was geometrically impossible. Ensemble Analysis destroyed that defense. Using Markov Chain Monte Carlo (MCMC) algorithms, mathematicians can program a computer to randomly generate 100,000 completely neutral maps that follow all state rules. If the politician's map is more extreme than 99,999 of the random maps, the "accident of geography" defense collapses. It is mathematical proof of intent to rig.

In This Module

  • Covers: Markov Chain Monte Carlo algorithms, the Recombination (ReCom) method, and building a mathematical baseline of fair districting outcomes.
  • Why it matters: This is currently the most devastating weapon in the civil rights data scientist's arsenal. If you do not understand ensemble modeling, you cannot lead a major redistricting challenge.
  • After this module, the reader can: Understand how algorithms systematically swap voting precincts to generate massive distributions of alternate universes, mathematically cornering gerrymandered outcomes.

Reading List

Conceptual

  • Conceptual [Scale lens]
    A high-level explanation of the baseline problem. Duchin argues that you cannot simply compare a map to strict proportionality (e.g., "50% vote should equal 50% seats") because physical geography limits what is possible. Instead, you must compare a map to the universe of possible maps for that specific state. Thus, the algorithm acts as the baseline for fairness.
  • 2. Gregory Herschlag, Jonathan Mattingly, et al., Quantifying Gerrymandering in North Carolina
    Conceptual
    An accessible summary from the Duke mathematics team that pioneered much of the use of MCMC in state supreme courts. This introduces the concept of the "bell curve" of maps. They plot 24,000 random district configurations and place the enacted NC legislature's map entirely off the far edge of the curve, visually demonstrating the extreme statistical unlikelihood of the result.

Methods

  • 3. DeFord, Duchin, and Solomon, Recombination: A family of Markov chains for redistricting (2019)
    Methods
    The hard mechanics. Older MCMC methods swapped single precincts one at a time on the border of a district, which often led to wildly non-compact shapes. The ReCom method solves this by fusing two adjacent districts together, drawing a random spanning tree through the fused super-district, and cutting it back into two mathematically compact pieces. This is the industry standard for modern modeling.

Technical Reference

  • Technical Reference
    GerryChain is the open-source Python ecosystem developed specifically for running ReCom ensembles. As a technical practitioner, you must review the documentation to understand the programmatic structure of a Markov Chain run: defining the initial partition (seed map), setting the constraints (population limits, VRA limits), and declaring the updaters (tracking partisan shifts).

Key Concepts

Why must gerrymandering analysis compare an enacted map to the universe of possible maps?

Moon Duchin argues that strict proportionality ("50% vote = 50% seats") is not a valid baseline because physical geography constrains possible configurations. Instead, analysts must generate thousands of random neutral maps following all state rules. The algorithm itself becomes the baseline for fairness, allowing courts to determine whether the enacted map is a statistical outlier.

How did Duke mathematicians use MCMC ensembles to prove gerrymandering in North Carolina?

Herschlag, Mattingly, and colleagues generated 24,000 random district configurations using only North Carolina's non-partisan criteria. The enacted legislature's map fell entirely off the far edge of the bell curve—more hostile to the minority party than 99.9% of neutrally drawn alternatives—constituting mathematical proof that the outcome could not have been accidental.

How does the ReCom method improve upon older MCMC redistricting algorithms?

Older MCMC methods swapped single precincts at district borders, producing non-compact shapes. ReCom fuses two adjacent districts into a super-district, draws a random spanning tree through the combined geography, and cuts it back into two mathematically compact pieces. This produces more realistic, legally defensible map samples and has become the industry standard.

What is GerryChain and how is it used for redistricting litigation?

GerryChain is an open-source Python library for running ReCom ensemble simulations. Practitioners define the initial partition, set population constraints and VRA thresholds, and declare "updaters" tracking partisan metrics across each generated map. A typical litigation run generates 10,000–100,000 samples, producing a statistical distribution against which the enacted map is measured. Its open-source nature ensures reproducibility—a legal requirement for expert testimony.