The (very) tricky math of detecting gerrymandering in election districts
Keith Devlin @KeithDevlin@fediscience.org, @profkeithdevlin.bsky.social
with Ellen Veomett https://ellenveomett.com
Today is Election Day in the United States. Which, among other things, means we are likely in for a series of lawsuits, many of which will involve mathematical testimony regarding the organizing, running, and counting of votes. (From the perspective of fair representation, the US Presidential election is a mathematical disaster area.)
Moreover, the local elections going on all over the nation bring in the mathematical treasure trove of how to determine fair districts, with advanced math used both to rig the districts in favor of one party and for the losing party to detect (and hopefully prove) that such has occurred.
I’ve written about election math in two earlier Devlin’s Angle’s, November 2000 and more recently November 2016. This time, I want to take a look at the mathematically fascinating topic of detecting party bias in districting.
In particular, I want to give more air to recent studies showing just how tricky it can be to detect bias in an era where party loyalists have access to considerable mathematical skills and big budgets for computing with large datasets.
NOTE: The thumbnail image for this post shows the original “Gerry mander” congressional district, created in 1812 by Elbridge Gerry to redraw the Massachusetts state senate district near Boston. By carving up Essex County, a political stronghold for the Federalists, it resulted in the election of three Democratic-Republicans, leaving just two Federalist senators. Elbridge Gerry, the governor who signed the bill creating the new “salamander looking” district, was one of America’s Founding Fathers, signing the Declaration of Independence, a congressman, a confidant of John Adams, and the nation’s fifth vice-president. The word “gerrymander” was apparently coined at a Boston dinner party hosted by a prominent Federalist, after illustrator Elkanah Tisdale drew a map of the district, embellished to look like a monster salamander.
Since this area of mathematics is outside my area of expertise, I asked one of the leading experts in the field to write the story.
Dr Ellen Veomett is an Associate Professor in the Computer Science Department at the University of San Francisco, whose research focuses on redistricting and fairness in machine learning. Professor Veomett has a Masters in Computer Science from the University of Illinois at Urbana-Champaign and a PhD in Mathematics from the University of Michigan.
She started looking at election issues in 2017, after she read a paper about the Efficiency Gap (and how it satisfied the Efficiency Principle). She didn't believe it was correct, and when she looked into it she found a counterexample to the main claim.
For any mathematician, that kind of thing can (and usually does) fire an obsession to understand why. In Ellen’s case, it led to her first paper in the area: The Efficiency Gap, Voter Turnout, and the Efficiency Principle (citation at end). She has, she tells me, been fascinated with this area of study ever since.
She has worked on metrics to detect gerrymandering, redistricting protocols, multimember districts, ballot generation, and the Markov Chain process to create the ensemble of potential districting maps. Some of that work is published, some is submitted, and some is work in progress. I give a list of her publications in this area at the end.
GUEST POST
Metrics to detect gerrymandering
by Ellen Veomett
In case you haven’t noticed yet, today is election day! If you can vote, you’ve been asked to weigh in on your representative for your U.S. House District, State Senate District, City Council District, or any number of other districts that you may live in. So, you may also be curious how much your district vote “counts.” In particular, you may be wondering how gerrymandered is your districting map?
In the past 10 years or so, study of the mathematics of redistricting has exploded, and some innovative and revealing techniques have been developed. To see a few examples, I encourage you to explore the websites of the Metric Geometry Gerrymandering Group and the Quantifying Gerrymandering Group, to name just two.
In this blog post, I’d like to focus on metrics to detect gerrymandering. In particular, two that are arguably the oldest and most established metrics used to detect gerrymandering. [Due to modern computing techniques, pretty much everyone agrees that metrics focused on district shapes can’t detect gerrymandering anymore, so I won’t consider those.]
The metrics I’ll hold a microscope to are the Partisan Bias and the Mean-Median Difference. I’m going to try to convince you that these two metrics have avoided proper scrutiny for far too long. Perhaps we, as a community, can support democracy by helping to explain what these metrics truly can and cannot do.
Both are examples of what are called “symmetry metrics” because, in a way, they measure how the two political parties (the Democratic and Republican parties) are treated differently. The metrics allege that the two parties should be treated equally.
“Wait, what’s that?” you say. “Isn’t that a good thing? Isn’t fairness based in being treated equally?” For sure, the ideals on which these symmetry metrics are based are appealing, and generally accepted as appropriate. Which explains, I think, at least a portion of their lasting power.
But as I’ll show, while the story behind the design of these metrics sounds good, they cannot detect maps that are biased towards one party.
Figure 1 shows you how these metrics can be calculated.
We can get another point in the V-S plane by using the assumption of “uniform partisan swing.” Under this assumption, when the vote share swings towards the Democratic (Republican) party, it does so uniformly across all districts. See Figure 2.
Figure 3 gives a (fictional) example of a seats-votes curve constructed using the uniform partisan swing assumption.
This fictional state has 5 districts, with estimated Democratic vote shares of 20%, 30%, 55%, 60%, and 65%.
If we assume that uniform partisan swing is reasonable, then the height of this curve at V = 0.5 would indicate the seat share that the Democratic party would receive, if they received a 50% vote share.
Here is where the ideal of partisan symmetry comes into play: if this height is not at 0.5, then the two parties are not receiving the same seat share when their vote shares are both 50%.
Thus, measuring how far this height is from 0.5 is a measure of partisan asymmetry. When this height is calculated from the seats-votes curve under the assumption of uniform partisan swing, it corresponds to the Partisan Bias, which can be calculated as shown in Figure 4.
Similarly, if we assume that uniform partisan swing is reasonable, then the horizontal location of this curve at S = 0.5 would indicate the vote share that the Democratic party must achieve in order to receive a 50% seat share.
Again, if this horizontal location is not at V = 0.5, then the two parties need different vote shares in order to achieve 50% of the seat share. This horizontal distance is another measure of asymmetry, and it corresponds to the Mean-Median Difference:
MM = median(V1, V2 ,… ,Vn) — mean(V1, V2, … Vn).
For our fictional state, this works out to be 0.09, which is the distance that we see labeled as MM in Figure 3.
These metrics are based in a very reasonable idea: partisan symmetry. And they are relatively easy to calculate and understand. They are also available to the public at sites like Dave’s Redistricting App (DRA), which allows users to “create, analyze and share district maps.”
Members of the public can, and do, create, analyze, and share district maps at DRA.
Indeed, during the last redistricting cycle, plenty of citizens tried their hands at redistricting their states with the help of DRA.
Some, like Matthew Petering of Wisconsin, even wrote amicus briefs to the court regarding their maps, as well as other maps drawn for the state.
Dr. Petering and other citizens cited values like the Mean-Median Difference and Partisan Bias of the maps, and repeated the statement from DRA that larger positive values indicate bias towards the Democratic party, and negative values that are larger in absolute value indicate bias towards the Republican party.
While I applaud the public’s interest and involvement in the redistricting process, I have to say I cringe at those claims, because they are simply not true. The idea of uniform partisan swing is an assumption — it’s not the way swings really happen— based on a fixed map with changing district vote shares.
We could instead use fixed partisan data (such as a fixed statewide election result), create different districting maps for that state (as the public is encouraged to do on DRA), and empirically test whether it is indeed the case that larger positive values indicate bias towards the Democratic party.
As recent work by myself and others currently pending publication shows (citations at end), the values of the Mean-Median difference and Partisan Bias do not generally increase as the number of Democratic-won districts increases, and they do not decrease as the number of Republican-won districts increases. Take, for example, the Massachusetts State House. One can use a hill-climbing method to look for districting maps for the MA State House with a large number of districts won by a particular party.
Suppose it were true that large positive values of a metric indicated bias benefitting the Democratic party. Then, when we look at maps where Democrats would win m districts, we would expect the metric values on those maps to increase as m increases.
Indeed, this is the case for some metrics such as the Efficiency Gap, the Declination, and the GEO gap. But it is not true for the Mean-Median Difference and the Partisan Bias. We can easily see this effect in Figure 5. In this figure, the horizontal axis corresponds to the number of districts which would have been won by Democrats (using the 2018 Senate race as partisan data), and the bars correspond to metric values for the corresponding maps. We can see that the Mean-Median Difference and Partisan Bias do not generally increase, while the other metrics do.
Moreover, and perhaps surprisingly, this does not seem to be unique to a particular state or map. Indeed, for all of the 18 maps that my collaborators and I observed, a similar effect was true: extreme values of the Mean-Median Difference and Partisan Bias did not correspond to maps with extreme number of districts won by a particular party.
These results matter. When well-established and otherwise excellent sites like Dave’s Redistricting App tell the public that “larger values mean more bias,” the public has no reason to believe otherwise.
Beyond the general public, commissions such as Michigan’s Independent Citizens Redistricting Commission were told that a positive Mean-Median Difference “is evidence that the party had an electoral advantage from the redistricting scheme.” If we take “electoral advantage” to mean more districts won, we can see that this simply isn’t true either.
While all of this may come across as a cautionary tale, I see it also as a place for optimism. Hopefully, these empirical results can convince sites like Dave’s Redistricting App to be a bit more precise in their descriptions of what metrics like the Mean-Median Difference and Partisan Bias can and cannot do. And hopefully we as a community can inform redistricting commissions that limiting the Mean-Median Difference and Partisan Bias to values near 0 will not guarantee that the resulting map is less biased.
More generally, I’m optimistic that the mathematical community will continue to study and develop better tools for courts and commissions to use. Not only can we use our voting power as citizens, as mathematicians we can use our mathematical and computational skills to improve the redistricting process, and thereby improve the democratic process. This is not just an interesting math problem with a feel-good application. Our democracy may depend on it.
Ellen Veomett: published research on election mathematics
“The Geography and Election Outcome (GEO) Metric: An Introduction” with M. Campisi, T. Ratliff, and S. Somersille Election Law Journal (2022) https://www.liebertpub.com/doi/10.1089/elj.2021.0054
“Declination as a Metric to Detect Partisan Gerrymandering” with M. Campisi, A. Padilla, and T. Ratliff, Election Law Journal (2019) https://doi.org/10.1089/elj.2019.0562
“The Efficiency Gap, Voter Turnout, and the Efficiency Principle” Election Law Journal (2018) https://doi.org/10.1089/elj.2018.0488
Currently in review:
“Bounds and Bugs: The Limits of Symmetry Metrics to Detect Partisan Gerrymandering” with D. DeFord (submitted 2024)
“Don’t Trust a Single Gerrymandering Metric” with T. Ratliff and S. Somersille (submitted 2024)
“Connected Recursive Bijection and Perfect Hierarchical Matchings” with I. Ludden, K. Chandrasekaran, and S.H. Jacobson. (submitted 2023)