The Washington Post made an admirable attempt to illustrate the forces leading to the shutdown. However, their graphic fails on two important levels: the chart doesn’t support their thesis and their thesis is wrong.
Let’s address the technical issues first.
The vast majority of the graphic is a cartogram of the U.S. House districts. Maps may be a good tool to draw readers in, but they’re inappropriate for anything but a geographic thesis. If their argument were that the Northeast and Pacific coast are Democratic strongholds and the South and West Republican ones, it would be clearly demonstrated by the Post’s map (though it’s pretty much common knowledge these days). But they sought to prove the polarization of the parties, not their geographical distribution. It’s a mistake to devote so much ink to an inappropriate tool.
My next concern centers on the legend. The Post has categorized each district by the winner’s party and as safe or competitive based on the share of the vote obtained by the winner. Notwithstanding binning concerns (conclusions can be very sensitive to the choice of the bin cutoffs), this two-axis categorization poses a problem when graphed on a single axis. The problem rears its ugly head with the legend’s center point label: 47%.
Due to our first-past-the-post voting system, candidates can be elected with less than 50% of the vote, so it’s unwise to situate the Republican and Democratic vote share on two ends of a diverging scale. If the leftmost point is 100% Democrat (0% Republican) and the rightmost point is 100% Republican (0% Democrat), then the center point must be 50/50. Anything else is nonsense. Fortunately fixing this issue is easy.
My biggest concern with the Post’s visualization is that it represents only a single moment in time. Attempting to explain an unprecedented situation without any historical context is just lunacy. If this situation is unique, the chart must show that.
So I went and grabbed the Federal Elections Commission data. The first thing I did was build a chart of the distribution of the House over time, binned into the same ranges as the Washington Post’s graphic (c02e9fc).
Immediately I noticed something: the trend the Post claimed simply does not exist. We can see that over the past decade, the share of representatives in “safe” districts has varied considerably, but with no distinct trend. The Post argued that since “just 31 Republicans and 31 Democrats won their seats with 54 percent of the vote or less” that Congress is more partisan than usual. But compare that to election year 2002, when a mere 36 representatives total were elected in “close” races. Something is missing in the Post’s analysis – their thesis cannot fully explain the current political quagmire.
I decided a chart with more nuance would be needed. Since we want to examine the distribution of vote share, a natural chart type is a cumulative distribution chart. These can convey a significant amount of information in very little space. However, they are a bit unintuitive and can be hard for many to read. Such a chart is rarely the right option for a final graphic, but it can be a good tool for exploring a dataset.
The first cumulative distribution chart I built was simply the overall distribution broken down by years (a663246ab5). This is a bit hard to read, but shows a clear trend. Look just at the horizontal 50% line. This represents the median vote share, and the leftward year-on-year movement shows that it has been steadily decreasing over the past decade. This pretty directly refutes the Post’s thesis that gerrymandering has created safe districts.
But gerrymandering is a complex operation, and we’ll need even more nuance to see the full effect. The most common strategy is known as pack-and-crack. Opposition strongholds are cracked, distributing voters among favorable districts to eliminate strong, coordinated opposition. Small groups of opposition voters are packed into a single district to eliminate their effect on polls elsewhere. I’m not a political scientist, but my guess is that the net result would be to decrease the vote share of incumbents (who are more likely to win anyway) and increase the vote share of new representatives (who are carried into office on the back of a gerrymander).
To get a clearer idea of this I split the graphic into two charts, one limited to incumbents and the other showing freshmen representatives (e586a87690). I also only plot the years before and after redistricting, since these should most effectively show the gerrymander.
Look at the second chart, showing non-incumbent representatives. Note that in elections before redistricting, they tend to be elected with 55% of the vote. After redistricting, the median freshman representative is elected with 58% percent of the vote. Gerrymandering is alive and well in the United States.
I also split it up by party but there wasn’t a discernible trend. I think it would be interesting to split based on control of the state government, since the states are ultimately responsible for redistricting. Perhaps I’ll look into that in the future, but for now I want to focus on the shutdown.
If gerrymandering isn’t enough to explain the political mess, we need to expand our search. One of my influences while working on this project was an xkcd cartoon showing the history of Congress. Go take a look at it, it’s awesome.
In the fine print they note that their data comes from a group of political scientists: Poole, Rosenthal et al. These folks have a two-dimensional coordinate system for mapping congressional representatives called DW-NOMINATE. In the modern era, the first dimension roughly corresponds to the liberal-conservative spectrum of politics. One advantage of their model is that it is based solely on voting records, and thus this liberal-conservative axis is an emergent property, rather than being specified in advance. This lends a bit of support to the idea that there is some merit to their analysis. I’d recommend looking into their work, it’s very interesting.
Fortunately they make available datasets of their supercomputer-crunched coordinate system for every Congress throughout history. Since senators and representatives serve overlapping terms, they can put successive congresses on the same scale (allowing for individuals to evolve their views, i.e., move their position). So I built a chart showing the evolution of this liberal-conservative dimension in the modern era. My first take simply plotted every individual member of the House (9def35ddee). As you can see this is pretty tough to meaningfully read, and it’s slow to render, too.
The natural thing is to do a little statistical mumbo-jumbo. It makes sense in particular to smooth this dataset since we’re not all that convinced that the individual scores are exactly right. A standard box plot shows the 25th and 75th percentiles and the median, and whiskers show the 1st and 99th or 5th and 95th or some other percentiles. A natural extension of the box plot over many time periods is an area chart like the one here. This is somewhat inspired by the ideas of Stephen Few.
Here we see quite plainly a growing divide between the parties. For three decades, between 1952 and 1982, some Democrats were as conservative as the bulk of Republicans, and on several occasions more conservative than the median Republican. For sixty years, from 1930 to 1990, there was significant overlap in the ideologies of the members of the two parties. Since 1976, as the Democrats have grown only modestly more liberal, the Republicans have seen a steady, significant rightward drift.
And so today the difference between the median position of the parties is larger than it’s ever been in the modern era.
Check out the interactive version here: http://couchand.github.io/polarization.