The total population of all the red states is equal to the number of people without health coverage in the US.

Design Notes on the Health Coverage Viz

I enjoy reading behind-the-scenes discussions of the design process, such as this one from Jerome Cukier, so here’s the background for my submission Who Shouldn’t Receive Health Coverage? to the Tableau Interactive Political Viz Contest. Congratulations to Adam McCann for his NYTimes-style electoral map and statistical model visualization!

Inspiration

This summer I read a blog post by Matt Miller at the Washington Post remarking that the number of people without health coverage in the US is roughly equal to the combined populations of Alaska, Arkansas, Connecticut, Delaware, Hawaii, Iowa, Kansas, Kentucky, Maine, Mississippi, Montana, Nebraska, Nevada, New Hampshire, New Mexico, North Dakota, Oklahoma, Oregon, Rhode Island, South Dakota, Utah, Vermont, West Virginia, and Wyoming. That’s half the states in the country, 49.9 million people as of the 2010 census – 48.6 million in the latest count, just out this week.

The total population of all the red states is equal to the number of people without health coverage in the US.

The total population of all the red states is equal to the number of people without health coverage in the US.

My thought was to create a visualization that would generate a cognitive dissonance in the user, a dissonance between our highest ethical values of caring for one another, of loving thy neighbor, of treating one another as we would want to be treated, and the moral choices inherent in our socioeconomic system that threaten people with the loss of their job and their own health coverage if they take a day off to take care of a loved one, that shorten lives through lack of primary care coverage, that curtail freedom by keeping people in awful jobs for fear of losing coverage due to a “pre-existing condition”. By having the interactor actually make the choices about who gets health coverage and who doesn’t, perhaps they’d consider their own position.

Rather than frame the choice of who gets health coverage and who doesn’t around states, I thought demographic breakdowns would ultimately be more compelling, since all of us can put ourselves in any number of those categories. By offering the user combinations of gender, age, race, income, ethnicity, sexuality, and religion to get to the ~50 million without coverage, the dissonance can be heightened. Having those choices represented by familiar people icons on a map of the lower 48 (sorry, Alaska and Hawaii, Tableau wants to put you too far away) connects the user to place and quantity:

In addition, the exceptional nature of our healthcare system is highlighted in the comparison bar chart that shows how many millions of people are excluded from coverage in the US while other notable countries ensure universal coverage:

The Data

The real question here was regarding the dimensionality of the data, since I wanted the user to be able to pick and choose among the different demographic categories. I looked around for data sets that had combinations of the demographic categories, and couldn’t find anything freely available that met my requirements. I started diving more deeply into the 2010 Census data, and ran into two limitations: time, and size. I didn’t have the time to put the aggregations together that I’d need, and in any case Tableau Public is limited to 100,000 rows. The Census Bureau puts out a wide variety of briefs on different dimensions – race, age, religion, etc. but each only goes down a single dimension at most, like age and sex, and not across. In addition, some demographics like getting at single moms and dads are only reported at a household level. So whatever I did, I was going to have to do some reshaping of the data.

I contemplated creating a solution that would pick the “best” set of dimensional data for the chosen demographics in the view, but it seemed like that would be a massive amount of work for very little gain in accuracy in the final view. For example, I knew that on the map I wasn’t going to be able to display more than several hundred points, so the resolution was going to be one mark per 100K people or more. So, I went with the less-accurate option of gathering data for each category as a proportion of the whole US population. Where data was reported at a household level, I used the data sources to determine the average size of the household in order to get that back to individual totals, and put together an Excel spreadsheet:

And here’s the compiled data: health coverage.xlsx

 

The Selector and The Cross Product

The plan was to have a section of the viz where the user could pick a set of demographic categories. What I really wanted to have was a bunch of checkboxes with some buttons, however Tableau doesn’t work that way. For this kind of interactivity in Tableau, there are Parameters, Quick Filters, Action Filters. I explored parameters a bit, but they didn’t work because there are no “checkbox” style parameters, and no multi-select parameters. Both Quick Filters and Action Filters weren’t ideal because they reduce data available, when what I really wanted was to have the user pick the demographics categories in one part of the view and then have that drive display in the other part of the view.

So that led to me thinking about the map. In order to have Tableau have the data to draw 500 points, Tableau was going to need 500 rows potentially no matter what filters were chosen. That meant a cross product using Custom SQL between the points and the demographics. I decided to use an Action Filter based on a worksheet for performance and since it’s more controllable in terms of formatting and what worksheets it can act on. Action Filters also have an advantage in terms of being able to click and drag to select a number of rows at once.

Making the Map

After initially looking into working from census data, or a random batch of zip codes, I did the following:

  1. In Excel, generated 1200 random latitude/longitude combinations between 24 and 49 degrees latitude and -67 to -124 longitude.
  2. Then I loaded that into Tableau:
  3. Manually excluded from the view all the points that were over water, in Mexico or Canada, and then the extra points to get to 500.
  4. Duplicated the worksheet to a crosstab.
  5. Exported the crosstab back to Excel where it could be used as a data source in the cross product.

Color and Shape

Color-wise, I tried out a number of combinations and settled on red, white, and blue. Partially for those being the colors of the US flag, mostly for red being the color of the Red Cross, blood, and the Republican Party and blue being a nice counterpoint. I also tried a number of shapes and built some custom shapes before settling on the familiar Tableau unisex icon.

Table Calculation Goodness

With the 70 demographic categories and 500 points, that’s 35,000 rows. Since the calculations to determine the intersection of the chosen demographic categories were going to need to be at level of the demographic and demographic category, while the points were in the view, that called for table calculations.

There are five table calculations that really drive the dashboard, two of them are filters. The Map view uses a filter with a nested table calc 4 levels deep to calculate the total number of points to show based on the user’s set of demographic category selections, and since the user is likely to have picked more than one then another table calc filter to get rid of all the duplicate marks created by the cross product. In addition, there are other calculations based on those to show legends and labels. Keeping track of the Compute Using settings was rather nightmarish at times as I was tweaking the view.

Map Legend

There’s all this white space on the map (ok, the white space is Canada, home to coins named after my favorite bird) and given the quantity of text and data in the view, I thought I’d see what I could do to just draw another mark on the map and label it. I tried dual axes with a latitude/longitude as calculated fields and got a lot of duplicate marks. Then I tried blended data and that didn’t work. Then I got a great assist from Shawn Wallwork, and went with adding a point to the returned data, actually 70 of them, one for each demographic because that was easiest to work with the filters and table calcs. Then some additional calculated fields could identify the mark for coloring and labeling.

Annotating the Bar Chart

The bar chart had a big column for the US results, and a series of zero’s for the UK, Canada, and Mexico, all countries that have universal health coverage. That led to a lot of white space there, and I thought it would be good to add some informational annotations to the bar chart as different selections were made. For example, the Congressional Budget Office predicts that even with the Patient Protection and Affordable Care Act of 2010, 23 million people in the US would still not have health coverage. Area and Point Annotations wouldn’t work for these kinds of messages because they don’t have access to all the calculations in the view, and Mark Annotations fail because each time the view updates a new Mark is drawn. I ended up using a dual axis to draw an invisible Gantt Bar, since Gantt Bars give the most control over where the Mark Labels go. The Size of the invisible bar also changes depending on how many people are selected in order to change the position of the Mark Label.

I discovered a Tableau bug along the way. I’d alias the axis heading of the invisible bar to “ “ (space) to white it out, and then when Tableau refreshed the view (or uploaded to Public) it would replace that with the name of the calculated field. That’s why there’s the “—“ in the lower right-hand corner of the Bar Chart.

Magically Multiplying Action Filters

At this point, I was into the homestretch, only there was an issue that I still haven’t figured out where the Action Filter on the dashboard would magically multiply instances of the Filter Sets on the Filter Shelves in the worksheets. This would affect one or more of the table calculation results, and then I’d be trying to get rid of the Filter, rebuild it, and makes sure all the calculations still worked. To use the Maine dialect, that was wicked painful.

My Opinion of the Viz

There are a few things that aren’t working as well as I’d like:

  • The population calculations could definitely be more accurate, for example almost nobody under age 62 is eligible for Medicare, but the selector would allow the user to pick minors on Medicare.
  • I’d intended for there to be one unisex icon per 100,000 people, and then if the user selected more than 50,000,000 people the map would change in some way, like by using a chloropleth view to change the background. However, after the dual-axis mess I ran into trying to get a legend on the map I gave up. The map only draws 500 points and it’s fairly easy for the user to make a set of selections that go over 50,000,000 people, fill up the map and then not much happens from there.
  • I ran out of creativity in terms of tooltips over the map, and just got rid of all of them.
  • The table calcs can be slow, especially if the user clears everything from the Action Filter. There are likely some optimizations that can be done there.

All in all, though, I’m pretty happy with the viz. I’m a little proud of the particular hack of figuring out how to get Tableau to use filtered data along two dimensions (demographic and category) to perform a complex series of calculations to generate potentially hundreds of records along a different dimension (the map points). I’m also hoping that as the viz is in the wild now it might trigger some new thoughts about the state of health coverage in this country.

Thanks for reading, and as always comments are welcome!

Please add your thoughts and perspectives