Census 2010

Housing vacancy rate in Wisconsin, 2010
Housing vacancy rate in Wisconsin, 2010 | Jan Willem Tulp

What works

The “Ghost Counties” interactive visualization by Jan Willem Tulp that I review in this post won the Eyeo Festival at the Walker Art Center last year. The challenge set forth by the Eyeo Festival committee in 2011 (for the Festival happening in 2012) was to use Census 2010 data to create a visualization using Census data that did not rely on maps…or if it did rely on maps, it had to use maps in a highly innovative way. This is an excellent design program – maps are over-used. Yet it’s one thing to assert that maps are over-used and another thing to produce an innovative graphic representation that is not a map.

Tulp does a great job of leaving the map behind. He also does a phenomenal job of incorporating a large dataset (8 Mb of data serve the images in the interactive graphic from which the stills in this post were captured). The graphic has a snappy response time once it has loaded and his work makes a solid case for the beautiful union of large data and clear representation thereof.

The color scheme is great and reveals itself without a key. Those counties with low vacancy are teal, those sort of in the middle are grey-green, and those with high vacancy are maroon. The background is light, but not white. White would have been too stark – like an anesthetized space. He experimented with darker backgrounds (see his other options at his flickr stream here) but those ended up presenting an outer space feel. The background color he settled on was (and is) the best choice. Background colors set the tone for the entire graphic, along with the font color, and Tulp’s work is positive evidence of the value of carefully considering them.

Pie charts might be better than circles-in-circles

The dot within a dot is difficult for the eye to measure. Pie charts- which I only recommend if there are very few wedges – would have worked well with this type of data because there are only two wedges (see here for an example of a two wedged pie chart). I just finished reading Alberto Cairo’s important new book The functional art and he had a solid critique of the circle-in-circle approach that helped me realize what’s so appealing, but just plain wrong, about circles-in-circles:

“Bubbles are misleading. They make you underestimate difference….If the bubbles have no functional purpose, why not design a simple and honest table? Because circles look good. (emphasis in original)”

In this case, a wedge in a pie chart could have represented the percent of total housing units occupied.

Why is it so hard to ‘see’ rural vs. urban?

The x-axis is a log scale for population size. It’s clear from what we know about the general trend towards urbanization that we would expect urban areas to have lower vacancy rates than rural areas. Even in 1990 – two census surveys before the 2010 data that was used here – the New York Times ran a story about the population decline in rural America and there has been widespread coverage of the trend towards urbanization by both journalists and academics (the LSE Cities program does nice work).

Housing vacancy rate in Minnesota, 2010
Housing vacancy rate in Minnesota, 2010 | Jan Willem Tulp
Housing vacancy rate in New York, 2010
Housing vacancy rate in New York, 2010 | Jan Willem Tulp

The two states shown here – New York and Minnesota – both have some big cities and a whole of small cities in rural areas. Some small cities are also in suburban areas. That’s a problem with this visualization, the distinctions that have been established in academic literature between rural, suburban, ex-urban, and urban are difficult to pick out of this visual scheme. While it would be difficult to find a sociologist who could wrangle the data to produce this kind of visualization, I imagine many of my intellectual kin would be confused by this visual scheme and demand to return to a map-based graphic because at least in that case they could see patterns associated with the rural-urban spectrum the old-fashioned way. I am not wedded to the notion that a map is the only way to “see” the rural-urban spectrum, but the current configuration makes it difficult to think with the existing literature about housing patterns even though the attempt to distinguish between population size was built into the graphic on the x-axis. Population size is not always a great proxy for urban vs. rural, so it is a weak operationalization of spatial concepts social scientists have found to be meaningful. For instance, a small, exclusive ex-urban area filled with wealthy folks and their swimming pools is conceptually much different from a small, depopulating rural town even if they have roughly similar population sizes.

It is important in a research community to build on good existing work and reveal the weaknesses of existing work where it’s falling short. Either way, it is a bad idea to ignore existing work. Where a project does not relate to existing work – neither building momentum in a positive direction nor steering intellectual growth away from blind alleys – it will likely become an orphan. In this case, the project is only an orphan with respect to urban scholarship. As a computational challenge, it most definitely advanced the field of web-based interactive visualization of large datasets. As a visual representation, it adhered to a design aesthetic that I would like to see more of in academic work. But as a sociological analysis, it’s nearly impossible to ‘see’ clearly or with new eyes any of the existing questions around housing patterns. It is also my opinion – and this is far more easily contested – that it does not raise new important questions about housing patterns in urban, suburban, or rural America either.

My critique here is not that all data visualization is pretty but useless and that we should stick to our maps because they tie us to our existing disciplines and silos of knowledge. Rather, my critique is that in order for data visualization to become a useful tool in the analytical and communication toolkits of social scientists, the work of social science is going to have to find a way into the data visualization community. As anyone who has tried to use Census data knows, looking at piles of data is not synonymous with analysis. While Tulp’s graphics certainly present an analysis, that analysis seems to have turned its back on a fairly sizable swath of journalism on urbanization, not to mention the hefty body of academic work on the same set of topics.

Graphic Sociology exists in part to find a way to keep social scientists motivated to produce higher quality infographics and data visualizations than what is currently standard in our field. But the blog is equally good for sharing a social scientific perspective with computer scientists and designers who are ahead of us with respect to the visual analysis and display of social data. There is a way to bring the strengths of these fields together in a meaningful, positive way. We are not there yet.


Cairo, Albert. (2013) “The Functional Art: An introduction to information graphics and visualization.” Berkeley: New Riders.

Eyeo Festival.

Tulp, Jan Willem. (2011) “Ghost Counties” [Interactive Visualization] Submitted to Eyeo Festival and selected the winner in 2012.

Growth of the population of Hispanic and Asian children in America, 2000-2010
Growth of the population of Hispanic and Asian children in America, 2000-2010 | Wall Street Journal

What works

It’s nice to see all of the Census 2010 data coming out and generating infographics. This one comes from the Wall Street Journal which distilled the above panel of stills from an interactive graphic which also has maps for white and black kids and detailed tables by race and geography.

Though the two stills here do not do a good job of demonstrating the claim in the headline, that there are fewer white kids, the bar graph on the right and the interactive graphics, do, in fact, back up the headline claim. We could quibble about the flipside to the headline – rather than saying there are fewer white kids, should it have pointed out that there are more Hispanic and Asian kids? – but quibbling about headlines isn’t my concern here. Other news outlets did take that spin on the same set of information.

What I like here is that the graphs did not try to show everything all at once – each of the four racial categories included in this series gets its own graph. Yes, there are more than four racial categories and yes, it would be nice to see where other racial categories fit. But inasmuch as I am concerned with the overuse of mapping data, especially when those maps get layered up with all sorts of information that makes them illegible, I am happy to report that these folks had the commonsense to generate one map for each of the racial categories they decided to depict.

One of the incidental facts portrayed here is that the country continues to tip towards the southwest. The big red ‘decrease’ blobs appear in the northeast for whites and blacks and are not compensated for by blue ‘increase’ blobs among Hispanic and Asian births. Because I wouldn’t necessarily have picked this up from looking at a table, I think it’s clear to say that the use of maps was justified in this case because at least part of the story is geographic in nature.

What needs work

I have a tough time with the blob maps. I can get an overview but I have a tough time doing additions, let alone additions and subtractions. The bar graph that appears in the stills helps present the same information in a different way. In this case, the maps can only display the big picture. The bar graph is necessary to help understand how all these blobs add up. In particular, the top graph shows a large increase in the number of mixed-race kids by percentage, but this group is still so small that the absolute numbers wouldn’t even register on the blob maps.

Food for thought

The second, vertical, bar graph is my favorite part that ties all of rest of the information together. We see that white kids still make up more than half of the children born in the US, though it appears that this may not be the case in 2020. We see most clearly that Hispanic kids are growing faster than any other category of kids. I’m going to take this moment to note that Hispanic-ness is an ethnicity, not a race, and that many Hispanic kids are considered white. Remember that Central and South America were colonies of Spain and Portugal and we tend to consider Spanish and Portuguese people white. I’m not prepared to get into a discussion about what it takes to be white in America, just pointing out that Hispanic people are, in many cases, racially white even though they may consider themselves to be ethnically Hispanic. It is possible to hold both of those identities at the same time. Furthermore, if we look back in history there was a time when Irish and Italian immigrants were considered non-white. I have wondered if today’s Hispanics are similar to yesterday’s Irish and Italian immigrants in the sense that they will eventually come to be seen as white ethnics.

This is a debate I’m hardly qualified to comment on and I welcome others who are more qualified to take up this issue in the comments. In particular, I’m wondering how the numbers matter. If there are more and more Hispanics born in the US, will that mean that they are not under pressure to assimilate to mainstream white-ness and will have more opportunities to maintain a distinct identity? Or will the decreasing number of white folks mean that there is pressure to recruit new populations into the white identity as part of our one-drop anti-black legacy? I don’t know what this all means, but I do feel like the numerical balance is meaningful.


Frey, William H. (2011) Brookings Institution analysis of 2010 Census Data.

Dougherty, Conor. (6 April 2011) New Faces of Childhood: Census Shows Hispanic and Asian Children Surging as Whites, Blacks Shrink. Wall Street Journal.