Archive: 2009

Example of a wordle
Example of a wordle

What Works

Wordles are generated by inserting a block of text into an algorithm that filters out typically common words like ‘the’ and ‘they’ and then picks out frequently used but relatively uncommon words like ‘Mexico’ and ‘resistance’. These images get used occasionally in academia on websites where people want to use images to represent something like a talk or paper where they don’t actually have images that go with that talk or paper, but they do have an abstract, transcript, or full text of the talk or paper. In the sense, that a wordle is a relevant image to stick in the hole reserved for images. In another sense, I encourage you to go out and find public use images that are available, relevant to the talk/paper, and intellectually provocative rather than creating a Wordle.

What Needs Work

Wordles appear to tell you something you didn’t know by revealing patterns. In fact, Wordles are bizarre artifacts that sit somewhere between images and text. As images, they are fairly ugly agglomerations. As text, they don’t make much sense. In fact, I think Wordles are excellent when referred to as icons of the dystopic side of instantly available, decontextualized factoids that anchor the downside of the internet era. With but a click a block of carefully crafted (well, maybe it was carefully crafted) writing is blown apart and reconstructed as a brightly colored lettered blob that is somehow supposed to indicate the essential components of the piece of writing. A bit insulting to the person who wrote the text, if nothing else. A good abstract or even list of works cited says more than a Wordle in a clearer fashion.

Relevant Resources

Jonathan Feinberg of IBM Research created the Wordle Generator.

key

Web Map of thesocietypages.org
Web Map of thesocietypages.org

What Works

This is a map of a website.

Let’s reflect on that seemingly straightforward sentence for a moment. This is a map of something that does not exist in space. Baudrillard comes to mind here – “Abstraction today is no longer that of the map, the double, the mirror or the concept. Simulation is no longer that of a territory, a referential being or a substance. It is the generation by models of a real without origin or reality: a hyperreal. (from Simulacra and Simulations, Baudrillard)” I’ll let you decide whether or not you want to accept the notion that there is such a thing as the hyperreal without any further digression down that rabbit hole.

The visual elegance here cannot be overstated. It’s a simple non-cartesian network map with absolutely no frills, labels, anything besides a hint of color. As a graphic, what works here is that, if you happen to have a basic understanding of how websites are built, you can quickly see what kind of site you’re looking at. Lots of blue means lots of links, lots of green means the designer is using a lot of css, lots of red (tables) is kind of old-school (not in a good way), and so on. But it does require some knowledge of how websites are put together to decode this representation. That being said, it’s a brilliant way to reveal the skeleton supporting the visual skin of the websites you visit. See the links at the bottom to be taken to the applet that will allow you to map out the structure of any site you like.

Though this may not at first glance appear to have anything to do with my post earlier this week about John Snow, both Snow and the Aharef web-map generator represent tools for the examination of patterns. Pattern recognition is an undersung analytic tool in the social sciences.

What Needs Work

I wouldn’t mind a little more color in order to break out the grey “other” category a little more. I would also love a color that indicated use of javascripting and flash, but I understand that would be a different technical hurdle altogether. If this kind of map could be combined with page traffic information, we’d really have an amazing graphic. Just imagine that the traffic following each link could be mapped, say by making the node larger or smaller based on flow (or we could stick with the color thing, and lighter hues would indicate less traffic while darker ones indicate more traffic). It would also be nice to get some meaning related to the length between nodes. Right now that distance seems fairly arbitrary, constrained by the size of the viewing window.

Relevant Links

Generate your own webmap for any site

Original post about this applet tool by it’s creator Aharef on Aharef

Baudrillard, Jean. (1998) Simulacra and Simulations from Jean Baudrillard, Selected Writings, ed. Mark Poster.

USA Today Flash Animated Graphic accompanying the headline “Deaths Down on America’s Roads”

What Works

Nothing is working here and I’m not just saying that because it’s flash and I can’t repost it. Please link through for a hot minute and look at it anyways.

What Needs Work

My problem with this graphic is that it is ONLY a map of the US, except for the few seconds when you roll your mouse over it. Even then, you don’t end up seeing a pattern, you just see little pop up windows with some numbers in them. Information graphics need to artfully, intelligently, dare I say cleverly weave the information into the graphic so that the two become greater than the sum of their parts. None of that happens here.

The map of the US is still just a map of the US. No shading, no numbers, no way to tell that we’re talking about traffic deaths. Even just mapping out the interstate highway system would have given a hint of a visual clue to tell us what we’re talking about. In the previous post, Snow stacked bars to indicate dead bodies. Maybe it’s a little over the top, but if we are addressing the notion of a change in body count, I would like to see some visual representation either of bodies or of change (change is more abstract and probably more appropriate for USA Today than a visual representation of a body count). Furthermore, I want to know if there really is a relationship between gas prices and body counts which *could* be explored looking across states. States tax gas at different rates resulting in variations from one state to the next. Sensitively factor in income and unemployment and we might be able to get a sense of how much gas prices impact mortality on the roads. Even more interesting would be whether it’s the fact that people aren’t on the road at all that prevents them from dying out there (no gas = no go) or if it is somewhat more subtle – perhaps people drive slower to be more fuel efficient rather than staying home and it is the slow down, not the no-go that keeps people alive. The more likely scenario would also point out that cars continue to get safer and that seatbelt laws work. If we could look at the data over time, we’d have a better idea how more quickly traffic fatalities dropped in 2008 than in other recent years, which would help factor in the cars-are-safer-now + more-states-have-seatbelt-laws effects.

This graphic falls woefully short of even hinting at any of these questions. I wish they had left it out altogether, forcing everyone to read the article in full.

John Snow - Mapping Cholera 1854
John Snow - Mapping Cholera 1854

What Works

This is a combination of a map and a chart whose creation helped epidemiologists understand that cholera was not caused by a ‘miasma’ carried by the fog from the river, but rather was a germ carried in the water. It’s one of my personal favorite early examples of information graphics as a tool not of publication, but of analysis and discovery. Snow mapped the area around the Broad Street pump and then represented deaths with bars (not dots as some later cartographers have done when re-presenting Snow’s maps). The bars end up looking like stacked bodies, reinserting the gravity of the situation into the fairly sterile context of the map as info graphic.

The pattern is imperfect, but clear. Proximity to this well is directly proportional to mortality risk. The point of this entry it to encourage the use of information graphics not only in the publication stage of the research process, but also in the analysis stage. Granted, epidemiology isn’t a social science, but this is a classic example that sets the scene for contemporary examples of graphics as tools of analysis.

What Needs Work

There are other more comprehensive maps of the whole neighborhood that show the patterns even more clearly. What I have here is just a close up, probably a mistake on my part. The full version is here as a pdf. The romantic in me wanted to restrict this post to the original grainy, scanned map* drawn by Snow himself.

The realist in me notes that even though I believe the creation of information graphics can be used as analytic tools, the story in the John Snow case isn’t a perfect fit. An article by Brody et al in The Lancet points out that, “Snow developed and tested his hypothesis will before he drew his map. The map did not give rise to the insight, but rather it tended to confirm theories already held by the various investigators.” So Snow didn’t get his brilliant insight just by examining the map but he did use the map as an analytical tool later in the process to help confirm his hypothetical hunches. It wasn’t like he just threw the map/chart together to present at a conference or while he was writing up an article which is how I feel many social scientists end up using info graphics.

*This version is actually the second version though it’s main difference from the very first map is that the pump has moved just slightly off from the exact corner of Broad Street closer to the house of 18 deaths.

Relevant Resources

John Snow website at UCLA School of Public Health where I found many maps.

Brody, H., M. R. Pip, et al. <2000) “Map-making and myth-making in Broad street: The London Cholera epidemic, 1854.” The Lancet 356, (9223): p64-68.

Problem Set at Princeton - Marriage Patterns in France from 1968 - 1987
Problem Set at Princeton - Marriage Patterns in France from 1968 - 1987

What Works

This example comes from a Princeton problem set in the Research Methods in Demography, a bit unexpectedly. What works is that the gently swooping shape is elegantly intriguing – an eye grabber that gets more interesting the harder you look at it. There is something to be said for beautiful forms, but unless there’s substance, info graphics that are only beautiful disappoint like vinyl siding. The fact that this one happens to generate such a fetching shape that it has been repeated throughout branded America is a real triumph.

Each line represents one cohort. The slope indicates the coherence within that cohort to age of first marriage – the steeper the line the more quickly the entire cohort goes from being single to being married. Later cohorts produce flatter slopes, indicating that there is a wider spread across ages of first marriage. It’s also easy to see that the age of first marriage slowly creeps up over time.

Note the popularity of this shape elsewhere:

New York Philharmonic Logo
New York Philharmonic Logo

What Needs Work

It’s not clear just which line goes with which cohort. Sure, demographers and pop culturists alike know that age of first marriage has been increasing over time and will assume that the cohorts who marry later are, in fact, the later cohorts, if we had different data that didn’t show such a smooth trend from year to year, it could be difficult to pick out which line represented just which cohort. Say there was suddenly a $10,000 incentive attached to getting married by age 22. Get married before your 22nd birthday and a giant $10,000 check arrives. That would push back the collective age at first marriage but in this chart, that line would just get buried among the earlier cohorts, or so I would predict. In this case, I might have recommended adding a year marker to every fifth line or so, just to reassure me that the pattern is smooth over time.

In 1900 the median age at first marriage was 21.9 for women and 25.9 for men and then these ages dropped til 1957 when they started rising again. Just saying. Age at marriage doesn’t have to keep going up.

Relevant Resources

German Rodriguez (2006) Office of Population Research, Princeton University. Problem Set 4: Marriage in France Research Methods in Demography.

US Census Bureau (2004) Estimated Median Age at First Marriage, by Sex: 1890 to Present in table format

USAID map of the area in and around Darfur
USAID map of the area in and around Darfur
BBC map of Gaza 4 January 2009
BBC map of Gaza 4 January 2009

What Works

The best thing about the map of the camps around Darfur is that it exists at all. After looking at some of the elaborate maps that have been part of the news coverage of Gaza (see the one here, click through on the caption for a larger image) and earlier, of the bombings in Mumbai, I assumed I would be able to find something of similar quality related to the camps around Darfur to sate my curiosity about how big the camps are, where they are, how they are supplied, whether or not they are targets, and so on. But this map from USAID is one of the only things I could find around these interwebs that presented a basic map narrative of the camps in Darfur. Notably, I found many graphics promoting concerts that were fundraisers or awareness-raisers for the people in Darfur. Some of these concert posters and t-shirts got around the (apparently) tricky question of where Darfur is by just using an outline of the continent of Africa.

What Needs Work

The lack of a decent map-narrative around the problems in Sudan/Darfur indicates an uncomfortable fissure in the epistemology of crisis. I’m willing to conjecture that there may be an inverse relationship between perceived cultural differences and the production of ‘fact’ based information around crises. There isn’t an easy way to measure social/cultural difference, but it seems that the greater the degree of “otherness” of the people undergoing a crisis, the more likely the story is to be covered not with an onslaught of ‘hard facts’ that can be diagrammed, mapped, combed, regressed, permuted, computed, etc. but rather the story will be covered by emotive tools like first person narratives, photographs, and even awareness raising concerts, vigils, and that sort of thing.

I would love to hear what readers think about this theory of mine and I’ll continue to look for examples of differences in the use of information graphics across seemingly similar data sets.

Relevant Resources

BBC Map of Gaza Offensive – Week One (5 January 2009) with narrative time line.

NYTimes.com Israel and Hamas: Conflict in Gaza (4 January 2009) with narrative time line.

USAID map of camps in Sudan

USAID page on Sudan

The United States Holocaust Museum Mapping Initiatives Crisis in Darfur. This is a plug-in to googleEarth that layers photos, videos, quotes, and a bit of 2004 information about the camps on the googleEarth map of Sudan/Chad.

Piled Higher and Deeper - PhD Humor
Piled Higher and Deeper - PhD Humor

What Works

Humor is a slippery animal, indeed. I like to think of it as the pinnacle of culture, not in a high culture kind of way, but in a cultural development kind of way. Just think of trying to learn a foreign language. When you can intentionally, subtly be humorous in that language, you know you’re really getting somewhere. If you have never gotten to that point in a foreign language, just listen to kids try to tell jokes. They kind of suck. You end up laughing along because they’re kids and kids telling jokes is funny in itself, not because what they are saying is actually humorous. This is a fairly long winded way to point out that one indicator that telling stories with graphics is thick culture (thanks, Geertz) is that things like the above image are actually funny in a way that they couldn’t be funny in another format. If you had to say to someone, “man, professors spend lots of time on service activities, but the administration really doesn’t reward that or even notice” nobody would laugh. They might sigh and wish the economy were better so they could find a job that didn’t involve sitting on committees.

Bottom line: this works because we have been immersed in graphic storytelling. We get it. It doesn’t work in any other format.

Relevant Resources

Piled Higher and Deeper, a comic strip by Jorge Cham online. If you are a student or professor and haven’t discovered this, I’ll warn you that it could suck away an hour or two of your day if you click through right now.

Higher Education Research Institute at UCLA. The HERI Faculty Survey. There are fees associated with accessing the data but you can get an overview of how data about faculty time commitments is gathered.

This 2006 Obituary of Clifford Geertz in the New York Times does a good job of summarizing his life and work, for those who want to follow up on my parenthetical. His book “The Interpretation of Cultures” is a good place to start. If you want something shorter than a book, “Deep Play: Notes on a Balinese Cockfight” is worth a read.

Link to Bigger Map of Remittances from US to Mexico
Link to Bigger Map of Remittances from US to Mexico

What Works

This map does a great job of demonstrating the granularity of the flow of remittances from particular cities in the US to particular cities in Mexico. It does a very good job of using a single characteristic – financial flow from the US – to illustrate a larger pattern of migration between sister cities in two countries. I talked to a restaurant chef-owner in New York on Monday and she said all of her cooks are from Puebla. This graphic could have told me about the same thing, though it wouldn’t have been able to tell me to look in the kitchen.

The graphic makes great use of color – picking one basic color for each country and increasing that color’s intensity to indicate concentrations of migration activity.

Credit for the article from which this was drawn goes to Raúl Hernández-Coss and credit for the graphic goes to Ryan Morris (I think, it’s really hard to read the fine print).

What Needs Work

Even in the bigger version of the graphic I can’t read the text in the boxes very well. I’m sure this looked good in print, but it didn’t translate well to digital. Still, even without being able to read the explanatory text, the basic point is obvious and legible.

Relevant Resources

Raúl Hernández-Coss. (2007) World Bank Working Paper 47 The U.S.-Mexico Remittance Corridor

Julie Watson (27 January 2009) Yearly Mexican Remittances Drop for the First Time in the Washington Post.

Matthew Quirk (2007) The Mexican Connection in The Atlantic. (This is where I first spotted the graphic)

Bigger Version of the Graph

Click Here to View the Animation by Aaron Koblin
Click Here to View the Animation by Aaron Koblin

What Works

Click on the link in the caption to go to Aaron Koblin’s site and watch the animation. It’s mesmerizing and I ended up watching it more than once, trying to pick out the patterns. And, in fact, what works about this approach is it’s ability to help quickly identify patterns. Generally speaking, data that is dynamic (usually the change is happening over time, as in this case) is data that may lend itself to this sort of pattern recognition analysis via visualization.

As you watch the whole visualization, you’ll see that Aaron Koblin experimented with three different ways of displaying the same data. He starts with impermanent white lines over a dark background, then globs of oil-ish substance over a white background, then he applies color to the original white-on-black version. I like the political implication of using the oily blobs – that is what we are collectively doing when we’re flying – burning up vast quantities of fossil fuels by using just about the least fuel-efficient form of transportation we’ve got. Vehicles for traveling outside earth’s orbit are even less efficient. I still think the white-on-black version works best because I couldn’t figure out what the colors represented.

I love the total flight counter and the running clock. Adds a great deal of contextual information very subtly.

What Needs Work

I think this animation does a great job of showing what it sets out to show – the flight patterns in the US over a 24 hour period. If there was an intention to include data about the environmental cost, I would have liked something that isn’t quite as subtle as showing the patterns using blobs of oil-like substance. But modeling that sort of data would be even more complicated than what was done here because it would count on knowing how big each plane was – jets use more fuel than smaller planes – and some estimate of how heavy it was – full flights use more fuel than empty ones.

I also wanted to know if this represents all passenger and cargo flights, or if it is just passenger flights?

Relevant Resources

Aaron Koblin’s website and a link to the specific animation related to this post.

For more on globalisation, see Saskia Sassen who was interviewed about her work by John Sutherland at the Guardian in 2004.

For more on the relationship between aircraft and climate change see this slightly outdated 2001 report from the Intergovernmental Panel on Climate Change (UNEP)

NYtimes.com - College Endowments Loss Is Worst Drop Since ’70s
NYtimes.com - College Endowments Loss Is Worst Drop Since ’70s
Remixed College Endowment Graphic
Remixed College Endowment Graphic

What Works

Stories like this one that cover a data driven report should always include an info graphic. But, of course, this is coming from an avid fan of info graphics. Kudos to the NYTimes for including a graphic and for including not only the punchline – the huge drop in college endowments in the very recent past months – but also some context about what college endowments were doing before. I would have liked even more context because fiscal year ’08 was already seeing some of the downturn in the market. Total movement for all endowments, or endowments divided into fewer categories, since ’00 would have been even better.

What Needs Work

It is intuitive to portray data that “drops” (according to the headline) or rises with the change along the y-axis. I did a little remix just to show you what I mean. In the first glance at the data, the increase or decrease is going to be more legible when it’s happening on the vertical axis. It’s just the way we learn to read charts and graphs. Before that, I suppose our tendency to associate the vertical axis with things rising and falling came from gravity. The laws of physics aren’t going to change – stick dropping/rising data on the y-axis until gravity causes changes in the x-axis.

The other thing I might have changed was the choice of categorization. What is gained by splitting the data into the uneven increments that appear here? First, increments should either be even or should have some reason for being uneven. We’ve got a $500m range, a $400m range, a $50m range…it’s all very unclear why these are the important categories, especially when there is no immediately obvious significant difference between them. They all seem to have been more or less flat in FY’08. Then they all plummeted ~21-22% between July and November of ’08. I would have opted for more historical context and fewer categories.

The Wall Street Journal is running basically the same story with a different graphic though they still stick with the horizontal arrangement. I like there’s even less because they

Relevant Resources

Katie Zezima (27 January 2009) Data Show College Endowments Loss Is Worst Drop Since ’70s at the NYTimes.com

John Hechinger (27 January 2009) College Endowments Plunge Wall Street Journal Online

National Association of College and University Business Officers 2008 NACUBO Endowment Study Available for Purchase.