twitter

Geography of twitter in a tree map
Geography of twitter in a tree map. Graphic created using TreeMappa. Data gathered by Devin Gaffney; analysis by Monica Stephens and Mark Graham.

What works

1. Boxes-in-a-box diagrams aka treemaps achieve an efficient use of space without sacrificing granularity of information.

2. Intelligent grouping – all of the countries from each continent are grouped together in boxes that fit neatly into the perimeter of the overall boundary.

3. The color in the boxes does not obscure the labels. Also, the font size is well chosen.

4. The treemappa algorithm rigorously adheres to scale which is critical for visual analysis. Scale communicates across language barriers and that’s one of the reasons visual communication has advantages over text-only communication.

5. Treemappa is free to use.

What needs work

1a. The legend is efficiently small to benefit the efficiency goals of the design, but it doesn’t explain what numerical value underlies the High, Medium, and Low internet users-to-tweets ratio. The blog post accompanying the graphic does not describe this ratio either though I would imagine it is discussed in the as-yet-unpublished manuscript “Where in the world are you? Geolocation and language identification in Twitter” listed (but not linked) in the references. We’ll have to wait for formal publication.

1b. We also can’t tell what the scale is with respect to the activity comparisons between countries. Scale is extremely important for interpretation [see number 4 above]. It’s critical to include numbers in the legend so that viewers can calculate ratios. [For instance, I would like to know if they’re using a log or a linear scale but without a numerical legend I can’t tell…]

2. The biggest problem with this graphic is a problem I have been contemplating about many different information graphics: information graphics are consumed as hermetically sealed information objects that offer a kind of apolitical truthiness. Within the social science tradition – and within most scientific traditions – it’s incredibly important to make the messiness of research transparent. In this particular case, the blog author does an excellent job of representing the dubious validity of this research in the blog post that accompanies this image when he writes:

As a first step, we decided to collect all georeferenced tweets sent between March 5 and March 13, 2012. It is important to point out that georeferenced tweets comprise fewer than 1% of all tweets and it is possible that significant geographic biases exist in where and how people georeference their content.

So should we trust that the numbers above are representative of the actual geolocation of tweets? Well, we should only assume that this is a good representation if we believe that there is no systematic geographical correlation between users who include geolocation data with their tweets and those who do not. I am not a twitter expert, but it’s hard for me to swallow the idea that users of twitter have the same attitudes about privacy and competence with privacy settings the world over.

The tyranny of beauty

The bigger question for information graphics, though, is how can we ensure that the graphics themselves reveal their own messiness, incompleteness, and methodological underpinnings? If information graphics are to become legitimate components of (social) scientific practice, they need to find ways to include the kinds of doubts, disclaimers, and methodological difficulties that appear in the discussion section of academic papers.

I struggle with this immensely in the graphics I make. I’ve found that the designerly desire that graphics be beautiful in order that they communicate instantaneously through first impressions lead to a tyranny of aesthetics in which graphics that are deemed “good” are those that specifically avoid messiness and present a sanitized, sealed, image-as-object that deliberately obscures many of the problems that remain open questions. The graphic presents itself as an answer. In text, it is possible to differentiate between the elements of questions that are leaning towards answers. In photographs, interpretations can be meaningfully multiple. But in information graphics, the image is often so tightly bounded that it leaves no invitation to skepticism.

Boxes-in-a-box diagrams like the tree map above is a particularly clear illustration of the larger tension in which information graphics are asked to present clear and complex information at the same time that academic requirements ask that they make their messiness and unknowns obvious. The graphics were created in order to present information efficiently at first glance and then reveal granular detail upon further inspection. This is a worthy set of goals and the boxes-in-a-box tree map diagrams deliver on both of those goals. I would argue that those goals satisfy only one side of the problem – to communicate what is known in a clear, compelling fashion – leaving aside the notion that much remains unknown, that many other relationships have been left out, and that even the things we think we know rely on sound methodology which may or may not be possible. Social science research has always been blessed/plagued with the challenges of drawing meaning from incomplete, intersecting, and incommensurable information.

This is an issue I’ll continue to explore and I encourage both designers and social scientists to share thoughts about the benefits and drawbacks of ‘beautiful’ information.

References

Graham, Mark; Stephens, Monica; and Gaffney, Devin. (2012) “A Geography of Twitter” [blog post] Visualizing Data blog. Oxford, UK: University of Oxford, Oxford Internet Institute.

TreeMappa [free online graphic creation tool]

Figures 1 and 2 from "Who Gives a Tweet?" by André, Bernstein, and Luther CSCW paper
Figures 1 and 2 from "Who Gives a Tweet?" by André, Bernstein, and Luther CSCW paper

What works

A new study will be presented in a couple weeks at CSCW by researchers in Human-Computer Interaction and Social Computing that used 43,000 ratings of tweets to explain what content twitter readers find useful.

In short, worthwhile tweets:
1. Are informative NOT boring
2. Are funny
3. Are concise (even shorter than 140 characters!)
4. Are hyper-timely
5. Avoid whining and navel gazing (Tweets about meals past, present, or future are ‘boring’)
6. Avoid using too much twitter mark-up like @ replies, hashtags, multiple links)

The graphics do a good job of providing a visual overview of the study’s findings. With my brief textual synopsis and the two graphics here I bet many of you reading this will feel like there is no need to go read the study itself. Just in case that’s true, you should know that in the author’s discussion section, they note that their raters were volunteers who were not randomly chosen and skewed towards the tech crowd. Perhaps there’s reason to believe that tech people would be more likely to appreciate informative tweets? Not sure. But I can say from my own research that there is a noticeable portion of the twitterverse that appreciates food-related tweets. Even within that sub-group, people tend to appreciate tweets about recipes or with pictures over tweets that just say, “I had a great #sandwich at lunch! Fresh mozzarella rocks.” A recipe is informative. A recounting of lunch or a whiny tweet about missing lunch is boring at best and annoying at worst.

The thing I like best about this piece is that many of the findings apply to communication in general, not just tweets. Folks, it’s probably true that whether you are tweeting or talking, nobody wants to know what you had for lunch unless they want to have what you’re having. And if they do, they’ll probably ask. No need to volunteer. Also: brevity is the soul of wit; and wit is wonderful.

As an aesthetic point, I think they got the colors about right. Red represents the not-worthy or bad votes that ought to stop; blue represents the neutral position; and green represents the good tweets tweeps should go for.

What needs work

This graphic came without a title and I added “Which tweets are worth reading?” because it was really hard to interpret the graphs at first glance without a title. There is enough information for interpretation in the caption, but I think a caption should not stand in for a title.

The title is the first thing we see.
The graph is the second thing we see.
The caption is the third thing we see.
In order to understand the graph, then, it’s logical to have a title first so that readers’ don’t get frustrated that they have no idea what these colorful bars represent (the axes only get us halfway there in this case).

The title follows their own recommendations: questions work well as tweets. I figured I would try it here as a title, see what happens.

References

P. André, M. Bernstein, and K. Luther. (In press). “Who Gives A Tweet: Evaluating Microblog Content Value.” To appear in CSCW ’12: Proceedings of the 2012 ACM Conference on Computer Supported Cooperative Work. (Best Paper Award honorable mention; top 5% of submissions)