graphs

The functional art book cover

Cairo, Alberto. (2013) The Functional Art: An introduction to information graphics and visualization. Berkeley: New Riders, a division of Pearson.

Overview

A functional art is a book in divided into four parts, but really it is easier to understand as only two parts. The first part is a sustained and convincingly argument that information graphics and data visualizations are technologies, not art, and that there are good reasons to follow certain guiding principles when reading and designing them. It is written by Alberto Cairo, a professor of journalism at the University of Miami an information graphics journalist who has had the not always pleasant experience of trying to apply functional rules in organizational structures that occasionally prefer formal rules.

Sketch of "The Transatlantic Superhighway" by John Grimwade
Sketch of “The Transatlantic Superhighway” by John Grimwade which was originally for Conde Nast Traveler and reprinted in The Functional Art. Click for the full interview with Grimwade.

The second part of the book is a series of interviews with journalists, designers, and artists about graphics and the work required to make good ones. This part of the book is as much about the organizational culture of art and design and specifically of graphics desks in newsrooms as it is about graphic design processes. The process drawings are fantastic. I’ve included two of them here. The first by John Grimwade is multi-layered, full of color and dynamic vitality. These qualities were carried through into the final graphic but are often very difficult to build into computer-generated images. I wondered if the graphic would have been as dynamic if it had come from a less well-developed hand sketch (or no sketch at all).

Photo of clay model of Gobekli Tepe
Photo of clay model of Gobekli Tepe by Juan Velasco with Fernando Baptista for National Geographic. Click for a video of the model building process.

The second is a set of photographs taken of a clay model by Juan Velasco and Fernando Baptista of National Geographic that was used to recreate an ancient dwelling place call Gobekli Tepe that was in what is now Turkey. Both of these examples lead me to the iceberg hypothesis of graphic design – the more the design that shows up in the newspaper or magazine is just the tip of an iceberg of research, development, and creative work, the more accurate and engaging it is likely to be.

As a sociologist I am accustomed to reading interviews and am fascinated by the convergence and divergence in the opinions represented. In this case, I especially appreciated that Cairo’s interview questions touched on the organizational structures and working arrangements, as did his own anecdotes throughout the book, to provide an understanding of the opportunities and constraints journalists and information graphic designers face. Their work is massively collaborative and the book works to reveal the bureaucratic structures that come to promote and impinge upon design processes and products.

There is a fifth part to the book, too, a DVD of Cairo presenting the material covered in the first three chapters of the book. I admit, I have rarely been a large fan of DVD inclusions. They are easy to lose, scratch and/or break. But assuming the DVD is intact and accessible, I never know when I ought to stop reading and start watching. And even if the book has annotations indicating that an obedient reader should stop reading and start watching the DVD, this assumes the reader is willing and able to put down the book and fire up the computer. The only time I can imagine using the DVD is as a teaching aid in class to give the students a break from having to listen to me all the time. Unfortunately, that is prohibited by Pearson.

Still, it is worth watching because Cairo has a great voice and he is able to discuss interactive content/design in a way that is not easy in the pages of the book. While some of the discussion repeats themes from the first part of the book, there are new examples from additional designers, including some who have been Cairo’s students, which might be of interest to people thinking of signing up for his online course.

What does this book do well?

"Brazilian population grows more in prisons" graphic
“Brazilian population grows more in prisons” by Alberto Cairo originally in Epoca magazine November 2010, reprinted in “The Functional Art” by Alberto Cairo in 2013.

The book does a great job of explaining the decision making behind graphic design. The sketches, process drawings, and recounts of the conversations that went on in editorial meetings gave important depth of context. The organizational culture and day-to-day expectations of the newsroom tend to encourage the use of templates and discourage exuberant creativity. Cairo explained that this Brazilian prison graphic that eventually won the Malofiel design award also won him a reprimand from his boss who proclaimed it to be “ugly”. In practice, conceptual distinctions between art and technologies for comprehension are made rigid by bureaucratic structures in which, “the infographics director is subordinate to the art director, who is usually a graphic designer,” and that this arrangement, “can lead to damaging misunderstandings.”

The more prominent argument follows from these peeks into the backstage of journalism. Infographics and visualizations are technologies, not illustrations. Cairo writes that:

The first and main goal of any graphic and visualization is to be a tool for your eyes and brain to perceive what lies beyond their natural reach….The form of a technological object must depend on the tasks it should help with….the form should be constrained by the functions of your presentation….the better defined the goals of an artifact, the narrower the variety of forms it can adopt.

One of the writing techniques that Cairo uses is summarizing his take-away points from previous paragraphs in quick lists of pointers or key questions. Cairo incorporated these quick lists gracefully into the writing style and I never felt like I was reading a textbook. Still, the quick lists make it easy to use the book as a reference. The index, bibliography and detailed table of contents add strength to the book as a reference source, too. Note to the publisher: I found it frustrating that the book did not include a list of figures, especially given the subject matter.

"Home and Factory Weaving in England, 1820-1880" graphic
“Home and Factory Weaving in England, 1820-1880” Otto and Marie Neurath Isotype Collection, University of Reading as seen in The Functional Art by Alberto Cairo.

Diversity

One of the greatest strengths of this book is the diversity of sources from which Cairo draws his material. Yes, he uses graphics he has developed in many cases which is hugely valuable because he is able to provide insights into the development processes. However, he also draws from graphics old and new [see an old one he pulled out of an archive at the University of Reading about weaving in the industrial revolution], from magazines, newspapers, and the internet, made by freelancers, in-house designers, and students, and in languages other than English (some of which are translated, some of which impressively need little translation). My favorite graphic in the book was one I never would have come across that uses pieces of fruit to describe the surgical procedures used to achieve sexual reassignment.

“How sex change surgeries work.” by Renata Steffen, William Vieira, Alex Silva and Sergio Gwercman in Superinteressante magazine (Brazil). Part 1 of 2.
“How sex change surgeries work.” by Renata Steffen, William Vieira, Alex Silva and Sergio Gwercman in Superinteressante magazine (Brazil). Part 2 of 2.

This diversity serves as an example of the breadth of Cairo’s experience in the world of journalistic information graphics. It is also a testament to his real joy in the subject. Many authors of design books are happy to fill the pages with their own work. Cairo is surely talented enough to have done. Instead, he chose to showcase an incredible range of designers and styles. This diversity, combined with the accessibility of the writing, are cause enough to recommend this book for anyone who is curious about graphics and journalism, especially journalism students.

What doesn’t this book do well?

The most curious shortcoming – given the incredible diversity of designers, styles, countries, and publication types represented – is the scarcity of women designers. There are thirteen designers profiled in part IV of the book; only two are women. There were forty-seven graphics reprinted; five were designed by women. With respect to the reprints, Cairo is completely justified in reprinting his own work more often than the work of others because he knows how the design process unfolded in those cases. Since he is a man, this inflates the masculine contribution to the reprinted graphics category. Still, many of the graphics he worked on were collaborative efforts and his collaborators could have been women in a more ideal world. But mostly, they were men.

Because the information graphics world is relatively interdisciplinary and (so far as I know) has no specific professional organization whose membership includes a representative sample of practicing information graphics and data visualization professionals, it is hard to tell if the gendered pattern in Cairo’s book is due to some oversight on his part or the underlying gendered make-up of the industry or a combination of both. Even if the industry is dominated by men, it is important for people who write and edit textbooks to ensure that women are represented or they run the risk of sending the message that women may not be welcome or well-rewarded if they choose to pursue data visualization. That is unacceptable. The graphics world will lose out on half its talent pool and women might avoid careers that could have been satisfying and rewarding for them. Notably, the kinds of graphic design that require coding – like data visualization and interactive design – are better compensated than illustration and static design so it’s possible that women are being subtly nudged into the less well-compensated areas of graphic design along the line. It would have been nice if this textbook that is so diverse in so many other ways could have pushed the gender boundary and included more women.

The book also over-promises in the cognition section. The first chapter on cognition was too basic. The second and third chapters in this section had more that was directly applicable to design. All three chapters could have been condensed into one. It is certainly true that perception and cognition ought to be included and there were some useful applications derived from the three chapters, but there was too much review and too few clear applications of the basic principles of cognition and perception to graphic design.

Here are the pointers I did find useful, if you happen to want to buy the book and skip those chapters:

+ If you want viewers to estimate changes by visually comparing elements, you will have the best luck if those changes are depicted using elements of the smallest number of dimensions possible. For instance, viewers will have an easier time coming up with an accurate estimate of the difference in size between two lines (1D) than between two circles or squares (2D). It’s best to avoid 3D comparisons altogether. I would also add that regular objects like circles and squares are cognitively easier to think with than irregular objects like polygons other than squares.

+ The less frequently a color appears in nature, the more likely it is to draw the eye. Reserve the use of colors like red, pink, purple, orange, teal, and yellow for elements that are meant to draw attention.

+ Humans cannot focus on multiple elements at the same time. Design graphics that have one focal point or clear hierarchies of focal points. Do this by eliminating unnecessary use of bright color, chart junk like grid lines that aren’t absolutely necessary, and by establishing a logical information hierarchy in the page layout.

+ Landscapes have horizon lines. Humans are used to encountering the world this way. This is one reason why it is easier to make comparisons using bar graphs (where all the elements start from a common horizon line) rather than pie charts (where there is no shared horizon).

+ Eyes are good at detecting motion and they will focus attention on moving objects. Try not to ask viewers to read text and simultaneously watch a moving element in interactive graphics.

+ Human brains are good at picking out patterns. Often, fairly small changes to a graphic layout that strengthen the appearance of grouping or other types of patterns will add to the ability of the graphic to deliver an instant impression or overview of the message being communicated. For instance, changing the spacing of the bars in a bar graph so that every fourth bar has twice as much space after it as all the rest will make the graph appear to have groups of 4-bar units.

+ Interposition – placing one object in front of another so they overlap – is a good way to add depth. If objects never overlap, the opportunity for the illusion of depth is lost.

Summary

Overall, the book was well-written, included valuable insight into the process underlying the creation of strong, successful information graphics and visualizations, and would be a solid textbook for use in journalism departments. The representation of women designers was disappointingly low and the segment on cognition could be condensed or otherwise improved. Cairo is clearly a talented designer and teacher. This book meaningfully combines both of those strengths and is an important contribution to undergraduate and graduate education in the emerging sub-discipline of information visualization and design.

I am sending you out with one of the graphics I was most impressed by, in part because the graphic is good, but mostly because Cairo helped me to see why a rather average looking graphic is in fact rather brilliant. It is by Hannah Fairfield of the New York Times graphic desk and it shows that the driving behavior of Americans is sensitive to changes in the economy. During the 2005 recession when gas prices were high but the economy was struggling overall, Americans drove fewer miles. This pattern had only one historical precedent – the 1970s. The graphic depicts this by having a timeline that appears to walk backwards during those two periods in history, a broken pattern your pattern-loving mind is likely to fixate on once you realize this is not your average line graph. Smart.

"Driving shifts into reverse" graphic
“Driving shifts into reverse” by Hannah Fairfield originally published in the New York Times, May 2010; reprinted in “The Functional Art” by Alberto Cairo, 2013.

References

Cairo, Alberto. (2013) The functional art: An introduction to information graphics and visualization.

Fairfield, Hannah. (2010) Driving Shifts into Reverse New York: New York Times.

Grimwade, John. (1996) The Transatlantic Superhighway. [information graphic]. New York: Conde Nast Traveler.

Steffen, Renata; Vieira, William; Silva, Alex and Gwercman, Sergio. “How sex change surgeries work.” Superinteressante magazine. Brazil.

Velasco, Juan and Fernando Baptista. () “Gobekli Tepe Process Shots”. National Geographic Magazine. In Cairo, Alberto (2013) The Functional Art p. 238.

Hot Dog Eating Contest Graph
Hot Dog Eating Contest Graph – Large version

Preface to the book review series

There are two ideal types of infographics books. One ideal type is the how-to manual, a guide that explains which tools to use and what to do with them (for more on ideal types, see Max Weber). The other ideal type is the critical analysis of information graphics as a particular type of visual communications device that relies on a shared, though often tacit, set of encoding and decoding devices. The book reviews I proposed to write for Graphic Sociology include some of each kind of book, though they lean more towards the how-to manuals simply because more of that type have come out lately. As with all ideal types, none of the books will wholly how-to or wholly critical analysis.

I meant to review two of Edward Tufte’s books first so that we would start off with a good grounding in the analytical tools that would help us figure out which parts of the how-to manuals were likely to lead to graphics that do not commit various information visualization sins. However, I have spent the past six weeks at a field site (a graphic design studio nonetheless) and it rapidly became completely impractical to lug the two oversized, hard cover Tufte books around with me. I found Nathan Yau’s paperback “Visualize This” to be much more portable so it skipped to the head of the line and will be the first review in the series.

The Tufte review is next up.

Review of Visualize This by Nathan Yau

Visualize This book cover

Yau, Nathan. (2011) Visualize This: The FlowingData guide to data, visualization, and statistics Indianapolis: Wiley.

Visualize This is a how-to data visualization manual written by statistician Nathan Yau who is also the author of the popular data visualization blog flowingdata.com. The book does not repeat the blog’s greatest hits or otherwise revisit much familiar territory. Rather, this was Yau’s first attempt to offer his readers (and others) a process for building a toolkit for visualizing data. The field of data visualization is not centralized in any kind of way that I have been able to discern and Yau’s book is a great way to build fundamental skills in visualization that use tools spanning a range of fields.

The three primary tools that Yau introduces in the book are two programming languages – R and python – and the Adobe Illustrator design software. Both R and Python are free and supported by a bevy of programmers in the open source world. R is a programming package developed for statistics. Python has a much broader appeal. Both of them can produce data visualizations. Adobe Illustrator is neither free nor open source but it is worth the investment if you are planning to do just about any kind of graphic design whatsoever, including data visualizations. Yau mentions free alternatives, and there are some, but none have all of the features Illustrator has.

Much of the book starts readers off building the basic bones of a visualization in R or python, based on a comma-separated value data file that has already been compiled for us by Yau. He notes that getting the data structured properly often takes up more than half the time he spends on a graphic, but the book does not dwell much on the tedium of cleaning up messy data sources. Fine by me. One of the first examples in the book is a graphic built and explored in R, then tidied up and annotated in Illustrator using data from Nathan’s Hot Dog Eating contest.

This process is repeated throughout:
   1. start visualizing data with programming;
   2. try to find patterns with programming;
   3. tidy up and annotate output from program in Illustrator.

The panel below shows you what R can do with just a few lines of code. Hopefully, it also becomes clear why it is necessary to take the output from R into Illustrator before making it public.

Visualize This - example from chapter 4
Visualize This – example from chapter 4

Great tips

There are hints and tips sprinkled throughout the book covering everything from where to find the best datasets to how to convert them into something manageable to how to resize circles to get them to accurately represent scale changes. This last tip is one of my favorites. When we visualize data and use circles of varying sizes to represent the size of populations (or some other numerical value) what we are looking at is the area of the circle. When we want to represent a population that is twice as big as the size of some other population, we need to resize the circle so its area is twice as big, not its circumference.

How to scale circles for data visualization
How to scale circles for data visualization

More great tips:
1. First, love the data. Next, visualize the data.*
2. Always cite your data sources. Go ahead and give yourself some credit, too.
3. Label your axes and include a legend.
4. Annotate your graphics with a sentence or two to frame and/or bolster the narrative.

*Love the data means take an interest in the stories the data can tell, get comfortable with the relationships in the data, and clean up any goofs in the dataset.

Pastry graphics: Pie and donut charts

Yau’s advice about pie charts diverges from mine. I say: use them only when you have four or fewer wedges because human eyes really have trouble comparing the area of one wedge to another wedge, especially when they do not share a common axis. Yau acknowledges my stubborn avoidance of pie charts but advises a slightly different attitude:

Pie charts have developed a stigma for not being as accurate as bar charts or position-based visuals, so some think you should avoid them completely. It’s easier to judge length than it is to judge areas and angles. That doesn’t mean you have to completely avoid them though. You can use the pie chart without any problems just as long as you know its limitations. It’s simple. Keep your data organized, and don’t put too many wedges in one pie.

The Yau explains how to visualize the responses to a survey he distributed to his own readers at FlowingData to see what they’d say they were most interested in reading about. He showed the readers of the book a table with the blog readers’ responses which I’ve recreated below [Option A]. I think the data is easier to read in the table than in either the pie chart or the closely related donut chart [Option(s) B]. In life as in visualization, a steady stream of pies and donuts is fun but dumb. Use sparingly.

Visualize This example from chapter 5
Visualize This example from chapter 5

Interactive graphics

Learning about pie charts was great fun even though I don’t like pie charts because Yau taught us how to use protovis, a javascript library that yields interactive graphics. We built a pie chart just like the one(s) in Option B that popped up values on mouseover the wedges. Protovis was developed at Stanford and has now morphed into the d3.js library. The packages developed in Protovis are still stable and usable. I highly recommend this exercise for anyone who wants to make infographics for the web. It helps to have a basic understanding of html going in.

What needs work

The overarching problem I had with Visualize This is that it spent relatively little time generating different types of graphics using the same data. We saw a little bit of that above when Yau used both a pie chart and a donut chart to visualize the same survey responses, but since donut charts are just variations on pie charts, it was not the best example in the book. The best example came when Yau visualized the age structure of the American population from 1860 – 2005 (I updated the end date to 2010 since I had access to 2010 census data).

First, Yau shows readers how to make this lovely stacked area graph in Illustrator. That’s right. No R. No Python. Just Illustrator.

Aging Americans
Aging Americans | Stacked area graph version

Then Yau admits that the stacked area chart has some general limitations:

One of the drawbacks to using stacked area charts is that they become hard to read and practically useless when you have a lot of categories and data points. The chart type worked for age breakdowns because there were only five categories. Start adding more, and the layers start to look like thin strips. Likewise, if you have one category that has relatively small counts, it can easily get dwarfed by the more prominent categories.

I tend to disagree that the stacked area chart ‘worked’ for displaying the age structure of the US population, but not because there were too many categories. I’ll get to why I don’t think the stacked area graph worked shortly, but first, let’s have a look at the same data represented in a line graph. This was Yau’s idea, and it was a good one. What we can see by looking at the data in a line graph rather than a stacked graph is the size ordering of these age slices. Yeah, I can kind of see that the 20-44 group was the biggest group in the stacked graph. But I had to think about it. In the line graph, I don’t wonder for a second which group was biggest. The 20-44 group is on top. The axes in line graphs just make more sense. I admit that the line graph is not an aesthetic marvel the way the area graph was. But, you know, you can figure out your own priorities. If you want pretty, go with the area graph and get smart about colors (with the wrong color scheme, any graphic can look awful. See also: what Excel generates automatically). If you want a graphic for thinking with, avoid stacked area graphs.

Aging Americans
Aging Americans | Line graph version

Coming back to what I think about visualizing the age structure of the American population. Call me old-fashioned, say that I adore my elders too much, I’ll just tell you we all stand on the backs of geniuses. I like the age pyramids for visualizing the age structure of a population. Here’s one I plucked from the Census website.

Population Aging in the United States | Traditional age pyramid graphic

The pyramid has these advantages:
   1. It shows gender differences. Males are on the left. Females are on the right.
   2. This graphic does a better job of showing the structure of the population because the older people appear to balance on the younger people. This is useful because the older people actually do kind of balance on the younger people when it comes to things like Social Security. The structure of the population does not come through in the area graph or the line graph. Both of those show us that there are more old people now than there were before but displaying more is a less sophisticated visual message than showing us just how many older people and how much older and how these things have changed over time. See all those and’s in the previous sentence? Yeah. That’s how much better the pyramid is.
   3. It is possible to see both the forest and the trees in this age pyramid. What do I mean? Well, the stacked area graph and the line graph had to lump rather large (and disproportionately sized) groups of ages together. In the age pyramid, the slices are even at every five years and if you happen to want to figure out just how the 20-24 year olds are changing over time, you can. But this granularity does not make it difficult to understand the overall structure of the pyramid.

To summarize my larger disappointment, I wish that Yau had gone through a number of examples of displaying the same data with different graphics in order to teach readers how to choose the best graphic. To his credit, he did visualize crime data with a bunch of different graphics, but I didn’t like any of the graphic types. I’m including the one I liked most, but it’s mostly for historical reasons. This type of weird fanned out pie wedges is called a Nightingale chart and was developed in part by Florence Nightingale way back when information graphics didn’t exist. He visualized this same crime data with Chernoff faces and with star graphics, neither of which were interpretable, in my opinion.

US Crime Rates by State - Nightingale charts
US Crime Rates by State – Nightingale charts

Heatmaps

Unlike Chernoff faces, star charts, and Nightingale charts which I think are totally useless, heatmaps have promise as data visualizations. This is a good example of how I wished Yau would have started working hard to get the data to lash up better with the visualization. This is his final version of the heatmap of a whole bunch of different basketball game statistics with the players who were responsible for scoring, assisting, and rebounding (among many other things). I am a basketball fan. I went linsane last season. But I just do not get excited when I look at this heatmap because the visualization does not reveal any patterns. Ask yourself: would I rather have this information in a table? If the answer is yes, well, then you know there’s at least one other kind of representation besides this one that you would prefer if this is the data you are trying to display.

NBA heatmap via FlowingData
NBA heatmap via FlowingData

So what would I do? Well, I’d do a couple things. First, I would probably try restricting this heatmap to the top ten players or even to my favorite players. Throwing in 50 players and about 20 statistics per player without condensing anything means we are looking at 1000 data points. Ooof. So…if not cutting down the number of players, maybe put the scoring statistics in a different heatmap than all the other statistics (playtime, games played, rebounds, steals, blocks, turnovers, and so on). Maybe strip out the “attempts” and just leave the completed free throws, field goals, and three-pointers. I do not know if these things would have revealed patterns, I just know that the current graphic is still looking like a data soup to me.

Maps triumphant

Overall, this was a great how-to for data visualization and I want to end on an appropriately high note. One of the biggest wins in the book was Chapter 8 in which Yau walks us through the most meticulous and involved demo in the book. The payoff is big. He shows us how to use google maps and FIPS codes to make choropleths (these are large maps in which colors mated with numerical values fill in small, politically bounded units, usually counties but sometimes census tracts). He does not use ArcGIS which is one of the reigning mapping tools on the market. But ArcGIS is expensive. And Yau shows us how to generate maps without spending a dime. You will have to spend some time. If you are a cartography geek or you follow the unemployment rate, you’ve probably already seen this graphic because it was widely circulated, for good reason.

Unemployment map via FlowingData
Unemployment map via FlowingData
Graphic: Gender ratio of recent US graduates by degree
Gender ratio of recent US graduates by degree | Laura Norén | click caption for pdf

Is higher education “dominated” by women?

There has been plenty of news coverage recently about the rise of women and the decline of men. While I have always disliked the irrational use of zero-sum language – why do we have to frame this discussion as men who are losing because women are making some gains? – I thought it would be worth taking a closer look at the gender ratio in higher education. I found many text-heavy stories (the Guardian, the New York Times, the Chronicle of Higher Ed, Huffington Post, The Atlantic, and many others) about female students earning more bachelors but surprisingly few graphics.

Graphics can do an excellent job of summarizing the gender gaps as they have developed over time within bachelors, masters, and professional+doctoral degrees. One graphic, quite thought provoking. All of the three degrees were more likely to be earned by men in 1970. Then between 1970 and 1980 women made rapid gains which continued through the 1980s. The gains for women slowed down once they hit the 50/50 mark for both bachelors and masters degrees and I predict they will also slow down for phd and professional degrees. Though it’s hard to tell by looking at the graphic, women are earning the largest proportion of masters degrees (projected to be 61% in 2020) which is slightly more than the 58% of bachelors degrees they are projected to earn in 2020.

Why aren’t women earning more if they are so well educated?

There is still a pay gap in earnings between men and women. Within the university, male faculty members tend to make slightly more than female faculty members. Overall, the most powerful explanation for pay gaps is not so much a failure to pay men and women equally for the same job. Rather, women are more likely to get degrees that lead to positions which are paid less than the positions men are more likely to get following their collegiate specializations. More women end up in education and nursing; more men end up in engineering and computer science. Education and nursing are not as likely to be lucrative as jobs that require engineering and computer science degrees.

To answer the question about women “dominating” higher education it is clear from the numbers that there are more female students at every level, though some majors still tilt towards men. What’s perhaps more important, women may or may not go on to match the earning potential of men, in part because they may not always choose the majors that lead to the most lucrative careers. Some argue that earning potential should drive choice-of-major but I’m still of the mind that going to school is not all about (or even primarily about) producing good workers. Going to school is about taking the time to explore different ways of thinking in depth and without undue concern for their ability to produce economic return. I’m glad that we have gotten to the point where there is enough gender parity to return to conversations about what school is for rather than who school is for…

Does the gender gap in graduation rates vary by race/ethnicity?

Graphic: 2009 US bachelors degrees by race/ethnicity and gender
2009 US bachelors degrees and gender gaps by race/ethnicity | Laura Norén | click caption for pdf

…but on the other hand, there are still critical gaps in access to higher education and degree completion that trend along racial/ethnic lines (class lines, too, but I didn’t get into that in this post). The graphic above displays the share of bachelors going to different racial/ethnic groups in 2009. In order to provide a relevant framework for comparison, I plotted the share of degrees earned next to the share of the total population of 18-24 year olds constituted by each racial group. There are some missing categories – mixed race people, for instance – but I couldn’t find graduation rates broken down any further than the five traditional racial/ethnic categories. Asians and Pacific Islanders only make up 4% of the population but they earn 7% of the bachelors in 2009 and their gender gap that year was only 10%. Whites were similarly over-represented in degree-earners and had a similar gender gap of 12%. But then things got interesting. The gender gaps for American Indians and Hispanics were much higher at 22% and the gender gap for blacks/African Americans was even higher still at 32%.

Especially when it comes to studying gender which is often constructed as a binary in which both groups make up about 50% of the whole, it is important to realize that analytical rigor might be increased by further segmenting these gender categories by some other key analytical variable. In this case, adding vectors for race/ethnicity provided a new perspective, one that might be a decent proxy for class.

References

Norén, Laura. (4 September 2012) Gender ratio of recent US graduates [infographic] New York.

Norén, Laura. (4 September 2012) US bachelors degrees by race/ethnicity [infographic] New York.

National Center for Education Statistics. (2011) Table 283: Degrees conferred by degree-granting institutions, by level of degree and sex of student: Selected years, 1869-70 through 2020-21 [Available in html and xls] US Department of Education.

National Center for Education Statistics. (2011) Table 300: Bachelor’s degrees conferred by degree-granting institutions, by race/ethnicity and sex of student: Selected years, 1976-77 through 2009-10 US Department of Education.

US Census Bureau. (2012) Table 10: Resident Population by Race, Hispanic Origin, and Age: 2000 and 2009 In The 2012 US Statistical Abstract. [Available in pdf and xls.

US food security map | Gallup via Marketplace.org
US food security map | Gallup via Marketplace.org

What works

Food insecurity – worrying about having enough money to buy food – is an extremely important problem. Gallup came up with new poll numbers on the prevalence of food insecurity in the US just this week and spokesman Frank Newport did an interview on the findings with Tess Vigeland of the radio show Marketplace. Marketplace ran the map graphic above on their website which is somewhat rare for a radio program given that graphics just do not have much of a place on the radio.

The survey question was:

Has there been one time in the last 12 months when you did not have enough money to buy the food that you or your family need? And overall, 18 percent of Americans so far this year — the first half of the year — said yes, there has been at least one time.

The graphic makes clear that the problem of food insecurity has a north-south pattern to it. People in the South have “high” levels of reporting food insecurity while people in the middle and on the west coast have “moderate” levels of food insecurity and folks in the north have “low” levels of food insecurity. But…

What needs work

…where are the numbers? What ranges are represented by the “low”, “medium”, and “high” levels of reported food insecurity? This information should be in the graphic. Legends matter.

What we can imply from the interview is that the states in the “high” range have 20% of their poll respondents reporting that they’ve had trouble paying for the food they need in the last 12 months. The “low” level of insecurity includes states like North Dakota where 10% of people reported having trouble paying for food. That still seems high given how wealthy Americans are on the whole. This food insecurity data is one way to think about just how economic inequality plays out in the US. Folks cannot even afford the food they need.

Here’s another graphic to think about, the rate of the use of food stamps (SNAP):

Food stamp program participation 1970-2010
Food Stamp program participation 1970-2010

Understanding food insecurity is one of those things that is going to require more than a single map based on a single survey question asked at one point in time. Well-designed graphics can and should aim to depict complexity and nuance…kind of like any other representation of critical analysis (writing, reporting, etc).

References

Vigeland, Tess. (23 August 2012) Americans struggle to feed their families. [Interview with Frank Newport] marketplace.org

Global smoking rates by gender chart
Global smoking rates by gender | The Lancet via The Economist Daily Charts

What works

The Economist put together an infographic using data from a study published last week in The Lancet collected by an impressively large team of researchers from three different institutions in three different countries (The World Health Organisation, America’s Centres for Disease Control and the Canadian Public Health Association). The article in the Lancet has much more detailed data about all sorts of smoking traits that did not make it into this chart, but the chart succeeds in portraying two gendered vectors of smoking behavior: the different rates of smoking between men and women and the difference in the number of cigarettes smoked between the two genders.

Globally speaking, it is safe to say that smoking is a masculine activity. There is no country in which more women than men are smokers. That particular take-away is made extremely clear in the chart. Just a glance is enough exposure to the data to absorb the idea that smoking is somehow masculine.

What needs work

The graphic designers at the Economist try to expand on the notion that smoking is “somehow masculine” by layering another set of findings onto the basic rates of smoking by men and women. Way off to the right they have what is essentially two columns of a table that report the average number of cigarettes smoked by men and women. My fuzzy and addled brain wants this little table to be more like a bar chart in which the length of the bars corresponds to the number of smokes. Countries where smoking rates are highest would have longer bars. Countries where smoking rates are low would have shorter bars. Visually, the impact would increase dramatically if the size of the bar corresponded to the amount of cigarettes smoked.

Importantly for the point about the gendered nature of smoking, we could see another way in which smoking is gendered by looking at how many cigarettes are smoked by each gender. Some countries have dramatic differences: in Russia and Turkey men smoke about 1.5 times as many cigarettes as women. This is a marked contrast to the other end of the spectrum where in India, women who smoke (and there are very few women who smoke in India), smoke 7 cigarettes per day while the smoking men only smoke 6.1 cigarettes per day. If that part of the graphic had been given more space, it would have been easier to quickly absorb that pattern. As it is, only a careful reading of that table yields insight; we might as well just look at the data in Excel.

The other change I would order up for this graphic is to make the blue horizontal bars that run the full length of the graphic a different color than the male icon. My best option would have been to make the horizontal bars grey and truncate them after the male icon. There’s no need for them to go all the way across and it makes the table slightly harder to read. I realize that changing the horizontal bars to grey would then give the whole table a gridlike look due to the presence of the vertical bars. I would just shorten the vertical bars to tick marks at the top and tick marks at the bottom (it is a tall chart so tick marks only at the top or only at the bottom would be invisible to people who have to scroll to see the whole graphic).

I like the coral color used for the female icons. I would have turned the men navy because coral and navy are complimentary colors and look especially good together.

I wasn’t able to add the bar graphs out to the side or to fully eliminate the baby blue, but I did make some of the changes I suggested on the jpg below for your viewing ease.

Remix of The Economist Daily Chart from 20 August 2012 - Puffed Out: Daily cigarette smoking by men and women

References

The Economist. (20 August 2012) Puffed Out: Daily cigarette smoking by men and women The Economist: Daily Charts. [graphic design]

Giovino, Gary, et al. (18 August 2012) Tobacco use in 3 billion individuals from 16 countries: an analysis of nationally representative cross-sectional household surveys. The Lancet, Volume 380, Issue 9842, Pages 668 – 679, doi:10.1016/S0140-6736(12)61085-X

Food blog content characteristics and frequency of use | The Food Blog Study

What works

I conducted a web-based survey of food bloggers last summer as a doctoral intern at Microsoft Research in the Social Media Collective. I am now analyzing the mountains of data that I gathered in the interviews (N=30), survey (N=303), and web crawler (N=30,000) and getting ready to send out papers for publication. I thought it would be nice to share some of the findings here in advance of the slow academic publishing process.

Since I made the graphic and since I am modest, I’ll just say that I like the colors and I like that I was able to find a way to keep all of the granular detail of tabular data while adding visual impact.

If you would rather hear about the substance of the study than about the struggles I had while creating the graphic, skip to the bottom third of the post and the “What surprised me” heading.

What needs work

Since I have the benefit of having seen the data I can say that two things certainly need work. First, the survey asked about many more behaviors than I have decided to depict in this graphic. I left out data mostly because I want to be able to publish it and publishers are not keen on accepting already-published material. Some of them are not too bothered if bits and pieces of the findings are blogged about here and there. Some of them are hugely bothered and will not accept submissions that have been written about on blogs at all. There are good reasons for subjecting the findings to peer-review – like having smart people verify that the findings are not fabricated from thin air or otherwise constituted by complete rubbish. All that being said, my biggest problem with this graphic is that it is just the tip of the iceberg in terms of what the survey had to say about the characteristics of food blog content.

The second big problem with this is that I had a very difficult time dealing with proportional data in the rows and the columns. In case you still haven’t figured out what this graphic is saying – and I don’t blame you if you find it hard to digest – the graphic is depicting the frequency with which about 300 food bloggers (303 to be exact) reported using the listed types of content. For example, 96% of food bloggers report using video 20% of the time or less. Video just is not all that common on food blogs and most food bloggers hardly ever use it. Images, on the other hand, are included in food blog posts most of the time by most food bloggers. Seventy-four percent of food bloggers use photos 80% of the time or more. Reviews of restaurants, cookbooks, and kitchen gear, on the other hand, end up on 11% of food bloggers posts very frequently (80% or more posts contain reviews) while fully half of food bloggers hardly ever post reviews (20% or fewer of their posts contain reviews).

Since most food bloggers like to mix things up at least a little – hardly anyone has such a firmly established template for their blog content that 100% of their posts contain recipes and photos while 0% of their posts contain videos or discussion of non-food content (which would include mentions of important life events like getting a book contract, having a child, getting married, or getting cancer). With content, then, I wanted to let food bloggers explain about how often they posted a variety of different kinds of content. But then I had this difficulty of having proportions in the rows and the columns of the graphic which makes it difficult to interpret. Believe me, the tabluar data without the blocks changing sizes and colors was even harder to interpret so turning this information into a visual did help the analysis along by making the patterns clearer.

What surprised me

I was expecting many more bloggers to report including recipes more often. Only 37% said that 80% or more of their posts contained recipes. From what I gathered in the interviews, having someone else make your recipe and then leave a comment about it is one of the routine gratifications associated with food blogging. Web traffic to the site from google.com and on mini-search engines within the site is generally related to recipes, as well. So whether food bloggers care about the deeper meaning associated with food blogging and being part of a community or the hard-nosed economics and web traffic side of writing a blog, from the interviews, I was expecting recipes to be a bigger part of reported content than what I found in the survey. Recipes are one of the main activities around which both creativity and community are wound. They also draw a lot of traffic. On blogs, traffic often equals money (though not all that much money, which is why I think the meaning associated with recipes is more interesting than the money associated with recipes).

I was not at all surprised that most bloggers ignore nutritional information but I think that people who have never done much with food blogs would be surprised to see that three-quarters of bloggers mention nutrition and nutritional information 20% of the time or less. Food blogging gets its meaning and importance through practices of creating and community-making, not because the blogs are used as archives or tracking devices for those trying to lose weight or achieve other health goals. There are blogging communities organized around those things, but generally speaking, folks in those communities do not identify with the term ‘food blogger’.

Reference

Norén, Laura. (2012) Infographic: The Content of Food Blogs. The Food Blog Study. [www.foodblogstudy.info/findings.html]

Sleep-wake graph of Danielle Carrick's week, May 1st, 2012
Sleep-wake graph of Danielle Carrick’s week, May 1st, 2012 | via daniellecarrick.com

What works

This simple graph is visually nothing all that unique but conceptually it makes a very smart use of the bar graph trope to display information. Sleeping and waking hours are taken to be each other’s opposites (and are assumed to happen in unbroken spans – no daytime naps for Danielle).

What needs work

I might have toyed with placing the waking hours on top and the sleeping hours on the bottom. Or, better yet, I might have flipped the axis and put waking hours on the right and sleeping hours on the left. But that’s simply a matter of taste. Flipping the axis doesn’t change the concept.

Quantified self

As we see more and more applications and products that aim to reveal patterns about individuals to individuals, we’ll see more and more of ourselves reflected back to us in information graphics like this one. I’m curious to find out how the visualization of the data shapes the way people use the data.

There will slowly be more on the quantified self theme here.

References

Carrick, Danielle. (May 2012) “Week of May 1st | Sleeping” [infographic].

Saveur food blog award nominees and winners by gender, 2010-2012
Saveur food blog award nominees and winners by gender, 2010-2012

Gender in food blogging

Last summer I conducted a survey of food bloggers (N=283) which found that 85% of food bloggers are women (see here for more demographic statistics from the survey). I also conducted interviews with food bloggers and started to get the impression that food blogging is a community dominated by women in which the relatively few men end up being disproportionately successful. This kind of gender disparity – a group that is overwhelmingly women in which men are more likely to occupy positions of power or prestige – has been written about in the sociological literature with respect to elementary school teaching and nursing. In elementary schools, for example, the majority of the teachers were women but administrators (like the principal and vice principals) were disproportionately likely to be men. This gender disparity in the schools is no longer as pronounced as it once was. Women now occupy more of the administrative positions but men have not moved in to occupy more teaching positions. If food blogging follows the same trajectory, we can expect women to occupy more of the most prominent food blogging positions over time.

But what is a ‘prominent food blogging position’?

Since food bloggers are not working professionals within a clear hierarchy like teachers and nurses, I decided to look at food blog awards data as a proxy for success in the food blog world. The magazine Saveur hosts the longest running, most extensive set of food blogging awards of any organization. I used their awards nominees and winners to pull together the graphic above and find out how gender and success in food blogging interact.

Using the Saveur awards data, it is clear that there is a pattern of disproportionate male success within the food blog nominees and winners. In a perfectly gender-neutral world, we would expect that when 15% of the food blogs are written by men, 15% of the food blogging awards will be distributed to men. In fact, 26% of the nominees (chosen by Saveur) were men and 36% of the winners (voted on by the internet audience) were men. In other words, both the Saveur selections and the internet-audience voters were inclined to select men more often than strict chance would have predicted.

My interviews indicated that there could be a few explanations for this kind of pattern. However, I’m curious to hear what food bloggers – especially those who voted for or won Saveur‘s awards – have to say.

The comments are open.

Methodological note

N=194

I removed blogs whose writers’ genders were not revealed and blogs written by couples or other mixed-gender groups. I also removed blogs that did not meet my original definition of food blog which include the two categories for blogs about alcohol and the category for blogs about kitchen tools/gadgets.

References

Saveur Food Blog Awards 2012.
Saveur Food Blog Awards 2011.
Saveur Food Blog Awards 2010.

Norén, Laura. (2012) Saveur food blog award nominees and winners by gender, 2010-2012. [Blog post] Graphic Sociology blog.

New York Times 100 Notable Books - Authors' Academic Affiliations
New York Times 100 Notable Books - Authors' Academic Affiliations

What works

Using the New York Time’s list of 100 Notable books of 2011 that ran over the weekend as part of their Holiday Gift Guide, I created the graph above. As an almost-academic, I am interested in the scope of academic work and found it interesting that less than half of the notable books were written by people with academic affiliations. Michael Burawoy and Craig Calhoun have both called for new roles for scholarship and the university, emphasizing that an academy unhitched from the public sphere is not a viable model and might very well be considered irresponsible, given the scale and scope of social, scientific, and technological challenges facing the globe right now and for the foreseeable future.

So what does it mean that non-academics are writing more of the notable books than are academics?

I cannot answer that question definitively, but I can offer three possible avenues for exploration. First, it could be that academics are irresponsible or lazy and that they have either failed to write well or to address relevant topics. They are off publishing pedantic articles in academic journals that nobody reads to fill out their CVs. This scenario is grave. There is an element of truth to it.

An alternative explanation would be that, in part because this is a *gift* suggestion list, these books are not necessarily the most important, but they are the most well written. If that is the case, then the fact that so many non-academic voices make the list indicate that writing itself is an art, one that is spread much more judiciously across the American populous than are academic positions. It also suggests that thinking clearly and writing well are going on in all sorts of places, not just the ivory tower. This is encouraging. There is an element of truth to it.

A third version of this story begins where the second one left off and suggests that, in fact, if academic books do appear on holiday gift lists of notable books, those academics are shirking their duties as academics. Any book with broad public appeal probably is NOT doing much to advance a field. It’s probably just regurgitating existing research in a kind of “Research Thought X for dummies” kind of way. [Many of the people who adhere to this line of thinking have deep and abiding negative thoughts about Malcolm Gladwell.] The view from this perspective argues that asking academics to be responsible to public audiences is akin to asking people to text and drive. It’s dangerous. It takes one’s eye off the critically important field of action and reorients it, likely towards one’s own navel. The primary activity – analytical research and publishing – will suffer, perhaps taking down innocent bystanders along the way. This is a fairly rigid understanding of the best practice for academic research. There is an element of truth to it.

I invite debate on the points I mentioned and those that I have overlooked in the comments.

What needs work

This graphic is not as elegant as I would like. There are far too many words.

I am fascinated with the nitty gritty details of the schools at which those with academic appointments are working. Including the names of so many schools made the endnotes lengthy. I am of two minds on that. Like I said, I enjoy knowing the details, especially when it comes to fleshing out a category like “Elite.” It’s important to know just how eliteness has been defined. In this case, I used US News and World Report. With respect to most of the schools – Princeton, Harvard, Yale, Oxford, Cambridge, Columbia – I think there is widespread agreement that these schools are at the top of the academic heap and have been for a while. Some might quibble about Pomona and Williams.

The point I was trying to illustrate was that those in academia who have books on the notables list could be seen to be public intellectuals or at least they are doing better at making their work accessible to the public than their colleagues who never make it to such lists. It is especially important that the professors in elite institutions make their work accessible because, unlike their colleagues at public schools or less exclusive private schools, the metaphor about the ivory tower as a mechanism of separation is apt. Very few of us have access to elite institutions. Some have argued that those in academia have some responsibility for making their work accessible to broader publics.

References

Burawoy, Michael. (2005 [2004]) http://ccfi.educ.ubc.ca/Courses_Reading_Materials/ccfi502/Burawoy.pdf [Presidential Keynote Address at the Annual Meeting of the American Sociological Association] American Sociological Review Vol. 70.

Calhoun, Craig. (2006) “Social Science for Public Knowledge”>The University and the Public GoodThesis Eleven Vol. 84(7).

New York Times, Sunday Magazine. (November 2011) “100 Notable Books of 2011” [Holiday Gift Guide]

Time and Newsweek Circulation Figures | Graphic by Laura Norén
Time and Newsweek Circulation Figures | Graphic by Laura Norén
Newsweek and Time Circulation Figures | Graphic by Yolanda Cuomo
Newsweek and Time Circulation Figures | Graphic by Yolanda Cuomo

Which one works?

These two graphics portray some of the same information – household income, median age, audience and circulation – though the first one does not break down information between genders. Though it probably goes without saying, I like the one I designed best. The second one has some tantalizing shapes – I applaud the visual appeal – but it does nothing to aid people’s eyes as they try to compare relative sizes between the salient categories. I also happen to think it is easier to understand the complexity of the difference between audience and circulation with the textual explanation provided in the first one. I find the white-font-on-dark-background of the Time and Newsweek labels hard to read (it’s also a known graphic design no-no, especially with a small font size like this. It is easier for the human eye to grok the contrast with dark text on a light background than with light text on a dark background).

From a sociological perspective, comparing the readership of Time and Newsweek not only to each other but also to national averages provides a much deeper sense of context. The second graphic was built from the first though I never had a chance to meet with any of the writing or design team to understand why the national averages were removed.

There are other elements I dislike in the second one. I dislike, for instance, the need to repeat certain elements of text over and over again: “readers per copy” and “Total adult population” and even the “Time” and “Newsweek” headings. One of my closest friends and colleagues spends a lot of his time writing code. The best lesson I have learned from him is that where elements or actions have to be repeated over and over, there is inefficiency in the system. A better design is possible.

I would love to hear from my readers on this comparison. Am I suffering from too much ego investment in the graphic I made? Is the second graphic an improvement on the first? If so, how?

References

Norén, Laura. (2010) “Appendix: Data and Methods” in first draft of Dill, Nandi and Telesca, Jen Imagining Emergencies. [Information graphic].

Cuomo, Yolanda. (2011) “Readership Data Time and Newsweek 2008” in final draft of Dill, Nandi and Telesca, Jen Imagining Emergencies. [Information graphic].