Tag Archives: evaluation

Should the Pie Chart Be Retired?

The ability to create and interpret visual representations has been an important part of the human experience since we began drawing on cave walls at Chauvet.

Today, that ability—what I call visualcy—has even greater importance.  We use visuals to discover how the world works, communicate our discoveries, plan efforts to improve the world, and document the success of our efforts.

In short, visualcy affects every aspect of program design and evaluation.

The evolution of our common visual language, sadly, has been shaped by the default settings of popular software, the norms of the conference room, and the desire to attract attention.  It is not a language constructed to advance our greater purposes.  In fact, much of our common language works against our greater purposes.

An example of a counterproductive element of our visual language is the pie chart.

Consider this curious example from the New York Times Magazine (1/15/2012).

This pie chart has a humble purpose—summarize reader responses to an article on obesity in the US.  It failed that purpose stunningly.  Here are some reasons why.

(1) Three-dimensionality reduces accuracy: Not only are 3-D graphs harder to read accurately, but popular software can construct them inaccurately.  The problem—for eye and machine—arises from the translation of values in 1-D or 2-D space into values in 3-D space.  This is a substantial problem with pie charts (imagine computing the area of a pie slice while taking its 3-D perspective into account) as well as other types of graph.  Read Stephanie Evergreen’s blog post on the perils the 3-D to see a good example.

(2) Pie charts impede comparisons: People have trouble comparing pie slices by eye.  Think you can? Here is a simple pie chart I constructed from the data in the NYT Magazine graph.  Which slice is larger—orange or the blue?

This is much clearer.

Note that the the the Y axis ranges from 0% to 100%.  That is what makes the bar chart a substitute for the pie chart.  Sometimes the Y axis is truncated innocently to save column inches or intentionally to create a false impression, like this:

Differences are exaggerated and large values seem to be closer to 100% than they really are.  Don’t do this.

(3) The visual theme is distracting: I suspect the NYT Magazine graph is intended to look like some sort of food.  Pieces of a pie? Cake? Cheese?  It doesn’t work.  This does.

Unless you are evaluating the Pillsbury Bake-Off, however, it is probably not an appropriate theme.

(4) Visual differentiators add noise: Graphs must often differentiate elements. A classic example is differentiating treatment and control group averages using bars of different colors.  In the NYT Magazine pie chart, the poor choice of busy patterns makes it very difficult to differentiate one piece of the pie from another.  The visual chaos is reminiscent of the results of a “poll” of Iraqi voters presented by the Daily Show in which a very large number of parties purportedly held almost equal levels of support.

(5) Data labels add more noise: Data labels can increase clarity.  In this case, however, the swarm of curved arrows connecting labels to pieces of the pie adds to the visual chaos.  Even this tangle of labels is better because readers instantly understand that Iraq received a disproportionate amount of the aid provided to many countries.

Do you think I made up these reasons?   Then read this report by RAND that investigated graph comprehension using experimental methods.  Here is a snippet from the abstract:

We investigated whether the type of data display (bar chart, pie chart, or table) or adding a gratuitous third dimension (shading to give the illusion of depth) affects the accuracy of answers of questions about the data. We conducted a randomized experiment with 897 members of the American Life Panel, a nationally representative US web survey panel. We found that displaying data in a table lead [sic] to more accurate answers than the choice of bar charts or pie charts. Adding a gratuitous third dimension had no effect on the accuracy of the answers for the bar chart and a small but significant negative effect for the pie chart.

There you have it—empirical evidence that it is time to retire the pie chart.

Alas, I doubt that the NYT Magazine, infographic designers, data viz junkies, or anyone with a reporting deadline will do that.  As every evaluator knows, it is far easier to present empirical evidence than respond to it.

5 Comments

Filed under Design, Evaluation, Visualcy

A National Holiday and US Postage Stamp for Evaluation

Evaluation is an invisible giant, a profession that impacts society on a grand scale yet remains unseen.  I want to change that.  But what can one person do to raise awareness about evaluation?

Go big.  And you can help.

A National Evaluation Holiday:  With the power vested in me by the greeting card industry, I am declaring…

February 15 is EVALentine’s Day!

This is a day when people around the world share their love of evaluation with each other.  Send a card, an email, or a copy of Guiding Principles for Evaluators to those near and dear, far and wide, internal and external.  Get the word out.  If the idea catches on, imagine how much exposure evaluation would receive.

A US Postage Stamp:  With the power vested in me by stamps.com, I have issued a US postage stamp for EVALentine’s Day.  Other holidays get stamps, why not ours?  The stamp I designed is based on the famous 1973 Valentine’s Day love stamp by Robert Indiana.  Now you can show your love of evaluation on the outside of an EVALentine card as well as the inside.

Here is the best part.

To kickoff EVALentine’s day, I will send an EVALentine’s card and a ready-to-use EVAL stamp to anyone, anywhere in the world.  For free.  Really.

Here is what you need to do.

(1) Visit the Gargani + Company website in the month of February.

(2) Click the Contact link in the upper right corner of the homepage.

(3) Send an email with EVALENTINE in the subject line and a SNAIL MAIL ADDRESS in the body.

(4) NOTE:  This offer is only valid for emails received during the month of February, 2012.

Don’t be left out on EVALentines day.  Drop me an email and get the word out!

12 Comments

Filed under Evaluation, Program Evaluation

The Future of Evaluation: 10 Predictions

Before January comes to a close, I thought I would make a few predictions.  Ten to be exact.  That’s what blogs do in the new year, after all.

Rather than make predictions about what will happen this year—in which case I would surely be caught out—I make predictions about what will happen over the next ten years.  It’s safer that way, and more fun as I can set my imagination free.

My predictions are not based on my ideal future.  I believe that some of my predictions, if they came to pass, would present serious challenges to the field (and to me).  Rather, I take trends that I have noticed and push them out to their logical—perhaps extreme—conclusions.

In the next ten years…

(1) Most evaluations will be internal.

The growth of internal evaluation, especially in corporations adopting environmental and social missions, will continue.  Eventually, internal evaluation will overshadow external evaluation.  The job responsibilities of internal evaluators will expand and routinely include organizational development, strategic planning, and program design.  Advances in online data collection and real-time reporting will increase the transparency of internal evaluation, reducing the utility of external consultants.

(2) Evaluation reports will become obsolete.

After-the-fact reports will disappear entirely.  Results will be generated and shared automatically—in real time—with links to the raw data and documentation explaining methods, samples, and other technical matters.  A new class of predictive reports, preports, will emerge.  Preports will suggest specific adjustments to program operations that anticipate demographic shifts, economic shocks, and social trends.

(3) Evaluations will abandon data collection in favor of data mining.

Tremendous amounts of data are being collected in our day-to-day lives and stored digitally.  It will become routine for evaluators to access and integrate these data.  Standards will be established specifying the type, format, security, and quality of “core data” that are routinely collected from existing sources.  As in medicine, core data will represent most of the outcome and process measures that are used in evaluations.

(4) A national registry of evaluations will be created.

Evaluators will begin to record their studies in a central, open-access registry as a requirement of funding.  The registry will document research questions, methods, contextual factors, and intended purposes prior to the start of an evaluation.  Results will be entered or linked at the end of the evaluation.  The stated purpose of the database will be to improve evaluation synthesis, meta-analysis, meta-evaluation, policy planning, and local program design.  It will be the subject of prolonged debate.

(5) Evaluations will be conducted in more open ways.

Evaluations will no longer be conducted in silos.  Evaluations will be public activities that are discussed and debated before, during, and after they are conducted.  Social media, wikis, and websites will be re-imagined as virtual evaluation research centers in which like-minded stakeholders collaborate informally across organizations, geographies, and socioeconomic strata.

(6) The RFP will RIP.

The purpose of an RFP is to help someone choose the best service at the lowest price.  RFPs will no longer serve this purpose well because most evaluations will be internal (see 1 above), information about how evaluators conduct their work will be widely available (see 5 above), and relevant data will be immediately accessible (see 3 above).  Internal evaluators will simply drop their data—quantitative and qualitative—into competing analysis and reporting apps, and then choose the ones that best meet their needs.

(7) Evaluation theories (plural) will disappear.

Over the past 20 years, there has been a proliferation of theories intended to guide evaluation practice.  Over the next ten years, there will be a convergence of theories until one comprehensive, contingent, context-sensitive theory emerges.  All evaluators—quantitative and qualitative; process-oriented and outcome-oriented; empowerment and traditional—will be able to use the theory in ways that guide and improve their practice.

(8) The demand for evaluators will continue to grow.

The demand for evaluators has been growing steadily over the past 20 to 30 years.  Over the next ten years, the demand will not level off due to the growth of internal evaluation (see 1 above) and the availability of data (see 3 above).

(9) The number of training programs in evaluation will increase.

There is a shortage of evaluation training programs in colleges and universities.  The shortage is driven largely by how colleges and universities are organized around disciplines.  Evaluation is typically found as a specialty within many disciplines in the same institution.  That disciplinary structure will soften and the number of evaluation-specific centers and training programs in academia will grow.

(10) The term evaluation will go out of favor.

The term evaluation sets the process of understanding a program apart from the process of managing a program.  Good evaluators have always worked to improve understanding and management.  When they do, they have sometimes been criticized for doing more than determining the merit of a program.  To more accurately describe what good evaluators do, evaluation will become known by a new name, such as social impact management.

…all we have to do now is wait ten years and see if I am right.

41 Comments

Filed under Design, Evaluation, Program Design, Program Evaluation

Evaluation Capacity Building at the African Evaluation Association Conference (#3)

From Tarek Azzam in Accra, Ghana: Yesterday was the first day of the AfrEA Conference and it was busy.  I, along with a group of colleagues, presented a workshop on developing evaluation capacity.  It was well attended—almost 60 people—and the discussion was truly inspiring.  Much of our conversation related to how development programs are typically evaluated by experts who are not only external to the organization, but external to the country.  Out-of-country evaluators typically know a great deal about evaluation, and often they do a fantastic job, but their cultural competencies vary tremendously, severely limiting the utility of their work.  When out-of-country evaluators complete their evaluations, they return home and their evaluation expertise leaves with them.  Our workshop participants said they wanted to build evaluation capacity in Africa for Africans because it was the best way to strengthen evaluations and programs.  So we facilitated a discussion of how to make that happen.

At first, the discussion was limited to what participants believed were the deficits of local African evaluators.  This continued until one attendee stood up and passionately described what local evaluators bring to an evaluation that is unique and advantageous.   Suddenly, the entire conversation turned around and participants began discussing how a deep understanding of local contexts, governmental systems, and history improves every step of the evaluation process, from the feasibility of designs to the use of results.  This placed the deficiencies of local evaluators listed previously—most of which were technical—in crisp perspective.  You can greatly advance your understanding of quantitative methods in a few months; you cannot expect to build a deep understanding of a place and its people in the same time.

The next step is to bring the conversation we had in the workshop to the wider AfrEA Conference.  I will begin that process in a panel discussion that takes place later today. My objective is to use the panel to develop a list of strategic principles that can guide future evaluation capacity building efforts. If the principles reflect the values, strengths, and knowledge of those who want to develop their capacity, then the principles can be used to design meaningful capacity building efforts.  It should be interesting—I will keep you posted.

Leave a comment

Filed under Conference Blog, Evaluation, Program Evaluation

The African Evaluation Association Conference Begins (#2)

From Tarek Azzam in Accra, Ghana: The last two days have been hectic on many fronts.  Matt and I spent approximately 4 hours on Monday trying to work out technical bugs.  Time well spent as it looks like we will be able to stream parts of the conference live.  You can find the schedule and links here.

I have had the chance to speak with many conference participants from across Africa at various social events.  In almost every conversation the same issue keeps emerging—the disconnect between what donors expect to see on the ground (and expect to be measured) and what grantees are actually seeing on the ground (and do not believe they can measure). Although this is a common issue in the US where I do much of my work, it appears to be more pronounced in the context of development programs.

This tension is a source of frustration for many of the people with whom I speak—they truly believe in the power of evaluation to improve programs, promote self-reflection, and achieve social change. However, demands from donors have pushed them to focus on evaluation questions and measures that are not necessarily useful to their programs or the people their programs benefit.  I am interested in speaking with some of the donors attending the conference to get their perspective on this issue. I believe that donors may be looking for impact measures that can be aggregated across multiple grantees, and this may lead to the selection of measures that are less relevant to any single grantee, hence the tension.

I plan on keeping you updated on further conversations and discussions as they occur. Tomorrow I will be helping to conduct a workshop on building evaluation capacity within Africa, and really engaging participants as they help us come up with a list of competencies and capacities that are uniquely relevant to the development/African context. Based on the lively conversations I have had so far, I anticipate a rich and productive exchange of ideas tomorrow.  I will share them with you as soon as I can.

Leave a comment

Filed under Conference Blog, Evaluation, Program Evaluation

From the African Evaluation Association Conference (#1)

Hello my name is Tarek Azzam, and I am an Assistant Professor at Claremont Graduate University. Over the next few days I will blog about my experiences at the 6th Biennial AfrEA Conference in Accra, Ghana.  The theme of the conference is “Rights and Responsibility in Development Evaluation.”  As I write this, I await the start of the conference tomorrow, January 9.

The conference is hosted by the African Evaluation Association (AfrEA) and Co-Organized by the Ghana Monitoring & Evaluation Forum (GMEF).  For those who live or work outside of Africa, these may be unfamiliar organizations.  I encourage you to learn more about them and other evaluation associations around the world through the International Organisation for Cooperation in Evaluation (IOCE).

Ross Conner, Issaka Traore, Sulley Gariba, Marie Gervais, and I will present a half day workshop on developing evaluation capacity within Africa, along with a panel discussion.

I am also working with Matt Galen to broadcast via the internet some of the keynote sessions at the conference and share them with others.  I will send links as they become available.

I am very excited about the start of the conference.  It is a new venue for me and I look forward to sharing my experiences with you.

Leave a comment

Filed under Conference Blog, Evaluation, Program Evaluation

Evaluation: An Invisible Giant

When I am asked what I do for a living, I expect that it might take a little explaining.  Most people are unaware of program evaluation, including many who work for organizations that implement programs.

My short answer is that I help clients—nonprofit organizations, foundations, corporations, museums, and schools—determine how effective they are and how they can be more effective.   Often this leads to more questions and longer conversations that I quite enjoy, yet I am left wondering why evaluation is so little known given the size of the field.

How big is the field of evaluation?  Ironically, that is not a statistic that anyone tracks.  To get a handle on it, consider the nonprofit sector, which is closely associated with programs intended to further a social mission.

According to the Urban Institute, there were roughly 1.5 million nonprofit organizations in the United States in 2011, up by 25% over the preceding 10 years.  In 2010, nonprofit organizations produced products and services worth roughly $779 billion, which is 5.4 percent of GDP.  As a point of comparison that is more than the US spends on its military, which accounts for only 4.7% of GDP.

Nonprofits, however, are not the only organizations that implement programs.  Universities, public school systems, government agencies, hospitals, and a growing number of for-profit companies do so as well.  If we take into account all organizations that implement programs—what Paul Light calls social benefit organizations—it would easily double or triple our prior estimate based on nonprofit organizations alone.  That means that goods and services produced by the social benefit sector could be on par with those of healthcare—a whopping 16% of GDP.

Who figures out whether that whopping slice of GDP is benefiting society?  Who helps design the programs represented by that slice?  Who works to build the capacity of social benefit organizations to achieve their missions?  Countless evaluators.  Yet, program evaluation remains hidden from public view, an invisible giant unnoticed by most.  Isn’t it time that changed?

6 Comments

Filed under Evaluation, Program Evaluation

Santa Cause

I’ve been reflecting on the past year.  What sticks in my mind is how fortunate I am to spend my days working with people who have a cause.  Some promote their causes narrowly, for example, by ensuring that education better serves a group of children or that healthcare is available to the poorest families in a region.  Others pursue causes more broadly, advocating for human rights and social justice.  In the past, both might have been labeled impractical dreamers, utopian malcontents, or, worse, risks to national security.  Yet today they are respected professionals, envied even by those who have achieved great success in more traditional, profit-motivated endeavors.  That’s truly progress.

I also spend a great deal of time buried in the technical details of evaluation—designing research, developing tests and surveys, collecting data, and performing statistical analysis—so I sometimes lose sight of the spirit that animates the causes I serve.  However, it isn’t long before I’m led back to the professionals who, even after almost 20 years, continue to inspire me.  I can’t wait to spend another year working with them.

The next year promises to be more inspiring than ever, and I look forward to sharing my work, my thoughts, and the occasional laugh with all of you in the new year.

Best wishes to all.

John

1 Comment

Filed under Commentary, Evaluation, Gargani News, Program Evaluation

The AEA Conference (So Far)

The AEA conference has been great. I have been very impressed with the presentations that I have attended so far, though I can’t claim to have seen the full breadth of what is on offer as there are roughly 700 presentations in total.  Here are a few that impressed me the most.  Continue reading

1 Comment

Filed under AEA Conference, Evaluation Quality, Program Evaluation

AEA 2010 Conference Kicks Off in San Antonio

In the opening plenary of the Evaluation 2010 conference, AEA President Leslie Cooksy invited three leaders in the field—Eleanor Chelimsky, Laura Leviton, and Michael Patton– to speak on The Tensions Among Evaluation Perspectives in the Age of Obama: Influences on Evaluation Quality, Thinking and Values.  They covered topics ranging from how government should use evaluation information to how Jon Stewart of the Daily Show outed himself as an evaluator during his Rally to Restore Sanity/Fear (“I think you know that the success or failure of a rally is judged by only two criteria; the intellectual coherence of the content, and its correlation to the engagement—I’m just kidding.  It’s color and size.  We all know it’s color and size.”)

One piece that resonated with me was Laura Leviton’s discussion of how the quality of an evaluation is related to our ability to apply its results to future programs—what is referred to as generalization.  She presented a graphic that described a possible process for generalization that seemed right to me; it’s what should happen.  But how it happens was not addressed, at least in the short time in which she spoke.  It is no small task to gather prior research and evaluation results, translate them into a small theory of improvement (a program theory), and then adapt that theory to fit specific contexts, values, and resources.  Who should be doing that work?  What are the features that might make it more effective?

Stewart Donaldson and I recently co-authored a paper on that topic that will appear in New Directions for Evaluation in 2011.  We argue that stakeholders are and should be doing this work, and we explore how the logic underlying traditional notions of external validity—considered by some to be outdated—can be built upon to create a relatively simple, collaborative process for predicting the future results of programs.  The paper is a small step toward raising the discussion of external validity (how we judge whether a program will work in the future) to the same level as the discussion of internal validity (how we judge whether a program worked in the past), while trying to avoid the rancor that has been associated with the latter.

More from the conference later.

1 Comment

Filed under AEA Conference, Evaluation Quality, Gargani News, Program Evaluation