Tag Archives: evaluations

The Future of Evaluation: 10 Predictions

Before January comes to a close, I thought I would make a few predictions.  Ten to be exact.  That’s what blogs do in the new year, after all.

Rather than make predictions about what will happen this year—in which case I would surely be caught out—I make predictions about what will happen over the next ten years.  It’s safer that way, and more fun as I can set my imagination free.

My predictions are not based on my ideal future.  I believe that some of my predictions, if they came to pass, would present serious challenges to the field (and to me).  Rather, I take trends that I have noticed and push them out to their logical—perhaps extreme—conclusions.

In the next ten years…

(1) Most evaluations will be internal.

The growth of internal evaluation, especially in corporations adopting environmental and social missions, will continue.  Eventually, internal evaluation will overshadow external evaluation.  The job responsibilities of internal evaluators will expand and routinely include organizational development, strategic planning, and program design.  Advances in online data collection and real-time reporting will increase the transparency of internal evaluation, reducing the utility of external consultants.

(2) Evaluation reports will become obsolete.

After-the-fact reports will disappear entirely.  Results will be generated and shared automatically—in real time—with links to the raw data and documentation explaining methods, samples, and other technical matters.  A new class of predictive reports, preports, will emerge.  Preports will suggest specific adjustments to program operations that anticipate demographic shifts, economic shocks, and social trends.

(3) Evaluations will abandon data collection in favor of data mining.

Tremendous amounts of data are being collected in our day-to-day lives and stored digitally.  It will become routine for evaluators to access and integrate these data.  Standards will be established specifying the type, format, security, and quality of “core data” that are routinely collected from existing sources.  As in medicine, core data will represent most of the outcome and process measures that are used in evaluations.

(4) A national registry of evaluations will be created.

Evaluators will begin to record their studies in a central, open-access registry as a requirement of funding.  The registry will document research questions, methods, contextual factors, and intended purposes prior to the start of an evaluation.  Results will be entered or linked at the end of the evaluation.  The stated purpose of the database will be to improve evaluation synthesis, meta-analysis, meta-evaluation, policy planning, and local program design.  It will be the subject of prolonged debate.

(5) Evaluations will be conducted in more open ways.

Evaluations will no longer be conducted in silos.  Evaluations will be public activities that are discussed and debated before, during, and after they are conducted.  Social media, wikis, and websites will be re-imagined as virtual evaluation research centers in which like-minded stakeholders collaborate informally across organizations, geographies, and socioeconomic strata.

(6) The RFP will RIP.

The purpose of an RFP is to help someone choose the best service at the lowest price.  RFPs will no longer serve this purpose well because most evaluations will be internal (see 1 above), information about how evaluators conduct their work will be widely available (see 5 above), and relevant data will be immediately accessible (see 3 above).  Internal evaluators will simply drop their data—quantitative and qualitative—into competing analysis and reporting apps, and then choose the ones that best meet their needs.

(7) Evaluation theories (plural) will disappear.

Over the past 20 years, there has been a proliferation of theories intended to guide evaluation practice.  Over the next ten years, there will be a convergence of theories until one comprehensive, contingent, context-sensitive theory emerges.  All evaluators—quantitative and qualitative; process-oriented and outcome-oriented; empowerment and traditional—will be able to use the theory in ways that guide and improve their practice.

(8) The demand for evaluators will continue to grow.

The demand for evaluators has been growing steadily over the past 20 to 30 years.  Over the next ten years, the demand will not level off due to the growth of internal evaluation (see 1 above) and the availability of data (see 3 above).

(9) The number of training programs in evaluation will increase.

There is a shortage of evaluation training programs in colleges and universities.  The shortage is driven largely by how colleges and universities are organized around disciplines.  Evaluation is typically found as a specialty within many disciplines in the same institution.  That disciplinary structure will soften and the number of evaluation-specific centers and training programs in academia will grow.

(10) The term evaluation will go out of favor.

The term evaluation sets the process of understanding a program apart from the process of managing a program.  Good evaluators have always worked to improve understanding and management.  When they do, they have sometimes been criticized for doing more than determining the merit of a program.  To more accurately describe what good evaluators do, evaluation will become known by a new name, such as social impact management.

…all we have to do now is wait ten years and see if I am right.

41 Comments

Filed under Design, Evaluation, Program Design, Program Evaluation

The African Evaluation Association Conference Begins (#2)

From Tarek Azzam in Accra, Ghana: The last two days have been hectic on many fronts.  Matt and I spent approximately 4 hours on Monday trying to work out technical bugs.  Time well spent as it looks like we will be able to stream parts of the conference live.  You can find the schedule and links here.

I have had the chance to speak with many conference participants from across Africa at various social events.  In almost every conversation the same issue keeps emerging—the disconnect between what donors expect to see on the ground (and expect to be measured) and what grantees are actually seeing on the ground (and do not believe they can measure). Although this is a common issue in the US where I do much of my work, it appears to be more pronounced in the context of development programs.

This tension is a source of frustration for many of the people with whom I speak—they truly believe in the power of evaluation to improve programs, promote self-reflection, and achieve social change. However, demands from donors have pushed them to focus on evaluation questions and measures that are not necessarily useful to their programs or the people their programs benefit.  I am interested in speaking with some of the donors attending the conference to get their perspective on this issue. I believe that donors may be looking for impact measures that can be aggregated across multiple grantees, and this may lead to the selection of measures that are less relevant to any single grantee, hence the tension.

I plan on keeping you updated on further conversations and discussions as they occur. Tomorrow I will be helping to conduct a workshop on building evaluation capacity within Africa, and really engaging participants as they help us come up with a list of competencies and capacities that are uniquely relevant to the development/African context. Based on the lively conversations I have had so far, I anticipate a rich and productive exchange of ideas tomorrow.  I will share them with you as soon as I can.

Leave a comment

Filed under Conference Blog, Evaluation, Program Evaluation

Santa Cause

I’ve been reflecting on the past year.  What sticks in my mind is how fortunate I am to spend my days working with people who have a cause.  Some promote their causes narrowly, for example, by ensuring that education better serves a group of children or that healthcare is available to the poorest families in a region.  Others pursue causes more broadly, advocating for human rights and social justice.  In the past, both might have been labeled impractical dreamers, utopian malcontents, or, worse, risks to national security.  Yet today they are respected professionals, envied even by those who have achieved great success in more traditional, profit-motivated endeavors.  That’s truly progress.

I also spend a great deal of time buried in the technical details of evaluation—designing research, developing tests and surveys, collecting data, and performing statistical analysis—so I sometimes lose sight of the spirit that animates the causes I serve.  However, it isn’t long before I’m led back to the professionals who, even after almost 20 years, continue to inspire me.  I can’t wait to spend another year working with them.

The next year promises to be more inspiring than ever, and I look forward to sharing my work, my thoughts, and the occasional laugh with all of you in the new year.

Best wishes to all.

John

1 Comment

Filed under Commentary, Evaluation, Gargani News, Program Evaluation

The AEA Conference (So Far)

The AEA conference has been great. I have been very impressed with the presentations that I have attended so far, though I can’t claim to have seen the full breadth of what is on offer as there are roughly 700 presentations in total.  Here are a few that impressed me the most.  Continue reading

1 Comment

Filed under AEA Conference, Evaluation Quality, Program Evaluation

Quality is a Joke

If you have been following my blog (Who hasn’t?), you know that I am writing on the topic of evaluation quality, the theme of the 2010 annual conference of the American Evaluation Association taking place November 10-13. It is a serious subject. Really.

But here is a joke, though perhaps only the evaluarati (you know who you are) will find it amusing.

    A quantitative evaluator, a qualitative evaluator, and a normal person are waiting for a bus. The normal person suddenly shouts, “Watch out, the bus is out of control and heading right for us! We will surely be killed!”

    Without looking up from his newspaper, the quantitative evaluator calmly responds, “That is an awfully strong causal claim you are making. There is anecdotal evidence to suggest that buses can kill people, but the research does not bear it out. People ride buses all the time and they are rarely killed by them. The correlation between riding buses and being killed by them is very nearly zero. I defy you to produce any credible evidence that buses pose a significant danger. It would really be an extraordinary thing if we were killed by a bus. I wouldn’t worry.”

    Dismayed, the normal person starts gesticulating and shouting, “But there is a bus! A particular bus! That bus! And it is heading directly toward some particular people! Us! And I am quite certain that it will hit us, and if it hits us it will undoubtedly kill us!”

    At this point the qualitative evaluator, who was observing this exchange from a safe distance, interjects, “What exactly do you mean by bus? After all, we all construct our own understanding of that very fluid concept. For some, the bus is a mere machine, for others it is what connects them to their work, their school, the ones they love. I mean, have you ever sat down and really considered the bus-ness of it all? It is quite immense, I assure you. I hope I am not being too forward, but may I be a critical friend for just a moment? I don’t think you’ve really thought this whole bus thing out. It would be a pity to go about pushing the sort of simple linear logic that connects something as conceptually complex as a bus to an outcome as one dimensional as death.”

    Very dismayed, the normal person runs away screaming, the bus collides with the quantitative and qualitative evaluators, and it kills both instantly.

    Very, very dismayed, the normal person begins pleading with a bystander, “I told them the bus would kill them. The bus did kill them. I feel awful.”

    To which the bystander replies, “Tut tut, my good man. I am a statistician and I can tell you for a fact that with a sample size of 2 and no proper control group, how could we possibly conclude that it was the bus that did them in?”

To the extent that this is funny (I find it hilarious, but I am afraid that I may share Sir Isaac Newton’s sense of humor) it is because it plays on our stereotypes about the field. Quantitative evaluators are branded as aloof, overly logical, obsessed with causality, and too concerned with general rather than local knowledge. Qualitative evaluators, on the other hand, are suspect because they are supposedly motivated by social interaction, overly intuitive, obsessed with description, and too concerned with local knowledge. And statisticians are often looked upon as the referees in this cat-and-dog world, charged with setting up and arbitrating the rules by which evaluators in both camps must (or must not) play.

The problem with these stereotypes, like all stereotypes, is that they are inaccurate. Yet we cling to them and make judgments about evaluation quality based upon them. But what if we shift our perspective to that of the (tongue in cheek) normal person? This is not an easy thing to do if, like me, you spend most of your time inside the details of the work and the debates of the profession. Normal people want to do the right thing, feel the need to act quickly to make things right, and hope to be informed by evaluators and others who support their efforts. Sometimes normal people are responsible for programs that operate in particular local contexts, and at others they are responsible for policies that affect virtually everyone. How do we help normal people get what they want and need?

I have been arguing that we should, and when we do we have met one of my three criteria for quality—satisfaction. The key is first to acknowledge that we serve others, and then to do our best to understand their perspective. If we are weighed down by the baggage of professional stereotypes, it can prevent us from choosing well from among all the ways we can meet the needs of others. I suppose that stereotypes can be useful when they help us laugh at ourselves, but if we come to believe them, our practice can become unaccommodatingly narrow and the people we serve—normal people—will soon begin to run away (screaming) from us and the field. That is nothing to laugh at.

8 Comments

Filed under Evaluation, Evaluation Quality, Program Evaluation

The Laws of Evaluation Quality

It has been a while since I blogged, but I was inspired to give it another go by Evaluation 2010, the upcoming annual conference of the American Evaluation Association (November 10-13 in San Antonio, Texas).  The conference theme is Evaluation Quality, something I think about constantly.  There is a great deal packed into those two words, and my blog will be dedicated to unpacking them as we lead up to the November AEA conference.  To kick off that effort, I present a few lighthearted “Laws of Evaluation Quality” that I have stumbled upon over the years.  They poke fun at many of the serious issues I will consider in the upcoming months and that make ensuring the quality of an evaluation a challenge.  Enjoy.

Stakeholder’s First Law of Evaluation Quality
The quality of an evaluation is directly proportional to the number of positive findings it contains.

Corollary to Stakeholder’s First Law
A program evaluation is an evaluation that supports my program

The Converse to Stakeholder’s First Law
The number of flaws in an evaluation’s research design increases without limit with the number of null or negative findings it contains.

Corollary to the Converse of Stakeholder’s First Law
Everyone is a methodologist when their dreams are crushed.

Academic’s First Law of Evaluation Quality
Evaluations are done well if and only if they cite my work.

Corollary to Academic’s First Law
My evaluations are always done well.

Academic’s Lemma
The ideal ratio of publications to evaluations is undefined.

Student’s First Law of Evaluation Quality
The quality of any given evaluation is wholly dependent on who is teaching the class.

Student’s Razor
Evaluation theories should not be multiplied beyond necessity.

Student’s Reality
Evaluation theories will be multiplied far beyond necessity in every written paper, graduate seminar, evaluation practicum, and evening of drinking.

Evaluator’s Conjecture
The quality of any evaluation is perfectly predicted by the brevity of the client’s initial description of the program.

Evaluator’s Paradox
The longer it takes a grant writer to contact an evaluator, the more closely the proposed evaluation approaches a work fiction and the more likely it will be funded.

Evaluator’s Order Statistic
Evaluation is always the last item on the meeting agenda unless you are being fired.

Funder’s Principle of Same Boated-ness
During the proposal process, the quality of a program is suspect.  Upon acceptance, it is evidence of the funder’s social impact.

Corollary to Funder’s Principle
Good evaluations don’t rock the boat.

Funder’s Paradox
When funders request an evaluation that is rigorous, sophisticated, or scientific, they are less likely to read it yet more likely to believe it—regardless of its actual quality.

7 Comments

Filed under Evaluation, Evaluation Quality

It’s a Gift to Be Simple

 simple_logic

Theory-based evaluation acknowledges that, intentionally or not, all programs depend on the beliefs influential stakeholders have about the causes and consequences of effective social action. These beliefs are what we call theories, and they guide us when we design, implement, and evaluate programs.

Theories live (imperfectly) in our minds. When we want to clarify them for ourselves or communicate them to others, we represent them as some combination of words and pictures. A popular representation is the ubiquitous logic model, which typically takes the form of box-and-arrow diagrams or relational matrices.

The common wisdom is that developing a logic model helps program staff and evaluators develop a better understanding of a program, which in turn leads to more effective action.

Not to put too fine a point on it, this last statement is a representation of a theory of logic models. I represented the theory with words, which have their limits, yet another form of representation might reveal, hide, or distort different aspects of the theory. In this case, my theory is simple and my representation is simple, so you quickly get the gist of my meaning. Simplicity has its virtues.

It also has its perils. A chief criticism of logic models is that they fail to promote effective action because they are vastly too simple to represent the complexity inherent in a program, its participants, or its social value. This criticism has become more vigorous over time and deserves attention. In considering it, however, I find myself drawn to the other side of the argument, not because I am especially wedded to logic models, but rather to defend the virtues of simplicity. Continue reading

Leave a comment

Filed under Commentary, Evaluation, Program Evaluation

The Most Difficult Part of Science

tesla

I recently participated in a panel discussion at the annual meeting of the California Postsecondary Education Commission (CPEC) for recipients of Improving Teacher Quality Grants.  We were discussing the practical challenges of conducting what has been dubbed scientifically-based research (SBR).  While there is some debate over what types of research should fall under this heading, SBR almost always includes randomized trials (experiments) and quasi-experiments (close approximations to experiments) that are used to establish whether a program made a difference. 

SBR is a hot topic because it has found favor with a number of influential funding organizations.  Perhaps the most famous example is the US Department of Education, which vigorously advocates SBR and at times has made it a requirement for funding.  The push for SBR is part of a larger, longer-term trend in which funders have been seeking greater certainty about the social utility of programs they fund.

However, SBR is not the only way to evaluate whether a program made a difference, and not all evaluations set out to do so (as is the case with needs assessment and formative evaluation).  At the same time, not all evaluators want to or can conduct randomized trials.  Consequently, the push for SBR has sparked considerable debate in the evaluation community. Continue reading

1 Comment

Filed under Commentary, Evaluation, Program Evaluation

Theory Building and Theory-Based Evaluation

einstein_tbe

When we are convinced of something, we believe it. But when we believe something, we may not have been convinced. That is, we do not come by all our beliefs through conscious acts of deliberation. It’s a good thing, too, for if we examined the beliefs underlying our every action we wouldn’t get anything done.

When we design or evaluate programs, however, the beliefs underlying these actions do merit close examination. They are our rationale, our foothold in the invisible; they are what endow our struggle to change the world with possibility. Continue reading

2 Comments

Filed under Commentary, Design, Evaluation, Program Design, Program Evaluation, Research

Randomized Trials: Old School, New Trend

surfers

To my mind, surfing hit its peak in the 1950s when relatively light longboards first became available.

Enthusiastic longboarders still ride the waves, of course, but their numbers have dwindled as shorter more maneuverable boards became more fashionable. Happily, longboards are now making a comeback, mostly because they possess a property that shortboards do not: stability. With a stable board novices can quickly experience the thrill of the sport and experts can show off skills like nose walks, drag turns, and tandem riding that are unthinkable using today’s light-as-air shortboards.

The new longboards are different — and, I think, better — because their designs take advantage of modern materials and are more affordable and easier to handle than their predecessors. It just goes to show that everything old becomes new again, and with renewed interest comes the opportunity for improvement.

The same can be said for randomized trials (RTs). They were introduced to the wider field of social sciences in the 1930s, about the time that surfing was being introduced outside of Hawaii. RTs became popular through the 1950s, at least in concept because they can be challenging and expensive to implement. During the 60s, 70s and 80s, RTs were supplanted by simpler and cheaper types of evaluation. But a small and dedicated cadre of evaluators stuck with RTs because of a property that no other form of evaluation has: strong internal validity. RTs make it possible to ascertain with a high degree of certainty — higher than any other type of evaluation — whether a program made a difference. Continue reading

2 Comments

Filed under Commentary, Evaluation, Program Evaluation, Research