To my mind, surfing hit its peak in the 1950s when relatively light longboards first became available.
Enthusiastic longboarders still ride the waves, of course, but their numbers have dwindled as shorter more maneuverable boards became more fashionable. Happily, longboards are now making a comeback, mostly because they possess a property that shortboards do not: stability. With a stable board novices can quickly experience the thrill of the sport and experts can show off skills like nose walks, drag turns, and tandem riding that are unthinkable using today’s light-as-air shortboards.
The new longboards are different — and, I think, better — because their designs take advantage of modern materials and are more affordable and easier to handle than their predecessors. It just goes to show that everything old becomes new again, and with renewed interest comes the opportunity for improvement.
The same can be said for randomized trials (RTs). They were introduced to the wider field of social sciences in the 1930s, about the time that surfing was being introduced outside of Hawaii. RTs became popular through the 1950s, at least in concept because they can be challenging and expensive to implement. During the 60s, 70s and 80s, RTs were supplanted by simpler and cheaper types of evaluation. But a small and dedicated cadre of evaluators stuck with RTs because of a property that no other form of evaluation has: strong internal validity. RTs make it possible to ascertain with a high degree of certainty — higher than any other type of evaluation — whether a program made a difference.
An RT entails only three steps, selection, assignment, and measurement.
· Selection is nothing more than identifying a group of people who will participate in a study. The way in which program participants are selected has a direct bearing on answering the questions, “Will it work again?”
· Assignment is how the researcher “assigns” experiences to participants. What defines an RT is that fact that participants are assigned to at least two groups, one that experiences the program (the treatment group) and another that does not (the control group), and the assignment are made at random. In essence this means pulling the names of those assigned to the treatment group out of a hat, although in actuality the assignments are made using more modern methods.
· Measurement, at a minimum, takes place after the program has concluded, and as the name implies, it entails evaluators measuring something about program participants, organizations, society, or other entities that the program was intended to benefit.
RTs are not the only way to evaluate a program. And they are not the only good way to evaluate a program. Ideally, an evaluation is designed to answer specific questions that are of interest to clients, and to do so with the highest level of certainty given the program’s stage of development, the specific context of the program, and available resources.
When a program is relatively stable (not undergoing rapid development or experiencing other types of change) and the question of interest is “Did we make a difference?” RTs are can provide policymakers and program staff with a powerful, credible answer. Like riders of modern longboards, you might get a few stares if you implement an RT, not because what you are doing seems odd but because with an RT you can do things—and learn things—that others cannot.