Given the good contributions of the DAC criteria noted in Part 1 of this series, let us now turn to why they need to be rethought. Based on my experience, I draw the following conclusions about requirements for evaluation criteria suitable for (sustainable) development contexts.
We need a set of criteria that ensures that we actually evaluate contributions to ‘development’. As a set, the DAC criteria are necessary but not sufficient to determine the worth, significance or merit of an intervention for development. Important aspects are absent, creating a false sense of confidence about the extent to which they can lead to a sound, defensible synthesis judgment about contributions to development.
This argument will be elaborated in a next post. Suffice it to say that at least coherence and synergy (or complementarity) have to be additional criteria, and impact and sustainability have to be connected and carry a high and equal/or weight in any judgment about success. This issue is of particular importance in the Global South – another discussion in a next post in the series.
We need a set of criteria that directs, yet is flexible and encourages creative thinking about what is essential to evaluate. In development evaluation the DAC criteria are used to direct the evaluation questions, rather than the other way around. A too-limited and pre-determined set of criteria, used without careful justification, can make us lazy and stifle original thinking about priorities for evaluation. This can easily lead to the omission of important evaluation questions, and prevent in-depth insights or innovations based on fresh thinking about evaluation criteria and questions.
We need evaluation criteria with nuanced definitions and descriptions. Later versions of definitions or interpretations of the criteria tend to be more nuanced. Compare for example Efficiency, Impact and Sustainability in the this version with this one, the more detailed descriptions by UNDP, or the adjusted criteria by ALNAP for use in the humanitarian sector in 2006. However, improved practice does not necessarily follow across organisations, as commissioners do not necessarily track or engage with such improvements. It is equally unlikely that evaluators do such tracking, or negotiate with commissioners to ensure that such improvements are included in terms of reference.
We need an approach that facilitates, even compels synthesis judgments across criteria. We seldom discuss the conceptual and practical strengths and weaknesses of the DAC criteria for synthesis judgments about development. Making a synthesis judgment in development contexts is often technically and politically challenging, but this does not remove our responsibility to engage with the matter. Our failure to do so weakens efforts at a more holistic understanding of development. It inhibits debate and learning about priorities, trade-offs and influences.
In a very useful book Tom Schwandt notes four ways in which this “Achilles heel” of evaluation practice can be addressed (through rule-governed, algorithmic and rubric-based approaches; intuitive-based holistic approaches; ‘all-things-considered’ approaches; and deliberative approaches). Rubrics and quantitative approaches - weighing, scoring, aggregating - are used by some organisations, but in most evaluations I see there is little systematic work or convincing reasoning towards a synthesis judgment.
We need criteria suitable for evaluation beyond projects and programmes. The DAC criteria exacerbate the notion that evaluation for development is all about projects and programmes (or sometimes about policies or events). Evaluation evolved out of ‘programme evaluation’ in the Global North, which for decades tended to involve tinkering within a particular sector. Similarly, the fragmented nature of the aid system has meant that the most common evaluations are aimed at one or portfolios of interventions of an organisation or partnership.
But countries in the Global South face the difficult challenge of enabling and sustaining positive national development trajectories in the long term across many sectors and institutions, usually from a low base. In LMIC countries development is also more severely influenced by interconnected global and regional dynamics and power asymmetries. This requires thinking beyond interventions to cross-cutting issues, thematic areas, systems and so on as foci for evaluative attention.
We need criteria that can shift some attention away from results or ‘impact’ to more thoughtful engagement with design and implementation. If we treat development as ‘complex adaptive system’ (CAS), we need much more thoughtful and systematic prospective and retrospective examination of intervention designs and the extent of their tailoring to theory, to context, to development, and to development as CAS. We also need to move implementation or so-called process evaluation away from a fixation with logframes or rigid RBM-determined implementation and compliance assessments. It is unfortunate that the adoption of adaptive management and the useful practice of developmental evaluation has been limited to date.
Issues for consideration per criterion
In addition to my own observations, the following summary of concerns by criterion draw from the thoughtful contributions to DAC criteria discussions by Thomas Chianca (2008), Caroline Heider (2017) and Marc Cohen and Olivier Shingiro (2017), who have all made highly relevant points about the suitability of current set of five DAC criteria. The latter article is particularly pertinent in view of the increasing focus on SDG evaluation.
Relevance has been defined in a manner that results in nearly everything being relevant in some way or another to development policies, strategies and/or stakeholder needs. In the SDG era this will be amplified; it is very likely that every intervention will be relevant to something related to ‘sustainable development’. Relevance as currently defined will therefore definitely not help to counter the dilution and fragmentation of development funding. Most importantly, there is no sense of the significance and timeliness of what has been planned, done or effected. Perspectives on relevance also differ among stakeholders: whose voice will count (the most), and on what basis?
Efficiency does not capture non-monetary costs, waste or the hidden costs of (negative) social and environmental impacts. It focuses almost entirely on the use of the least costly resources, and even when input-based targets are achieved, it is not clear whether they have actually served their purpose efficiently. The definition of this criterion also initially did not emphasise the importance of considering alternatives; this aspect remains neglected in practice.
More generally, the DAC criteria fail to capture what might be important related considerations in some cases, such as cost-effectiveness, economic rates of return, social returns on investment or value for money.
Effectiveness as DAC criterion reinforces my argument that so-called ‘development evaluation’ does not necessarily mean ‘evaluation for development’ as it does not sufficiently take account that development is a complex adaptive system. Caroline Heider notes that Effectiveness embodies the accountability dimension of evaluation. The criterion has two interpretations: development effectiveness as synthesis judgment, or the effectiveness of specific interventions. But in both cases analyses tend to focus on whether objectives were achieved without encouraging questioning, in the first place, their appropriateness or importance in that particular context. The objectives might have been set based on development partners’ political priorities rather than stakeholders’ main needs, or at the wrong time and/or out of sequence with other initiatives that are essential for development. ‘Success’ is often too vaguely defined or poorly conceptualised, and evaluators tend not to engage with the values, understanding of ‘change’ or the development ideologies upon which the so-called effectiveness or success has been defined.
Evaluators also often struggle to distinguish between the criteria of Impact and Effectiveness. Implementers work towards predetermined outcomes without being encouraged to take risks, to improvise or adjust strategies and even outcomes or impacts where well justified. Instead, they often try to – or have to - game the system. Evaluating the effectiveness of an intervention in isolation of what goes on around it can be quite meaningless if issues such as coherence and synergy between interventions or policies are not taken into account. Similarly, any (negative) unintended and unanticipated consequences of an intervention need to be accounted for when assessing effectiveness, in particular when there is no integration of the Effectiveness, Impact and Sustainability findings into a synthesis judgment.
Impact as criterion suffers from a variety of challenges when not conceptualised from a complexity perspective. It should be conceptually linked to Sustainability. Far more attention should be given to impacts on different groups (rather than averages) and on ensuring that negative social, environmental, technological, or economic impacts are traced and included in assessments. Interventions ideally need to unleash changes that cascade or ripple out and cause further changes to take place, but how often are positive impacts claimed in the short term without any notion of whether they will enable any positive changes to sustain in the long run (as long as it is desirable)?
Caroline points out a swathe of challenges and weaknesses around the impact evaluations which have taken over evaluation practice in the last decade: The quality of many of these studies are not high, and the results inconclusive and limited to narrow phenomena, often concluding that more studies are needed. Intervention designs continue to be too linear and confuse between outputs, outcomes and impacts, while designs, implementation strategies and evaluations are done without cognisance of the implications of viewing development through a complexity lens.
Sustainability is often in practice considered only in terms of financial or programme sustainability. Analytical depth is sorely lacking. Evaluations tend to neglect environmental aspects and do not to refer explicitly to the interconnections between socio-cultural, economic and environmental components of development. Furthermore, as Valuing Voices’ work on Sustained and Emerging Impact Evaluation has recently very articulated, determining the potential for, or the actual sustainability of positive impacts has been greatly neglected.
I have been greatly concerned about this latter issue since my early years as evaluator when I saw how few interventions were planned and done in a manner that yielded results that endured. Already in 2002 during a presentation in South Africa I asked Michael Patton about the need for evaluating programme designs and implementation strategies specifically for issues of impact sustainability. I have also referred to it in various presentations and articles (see for example here and here) and have always tried to address it in my own evaluations. It is also one of the main reasons for my focus on the role of culture in development and in evaluation. It is imperative that we integrate the findings derived from assessments using the Impact and Sustainability criteria.
Although the focus on the related concept of resilience is welcome, the Sustainability criterion continues to be critical, and has to be addressed with far greater vigour and nuance than we have done to date.
A too-narrow conceptualization and too-rigid application of a set of criteria might create comfort zones that are too comfortable. They can also prevent us from evaluating for development using a complexity lens. As a result, critical areas that need attention in the evaluation of development are being neglected. This is even more so when considering the ambition and interconnectedness of the Global Goals of the 2030 Agenda, which is now widely acknowledged as the overarching global framework for the conceptualisation of development worldwide.
Given all these considerations, in the next two posts in this series I will make a proposal for a new approach to the identification of evaluation criteria for sustainable development.
Zenda Ofir is an independent South African evaluator at present based near Geneva. She works primarily in Africa and Asia, and advises organisations around the world. She is a former AfrEA President, IOCE and IDEAS Vice-President, AEA Board member, Honorary Professor at Stellenbosch University, Richard von Weizsäcker Fellow, and at present Interim Council Chair of the new International Evaluation Academy.