The data revolution we need

Delegates of the 193 United Nations member states have reached consensus on the new Sustainable Development Goals for the next 15 years. With 169 targets amidst 17 goals, the SDGs have size and ambition, anchored by one exceptional poverty target to

By 2030, eradicate extreme poverty for all people everywhere, currently measured as people living on less than $1.25 a day

Such an ambitious (some would say impossible) agenda demands that the global development apparatus function as a well-oiled machine. Somehow we must miraculously achieve, within a hybrid intergovernmental-public-private institutional framework, a scale and efficiency that has rarely been achieved by a single state, with the possible exception of China, a success story that relatively few stakeholders seek to emulate. Unless our intricate and fragile global governance architecture is suddenly supplanted by a benevolent despot from outer space, the success of the SDGs and any successor frameworks will depend on our ability to mobilize an unprecedented mix of human, financial and institutional resources and to deploy them in the most efficient way possible.

In today’s world, analytics –meaning the analysis of quantitative AND qualitative data in pursuit of a logical goal – has nearly completed its long ascent to prominence in the logic of the public and private sector action. We may disagree about the perceived end of logical action – be it development, human rights, systems, security, profit or power – but mostly agree that goals can be analyzed and pursued in a logical, directed, and evidence-based fashion. At whatever level of spatial and temporal specificity we are operating, our logical model is roughly the same

Need → Intervention → Impact

It is important and underreported news that we possess an analytic framework for the scientific pursuit of goals for people and planet, yet the SDG ratification document pays meager lip service to evidence-based practice. It offers absolutely no concrete ambitions for raising our capacity to pursue this model.

The problem is that, in spite of our daily encounters with the growing volume and impact of data both big and small, the current global development data space remains more characterized by gaps, holes and noise than by anything like an emerging corpus of evidence that could support dramatically heightened efficiency or impact, much less long-term goals like self-sufficiency or broad participation spelled out by the UN’s Data Revolution Group. As much as we hail the promise and progress of big data, the reality is that our analytical frameworks and tools have vastly outstripped the production of data of the necessary specificity and relevance to achieve our analytic goals. Big data from mobile phones and internet searches can play a critical role in measuring human need and programmatic impact, but they must be a complement, not a substitute, for program- and policy-relevant data collected purposefully at the appropriate spatial and temporal scale for analysis. This is before we even begin to contemplate the potential risks of big data to human rights, especially those of vulnerable, excluded or exploited groups. To be very clear, progress over the past 15 years in improving the collection of national statistical data has been extraordinary, not just in filling data gaps and improving data systems and capacity, but also in raising the motivation and capacity to move towards something more comprehensive. But we aren’t even close to the promised land.

Efforts to systematize evidence from program evaluations scattered across numerous settings and scales worldwide have driven home the reality that evidence-based interventions must be routinely optimized to suit local conditions that change over time. We need room for trial-and-error, experimentation, consultation, and yes, mistakes, but we need to know where, when, who and how these errors are emerging at a granular level in something approximating real time. In general, more comprehensive and specific goals demand data and analysis that capture relatively specific variations over time and across space.

It is in the full context of progress, opportunities, risks, and the drastic expansion of targets from MDG to SDG that the near-total neglect of the data revolution in the actual SDG framework is so deeply troubling.

Running away from the solution

Here is target 17.18, buried deep within the means of implementation

By 2020, enhance capacity-building support to developing countries, including for least developed countries and small island developing States, to increase significantly the availability of high-quality, timely and reliable data disaggregated by income, gender, age, race, ethnicity, migratory status, disability, geographic location and other characteristics relevant in national contexts

Target 17.18 has “fear of commitment” issues. It can’t even commit to building capacity, but merely to “enhance capability-building support”. By how much will capacity-building support be enhanced? We have no idea. “High-quality, timely and reliable” have no meaning without specific definitions. In light of technological advances, the mention of various forms of disaggregated data might be a step forward, were it not for the definitive declaration that such data must be “relevant in national context”, not so that they might meet the needs of individual people, communities, or subpopulations to achieve the stated goal of leaving no one behind.

The original Open Working Group Proposal for the SDGs was anchored by a more powerful expression of the purpose of data, a point that was completely excised from the ratification draft:

  1. In order to monitor the implementation of the SDGs, it will be important to improve the availability of and access to data and statistics disaggregated by income, gender, age, race, ethnicity, migratory status, disability, geographic location and other characteristics relevant in national contexts to support the monitoring of the implementation of the SDGs. There is a need to take urgent steps to improve the quality, coverage and availability of disaggregated data to ensure that no one is left behind.

Here we see the three key things that must be said about the purpose of data within the post-2015 framework: 1) that we must improve quality and coverage of disaggregated data for the explicit purpose of leaving no one behind, 2) that the purpose of data is to support the implementation of the SDGs at all levels, and 3) that data must be made available for accountability and eventually participation at all levels.

Irrevocably out of touch

The failure to retain this vision or to put specific targets on improving the quality of local data betrays a fundamental misunderstanding of the role of data in the development process. It likely reflects the fact that most people charged with negotiating the SDGs, while well-intentioned and well-versed in the language of leaving no one behind, don’t actually have much contemporary experience of carrying out this hard work.

Throughout the top tiers of the development industry, there remains a conception that the main point of collecting data is to generate accountability, and specifically to enforce the kind of “report card” accountability that would keep global and national leaders on side. The MDGs came in for much criticism around this issue, for setting goals without participation, for setting targets that could barely be measured at baseline and certainly not for measuring change over time, and for including no sanctions for failure to achieve targets. In reality, this sort of high-level accountability is a mirage, and certainly not a goal that could begin to justify the kind of data expansion that the world needs. Everyone responsible for negotiating the SDGs will be gone in 15 years, or at least not remotely in the same position. The notion that heads should roll if targets aren’t met is one only put forward by cynics who largely reject the entire process of goal-setting. Of course we can use country-level data to refocus certain efforts after 5 and 10 years, and to take stock after 15 years. But by then it will be pretty late. We might also potentially use country-level data to recalibrate our expectations for what can be realistically achieved over a 15-year time horizon, though if data were truly leading to a sense of realism, would we currently have SDGs that are anchored by a poverty target that is almost certainly out of reach?

The Data Revolution We Need

Ultimately, the real reason we need better data is for improved targeting and delivery of services, and the SDGs are effectively counterproductive in that respect. The SDGs barely contain the power to close the data and evidence gaps exposed by the MDGs. They certainly offer no plan to produce the data that would actually be needed to achive the 169 targets that will be on the books as of September 27.

The data revolution we need would include a few critical elements.

  1. completion of the extraordinary progress in basic national statistical data by funding and implementing the minimum package of SDG monitoring and statistical capabilities proposed by the UN Sustainable Development Solutions Network. This includes decennial census, regular and harmonized sample surveys in areas like health and living standards, environmental monitoring, geospatial data of places and facilities, and improved national accounts
  2. Registration of all births and deaths, with linkage to national identification systems. Registration is a human right and a vital marker of recognition and inclusion. It can also improve the speed, coverage, and efficiency of public transfer and benefits systems, in spite of lingering concerns about security and inclusion. Finally, registration would form the backbone for future individual-level service delivery and tracking platforms that remain years away from reality.
  3. Better data on the needs, capabilities, development outcomes of communities. This would include data at the district, sub-district and community level, and data at the facility level for institutionalized populations. The most pressing needs would be for small-area estimation of key outcomes like poverty, food security, education and health that are both important targets and markers of community vulnerability and resiliency. These data are essential to identify the locations with the biggest gains and unmet needs. They are also essential for conducting informed humanitarian response that actually goes beyond measuring damage to considering resiliency and vulnerability. Finally, such data provide baseline, background and control data for studies of community outcomes and resiliency in development and humanitarian research.
  4. More robust, comprehensive, harmonized and user-friendly data on systems, facilities planning and policies. As the focus of development action moves from a focus on individual-focused knockout interventions like immunization to a focus on vital systems of life support, we need more systemic databases linking all needs and inputs. The origins of such a system can be seen in the move from implementer-specific to sector-wide monitoring of inputs and outputs in sectors like water/sanitation or in the World Health Organization’s Service Availability Mapping approach. A more comprehensive approach to monitoring policies would register, monitor and collect valuable data on quality, price etc. for clearly defined sets of facilities including those in the public, private and nonprofit sectors.
  5. Finally, we need data on tracking of aid and government expenditures that is universal, publically-available, and more specific. There are already a number of organizations like Publish What you Fund and AidData and in existing Development Assistance Databases. They have made far more progress than innovators in other data measurement exercises, partly because these data mostly exist and merely need to be coded. But there are still gaps, most notably the fact that most aid data are not nearly specific enough as far as the timing or location of aid delivery to conduct meaningful analysis or for communities to see what aid they should have been receiving.

In short, the data revolution we need for today would provide the knowledge infrastructure necessary to have even a chance of achieving most of the current SDGs in most places. Such a revolution would also set the stage for a much more evidence-based, efficient and self-sufficient agenda for global development in 2030.

Would anyone care to put a price tag on this? It’s a lot cheaper than most would think.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s