Joshua Blumenstock states in his article, “Don’t forget people in the use of big data for development”, that a ‘humbler’ data science can transform international development while also limiting the number of silver bullets that have missed their mark in recent decades. Blumenstock argues that a promise for data-based international development relies on machine learning algorithms that detect patterns through data analysis by tracking digital footprints. He states that the data analyzed through people’s phones can be used to develop solutions and distribute humanitarian aid in a more focused and timely manner. For instance, Blumenstock mentions that, with some tweaks, the same algorithms used by companies such as Facebook and Google to match advertisements to people online can be used to match resources to those living in poverty. Data-based solutions and tools range from digital footprint analysis used to improve public health issues such as pandemics, epidemics, and even assisting with national and international responses to crises all the way to satellite imagery and high resolution maps of crop yields and childhood malnutrition.
However, Blumenstock warns that data science is not a complete solution for international-development issues, pointing out four major issues with these tools, including unanticipated effects, lack of validation, biased algorithms, and lack of regulation. Big data solutions often empower those already empowered rather than those in need, largely because the power to gain information from the data lies in the hands of the few already in positions of power. He continues discussing the flaws in the new approaches of data collection and how they have not been validated, unlike the conventional methods of data collection which have been developed and tested over time. Blumenstock also points out the issue of biased algorithms, detailing how tools are often trained on biased data, marginalizing the poorly represented. He explains how in order to provide data, an individual must have a mobile phone, which in turn requires connectivity, electricity, and literacy, excluding a vast amount of people in developing countries. Next, Blumenstock argues that there is a lack of regulation in data collection due to private companies who have little incentive to do anything other than maximize profits. He mentions how the issues of data privacy, algorithmic transparency, and fairness are often neglected by most companies operating in developing countries. Lastly, Blumenstock discusses the ways forward in which new sources of data should be validated by being a complement to the previous existing data, customized using existing algorithms, and have a deepened collaboration between data scientists, governments, civil society, development experts, and the private sector.
“Good intent is not enough in data science when dealing with the problems which determine people’s experiences.” -Anna Raymond
I have to agree. I believe that the desire to apply data science to human development was done so in order to improve lives. However, as mentioned by Blumenstock, the application of data science has flaws and fails to support its initial goals. For instance, the loan payments administered in Kenya were given to those with decent credit scores. However, this often led to a cycle of poverty and debt traps. Although the intent was good, it wasn’t enough to help those in need and created more suffering.
“Transparency is the underlying issue to many of these problems, so an increase in this on both ends (data based issues & human based issues) could lead to better results.” -Nira Nair
I believe that increasing data transparency is essential on both ends. I think that by increasing transparency between the data collectors, the data providers, and those who analyze and find solutions, would be a more efficient way of collecting data.
“In lieu of such drastic potential for promoting applications yet demoralizing hindrances, the balancing act can become difficult.” -Kayla Seggelke
In my opinion, the most ethical way of collecting data is through consent and simply the best method of promoting data applications in order for researchers to use data freely to develop algorithms. Unfortunately, hindrances will never completely disappear. Many of us fear a data leakage, preventing us from sharing our data.