DATA150

Final Paper: Research Proposal

Rhea Malhotra

Professor. Brewer

Human Development and Data Science

12/14/2021

Word count: 1807

Introduction:

Poverty is one of the most urgent concerns regarding human development. It forces millions to live in negative conditions such as homelessness, food insecurity, inadequate nutrition and childcare, and lack of access to healthcare or education. While many institutions aim to alleviate poverty through donations and construction of infrastructure, they are often unsuccessful due to the difficulty of not knowing where exactly those that are affected live, a lack of access to developing countries, and unreliable data. There is a lack of clear data to identify impoverished regions, leading to insufficient resource distribution (Kumar). With the inaccessibility of this information, developing countries are further set back from developing as they continue to live in poverty. To fix this issue, researchers are mapping poverty using satellite imagery along with census data as an alternative approach. Poverty mapping highlights inequalities within developing nations and provides an essential tool for the targeting of poverty alleviation policies (Bill and Melinda Gates Foundation).

Poverty is measured by comparing an individual’s income level to a set poverty threshold, with those falling under the threshold being considered poor. Nobel Prize Laureate Amartya Sen believes otherwise. According to Sen, poverty is a complex, multifaceted world. “You cannot draw a poverty line and then apply it across the board to everyone the same way, without taking into account personal characteristics and circumstances.” There are geographical, biological, and social factors that magnify or lower the impact of income on individuals. Sen believes that being poor means having an income level that does not allow for an individual to achieve certain minimum capabilities, while considering the circumstances and social requirements of the environment (IADB). Throughout my research done in this class on poverty assessment, Africa and India are where the majority of extreme poverty exists today. However, how can we locate and map out regions where poverty is less extreme and evident? How do we explain the existence of pockets of poverty in urban areas among middle-income people and how can we measure it?

Objective:

In Amartya Sen’s “Development as Freedom,” Sen states that human development is the process of expanding human freedoms to allow people to lead the lives they have reason to live (Sen). Freedom is important because it leads to a higher quality of life. By targeting poverty-stricken areas and gathering data, we can use our freedoms to eradicate the unfreedoms of poverty in developing nations. Through this data, researchers can better understand poverty and its determinants in order to better design programs that reduce inequality and improve public service delivery among those who are extremely poor (Castelán). In this research plan, I will be assessing data science methods that are being used to indicate poverty-stricken regions and how the data can be used to eradicate poverty in Africa and India.

I chose to assess poverty in Africa and India because both experience sizable issues with poverty. Almost every second person living in the states of Sub-Saharan Africa lives under the poverty line and two thirds of people in India live in poverty. In Sub-Saharan Africa, poverty rates have risen due to the long-term effects of war, genocide, famine, and land availability. Both countries’ rapid population growth are long-term driving factors of this increase in poverty in the continents. Rapid population growth stretches national and family budgets thin with an increasing number of children to be fed and educated and an increasing number of workers to be provided with jobs. Lastly, I chose to research Africa and India because the availability of reliable and accurate information in the location of impoverished regions is generally lacking particularly in these two countries. I was interested to learn how researchers could fill this gap in data and use data science techniques to locate poverty-stricken areas.

Data Science Methods:

There are several data science methods I have found to be significant during my assessment of poverty. Often, underdeveloped countries are highly data-deprived, so data science methods such as utilizing satellite imagery allow researchers to collect accurate and reliable data. Traditionally, methods used to target and analyze poverty rely on census data, which are often unavailable or out of date in most low- and middle-income countries (Steele). Alternative measures are needed to update estimates between censuses. Now, researchers use powerful machine learning technology to extract information about poverty through satellite imagery (Horton). Satellite Imagery and Remote Sensing Data play large roles in creating poverty maps. On the other hand, CDR data, or call detail records, collects a detailed record of all telephonic calls produced by a telephone exchange. Phone usage data can provide information on social networks, call behavior, and mobility patterns in a population, all which correlate with measures of socioeconomic status.

Satellite features are highly predictive of economic well-being. Daytime and night light imagery, more specifically, have emerged as practical sources of welfare due to new computer vision algorithms. Satellite images can collect data ranging from nightlights to condition of roads/homes and activity. Developments in deep learning and Convolutional Neural Networks (CNN) have the ability to classify objects such as cars, roof types, roads, and crops, all correlating with local income. The article, “How to Understand Global Poverty from Outer Space,” details a study conducted in Rwanda by Asmi Kumar used neural networks and satellite imagery to predict asset wealth and predict poverty in Rwanda, Africa. He did this in 5 steps: beginning by downloading the Demographic and Health Surveys (DHS), nightlight satellite imagery, and daytime satellite imagery. The DHS are surveys that provide household data for health, nutrition, and assets. Using the Google Maps Platform, daytime satellite images were obtained, containing valuable features of landscapes (conditions of roads and homes and activity). Lastly, nightlights were obtained from satellite imagery based on the locations from the DHS (Kumar). After the data is collected, the DHS and nightlight data is merged to understand whether nightlight data can be used to predict poverty. The author explains how nightlight luminosity is a great indicator of wealth. The results of the study detailed three linear regression models illustrating the relationships between wealth and nightlight luminosity. The merged data showed an increase in the r-squared value, which is also its predictive capability. The results showed that we can use Convolutional Neural Networks with daytime and nighttime satellite imagery and combine it with survey data to accurately and efficiently track impoverished areas in specific places.

Another significant data science method I found during my research was the Bayesian Geostatistical Model (BGM). A BGM is a statistical model using probability to represent uncertainty within the model. One study in Bangladesh conducted by Jessica Steele, author of “Mapping poverty using mobile phone and satellite data,” dives into how cell phone data is collected and paired with satellite information. Steele mentions using hierarchical Bayesian geostatistical models (BGMs) to create high resolution maps of poverty. She started out by approximating the mobile tower coverage areas using Voronoi polygons, allowing spatial data to be mapped in urban and rural areas. The results of this study found that models with the combination of RS and CDR data provided an advantage over models with just the data source. The data showed that when RS data and CDR data were combined, they produced a higher R-squared value in urban areas compared to RS-only or CDR-only data.

Proposal and Budget:

I propose the utilization of geospatial and data science methods such as CDR and RS data and satellite imagery to target pockets of unknown poverty in urban settings. This plan will have a 1-3 year horizon to explore my refined research question. I will have a team of researchers who will collect satellite data from urban cities located in developing countries at a closer local level. The team of researchers will also obtain and analyze survey and DHS data to be used in conjunction with satellite data. Lastly, they will use Bayesian Geostatistical Models to verify that the proper data is being collected and produced. I believe that funders should fund my research proposal because it will help countries like Africa and India reduce pockets of poverty in urban areas, improving the economic and social status of the two nations. Expanding urban cities in developing countries can promote industrialization and create new job opportunities for those in poverty. This research plan would have a funding of around $100,000. A majority of the finances would be going towards the data collection itself, such as satellite imagery and survey data. The remainder would go towards the research team, programmers, and data scientists in charge of the data collection.

Possible Issues:

There are a few salient harms or obstacles that exist in the efforts of solving the poverty crisis through data science techniques. Some of these include: security, privacy, costs, and accessibility. Large datasets often include private information and can be prone to data theft and many would be hesitant to share their data. Another obstacle would be that the cost of mining and storing data is sizable. Lastly, collecting accurate survey data could be an obstacle. This could be due to low participation and literacy and language comprehension barriers. Many people also live in scattered areas in developing countries which can be difficult to access. All these factors combined can make census taking difficult, time-consuming, and expensive. By identifying and mapping out poverty in urban cities, more government funding and aid can be efficiently directed towards those areas of need.

Conclusion:

Global poverty has been a devastating issue for decades on end. Now, we finally have the technology and resources to solve it. With all our information gathered through these various data science methods, proper aid and funding can be administered to poverty-stricken regions efficiently and effectively, helping us understand the actual needs of those in poverty. Overall, data science gives an insight into the actual needs of the poor to formulate the proper poverty alleviation programs. With data and technology, researchers can understand the target group’s current and potential income source, providing a platform to identify the most valued components for social security. This would in turn help us in deciding what economic opportunities to promote or policies that could empower these groups and the level of aid that needs to be distributed. There are of course limitations, as machine learning technology is ever-changing and algorithms have elements of subjectivity. Nonetheless, data science methods allow us to dive deeper into human development issues and constantly monitor for improvement. With my research plan, the opportunities outweigh the obstacles significantly. Poverty assessment and analysis have grown considerably with the utilization of CDR and RS data and satellite data. Satellite imagery has proven to produce high resolution poverty maps when combined with CDR, DHS and BGMs. It results in a machine learning product with the capability of predicting poverty in real time. Although there are still gaps in data, the results show real promise.

References:

“Amartya Sen and the Thousand Faces of Poverty.” IADB, www.iadb.org/en/news/webstories/2001-07-01/amartya-sen-and-the-thousand-faces-of-poverty%2C9286.html.

Bill and Melinda Gates Foundation, World Bank, Grameen Foundation. “High Resolution Progress out of Poverty Mapping.” WorldPop, https://www.worldpop.org/portfolio/project?id=22. Accessed 30 Sept. 2021.

Castelán, Carlos Rodríguez, et al. “Making a Better Poverty Map.” World Bank Blogs, blogs.worldbank.org/opendata/making-better-poverty-map.

Horton, Michelle. “Stanford Scientists Combine Satellite Data, Machine Learning to Map Poverty.” Stanford School of Earth, Energy & Environmental Sciences, pangea.stanford.edu/news/stanford-scientists-combine-satellite-data-machine-learning-map-poverty.

Kumar, Asmi. “How to Understand Global Poverty from Outer Space.” Medium, Towards Data Science, 6 July 2020, towardsdatascience.com/how-to-understand-global-poverty-from-outer-space-442e2a5c3666.

Sen, Amartya Kumar. Development as Freedom. Oxford University Press, 2001.

Steele, Jessica E., et al. “Mapping Poverty Using Mobile Phone and Satellite Data.” Journal of The Royal Society Interface, vol. 14, no. 127, 2017, p. 20160690., doi:10.1098/rsif.2016.0690.