A machine learning proposal to predict poverty
Main Article Content
Abstract
Due to the high rate of inclusion and exclusion errors of traditional methods (Proxy Mean Test) used for the identification of households in poverty condition and selection of the social assistance programs beneficiaries, this research analyzed different perspectives to predict households in poverty condition, using a machine learning model based on XGBoost. The models proposed were compared with baseline methods. The data used were taken from the 2019 household survey of Costa Rica. The results showed that at least one of our approaches using XGBoost gave the best balance between inclusion and exclusion errors. The best model to predict poverty and extreme poverty was build using an XGBoost with a classification approach.
Article Details
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Los autores conservan los derechos de autor y ceden a la revista el derecho de la primera publicación y pueda editarlo, reproducirlo, distribuirlo, exhibirlo y comunicarlo en el país y en el extranjero mediante medios impresos y electrónicos. Asimismo, asumen el compromiso sobre cualquier litigio o reclamación relacionada con derechos de propiedad intelectual, exonerando de responsabilidad a la Editorial Tecnológica de Costa Rica. Además, se establece que los autores pueden realizar otros acuerdos contractuales independientes y adicionales para la distribución no exclusiva de la versión del artículo publicado en esta revista (p. ej., incluirlo en un repositorio institucional o publicarlo en un libro) siempre que indiquen claramente que el trabajo se publicó por primera vez en esta revista.
References
S , Kidd and E, Wylde, E, “Targeting the Poorest: An Assessment of the Proxy Means Test Methodology” Technical report, AusAID, Washington, DC, 2011
L, McBride and A, Nichols, "Retooling poverty targeting using out-of-sample validation and machine learning", The World Bank Economic Review, vol. 32, no. 3, pp. 531-550, 2018.
D, Budlender “Considerations in Using Proxy Means Tests In Eastern Caribbean States”., St.Lucia, 2016.
Available:https://pdfs.semanticscholar.org/efab/96de659b7208f41382341cf206ac25 9838c.pdf. [Accessed: March. 18, 2020].
F, Delgado-Jiménez, “Efectividad en la selección de beneficiarios de los programas avancemos y bienestar familiar”, Economía y Sociedad, vol. 22, no. 52, pp. 1-24, 2017
A, Bah, “Finding the Best Indicators to Identify the Poor”,Working Paper 01-2013, Jakarta, Indonesia: National Team for the Acceleration of Poverty Reduction (TNP2K), 2013
C ,Brown. M Ravallion and D, Van de Walle, “A Poor Means Test? Econometric Targeting in Africa”, Working Paper 22919, Massachusetts, EEUU: National Bureau of Economic Research, 2016.
S, Kidd. B. Gelders and D, Bailey-Athias, “Exclusion by design: an assessment of the effectiveness of the proxy means test poverty targeting mechanism”, ESS Working Paper No.56, Geneva: International Labour Office, 2017
S, Ashwini, et al, “A Proxy Means Test for Sri Lanka,. Working Paper 8605, 2018
D. S Mapa. & , M.L.F, Albis, “New Proxy means test (PMT) models: improving targeting of the poor for social protection”. In 12th National Convention on Statistics, Manila, Philippines, 2013.
R. K , Dewi and A, Suryahadi, “The implications of poverty dynamics for targeting the poor: simulations using Indonesian data”, Working paper, SMERU Research Institute, Indonesia, 2014
J, Hussein. O, Nazih, “Poverty level characterization via feature selection and machine learning”, In 27th Signal Processing and Communications Applications Conference (SIU), Siva, Turkey, 2019.
M, Pisacha, “Better Model Selection for poverty targeting through Machine Learning: A case Study in Thailand”, M.S. thesis, Thammasat University, Thailand, 2017
T, Pave and N, Stender, “Is random forest a superior methodology for predicting poverty? an empirical assessment”. Poverty & Public Policy, vol. 9, no. 1, pp. 118-133, 2017
T, Chen and C Guestrin, “Xgboost: A scalable tree boosting system”, In Proceedings of the 22nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining, EEUU, San Francisco, CA, USA, 13 August 2016.
Guyon, et al, “Gene selection for cancer classification using support vector machines”. Machine learning, vol 46, no.1, pp. 389-422, 2002
H. Peng , F. Long , C. Ding, “Feature selection based on mutual information: cri- teria of max-dependency, max-relevance, and min-redundancy”, IEEE Trans. Pattern Anal. Mach. Intell. Vol. 27, pp. 1226–1238, 2005
S. García. J, Luengo. F. Herrera. “Tutorial on practical tips of the most influential data preprocessing algorithms in data mining”, Knowledge-Based Systems, vol. 98, pp. 1-29, 2016
O, Poursaeed. T, Matera, T. S, Belongie, “Vision-based real estate price estimation”. Machine Vision and Applications, vol. 29, no. 4, pp. 667-676, 2018
Y. Yoshimura et al, “Deep learning architect: classification for architectural design through the eye of artificial intelligence”. In International Conference on Computers in Urban Planning and Urban Management (pp. 249-265). Springer, Cham, 2019
R, Ngestrini, “Predicting Poverty of a Region from Satellite Imagery using CNNs”, M.S. thesis, Utrecht University, Utrecht, 2019
S.M, Pandey. T, Agarwal. N.C, Krishnan, “Multi-task deep learning for predicting poverty from satellite images”, In Thirty-Second AAAI Conference on Artificial Intelligence, New Orleans, USA, 2018
N. Jean, et al, “Combining satellite imagery and machine learning to predict poverty”. Science, vol. 353, pp. 790-794, 2016
V.H, Maluleke. S, Er. Q. R Williams, “Estimating poverty using aerial images: South African application”. Data Science and Applications, vol.1, no. 1, pp. 29-36, 2018.