Hybrid storage engine for geospatial data using NoSQL and SQL paradigms

Main Article Content

José A. Herrera-Ramírez
Marlen Treviño-Villalobos
Leonardo Víquez-Acuña

Abstract

The design and implementation of services to handle geospatial data involves thinking about storage engine performance and optimization for the desired use. NoSQL and relational databases bring their own advantages; therefore, it is necessary to choose one of these options according to the requirements of the solution. These requirements can change, or  some operations may be performed in a more efficient way on another database engine, so using just one engine means being tied to its features and work model. This paper presents a hybrid approach (NoSQL-SQL) to store geospatial data on MongoDB, which are replicated and mapped on a PostgreSQL database, using an open source tool called ToroDB Stampede; solutions then can take advantage from either NoSQL or SQL features, to satisfy most of the requirements associated to the storage engine performance. A descriptive analysis to explain the workflow of the replication and synchronization in both engines precedes the quantitative analysis by which it was possible to determine that a normal database in PostgreSQL has a shorter response time than to perform the query in PostgreSQL with the hybrid database. In addition, the type of geometry increases the update response time of a materialized view.

Article Details

How to Cite
Herrera-Ramírez, J. A., Treviño-Villalobos, M., & Víquez-Acuña, L. (2021). Hybrid storage engine for geospatial data using NoSQL and SQL paradigms. Tecnología En Marcha Journal, 34(1), Pág. 40–54. https://doi.org/10.18845/tm.v34i1.4822
Section
Artículo científico

References

S. Deogawanka, «Empowering GIS with Big Data,» 2014. [En línea]. Available: https://www.gislounge.com/empowering-gis-big-data/.

R. Cattell, «Scalable SQL and NoSQL data stores,» Acm Sigmod Record, vol. 39, nº 4, pp. 12-27, December 2010.

M. López, S. Couturier, and J. López, “Integration of NoSQL Databases for Analyzing Spatial Information in Geographic Information System,” Computational Intelligence and Communication Networks (CICN), 2016 8th International Conference on, pp. 351-355, December 2016.

M. A. Colorado Pérez, «NoSQL: ¿es necesario ahora?,» Tecnología Investigación y Academia, vol. 5, nº 2, pp. 174-179, 2017.

E. Baralis, A. Dalla Valle, P. R. C. Garza, and F. Scullino, “SQL versus NoSQL databases for geospatial applications,” in 2017 IEEE International Conference on Big Data (Big Data), Boston, MA, USA, 2017.

G. Ongo and G. Putra Kusuma, “Hybrid Database System of ; Gede Putra KusumaMySQL and MongoDB in Web Application Development,” in 2018 International Conference on Information Management and Technology (ICIMTech), Jakarta, 2018.

S. Goyal, P. P. Srivastava, and A. Kumar, “An overview of hybrid databases,” in Green Computing and Internet of Things (ICGCIoT), 2015 International Conference, Noida, 2015.

E. Şafak, A. Furkan, and T. Erol, “Hybrid Database Design Combination of Blockchain And Central Database,” in 3rd International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT), Ankara, Turquía, 2019.

H. R. Vyawahare, P. P. Karde, and V. M. Thakare, “Hybrid Database Model For Efficient Performance,” Procedia Computer Science, vol. 152, pp. 172-178, 2019.

Z. Pang, S. Wu, H. Huang, Z. Hong, and Y. Xie, “AQUA+: Query Optimization for Hybrid Database-MapReduce System. In (pp. 199-206). IEEE.,” in 2019 IEEE International Conference on Big Knowledge (ICBK), Beijing, China, 2019.

J. Arulraj, A. Pavlo, and P. Menon, “Bridging the archipelago between row-stores and column-stores for hybrid workloads,” in 2016 International Conference on Management of Data, 2016.

A. P. Costa and J. Oliveira, “Design and modeling of a hybrid database schema: transactional and analytical.,” in 17th Conference of the Portuguese Chapter of the Association of Information Systems (CAPSI), Guimarães, Portugal, 2017.

I. Zečević and P. Bjeljac, “Model driven development of hybrid databases,” in 7th International Conference on Information Society and Technology ICIST, 2017.

U. Goswami, R. Singh, and V. Singla, “Implementing hybrid data storage with hybrid search,” in Proceedings of the Third International Conference on Advanced Informatics for Computing Research, 2019.

C. Wu, Q. Zhu, Y. Zhang, Z. Du, X. Ye, H. Qin, and Y. Zhou, “A NOSQL–SQL hybrid organization and management approach for real-time geospatial data: A case study of public security video surveillance,” ISPRS International Journal of Geo-Information, vol. 6, no. 1, p. 21, 2017.

The PostgreSQL Global Development Group, “PostgreSQL 10.3 Released!” The World’s Most Advanced Open Source Database,” [Online]. Available: https://www.postgresql.org/. [Accessed 7 march 2018].

Developers, PostGIS, “PostGIS — Spatial and Geographic Objects for PostgreSQL,” [Online]. Available: https://postgis.net/. [Accessed 9 march 2018].

S. Agarwal and K. S. Rajan, “Performance analysis of MongoDB versus PostGIS/PostGreSQL databases for line intersection and point containment spatial queries,” Spatial Information Research, vol. 24, no. 6, pp. 671-677, 2016.

NoSQL, “NoSQL,” [Online]. Available: http://nosql-database.org. [Accessed 14 march 2018].

geoserver.org, Geoserver, 2014.

X. Liu, L. Hao, and W. Yang, “BiGeo: A Foundational PaaS Framework for Efficient Storage, Visualization, Management, Analysis, Service, and Migration of Geospatial Big Data—A Case Study of Sichuan Province, China,” ISPRS International Journal of Geo-Information, vol. 8, no. 10, p. 449, 2019.

Z. Lv, X. Li, H. Lv, and W. Xiu, “BIM Big Data Storage in WebVRGIS,” IEEE Transactions on Industrial Informatics, vol. 16, no. 4, pp. 2566 - 2573, 2020.

B. Shangguan, P. Yue, Z. Wu, and L. Jiang, “Big spatial data processing with Apache Spark,” in 2017 6th International Conference on Agro-Geoinformatics, Fairfax, VA, USA, 2017.

I. Simonis, «Geospatial Big Data Processing in Hybrid Cloud Environments,» de IGARSS 2018-2018 IEEE International Geoscience and Remote Sensing Symposium, Valencia, España, 2018.

Apache Software Foundation, “Apache JMeter,” [Online]. Available: http://jmeter.apache.org/. [Accessed 5 december 2017].

MongoDB, “The MongoDB 3.4 Manual,” [Online]. Available: https://docs.mongodb.com/v3.4/. [Accessed 6 december 2017].

J. M. Cavero Barca, B. V. Sánchez, and P. C. García De Marina, “Evaluation of an Implementation of Cross-Row Constraints Using Materialized Views,” ACM SIGMOD Record, vol. 48, no. 3, pp. 23-28, 2019.

8Kdata, “ToroDB,” 2016. [Online]. Available: https://www.8kdata.com/torodb. [Accessed 19 april 2016].

T. W. Anderson and D. A. Darling, “A test of goodness of fit,” Journal of the American statistical association, vol. 49, no. 268, pp. 765-769., 1954.

J. L. Gastwirth, Y. R. Gel, and W. Miao, “The Impact of Levene’s Test of Equality of Variances,” Statistical Theory and Practice Statistical Science, vol. 24, no. 3, pp. 343-360, 2009.

M. Raymond and F. Rousset, “An exact test for population differentiation,” Evolution, vol. 49, no. 6, pp. 1280-1283, 1995.

M. Terrádez and A. A. Juan, “Análisis de la varianza (ANOVA),” 2003.