Does NoSQL have a place in
GIS? - An open-source spatial database performance comparison with proven
RDBMS |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Christopher J McCarthy |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Abstract With the relational database model being more than 40 years
old, combined with the continuously increasing use of ‘big data’, NoSQL
systems are marketed as providing a more efficient means of dealing with
large quantities of usually unstructured data. NoSQL systems may provide
advantages over relational databases but generally lack the relational
robustness for those advantages. This project attempts to contribute to the GIS field in
comparing Open-Source RDBMS and NoSQL systems, storing and querying spatial
data with the overall goal to determine if NoSQL systems (specifically
MongoDB) have a place within the GI world. Working with Open-source spatial
dataset, OpenStreetMap, a scalable approach is taken working through global
to local scaled data. This approach aims to provide insight to how either
system may present performance advantages related to data size. The research highlights how the performance of each system
is limited by the system functionality. MongoDB’s spatial capabilities are
lacking in comparison to the PostgreSQL spatial extension PostGIS. The
outcome is that MongoDB cannot support the spatial needs of a specialist GIS
operative currently, however if basic spatial functionality is all that is
needed, MongoDB presents high performance on large datasets. PostGIS has a
complex, highly specialist ream of spatial functionally making it the best
performing spatial system, however increasing dataset size does present a
system slow down relationship. The use of each system is dependent on the application but
at the present time this NoSQL system is spatially outclassed thus not worthy
of the specialist GIS industry. Spatial Benchmark Queries
Table 1 Benchmark Queries
ResultsImport
times between the two systems were noticeably different stemming from the
creation of spatial indexes by the PostGIS import tool, while this wasn’t the
case for the MongoDB system which required manual creation.
Figure 1 Import Times MongoDB prevailed as the quicker system for CRUD (Create, Read, Update and Delete) system operations. Table 2 System Operations
MongoDB
continued to outperform the more complex spatial system with the following
Find Nearest Point and Distance Buffer queries generally running quicker on
all levels of dataset with exception to the smaller Buffer. This began to
highlight the scalability benefits of MongoDB, less performance degrade was
witnessed as dataset scale increased. Table 3 Spatial Queries
Mongo
outperformed PostGIS in many performance tests, but these were rather basic
spatial functionality. It became clear that PostGIS was a far more specialist
spatial system boasting far more complex spatial functionality. Many
operations could not be matched by MongoDB. In many other system tests,
PostGIS outperformed MongoDB showing under these more complex tests that the
more advanced spatial system had advantages. The increasing dataset size did
still present performance effects dramatically increasing operation times in
comparison to MongoDB’s impacts. Table 4 Further Spatial Queries
Conclusions
Key Referencesde Hass, W., Quak,
W. & Vermaji, M., 2008. A spatial DBMS buyers
guide, s.l.: Delft University of Technology Section
GIS Technology. Goodchild, M. F., 1992. Geogrpahical information science. Geograpical
Information Systems, 6(1), pp. 31-45. McCarthy,
C., 2014. Does NoSQL have a place in GIS? - An open-source spatial database
performance comparison with proven RDBMS. MongoDB,
2014. MongoDB Manual. [Online] PostGIS,
2014. Chapter 4. Using PostGIS. [Online] Simion, B., Ilha,
D. N., Brown, A. D. & Johnson, R., 2013. The Price of Generality in
Spatial Indexing, Toronto: Department of Computer Science, University of
Toronto. Stonebraker,
M. & Centintemel, U., 2005. One Size Fits All:
An Idea whose Time has Come and Gone. ICDE '05:
Proceedings of the 21st International Conference on Data Engineering, pp.
2-11. Stonebraker,
M., Frew, J., Gardels, K.
& Meredith, J., 1993. The SEQUOIA 200 storage benchmar.
SIGMOD 93' : Proceedings of the 1993 ACM SIGMOD
International conference on Management of data, pp. 2-12. Suprio, R., Bogdan,
S. & Demke, A. B., 2011. Jackpine: A Benchmark
to Evalutate Spatial Database Performance. Data
Engineering (ICDE), Volume 27, pp. 1139-1150. Vyas, R.
K., Paliwal, M. & Pal, B. L., 2011. Conceptual
Review on Relational and Spatial Database Query Processing and Benchmarking.
International Journal of Advanced Research in Computer Science, 2(5), pp.
578-580. |