Vegetation change detection based on time series analysis by Apache Spark and RasterFrame
Hanoi University of Mining and Geology, Vietnam
- Received: 18th-Sept-2020
- Revised: 9th-Jan-2021
- Accepted: 2nd-Feb-2021
- Online: 28th-Feb-2021
- Section: Geomatics and Land Administration
Spatial big data has a large scale and complex, therefore, it cannot be collected, managed, and analyzed by traditional data analytic software shortly. These platforms in many situations are restricted to vectors data. However, the raster data generated by the sensors on the enormous number of satellites now needs to be processed in parallel on the cluster environment. The article introduces the satellite image data analyzing method using the RasterFrames library on the Apache Spark platform. The RasterFrames library examines raster data for Python, Scala, and SQL, bringing the power of Spark DataFrames to access to Earth Observation, cloud computing, and data science. In the experimental part, the NDVI and the change in the average value of NDVI in the time series are calculated to demonstrate the vegetation mantle changes in Phu Tho province. These results are the reference data source in the assessment of weather, climate, and environmental changes in the study area during that time.
Aji, A., Sun, X., Vo, H., Liu, Q., Lee, R., Zhang, X., Saltz, J. and Wang, F., (2013). Demonstration of Hadoop-GIS: a spatial data warehousing system over MapReduce. In Proceedings of the 21st ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (pp. 528-531). ACM.
Fei Xiao, (2017). A Big Spatial Data Processing Framework Applying to National Geographic Conditions Monitoring. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLII-3, 2018 ISPRS TC III Mid-term Symposium “Developments, Technologies and Applications in Remote Sensing”, 7-10 May, Beijing, China.
Hughes, J. N., Annex, A., Eichelberger, C. N., Fox, A., Hulbert, A. and Ronquest, M., (2015). Geomesa: a distributed architecture for spatio-temporal fusion. In SPIE Defense+ Security (pp. 94730F-94730F). International Society for Optics and Photonics.
Lu, J. and Guting, R. H., (2012). Parallel secondo: boosting database engines with hadoop. In Parallel and Distributed Systems (ICPADS), (2012) IEEE 18th International Conference on (pp. 738-743). IEEE.
Nishimura, S., Das, S., Agrawal, D. and El Abbadi, A., (2011), June. Md-hbase: A scalable multi-dimensional data infrastructure for location aware services. In Mobile Data Management (MDM), 2011 12th IEEE International Conference on (Vol. 1, pp. 7-16). IEEE.
RasterFrames. http://rasterframes.io/.Stefan Hagedorn, Philipp Götze, Kai-Uwe Sattler, (2017). Big Spatial Data Processing Frameworks: Feature and Performance Evaluation. In 20th International Conference on Extending Database Technology (EDBT).
Yu, J., Wu, J. and Sarwat, M., (2015). Geospark: A cluster computing framework for processing large-scale spatial data. In Proceedings of the 23rd SIGSPATIAL International Conference on Advances in Geographic Information Systems (p.70). ACM.