Geospatial Big Data Processing with Python: Detecting Green Roofs in Toronto
Introduction
As geospatial data becomes more ubiquitous, processing geospatial big data has become an essential part of big data analytics. The amount of data is increasing at an exponential rate. Geospatial big data (2D, 3D, point cloud) processing has always been a challenge not only in the information and technology (IT) sectors but also in the geospatial domain. Efficiently handling geospatial data is essential for extracting meaningful information from big data. Big data processing techniques analyze big datasets at terabyte or even petabyte scale. In many cases, we need to use a combination of different tools and approaches to process geospatial big data. There are some useful python libraries and tools (Georasters, Gdal, Dask, Geopandas, Rasterstats, Databricks, Apache Spark…) that can be used to process large amounts of geospatial data. In this project, we will learn how to process geospatial big data (1.3 billion points) with python to detect green roofs. The project aims to identify green roofs and to explore potential green roofs in Toronto. Various approaches such as raster zonal statistics, raster point data processing, and Machine Learning (Deep Learning) are used to identify green roofs.
Detecting green roofs is a complicated process due to the impact of tree-shade…