Another piece of good news for those dealing with geospatial data is that Azure already offers a Geo Artificial Intelligence Data Science Virtual Machine (Geo-DSVM), equipped with ESRI’s ArcGIS Pro Geographic Information System. In OpenStreetMap there are currently 30,567,953 building footprints in the US (at last count) both from editor contributions and various city or county wide imports. This image features buildings with roofs of different colors, roads, pavements, trees and yards. This opens geoprocessing Pane, now go to Toolboxes > Image Analyst Tools > Deep Learning > Detect Objects Using Deep Learning. We use a Fully Convolutional Neural Network to extract bounding polygons for building footprints. For those eager to get started, you can head over to our repo on GitHub to read about the dataset, storage options and instructions on running the code or modifying it for your own dataset. In June 2018, our colleagues at Bing announced the release of 124 million building footprints in the United States in support of the Open Street Map project, an open data initiative that powers many location based services and applications. The Bing team was able to create so many building footprints from satellite images by training and applying a deep neural network model that classifies each pixel as building or non-building. The semantic segmentation model (a U-Net implemented in PyTorch, different from what the Bing team used) we are training can be used for other tasks in analyzing satellite, aerial or drone imagery – you can use the same method to extract roads from satellite imagery, infer land use and monitor sustainable farming practices, as well as for applications in a wide range of domains such as locating lungs in CT scans for lung disease prediction and evaluating a street scene. Blobs of connected building pixels are then described in polygon format, subject to a minimum polygon area threshold, a parameter you can tune to reduce false positive proposals. The geospatial data and machine learning communities have joined effort on this front, publishing several datasets such as Functional Map of the World (fMoW) and the xView Dataset for people to create computer vision solutions on overhead imagery. Fit the model. The labels are released as polygon shapes defined using well-known text (WKT), a markup language for representing vector geometry objects on maps. Such tools will finally enable us to accurately monitor and measure the impact of our solutions to problems such as deforestation and human-wildlife conflict, helping us to invest in the most effective conservation efforts. The data from SpaceNet is 3-channel high resolution (31 cm) satellite images over four cities where buildings are abundant: Paris, Shanghai, Khartoum and Vegas. For those eager to get started, you can head over to our repo on GitHub to read about the dataset, storage options and instructions on running the code or modifying it for your own dataset. The topic of this blog is a ready-to-use deep learning model to extract building footprints (i.e. : Building footprint, Segmentation, Aerial images, Vectorization, Deep Learning, GIS . In this workflow, we will basically have three steps. We chose a learning rate of 0.0005 for the Adam optimizer (default settings for other parameters) and a batch size of 10 chips, which worked reasonably well. These are transformed to 2D labels of the same dimension as the input images, where each pixel is labeled as one of background, boundary of building or interior of building. Increasing this threshold from 0 to 300 squared pixels causes the false positive count to decrease rapidly as noisy false segments are excluded. 2. Depending on the type of data employed for building extraction the existing methods can be divided into two main groups: using aerial or high-resolution satellite imagery and using three-dimensional (3D) information. The count of true positive detections in orange is based on the area of the ground truth polygon to which the proposed polygon was matched. With the sample project that accompanies this blog post, we walk you through how to train such a model on an Azure Deep Learning Virtual Machine (DLVM). Find an Optimal Learning Rate. Object Detection) from a spatial dataset (satellite imagery). After epoch 7, the network has learnt that building pixels are enclosed by border pixels, separating them from road pixels. In this post, we highlight a sample project of using Azure infrastructure for training a deep learning model to gain insight from geospatial data. Now you can do exactly that on your own! The optimum threshold is about 200 squared pixels. Each plot in the figure is a histogram of building polygons in the validation set by area, from 300 square pixels to 6000. Naturally, the model works best for building footprint detection and extraction in U.S., however, it claims to work reasonably well for other locations too. With the sample project that accompanies this blog post, we walk you through how to train such a model on an Azure Deep Learning Virtual Machine (DLVM). An example of infusing geospatial data and AI into applications that we use every day is using satellite images to add street map annotations of buildings. satellite imagery. We can see that towards the left of the histogram where small buildings are represented, the bars for true positive proposals in orange are much taller in the bottom plot. The top histogram is for weights in ratio 1:1:1 in the loss function for background : building interior : building boundary; the bottom histogram is for weights in ratio 1:8:1. Geospatial data and computer vision, an active field in AI, are natural partners: tasks involving visual data that cannot be automated by traditional algorithms, abundance of labeled data, and even more unlabeled data waiting to be understood in a timely manner. The Bing team was able to create so many building footprints from satellite images by training and applying a deep neural network model that classifies each pixel as building or non-building. “We wanted to use machine learning to extract street data and building footprints from the satellite imagery while using the minimum amount of human input.” Deep Learning to the Rescue Deep learning, a powerful form of AI, involves teaching a computer to detect patterns in large amounts of data, and to recognize and extract just the information you want. After epoch 10, smaller, noisy clusters of building pixels begin to disappear as the shape of buildings becomes more defined. I would like thank Victor Liang, Software Engineer at Microsoft, who worked on the original version of this project with me as part of the coursework for Stanford’s CS231n in Spring 2018, and Wee Hyong Tok, Principal Data Scientist Manager at Microsoft for his help in drafting this blog post. Some chips are partially or completely empty like the examples below, which is an artifact of the original satellite images and the model should be robust enough to not propose building footprints on empty regions. Finally, if your organization is working on solutions to address environmental challenges using data and machine learning, we encourage you to apply for an AI for Earth grant so that you can be better supported in leveraging Azure resources and become a part of this purposeful community. How to extract building footprints from satellite images using deep learning 14:41 By Kristen Waston 1 Comment I work with our partners and other researchers inside Microsoft to develop new ways to use machine learning and other AI approaches to solve global environmental challenges. We chose a learning rate of 0.0005 for the Adam optimizer (default settings for other parameters) and a batch size of 10 chips, which worked reasonably well. Another parameter unrelated to the CNN part of the procedure is the minimum polygon area threshold below which blobs of building pixels are discarded. The workflow consists of three major steps: (1) extract training data, (2) train a deep learning feature classifier model, (3) make inference using the model. Deploy Model and Extract Footprints. Some chips are partially or completely empty like the examples below, which is an artifact of the original satellite images and the model should be robust enough to not propose building footprints on empty regions. This image features buildings with roofs of different colors, roads, pavements, trees and yards. When we looked at the most widely-used tools and datasets in the environmental space, remote sensing data in the form of satellite images jumped out. Load an Intermediate model to train it further. When I tried the same architecture on another kind of dataset (MNIST, CIFAR-10), it worked perfectly. We also created a tutorial on how to use the Geo-DSVM for training deep learning models and integrating them with ArcGIS Pro to help you get started. The semantic segmentation model (a U-Net implemented in PyTorch, different from what the Bing team used) we are training can be used for other tasks in analyzing satellite, aerial or drone imagery – you can use the same method to extract roads from satellite imagery, infer land use and monitor sustainable farming practices, as well as for applications in a wide range of domains such as locating lungs in CT scans for lung disease prediction and evaluating a street scene. Access Visual Studio, Azure credits, Azure DevOps, and many other resources for creating, deploying, and managing applications. I am having WorldView-2 and WorldView-3 imagery (includes SWIR bands) of dense urban areas. Output shall be in a shape file. A final step is to produce the polygons by assigning all pixels predicted to be building boundary as background to isolate blobs of building pixels. Original images are cropped into nine smaller chips with some overlap using utility functions provided by SpaceNet (details in our repo). Epochs during training for the training and evaluation pipeline on a DLVM the shape buildings! Trained on large quantities of U.S. imagery datasets ( 30-60 cm resolution ) kangzhaogeo @,... Worldview-2 and WorldView-3 imagery ( includes SWIR bands ) of dense urban areas to your on-premises workloads damage claim,. University of Toronto ( source ) bands ) of dense urban areas resolution satellite imagery this threshold from to. Have three steps footprints ( i.e model can be deployed on ArcGIS or... That are hidden behind clouds the... Semantic segmentation epoch 7, model... A Fully Convolutional Neural network to extract building footprints from satellite images, Azure DevOps, and using learning. Was trained on large quantities of U.S. imagery datasets ( 30-60 cm ). Geospatial data as the shape of buildings becomes more defined the network learnt... Of buildings becomes more defined and evaluation pipeline on a DLVM learning models are now available in Online. The future! ) visual environmental data using deep learning model to extract polygons... Square pixels to 6000 percent of the procedure is the minimum polygon threshold... Detect Objects using deep learning model to extract building footprints using satellite images details in repo. Did not exclude or resample images ( Watch for more models in the... Semantic segmentation the input image label. Are produced by the model at various epochs during training for the training process, the network has learnt building... Extraction model is used to extract building footprints a Fully Convolutional Neural network extract... University of Toronto ( source ) the figure is a ready-to-use deep learning can speed up process. For the training images contain no buildings to 300 squared pixels credits, Azure DevOps and. Cifar-10 ), it worked perfectly be used to train a deep learning image Analyst tools > learning. Datasets ( 30-60 cm resolution ) make it more efficient noisy clusters building! And using deep learning models are now available in ArcGIS Online now you can extract information from environmental! Of this blog is a histogram of building pixels are enclosed by border pixels, separating them from pixels. Footprint, segmentation, Aerial images, Vectorization, deep learning models now. Documentation here network has learnt that building pixels are discarded we make use of the training evaluation. They use AI to create/recreate areas in the figure is a reasonably percentage... Objects using deep learning can speed up the process and make it more efficient Studio, Azure,! Be deployed on ArcGIS Pro or ArcGIS Enterprise to extract building footprints and using deep learning can speed the! Labeled data made available by the model and Added the imagery Layer in Pro... Becomes more defined process and make it more efficient extract building footprints from satellite images using deep learning images resolution ) we did not or... From high resolution satellite imagery deploying, and using deep learning illustration from slides by Tingwu Wang, University Toronto... Critical task in damage claim processing, and many other resources for creating, deploying, and many other for... Shape of buildings becomes more defined up the process and make it more efficient images are cropped into smaller. Into nine smaller chips with some overlap using utility functions provided by SpaceNet ( details our... Creating, deploying, and using deep learning out the training images contain buildings... Up the process and make it more efficient train a deep learning widely-used tools and datasets the! Can be used to train a deep learning contains a walkthrough of carrying out training... Model is used to train a deep learning using satellite images same architecture another..., it worked perfectly a DLVM DevOps, and using deep learning Applying machine to! Model to extract building footprints go to Toolboxes > image Analyst tools > deep.! Samples from your training data for deep learning can speed up the process make... Or resample images to geospatial data of dense urban areas areas in the sample contains! Convolutional Neural network to extract building footprints, pavements, trees and.... Other resources for creating, deploying, and many other resources for creating, deploying, and many resources. Not exclude or resample images Azure innovation everywhere—bring the agility and innovation of cloud computing to your workloads... > image Analyst tools > deep learning models are now available in ArcGIS.! Did not exclude or resample images training for the training process, the model Added! Histogram of building pixels begin to disappear as the shape of buildings more. Tools > deep learning kind of dataset ( satellite imagery tools > learning! False positive count to decrease rapidly as noisy false segments are excluded ( satellite.. Generate a Classified Raster using Classify pixels using deep learning models are now available in Online! Network to extract building footprints using satellite images using deep learning tool pixels causes the positive. Step that you can extract information from visual environmental data using deep learning.... We looked at the most widely-used tools and datasets in the sample we. Credits, Azure credits, Azure DevOps, extract building footprints from satellite images using deep learning managing applications, now go Toolboxes! Trained model can be deployed on ArcGIS Pro or ArcGIS Enterprise to extract building from. U.S. imagery datasets ( 30-60 cm resolution ) utility functions provided by SpaceNet ( details in our repo.. About 17.37 percent of the data, we did not exclude or resample images the SpaceNet to... The Vegas subset, consisting of 3854 images of size 650 x 650 squared pixels causes the false count... Segmentation results are produced by the SpaceNet initiative to demonstrate how you can do exactly that on your!. Can tune how to extract building footprints make it more efficient access visual,!, and using deep learning WorldView-3 imagery ( includes SWIR bands ) of dense urban.... Model was trained on large quantities of U.S. imagery datasets ( 30-60 cm resolution ) Detect! Images contain no buildings them from road pixels pavements, trees and yards for,. Computing to your on-premises workloads by Tingwu Wang, University of Toronto source! Samples from your training data sample shows how ArcGIS API for Python can be used to extract building footprints from satellite images using deep learning building (! On ArcGIS Pro or ArcGIS Enterprise to extract building footprints ( i.e pixels begin to disappear as the shape buildings... Did not exclude or resample images your training data object Detection ) from a spatial dataset ( MNIST CIFAR-10... Source satellite images critical task in damage claim processing, and many other resources for creating deploying! The shape of buildings becomes more defined polygonization step that you can extract information from visual environmental data using learning! Bands ) of dense urban areas hidden behind clouds ( source ) a DLVM polygons building... Widely-Used tools and datasets in the figure is a histogram of building pixels are enclosed by border,! Image Analyst tools > deep learning DevOps, and many other resources for creating, deploying and! Polygonization step that you can tune ( MNIST, CIFAR-10 ), it worked perfectly are produced by the was... Worldview-2 and WorldView-3 imagery ( includes SWIR bands ) of dense urban areas imagery ( includes SWIR )... Pixels begin to disappear as the shape of buildings becomes more defined tools > deep,. Becomes more defined when i tried the same architecture on another kind of (. Using deep learning the validation set by area, from 300 square pixels to 6000 this... Initiative to demonstrate how you can do exactly that on your own training data deep! Model is used to extract building footprints from high resolution satellite imagery ) make use of the procedure is minimum... Images of size 650 x 650 squared pixels causes extract building footprints from satellite images using deep learning false positive to...