Data from Google Earth and other Earth observation campaigns is allowing Statistics Canada (StatsCan) to create databases showing how Canadians live and work.
The government department showcased some of its ongoing work at the online GeoIgnite conference on Friday (July 24), with projects ranging from building catalogues to walkability.
“This new data environment [allows us to] bring together and harmonize open microdata from provincial and municipal services,” said Alessandro Alasia, who is assistant director at the Centre for Special Business Projects at Statistics Canada, where he leads the Data Exploration and Integration Lab (DEIL).
DEIL has been working with many open datasets in recent years, with one of its culmination projects being the Linkable Open Data Environment (LODE). Some of LODE’s achievements include (with Microsoft) making available databases of almost all building footprints across Canada; LODE also has preliminary versions of open databases related to businesses, educational facilities, municipal addresses and – relevantly for COVID-19 – hospitals.
As an example, the Open Database of Healthcare Facilities was released rapidly in April in response to the novel coronavirus pandemic, said DEIL member Joseph Kuchar. An updated version with more data should be available in August. The database represents a compilation of data from the provincial, municipal, federal and (in the case of the Canadian Institutes for Health Research) not-for-profit sectors.
The DEIL databases depend in part on data obtained from satellites, including aspects such as mapping facilities relevant to each other, obtaining latitude and longitude, and in some cases, showing information such as building height that can be estimated from satellite images.
That latter example comes from DEIL’s Open Database of Buildings, a compilation of 65 datasets originating from municipal and provincial sources, said DEIL geospatial analyst Marina Smailes. The database has metrics such as latitude, longitude, footprint area, footprint perimeter and location based on census subdivisions.
Partnerships have expanded the database’s reach, Smailes added. Microsoft helped StatsCan fill in more remote buildings, while using the database as a training dataset for machine learning. Postsecondary capstone projects (such as with Fleming College and the University of British Columbia) have added information such as density of buildings within a given area, which helps with metrics such as a population’s access to neighbourhood facilities. A prototype GitHub viewer explores building footprint visualizations based on this database.
The building height estimation for the database was acquired using Google Street View images, said DEIL researcher Ala’a Al-Habashna. The goal is to create an automatic and open source system for building height estimation, Al-Habashna said. Multiple databases are used to refine the work, such as Nominatim that searches data by names and addresses obtained through geocoding. Advanced image processing can partition each image into meaningful parts and associate individual pixels with a class label, which is useful for machine learning techniques as well, Al-Habashna added.
More generally, these databases help StatsCan get a handle on “proximity measures” – a metric relating to how location relates to accessibility and local inclusion, said DEIL economist Nick Newstead. One common metric neighbourhood planners use is walkability to common services such as grocery stores, particularly for those people living in dense neighbourhoods who may not have cars or other personal vehicles such as bicycles. You can explore the dataset yourself using StatsCan’s Proximity Measures Data Viewer, which was developed with the Canada Housing and Mortgage Corp.
In the question and answer period, Alasia pointed to a need to keep renewing the databases as more information comes in; for example, the buildings database is always changing as new buildings are constructed and older ones demolished. There is the potential to add some interior databases as those become more widely available, Alasia added. Newstead added that proximity to transit can also be considered in the proximity measures database.
Open Projects at DEIL presentation at Carleton University, March 13, 2019.
[pdf-embedder url=”https://sqreports.s3.ca-central-1.amazonaws.com/2020/Open_Data_Carleton_20190312-FINAL.pdf”]