Home

Thoughts on new work, small discoveries, and the publications that mark the journey.

Evaluating the Feasibility of ChatGPT for Mapping Building Attributes

In the past few years, we have been exploring how multimodal large language models can help urban researchers extract information from images. This book chapter, “Evaluating the Feasibility of ChatGPT for Mapping Building Attributes,” focuses on a practical question: can ChatGPT help extract building type, height, and age from Mapillary street view images in New York City?

The motivation is simple: many urban climate and energy models need detailed information about buildings, but fine-grained building attributes are often incomplete, expensive to collect, or unavailable at scale. Multimodal large language models lower part of the technical barrier by allowing researchers to turn visual cues into structured information without building a full computer vision pipeline from scratch.

In this chapter, we compare ChatGPT-derived building attributes with authoritative and hand-labeled reference data. The results suggest that building function is more tractable than age and height, while also showing why image quality, prompt design, and validation data matter when MLLMs are used for urban analytics.

Abstract:

With increasing rates of urbanization, many challenges are emerging regarding urban sustainability such as the energy usage of buildings. Coinciding with this is the growing attention of urban climate models for energy demand estimation and climate adaptation strategies. However, the applicability of these models is constrained by the lack of detailed urban surface information. Therefore, creating comprehensive datasets that capture urban surface information at a granular scale is crucial for responding to our rapidly urbanizing world.

Recent advancements in multimodal large language models have opened new opportunities in urban studies, offering accessible methods for information extraction. In this chapter we explore the feasibility of ChatGPT to extract building attributes from images. Taking New York City as a case study, we collect building images from Mapillary and process them through ChatGPT by posing specific questions to extract building attributes, including height, function, and age. These attributes are then compared with authoritative data.

An overview of the research workflow.
A comparison of building heights from ChatGPT and NYC Open Data.

Full reference:

Chen, Q., See, L., and Crooks, A. T. (2026). Evaluating the Feasibility of ChatGPT for Mapping Building Attributes. In Janowicz, K., Zhu, R., Mai, G., Gao, S., Hu, Y., Wang, Z., Cai, L., and Bennett, L. (eds.), Geography According to Foundation Models. IOS Press. https://doi.org/10.3233/FAIA260474

See the related project page: Building Attributes Extraction · Publication page.