Home | Qingqing Chen

Monday, June 01, 2026

Evaluating the Feasibility of ChatGPT for Mapping Building Attributes

Urban climate models now often depend on detailed urban surface inputs, yet these data are not always complete, consistent, or easy to build at scale. That practical constraint motivated this new book chapter: “Evaluating the Feasibility of ChatGPT for Mapping Building Attributes,” written with Linda See and Andrew Crooks, and published in the open access book “Geography According to Foundation Models.”

Using New York City as a case study, we test whether a multimodal model can recover three attributes that are highly useful for urban analysis: building function, height, and age. The goal is not to claim ChatGPT as a replacement for established methods, but to examine whether it can provide useful, scalable signals where data pipelines are still fragmented.

The process is intentionally simple: Mapillary images are collected, the same prompts are applied to ChatGPT, and the outputs are validated against authoritative references. The patterns are clear and useful: building function is comparatively reliable, while height and age remain more variable. Most failures come from familiar issues in visual inference, such as partial views, ambiguous facades, and prompt sensitivity.

In short, MLLMs can widen access to image-based urban sensing. They do so best when paired with strong validation, and they make quality control even more important.

The volume was edited by Krzysztof Janowicz, Rui Zhu, Gengchen Mai, Song Gao, Yingjie Hu, Zhangyu Wang, Ling Cai, and Lauren Bennett.

Abstract:

With increasing rates of urbanization, many challenges are emerging regarding urban sustainability such as the energy usage of buildings. Coinciding with this is the growing attention of urban climate models for energy demand estimation and climate adaptation strategies. However, the applicability of these models is constrained by the lack of detailed urban surface information. Therefore, creating comprehensive datasets that capture urban surface information at a granular scale is crucial for responding to our rapidly urbanizing world. Recent advancements in Multimodal Large Language Model (MLLMs) have opened new opportunities in urban studies, offering accessible methods for information extraction. In this chapter we explore the feasibility of ChatGPT to extract building attributes from images. Taking New York City as a case study, we collect building images from Street View Imagery and process them through ChatGPT by posing specific questions to extract building attributes (e.g., height, functions, age). These attributes are then compared with authoritative data. The proposed method helps address the current dearth of fine-grained surface data on urban issues, therefore enhancing the accuracy and utility of urban climate models. Overall, this study demonstrates the practical applications of ChatGPT in geographic knowledge extraction, advancing the understanding of MLLMs in geographic contexts, and more broadly to the discourse on Artificial Intelligence (AI) in urban modeling and climate science.

An overview of the research workflow.

A comparison of building heights from ChatGPT and NYC Open Data.

Full reference:

Chen, Q., See, L., and Crooks, A. T. (2026). Evaluating the Feasibility of ChatGPT for Mapping Building Attributes. In Janowicz, K., Zhu, R., Mai, G., Gao, S., Hu, Y., Wang, Z., Cai, L., and Bennett, L. (eds.), Geography According to Foundation Models. IOS Press. https://doi.org/10.3233/FAIA260474