Geographic Question Answering with Large Language Models

This project investigates the potential of Large Language Models (LLMs) to enhance, integrate with, or replace traditional Geographic Question Answering (GeoQA) systems. GeoQA systems aim to answer natural language questions that require geographic knowledge and spatial reasoning — questions like "What restaurants are near Cardiff Castle?" or "Which cities are north of London?". Traditional GeoQA approaches rely on complex pipelines involving query parsing, named entity recognition, spatial relation extraction, and database querying. With the emergence of LLMs, this project explores whether these models can handle the core challenges of GeoQA: understanding user queries with spatial intent, performing spatial reasoning, interpreting vague and qualitative spatial language, and synthesising coherent, accurate answers.

Research Questions

How effectively can LLMs understand and interpret geographic questions involving spatial relations, place references, and qualitative descriptions?
To what extent can LLMs perform spatial reasoning tasks traditionally handled by specialised GeoQA components?
How do LLMs handle vague and vernacular spatial language (e.g., "near", "around", "in the city centre")?
What are the limitations of LLMs in GeoQA, particularly regarding factual accuracy, spatial consistency, and hallucination?
How can LLMs be integrated with structured geographic knowledge bases to improve answer quality and reliability?
What hybrid architectures combining LLMs with traditional GeoQA components yield the best results?

Potential Contributions

Systematic evaluation of LLM capabilities across GeoQA subtasks
Analysis of LLM performance on vague and qualitative spatial language
Framework for integrating LLMs with geographic knowledge graphs
Benchmark datasets and evaluation metrics for LLM-based GeoQA