We introduce a framework that integrates traditional topic modeling methods-Latent Dirichlet Allocation (LDA) and BERTopic- with Large Language Models (LLMs) to automatically identify topics featured in project proposals for the cultural heritage (CH) domain. Applied to a dataset of 1, 757 English project proposals aimed at protecting and promoting CH in Africa, our approach begins by extracting initial topics using LDA and BERTopic. These topics are further refined by LLaMA3, generating precise and semantically meaningful categories that incorporate domain expert-curated labels to ensure contextual relevance. The consistency of assigned labels is evaluated using automatic classification. Additionally, we explore the role of linguistic features, such as sentence complexity, sentiment analysis, and gendered language, as predictors of proposal success. Results highlight the potential of combining traditional topic modeling with LLMs to uncover hidden insights into funding allocation patterns, aiming to enhance the equitable distribution of resources in CH projects.
inproceedings
BibTeXKey: TTK+25