Brussels / 3 & 4 February 2024


Prompt Compass: A Methodological Approach to Evaluating the Use of Large Language Models in SSH research

As researchers continue to explore the utility of platform-based large language models (LLMs) for tasks like data extraction, annotation, and classification, concerns about access, cost, scalability, volatility, and privacy arise. Even when using local or open-source LLMs in the social sciences and humanities, it is critical to address the inherent inconsistencies of LLM outputs, and to assess their suitability for specific tasks. 

How should LLMs be approached and evaluated for digital research projects? I propose a methodology for systematically exploring and evaluating above issues using Prompt Compass. The tool encapsulates research affordances for working with LLMs tailored to the social sciences and humanities. It provides easy access to various local and platformed LLMs, defines default (but modifiable) parameters that produce the most consistent results, offers a library of ready-made and customizable research prompts, allows for looping over rows in a CSV, and allows for testing and evaluating various LLM-prompt combinations. 

As technological advances reshape social science and humanities research, tools like Prompt Compass are critical in bridging the gap between LLM technologies and methodological robustness in both qualitative and quantitative research. Such tools allow for a hands-on approach to assess the stability of prompts, their accordance with specific LLMs, and the replicability of research.

Prompt Compass is available under an Apache 2.0 license at


Photo of Erik Borra Erik Borra