So what is the best model?

"all models of language are wrong but some are useful"

"Once the research question is defined and the researcher is performing measurement, we can assess how accurate a method is at recovering the specific concept of interest to the researcher. Because there were many different things about the document that the researcher could have chosen to measure, we do not assume there is a single model that captures a true data generating process, but hopefully there are options that help us capture what we need for our research question. In other words, we are agnostic about which particular model is used for measurement, as long as the model can accurately and reliably measure the concept of interest."
(Grimmer et al 2022, 19)
→ How do you make sure you have a useful model?

Know your data

the better you know your data, the easier it is to judge whether a model works

Know your model

see what is behind your model predictions e.g.

which features does a dictionary actually pick up?

Which features are most predictive in a machine learning model?

how does your complex model classify example sentences?

use quantitative and qualitative validation techniques and present the evidence

Explain your model

make sure you understand what is happening

use only as much complexity as needed for addressing a task

→ no inherent benefits of more complex models, though performance statistics for some tasks have become very impressive!

Upsides of complex models

significantly improved performance for many tasks

Downsides

understandability - Do you understand BERT? Does your reviewer?

complexity - hidden biases

environmental impact and computational power

change in scale between word embeddings and transformer models

"Training a single BERT base model (without hyperparameter tuning) on GPUs was estimated to require as much energy as a trans-American flight." (Bender et al)

https://theresagessler.eu/eui_cta/slides/session6.pdf