8 Week 5: Scaling techniques

Here we begin thinking about more automated techniques for analyzing texts. And there are a bunch of additional considerations we now need to bring to mind. These considerations have sparked significant debates… and the matter is by no means settled.

So what is at stake here? In weeks to come, we will be studying various techniques to ‘classify,’ ‘position’ or ‘score’ texts based on their features. The success of these techniques depends on their suitability to the question at hand but also on higher-level questions about meaning. In short, we have to ask ourselves: is there a way we can access the underlying processes governing the generation of text? Is meaning governed by a set of structural processes? And can we derive ‘objective’ measures of the contents of any given text?

The readings by Justin Grimmer, Roberts, and Stewart (2021), Denny and Spirling (2018), and Goldenstein and Poschmann (2019b) (as well as the response and replies by Nelson (2019) and Goldenstein and Poschmann (2019a)) will be required reading for Flexible Learning Week.

Justin Grimmer, Roberts, and Stewart (2021)
Justin Grimmer and Stewart (2013a)
Denny and Spirling (2018)
Goldenstein and Poschmann (2019b)
- Nelson (2019)
- Goldenstein and Poschmann (2019a)

The substantive focus of this week are a set of readings that all employ different types of “scaling” or “low-dimensional document embedding” techniques. The article by Lowe (2008) provides a technical overview of the “wordfish” algorithm and its uses in a political science contexts. The article by Klüver (2009) also uses “wordfish” in a different way—to measure the “influence” of interest groups. The response to this article by Bunea and Ibenskas (2015) and subsequent reply by Klüver (2015) helps illuminate some of the debates around these questions. The work by Kim, Lelkes, and McCrain (2022) gives an insight into the ability of text-scaling techniques to capture key dimensions of political communication such as bias.

Questions:

What assumptions underlie scaling models of text?; What is latent in a text and who decides?
What might scaling be useful for outside of estimating ideological position/bias from text?

Required reading:

Lowe (2008)
Kim, Lelkes, and McCrain (2022)
Klüver (2009)
- Bunea and Ibenskas (2015)
- Klüver (2015)

Further reading:

Benoit et al. (2016)
Laver, Benoit, and Garry (2003)
Slapin and Proksch (2008)
Schwemmer and Wieczorek (2020)

Slides:

Week 5 Slides

7 Week 4 Demo

9 Week 5 Demo