4 Week 3

4.1 Dictionary-based techniques

An extension of word frequency analyses, which we covered last week, are so-called “dictionary-based” techniques. In their most basic form, these analyses use an index of target terms and classify the corpus of interest based on their presence or absence. The technical dimensions of this type of analysis are covered in the chapter section by Klaus Krippendorff (2004), and some of the issues attending them in the article by - Loughran and Mcdonald (2011).

We will also be reading two examples of the application of these techniques by Martins and Baumard (2020) and Young and Soroka (2012). Here, we will be discussing how successful the authors are in measuring the phenomenon of interest (“prosociality” and “tone” respectively). Questions about sampling and representativeness will again be relevant here, and will naturally inform our assessments of this work.

Questions:

  1. Are general dictionaries possible; or do they have to be domain-specific?
  2. How do we know if our dictionary is accurate?
  3. How could we enhance/supplement dictionary-based techniques?

Required reading:

  • Martins and Baumard (2020)

  • Young and Soroka (2012)

  • Loughran and Mcdonald (2011)

  • Klaus Krippendorff (2004) (pp.283-289)

Further reading:

  • Tausczik and Pennebaker (2010)
  • Brier and Hopp (2011)
  • Barberá et al. (2021)

Slides: