U Mannheim Computational Analysis of Political Communication

The explosion of digital communication and increasing efforts to digitize existing material has produced a deluge of material such as digitized historical news archives, policy and legal documents, political debates and millions of social media messages by politicians, journalists, and citizens. This has the potential of putting theoretical predictions about the societal roles played by information, and the development and effects of communi­cation to rigorous quantitative tests that were impossible before. Besides providing an opportunity, the analysis of such “big data” sources also poses methodological challenges. Traditional manual content analysis does not scale to very large data sets due to high cost and complexity. For this reason, many researchers turn to automatic text analysis using techniques such as dictionary analysis, automatic clustering and scaling of latent traits, and machine learning.

To properly use such techniques, however, requires a very specific skill set. This course aims to give students a basic introduction to text analysis and computational thinking. R will be used as platform and language of instruction, but the basic principles and methods are easily generalizable to other languages and tools such as python.

Course Content

Day 1:introduction to computational thinking and R
(11.11.2019 10:15 – 18:45, 406 Seminarraum; B 6, 30-32 Bauteil E-F)

Preparation:

  • read Van Atteveldt & Peng (2018) and R for Data Science chapters 1 and 2
    (freely available at https://r4ds.had.co.nz)
  • Install R, RStudio and the packages tidyverse and quanteda on your laptop.

Morning session:

Afternoon session: Data analysis with Tidyverse

 

Day 2: Automatic Quantitative Text analysis
(18.11.2019 10:15 – 18:45, 406 Seminarraum; B 6, 30-32 Bauteil E-F)

Preparation:

 

Morning session:

  • Recap and Q&A for R and tidyverse basics
  • Lecture: Principles of automatic text analysis
  • Practical:  Reading in data: structured data, text files, scraping

Afternoon session:

  • Lecture: First steps in Text Analysis: cleaning, preprocessing, tokenizing
  • Lecture: Dictionary analysis, sentiment, and validity
  • Practical: Automatic Text Analysis with Quanteda, Validity of Sentiment Analysis
  • Practical: Work on your own project

Day 3: Language Processing and Validity
(
22.11.2019, 10:15 – 18:45, 406 Seminarraum; B 6, 30-32 Bauteil E-F)

Preparation:

  • read Grimmer & Stewart (2013)
  • finish practicals if needed;
  • Submit research proposal and exploration for research project

Morning session:

  • Lecture: Topic Modeling
  • Practical: Topic modeling
  • Practical: Work on assignment / presentation

Afternoon session:

  • Lecture: Validity and reliability of automatic text analysis
  • Lecture: Machine Leaning for automatic text analysis
  • Practical: Machine Learning
  • Practical: work on research project
  • Oral exams (depending on demand)