In this vignette we demonstrate the basic workflow using the most recent Emnid survey.

Overview

This package calculates coalition probabilities in multi-party systems. To this end we provide some convenience functions, the most important of which are listed below:

  1. scrape_wahlrecht: A wrapper function that download the most current survey results from https://www.wahlrecht.de/

  2. collapse_parties: Transforms information on percentages received by individual parties in long format and stores them inside a nested tibble (see tidyr::nest)

  3. draw_from_posterior: Draws nsim samples from the posterior distribution (i.e. nsim simulated election results based on provided survey)

  4. get_seats: Obtain seat distributions for each simulation (see also ?sls)

  5. have_majority: Given a list of coalitions of interest, calculates if the respective coalition would have enough seats for a majority

  6. calculate_probs: Given majority tables obtained from have_majority, calculates the probabilities for the respective coalitions to have enough seats for a majority

Read in data

# one line per survey (party information in wide format)
emnid <- scrape_wahlrecht() %>% slice(1:6)
emnid %>% select(-start, -end)
## # A tibble: 6 x 9
##   date         cdu   spd greens   fdp  left   afd others respondents
##   <date>     <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl>  <dbl>       <dbl>
## 1 2013-09-29    43    26      7     3     9     6      6        1382
## 2 2013-10-06    42    25      8     4     9     6      6        1849
## 3 2013-10-13    42    25      9     3    10     6      5        1833
## 4 2013-10-20    42    25      9     3    10     5      6        2334
## 5 2013-10-27    41    26     10     3     9     5      6        3219
## 6 2013-11-03    42    25      9     3     9     5      7        2768

Collapse survey data

After applying collapse_parties we still have one row per survey, but information on parties and percentage of votes they received is stored in a long format in a separate column (see ?tidyr::nest):

elong <- collapse_parties(emnid)
head(elong)
## # A tibble: 6 x 5
##   date       start      end        respondents survey              
##   <date>     <date>     <date>           <dbl> <list>              
## 1 2013-11-03 2013-10-24 2013-10-30        2768 <tibble[,3] [7 × 3]>
## 2 2013-10-27 2013-10-17 2013-10-23        3219 <tibble[,3] [7 × 3]>
## 3 2013-10-20 2013-10-10 2013-10-16        2334 <tibble[,3] [7 × 3]>
## 4 2013-10-13 2013-10-04 2013-10-09        1833 <tibble[,3] [7 × 3]>
## 5 2013-10-06 2013-09-26 2013-10-01        1849 <tibble[,3] [7 × 3]>
## 6 2013-09-29 2013-09-24 2013-09-26        1382 <tibble[,3] [7 × 3]>
elong %>% slice(1) %>% select(survey) %>% unnest("survey")
## # A tibble: 7 x 3
##   party  percent  votes
##   <chr>    <dbl>  <dbl>
## 1 cdu         42 1163. 
## 2 spd         25  692  
## 3 greens       9  249. 
## 4 fdp          3   83.0
## 5 left         9  249. 
## 6 afd          5  138. 
## 7 others       7  194.

Simulate elections

Based on each survey we can now simulate nsim elections by drawing from the Dirichlet distribution

set.seed(1)     # for reproducibility

elong <- elong %>%
  mutate(draws = map(survey, draw_from_posterior, nsim=10, correction=0.005))
elong %>% select(date, survey, draws)
## # A tibble: 6 x 3
##   date       survey               draws                
##   <date>     <list>               <list>               
## 1 2013-11-03 <tibble[,3] [7 × 3]> <tibble[,7] [10 × 7]>
## 2 2013-10-27 <tibble[,3] [7 × 3]> <tibble[,7] [10 × 7]>
## 3 2013-10-20 <tibble[,3] [7 × 3]> <tibble[,7] [10 × 7]>
## 4 2013-10-13 <tibble[,3] [7 × 3]> <tibble[,7] [10 × 7]>
## 5 2013-10-06 <tibble[,3] [7 × 3]> <tibble[,7] [10 × 7]>
## 6 2013-09-29 <tibble[,3] [7 × 3]> <tibble[,7] [10 × 7]>
# each row is one election simulation
elong %>% slice(1) %>% select(draws) %>% unnest("draws")
## # A tibble: 10 x 7
##      cdu   spd greens    fdp   left    afd others
##    <dbl> <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>
##  1 0.422 0.256 0.0799 0.0292 0.0920 0.0518 0.0696
##  2 0.404 0.249 0.0940 0.0247 0.0991 0.0593 0.0693
##  3 0.432 0.245 0.0887 0.0378 0.0888 0.0382 0.0697
##  4 0.418 0.259 0.0877 0.0293 0.0862 0.0527 0.0673
##  5 0.412 0.251 0.0873 0.0282 0.0786 0.0595 0.0832
##  6 0.430 0.250 0.0810 0.0343 0.0862 0.0484 0.0702
##  7 0.436 0.245 0.0842 0.0351 0.0800 0.0521 0.0668
##  8 0.423 0.249 0.0935 0.0269 0.0918 0.0490 0.0670
##  9 0.423 0.244 0.0932 0.0368 0.0931 0.0471 0.0623
## 10 0.430 0.243 0.0949 0.0231 0.0905 0.0505 0.0675

Calculate seat distribution

Given the simulated elections, we can calculate the number of seats each party obtained. To do so we need to have a function that knows how to redistribute seats for the particular election. In Germany for example, seats are distributed according to the system of Sainte-Lague-Scheppers, which is implemented in ?sls.

This makes this package easily extensible to other multi-party systems, as you only need to provide a function that redistributes seats based on percentages obtained by the various parties and provide that function to the distrib.fun argument of the get_seats function:

elong <- elong %>%
  mutate(seats = map2(draws, survey, get_seats, distrib.fun=sls))
elong %>% select(date, survey, draws, seats)
## # A tibble: 6 x 4
##   date       survey               draws                 seats                
##   <date>     <list>               <list>                <list>               
## 1 2013-11-03 <tibble[,3] [7 × 3]> <tibble[,7] [10 × 7]> <tibble[,3] [46 × 3]>
## 2 2013-10-27 <tibble[,3] [7 × 3]> <tibble[,7] [10 × 7]> <tibble[,3] [46 × 3]>
## 3 2013-10-20 <tibble[,3] [7 × 3]> <tibble[,7] [10 × 7]> <tibble[,3] [46 × 3]>
## 4 2013-10-13 <tibble[,3] [7 × 3]> <tibble[,7] [10 × 7]> <tibble[,3] [49 × 3]>
## 5 2013-10-06 <tibble[,3] [7 × 3]> <tibble[,7] [10 × 7]> <tibble[,3] [49 × 3]>
## 6 2013-09-29 <tibble[,3] [7 × 3]> <tibble[,7] [10 × 7]> <tibble[,3] [49 × 3]>
## sim column indicates simulated elections (rows in draws column)
elong %>% slice(1) %>% select(seats) %>% unnest("seats")
## # A tibble: 46 x 3
##      sim party  seats
##    <int> <chr>  <int>
##  1     1 cdu      280
##  2     1 spd      170
##  3     1 greens    53
##  4     1 left      61
##  5     1 afd       34
##  6     2 cdu      267
##  7     2 spd      165
##  8     2 greens    62
##  9     2 left      65
## 10     2 afd       39
## # … with 36 more rows

In the above example, given the latest survey, CDU/CSU would have 280 seats in the first simulation, 267 seats in the second simulation, etc.

Calculate majorities

The next step is to define coalitions of interest, then calculate in what percentage of the simulations the coalition would obtain enough seats for a majority.

Below, each list element defines one coalition of interest (one party could potentially obtain absolute majority on their own):

coalitions <- list(
    c("cdu"),
    c("cdu", "fdp"),
    c("cdu", "greens"),
    c("cdu", "fdp", "greens"),
    c("spd"),
    c("spd", "left"),
    c("spd", "greens"),
    c("spd", "left", "greens"),
    c("cdu", "spd"))


elong <- elong %>%
    mutate(majorities = map(seats, have_majority, coalitions = coalitions))
elong %>% select(date, draws, seats, majorities)
## # A tibble: 6 x 4
##   date       draws                 seats                 majorities           
##   <date>     <list>                <list>                <list>               
## 1 2013-11-03 <tibble[,7] [10 × 7]> <tibble[,3] [46 × 3]> <tibble[,9] [10 × 9]>
## 2 2013-10-27 <tibble[,7] [10 × 7]> <tibble[,3] [46 × 3]> <tibble[,9] [10 × 9]>
## 3 2013-10-20 <tibble[,7] [10 × 7]> <tibble[,3] [46 × 3]> <tibble[,9] [10 × 9]>
## 4 2013-10-13 <tibble[,7] [10 × 7]> <tibble[,3] [49 × 3]> <tibble[,9] [10 × 9]>
## 5 2013-10-06 <tibble[,7] [10 × 7]> <tibble[,3] [49 × 3]> <tibble[,9] [10 × 9]>
## 6 2013-09-29 <tibble[,7] [10 × 7]> <tibble[,3] [49 × 3]> <tibble[,9] [10 × 9]>
# The majorities table for each date will have 1 row per simulation
# and one column per coalition
elong %>% slice(1) %>% select(majorities) %>% unnest("majorities")
## # A tibble: 10 x 9
##    cdu   cdu_fdp cdu_greens cdu_fdp_greens spd   left_spd greens_spd
##    <lgl> <lgl>   <lgl>      <lgl>          <lgl> <lgl>    <lgl>     
##  1 FALSE FALSE   TRUE       TRUE           FALSE FALSE    FALSE     
##  2 FALSE FALSE   TRUE       TRUE           FALSE FALSE    FALSE     
##  3 TRUE  TRUE    TRUE       TRUE           FALSE FALSE    FALSE     
##  4 FALSE FALSE   TRUE       TRUE           FALSE FALSE    FALSE     
##  5 FALSE FALSE   TRUE       TRUE           FALSE FALSE    FALSE     
##  6 TRUE  TRUE    TRUE       TRUE           FALSE FALSE    FALSE     
##  7 FALSE FALSE   TRUE       TRUE           FALSE FALSE    FALSE     
##  8 FALSE FALSE   TRUE       TRUE           FALSE FALSE    FALSE     
##  9 FALSE FALSE   TRUE       TRUE           FALSE FALSE    FALSE     
## 10 FALSE FALSE   TRUE       TRUE           FALSE FALSE    FALSE     
## # … with 2 more variables: greens_left_spd <lgl>, cdu_spd <lgl>

Calculate coalition probabilities

The last step is to calculate the coalition probabilities (note that by default we exclude “superior” coalitions, i.e. if “cdu/csu” have a majority on their own, this simulation will not be counted to the simulation with majority for “cdu/csu” and “fdp”, see example in ?calculate_probs):

elong <- elong %>%
    mutate(
        probabilities = map(majorities, calculate_probs, coalitions=coalitions))
elong %>% select(date, majorities, probabilities)
## # A tibble: 6 x 3
##   date       majorities            probabilities       
##   <date>     <list>                <list>              
## 1 2013-11-03 <tibble[,9] [10 × 9]> <tibble[,2] [9 × 2]>
## 2 2013-10-27 <tibble[,9] [10 × 9]> <tibble[,2] [9 × 2]>
## 3 2013-10-20 <tibble[,9] [10 × 9]> <tibble[,2] [9 × 2]>
## 4 2013-10-13 <tibble[,9] [10 × 9]> <tibble[,2] [9 × 2]>
## 5 2013-10-06 <tibble[,9] [10 × 9]> <tibble[,2] [9 × 2]>
## 6 2013-09-29 <tibble[,9] [10 × 9]> <tibble[,2] [9 × 2]>
# one row per coalition
elong %>% slice(1) %>% select(probabilities) %>% unnest("probabilities")
## # A tibble: 9 x 2
##   coalition       probability
##   <chr>                 <dbl>
## 1 cdu                      20
## 2 cdu_fdp                   0
## 3 cdu_greens               80
## 4 cdu_fdp_greens            0
## 5 spd                       0
## 6 left_spd                  0
## 7 greens_spd                0
## 8 greens_left_spd          20
## 9 cdu_spd                  80

Wrapper

There is a wrapper function that directly returns probabilities:

elong <- collapse_parties(emnid)
elong %>% get_probabilities(., nsim=10) %>% unnest("probabilities")
## # A tibble: 36 x 3
##    date       coalition       probability
##    <date>     <chr>                 <dbl>
##  1 2013-11-03 cdu                      20
##  2 2013-11-03 cdu_fdp                   0
##  3 2013-11-03 cdu_fdp_greens           80
##  4 2013-11-03 spd                       0
##  5 2013-11-03 left_spd                  0
##  6 2013-11-03 greens_left_spd          40
##  7 2013-10-27 cdu                       0
##  8 2013-10-27 cdu_fdp                   0
##  9 2013-10-27 cdu_fdp_greens          100
## 10 2013-10-27 spd                       0
## # … with 26 more rows