Does learning during election campaigns really matter? To answer this, we replicate the results of the paper “Election Campaigns as Information Campaigns: Who Learns What and Does it Matter?” (Nadeau, Nevitte, Gidengil, & Blais 2008; hereafter NNGB). The study uses data from the 1997 Canadian Election Study to make a causal argument that learning during campaigns affects voters’ vote choices and that this relationship is moderated by voters’ initial information levels. They examine the impact of information gains during the 1997 Canadian election on people with varying levels of pre-existing knowledge. NNGB find that information has the most significant impact on voters with medium low and medium high information (measured by a composite variable, General Stock of Information, or GSI, which is measured by the number of correct answers respondents give to questions of general political knowledge). Voters with very low and very high levels of general knowledge are less affected by information gains during election campaigns. The authors argue that this is because highly informed voters will absorb new information, but this is unlikely to shift their vote choice which is already the product of high information levels, while low information voters are likely to learn only information reinforcing existing views.

We began by trying to replicate their results as closely as possible. We found similar effect sizes and directions as those reported in Table 4 of NNGB’s original paper (p. 241), for the most part (Table 1). The coefficients aren’t exactly the same because the original authors do not include enough detail about variable coding for us to accurately recreate their findings. Overall though, it looks to fit their theory, as those with medium levels of information are driving the relationship between information gains and vote choice changes. Notably, the authors found the most significant effect of information on medium-high knowledge voters. We found the most significance for medium-low knowledge voters (a 1.586 increase in vote change, significant at the 0.01 level).

Table 4 Replication
Dependent variable:
Vote Change
(1) (2) (3) (4) (5)
Time 0.016*** 0.019 0.016 0.006 0.021**
(0.005) (0.017) (0.010) (0.011) (0.010)
Party ID Strength -0.901*** -1.375*** -1.089*** -0.996*** -0.866***
(0.136) (0.450) (0.273) (0.260) (0.267)
Loser -1.498*** -3.095*** -1.209*** -1.479*** -1.776***
(0.155) (0.558) (0.316) (0.299) (0.373)
Schooling -0.476* -2.504** -0.061 -0.429 0.138
(0.289) (0.998) (0.590) (0.540) (0.577)
Interest 0.164 3.395*** -1.107* 0.147 0.442
(0.280) (0.947) (0.573) (0.524) (0.612)
TV -0.022 -2.193*** -0.049 0.461 0.419
(0.242) (0.793) (0.510) (0.459) (0.495)
News -0.504** -0.809 -0.583 -0.405 -0.694
(0.224) (0.739) (0.464) (0.403) (0.466)
GSI -0.137**
(0.056)
CSI Change 0.585** 0.908 1.586*** 0.187 -0.210
(0.285) (1.315) (0.586) (0.493) (0.630)
Constant 0.851*** 0.939 0.982* 0.458 -0.673
(0.270) (0.770) (0.531) (0.486) (0.572)
Info Levels All Low Medium Low Medium High High
Observations 1,467 209 355 424 414
Akaike Inf. Crit. 1,730.265 208.637 428.812 515.466 436.120
Note: p<0.1; p<0.05; p<0.01


To achieve this, we needed to make a number of decisions in our coding. To measure party ID strength (hereafter PID), we include ‘Don’t Know’ and ‘Refusal to Answer’ as low levels of PID strength, as respondents who feel strongly should be willing to give an answer. We also code ‘Don’t Know’ and ‘Refusal to Answer’ as intending to vote for the losing party (in this case, a non-Liberal party), since excluding these voters reduces our sample size to much less than the original authors’. Further, we combine participants who score 0 or 1 on the CSI (Campaign-Specific Information) Change variable as having low levels of information to create similar sized groups to those used in the original piece. Finally, we code voters who move from ‘Don’t Know’ for their vote choice pre-election to reporting a vote choice post-election as having changed their vote. Again, in this final case any other choice reduces our sample to levels much smaller than that used by the original authors. The rest of our coding decisions follow NNGB’s original coding.

However, we disagree with this final coding decision, despite its apparent importance in recreating the findings of NNGB. Including initial undecideds introduces the potential for significant omitted variable bias since respondents who don’t know who they will vote for in the upcoming election likely differ from those who do in some important ways beyond those controlled for, but by definition would have been coded as changing their vote choice. Unless this is orthogonal to campaign information gains, this will bias the results as these voters’ outcome variable was pre-determined regardless of the information campaign treatment. Figure 1 shows that once these voters are excluded, the coefficients for information gains are not significant, suggesting that the effect found by the original authors was driven by this bias.

We also try a number of different specifications to test whether the lack of significance is robust to multiple specifications. The following coefficient plot looks at four different specifications; three that excluded undecided respondents and 1 that excludes respondents in Quebec. Again, information changes don’t appear to be related to vote choice changes when undecided voters are excluded. Interestingly, the model that excludes Quebec but includes undecideds also doesn’t find significant results. In general excluding Quebec is an important robustness check because in Canadian politics, Quebec has a significantly different party system and so voters are likely to behave differently. With different and fewer viable parties (at the time only 2 were competitive in Quebec), we would expect different results. The lack of significance on the coefficient for information gains again suggests that their model is not robust to different specifications.

Figure 2 allows us to see if, when ‘Don’t Knows’ are excluded, the relationship holds up across any subsets of the sample. We look at the four that the original author looks at - low, medium low, medium high, and high GSI levels. In line with the rest of our findings, there doesn’t seem to be an effect across any of these groups once we’ve removed these problematic responses. The overall picture seems to be that their findings do not stand up to alternate specifications.

The next step in our replication was to assess the initial causal claims, passing by this apparent issue with the undecideds. We found their initial justification of a causal relationship unsatisfying as campaign learning isn’t randomly assigned amongst voters. We believe that the determinants of political information in voters are likely also related to vote choice changes and so the model used by the authors is open to other third variable issues. Unobserved traits like aptitude, curiosity, and receptiveness to risk might all drive a desire to become better informed and affect people’s vote choices by making different policies seem more or less attractive.

We use matching to create similar “as-if-random” groups to account for the covariates that predict receiving the treatment, which in this case is being informed. This should reduce the bias from confounding variables that influence both the independent and dependent variable. It also avoids imposing a specific relationship between the causal variable of interest and the results. We consider participants who learn at all to be treated while those who do not learn (have a CSI Change score of 0) to be untreated. The results of matching are recorded in Table 4. We match across all the variables that the authors include in their original analysis.

Balance Table
With Undecideds
Without Undecideds
Treated Control T-Stat P-Value Treated Control T-Stat P-Value
Time 14.46 15.84 -13.47 0.00 13.96 15.48 -15.45 0.00
PID Strength 0.77 0.77 0.47 0.16 0.81 0.81 0.34 0.32
Loser 0.29 0.29 -0.65 0.08 0.37 0.37 -0.54 0.16
Schooling 0.55 0.55 2.76 0.10 0.55 0.55 0.72 0.73
Interest 0.64 0.64 -0.89 0.58 0.65 0.65 -2.81 0.17
TV 0.51 0.50 3.18 0.05 0.52 0.52 2.11 0.31
News 0.42 0.41 1.11 0.46 0.43 0.44 -0.36 0.85
GSI 2.73 2.62 11.08 0.00 2.82 2.68 13.38 0.00

Our findings appear to support the interpretation of our earlier analysis - that some omitted variable is driving the relationship. Figure 3 shows that there is a significant difference between the ATT and the ATC; the ATC is primarily driving the near significance of the ATE while the ATT is not significant. The effect appears to be in the direction theorized for the ATC, but not for the ATT. One potential explanation for this story is that participants who do not learn are different from those who do learn in some significant way. This fits with the results earlier shown that the inclusion of ‘Don’t Knows’ might be responsible for the effect found by the original authors. Figure 3 provides more support for this; when undecided respondents are removed from the analysis, the ATC and ATT both become indistinguishable from 0.

Overall, the findings seem to contradict those of the original authors. Our replication suggests that the finding that campaign information gains matter is highly sensitive to the model specifications and, across a number of different replications, appears to disappear. We show one possible mechanism that might be causing the apparent finding of the original authors - how they treat undecided voters. It appears to us that there is something about undecided voters that causes them to learn differently at the end of a campaign than other voters and that this drives the relationship originally found.