‘Controlling’ for confounding variables graphically

As we’ve learned (ad nauseum) basing causal claims on a simple bivariate relationship is fraught with potential roadblocks. Even though there may be a strong, and statistically significant, relationship between an independent and dependent variable, if we haven’t controlled for potentially confounding variables, we can not state with any measure of confidence that the putative relationship between the IV and DV is causal. We should always statistically control for any (and all) potentially confounding variables.

Additionally, it is often desirable to dig deeper into the data and find out if the units-of-analysis are fundamentally different on the basis of some other variable. Below you may find two plots–each of which shows the relationship between margin of victory and electoral turnout (by electoral district) for the 2017 British Columbia provincial election. The first graph plots a simple bivariate relationship, while the second plot breaks that initial relationship down by political party (which party won the electoral district). It could conceivably be the case that the relationship between turnout and margin of victory varies across the values of political party. That is, the relationship may hold in those electoral districts where party A won, but not hold in those in which party B won.

We can see here that there is little evidence to suggest a difference in the relationship based on which party won the electoral district. Can you think of another `third’ variable that may cause the relationship between turnout and margin of victory to be systematically different across different values of that variable? What about rural-versus-urban electoral districts?

Here are the plots: