Data Visualization #19—Using panelView in R to produce TSCS (time-series cross-section) plots

As part of a project to assess the influence, or impact, of Canadian provincial government ruling ideologies on provincial economic performance I have created a time-series cross-section summary of my party ideology variable across provinces over time. A time-series cross-section research design is one in which there is variation across space (cross-section) and also over time. The time units can literally be anything although in comparative politics they are often years. The cross-section part can be countries, cities, individuals, states, or (in my case) provinces. Here is the snippet of the data structure in my dataset (data frame in R):

canparty.df[1:20,c(2:4,23)]

Year                    Region                      Pol.Party party.ideology
1981                   Alberta Progressive Conservative Party              1
1981          British Columbia            Social Credit Party              1
1981                  Manitoba Progressive Conservative Party              1
1981             New Brunswick Progressive Conservative Party              1
1981 Newfoundland and Labrador Progressive Conservative Party              1
1981               Nova Scotia Progressive Conservative Party              1
1981                   Ontario Progressive Conservative Party              1
1981      Prince Edward Island Progressive Conservative Party              1
1981                    Quebec                Parti Quebecois              0
1981              Saskatchewan           New Democratic Party             -1
1982                   Alberta Progressive Conservative Party              1
1982          British Columbia            Social Credit Party              1
1982                  Manitoba           New Democratic Party             -1
1982             New Brunswick Progressive Conservative Party              1
1982 Newfoundland and Labrador Progressive Conservative Party              1
1982               Nova Scotia Progressive Conservative Party              1
1982                   Ontario Progressive Conservative Party              1
1982      Prince Edward Island Progressive Conservative Party              1
1982                    Quebec                Parti Quebecois              0
1982              Saskatchewan Progressive Conservative Party              1

To get the plot picture below, we use the R code at the bottom of this post. But a couple of notes: first, the year data are not in true date format. Rather, they are in periods, which I have conveniently labelled years. In other words, what is important for the analysis that I will do (generalized synthetic control method) is to periodize the data. Second, because elections occur at any point during the year, I have had to make a decision as to which party is coded as having been in government that year.

Since my main goal is to assess economic performance, and because economic policies take time to be passed, and to implement, I made the decision to use June 30th as a cutoff point. If a party was elected prior to that date, it is coded as having governed the province in that whole year. If the election was held on July 1st (or after), then the incumbent party is coded as having governed the province the year of the election and the new government is coded as having started its mandate the following year.

Here’s the plot, and the R code below:

library(gsynth)
library(panelView)
library(ggplot2)

ggpanel1 <- panelView(Prop.seats.gov ~ party.ideology + Prov.GDP.Cap, data = canparty.df, 
          index = c("Region", "Year"), main = "Provincial Ruling Party Ideology", 
          legend.labs = c("Left", "Centre", "Right"), col=c("orange", "red", "blue"), 
          axis.lab.gap = c(2,0), xlab="", ylab="")
## I've used Prop.seats.gov and Prov.GDP.Cap b/c they are two 
## of my IVs, but any other IVs could have been used to 
## create the plot. The important part is the party.ideology 
## variable and the two index variables--Region (province)  ## and Year.

## Save the plot as a .png file

ggsave(filename="ProvRulingParty.png", plot=ggpanel1, height=8,width=7)

Data Visualization #16—Canadian Federal Equalization Payments per capita

My most recent post in this series analyzed data related to the federal equalization program in Canada using a lollipop plot made with ggplot2 in R. The data that I chose to visualize—annual nominal dollar receipts by province—give the reader the impression that over the last five-plus decades the province of Quebec (QC) is the main recipient (by far) of these federal transfer funds. While this may be true, the plot also misrepresents the nature of these financial flows from the federal government to the provinces. The data does not take into account the wide variation in populations amongst the 10 provinces. For example, Prince Edward Island (PEI) as of 2019 has a population of about 156,000 residents, while Quebec has a population of approximately 8.5 million, or about 55 times as much as PEI. That is to say a better way of representing the provincial receipt of equalization funds is to calculate the annual per capita (i.e., for every resident) value, rather than a provincial total.

For the lollipop chart below, I’ve not only calculated an annual per-capita measure of the amount of money received by province, I’ve also controlled for inflation, understanding that a dollar in 1960 was worth a lot more (and could be used to buy many more resources) in 1960 than today. Using Canadian GDP deflator data compiled by the St. Louis Federal Reserve, I’ve created plotting variable—annual real per-capita federal equalization receipts by province, with a base year of 2014. Here, we see that the message of the plot is no longer Quebec’s dominance but a story in which Canadians (regardless of where they live) are treated relatively equally. Of course, every year, Canadian in some provinces receive no equalization receipts.

Here’s the plot, and the R code below it:

Source: https://open.canada.ca/data/en/dataset/4eee1558-45b7-4484-9336-e692897d393f
require(ggplot2)
require(gganimate) 

gg.anim.lol3 <- ggplot(eq.pop.df[eq.pop.df$Year!="2019-20",], aes(x=province, y=real.value.per.cap, label=real.value.per.cap.amt)) + 
  geom_point(stat='identity', size=14, aes(col=as.factor(zero.dummy))) +  #, fill="white"
  scale_color_manual(name="zero.dummy", 
                     #      labels = c("Above", "Below"), 
                     values = c("0"="#000000", "1"="red")) + 
  labs(title="Per Capita Federal Equalization Entitlements (by Province): {closest_state}",
       subtitle="(Real $ CAD—2014 Base Year)",
       x=" ", y="$ CAD (Real—2014 Base Year)") +
  geom_segment(aes(y = 0, 
                   x = province, 
                   yend = real.value.per.cap, 
                   xend = province), 
               color = "red",
               size=1.5) +
  scale_y_continuous(breaks=seq(0,3000,500)) +
  theme(legend.position="none",
        plot.title =element_text(hjust = 0.5, size=23),
        plot.subtitle =element_text(hjust = 0.5, size=19),
        axis.title.x = element_text(size = 16),
        axis.title.y = element_text(size = 16),
        axis.text.y =element_text(size = 14),
        axis.text.x=element_text(vjust=0.5,size=16, colour="black")) +
  geom_text(color="white", size=4) +
  transition_states(
    Year,
    transition_length = 1,
    state_length = 9
  ) +
  enter_fade() +
  exit_fade()

animate(gg.anim.lol3, nframes = 610, fps = 10, width=800, height=680, renderer=gifski_renderer("equal_real_per_cap_lollipop.gif"))  

Data Visualization #15—Canadian Federal Equalization Payments over time using an animated lollipop graph

There is likely no federal-provincial political issue that stokes more anger amongst Albertans (and is so misunderstood) as equalization payments (entitlements) from the Canadian federal to the country’s 10 provinces. Although some form of equalization has always been a part of the federal government’s policy arsenal, the current equalization program was initiated in the late 1950s, with the goal of providing, or at least helping achieve an equal playing field across the country in terms of basic levels of public services. As Professor Trevor Tombe notes:

Regardless of where you live, we are committed (indeed, constitutionally committed) to ensure everyone has access to “reasonably comparable levels of public services at reasonably comparable levels of taxation.”

Finances of the Nation, Trevor Tombe

For more information about the equalization program read Tombe’s article and links he provides to other information. The most basic misunderstanding of the program is that while some provinces receive payments from the federal government (the ‘have-nots’) the other, more prosperous, provinces (the ‘haves’) are the source of these payments. You often hear the phrase “Alberta sends X $billion to Quebec every year!” That’s not the case. The funds are generated and distributed from federal revenues (mostly income tax) and disbursed from this same fund of resources. The ‘have’ provinces don’t “send money” to other provinces. The federal government collects tax revenue from all individuals and if a province has a higher proportion of high-earning workers than another, it will generally receive less back in money from the federal government than its workers send to Ottawa. (To reiterate, read Tombe for more about the particulars.)

Using data provided by the Government of Canada, I have decided to show the federal equalization outlays over time using what is called a lollipop chart. I could have used a bar chart, but I like the way the lollipop chart looks. Here’s the chart and the R code below:

Created by Josip Dasović

The data source is here: https://open.canada.ca/data/en/dataset/4eee1558-45b7-4484-9336-e692897d393f, and I am using the table called Equalization Entitlements.

N.B.: The original data, for some reason, had Alberta abbreviated as AL, so I had to edit the my final data frame and the gif.

## You'll need these two libraries
require(ggplot2)
require(gganimate)

gg.anim.lol1 <- ggplot(melt.eq.df, aes(x=variable, y=value, label=amount)) + 
  geom_point(stat='identity', size=14, aes(col=as.factor(zero.dummy))) +  #, fill="white"
  scale_color_manual(name="zero.dummy", 
                     values = c("0"="#000000", "1"="red")) + 
  labs(title="Canada—Federal Equalization Entitlements (by Province): {closest_state}",
       x=" ", y="Millions of nominal $ (CAD)") +
  geom_segment(aes(y = 0, 
                   x = variable, 
                   yend = value, 
                   xend = variable), 
               color = "red",
               size=1.5) +
  scale_y_continuous(breaks=seq(0,15000,2500)) +
  theme(legend.position="none",
        plot.title =element_text(hjust = 0.5, size=23),
        axis.title.x = element_text(size = 16),
        axis.title.y = element_text(size = 16),
        axis.text.y =element_text(size = 14),
        axis.text.x=element_text(vjust=0.5,size=16, colour="black")) +
  geom_text(color="white", size=4) +
    transition_states(
  Year,
  transition_length = 2,
  state_length = 8
) +
  enter_fade() +
  exit_fade()

animate(gg.anim.lol1, nframes = 630, fps = 10, width=800, height=680, renderer=gifski_renderer("equal.lollipop.gif"))  

Data Visualization #5–Canadian Residential Schools–plotting change in number and federal government

At the end of Data Visualization # 4 I promised to look at a couple of alternative solutions to the problem of outliers in our data. I’ll have to do so in my next data visualization (#6) because I’d like to take some time to chart some data that I have been interested in for a while and was made more topical by some comments unearthed a few days ago that were made by the leader of Canada’s federal Conservative Party, Erin O’Toole on the issue of the history of residential schools in Canada. These schools were created for the various peoples of the Canada First Nations’ and have a long and sordid history. If you are interested in learning more, here is the final report of Canada’s Truth and Reconciliation Commission.

I wanted to use a chart that is in the PDF version of that report as the basis for plotting the chart described above. Here is the original.

I was unable to find the raw data, so I had to do some work in R to extract the data from the line in the image. There are some great R packages (magick, and tidyverse) that can be used to help you with this task should the need arise. See here for an example.

Using the following code, I was able to reproduce fairly accurately the line i the graph above.

library(tidyverse)
library(magick)

im <- image_read("residential_schools_new.jpg")

## This saturates the pic to highlight the darkest lines
im_proc <- im %>% image_channel("saturation")


## This gets rid of things that are far enough away from black--play around with the %

im_proc2 <- im_proc %>% image_threshold("white", "80%")

## Finally, invert (negate) the image so that what we want to keep is white.

im_proc3 <- im_proc2 %>% image_negate()

## Now to extract the data.

dat <- image_data(im_proc3)[1,,] %>%
  as.data.frame() %>%
  mutate(Row = 1:nrow(.)) %>%
  select(Row, everything()) %>%
  mutate_all(as.character) %>%
  gather(key = Column, value = value, 2:ncol(.)) %>%
  mutate(Column = as.numeric(gsub("V", "", Column)),
         Row = as.numeric(Row),
         value = ifelse(value == "00", NA, 1)) %>%
  filter(!is.na(value))

# Eliminate duplicate rows.

dat <- subset(dat, !duplicated(Row))  # Get rid of duplicate rows

Here’s the initial result, using the ggplot2 package.

It’s a fairly accurate re-creation of the chart above, don’t you think? After some cleaning up of the data and adding data on Primer Ministerial terms during Canada’s history since 1867, we get the completed result (with R code below).

We can see that there was an initial period of Canada’s history during which the number of schools operating increased. This period stopped with the First World War. Then there was a period of relative stabilization thereafter (some increase, then decrease through the 1940s and early 1950s, and then there was about a 10-year increase that began with Liberal Prime Minister Louis St. Laurent, and continued under Conservative Prime Minister John Diefenbaker and Liberal Prime Minister Lester B. Pearson, during whose time in power the number of residential schools topped out. Upon the ascension to power of Liberal Prime Minister Pierre Elliot Trudeau, the number of residential schools began a drastic decline, which continued under subsequent Prime Ministers.

EDIT: After reading the initial report more closely, it looks like the end point of the original chart is meant to be 1998, not 1999, so I’ve recreated the chart with that updated piece of information. Nothing changed, although it seems like the peak in the number of schools operating at any point in time was in about 1964, not a couple of years later as it had seemed. Here’s an excerpt from the report, in a section heading entitled Expansion and Decline:

From the 1880s onwards, residential school enrolment climbed annually. According to federal government annual reports, the peak enrolment of 11,539 was reached in the 1956–57 school year.144 (For trends, see Graph 1.) Most of the residential schools were located in the northern and western regions of the country. With the exception of Mount Elgin and the Mohawk Institute, the Ontario schools were all in northern or northwestern Ontario. The only school in the Maritimes did not open until 1930.145 Roman Catholic and Anglican missionaries opened the first two schools in Québec in the early 1930s.146 It was not until later in that decade that the federal government began funding these schools.147

From the 1880s onwards, residential school enrolment climbed annually. According to federal government annual reports, the peak enrolment of 11,539 was reached in the 1956–57 school year.144 (For trends, see Graph 1.) Most of the residential schools were located in the northern and western regions of the country. With the exception of Mount Elgin and the Mohawk Institute, the Ontario schools were all in northern or northwestern Ontario. The only school in the Maritimes did not open until 1930.145 Roman Catholic and Anglican missionaries opened the first two schools in Québec in the early 1930s.146 It was not until later in that decade that the federal government began funding these schools.147

The number of schools began to decline in the 1940s. Between 1940 and 1950, for example, ten school buildings were destroyed by fire.148 As Graph 2 illustrates, this decrease was reversed in the mid-1950s, when the federal department of Northern Affairs and National Resources dramatically expanded the school system in the Northwest Territories and northern Québec. Prior to that time, residential schooling in the North was largely restricted to the Yukon and the Mackenzie Valley in the Northwest Territories. Large residences were built in communities such as Inuvik, Yellowknife, Whitehorse, Churchill, and eventually Iqaluit (formerly Frobisher Bay). This expansion was undertaken despite reports that recommended against the establishment of residential schools, since they would not provide children with the skills necessary to live in the North, skills they otherwise would have acquired in their home communities.149 The creation of the large hostels was accompanied by the opening of what were termed “small hostels” in the smaller and more remote communities of the eastern Arctic and the western Northwest Territories.

Honouring the Truth, Reconciling for the Future:
Summary of the Final Report of the Truth and Reconciliation Commission of Canada https://web-trc.ca/

A couple of final notes: one can easily see (visualize) from this chart the domination of Liberal Party rule during the 20th century. Second, how many of you knew that there had been a couple of coalition governments in the early 20th century?

Here is the R code for the final chart:

gg.res.schools <- ggplot(data=dat) + 
  labs(title = "Canadian Residential Schools \u2013 1867-1999",
       subtitle="(Number of Schools in Operation & Federal Party in Power)", 
       y = ("Number of Schools"), x = " ") +
  geom_line(aes(x=Row.Rescale, y=Column.Rescale), color='black', lwd=0.75)  +
  scale_y_continuous(expand = c(0,0), limits=c(0,100)) +
  scale_x_continuous(limits=c(1866,2000)) + 
  geom_rect(data=pm.df,
            mapping=aes(xmin=Date_Begin.1, xmax=Date_End.1, 
                        ymin=rep(0,25), ymax=rep(100,25), fill=Government)) +
              scale_fill_manual(values = alpha(c("blue", "red", "green", "yellow"), .6)) +
  theme_bw() +
  theme(legend.title=element_blank(),
        plot.title = element_text(hjust = 0.5, size=16),
        plot.subtitle = element_text(hjust= 0.5, size=13),
        axis.text.y = element_text(size = 8))

gg.res.final.plot <- gg.res.schools + geom_line(aes(x=Row.Rescale, y=Column.Rescale), color='black', lwd=0.75, data=dat)

Data Visualization #3–Cartograms as an alternative to standard area-based electoral maps

In my first post of this series I explained at length why basic geographically-based electoral maps are not very good at conveying the phenomena of interest (see that post for more detail), and alluded to the increased use of political geographers, and political scientists, of alternative methods of “mapping” the required information that were more clear about the message(s) contained in the data.

Let’s examine this further using the map above. This map shows the results of the Canadian federal (national) election of October 2019.The respective proportions of total area “won” by each political party as depicted in the map above are not easily translated into either the relative vote share of the parties, or the relative number of seats won. Someone ignorant about Canadian federal politics would see a relatively similar total amount of red, blue, and orange, and assume that these parties had relatively equal support across the country. The sizes (land mass), and populations of, federal electoral districts in Canada vary drastically and, as a result, these maps are not a good gauge of voter support for political parties.

Since this problem is widespread political scientists, and political geographers, have attempted to find solutions to this problem. One increasingly-common approach has been to use what are called cartograms. Cartograms are maps in which the elements (in this case, electoral districts) are usually transformed in such as way as to maintain their connections to neighbours (contiguous cartograms), but to either increase or decrease the area of the specific electoral district in order to match it to a common variable. A variable often used in the transformation of electoral maps is population size. Thus, in a completed cartogram, the size of the electoral districts is not the actual land mass of the electoral district, but is proportional to the population of the electoral district (sometimes the number of voters, or the size of the electorate is used instead of population). It’s no surprise, then, that cartograms are also called “value-by-area” maps.

Cartograms are used by geographers and social scientists to depict a wide variety of phenomena. Here are some examples. The first one is a global cartogram for which the size of the area in each country is equivalent to total public health spending by that country. We can easily see that most of the world’s spending on public health occurs in the rich countries of the global north.

Here’s one more, depicting the global share of organic agriculture, by country.

Below, I have created a cartogram that has transformed the standard electoral map of the 2019 Canadian federal election into one in which the size of the electoral districts is mostly proportional to their populations. By “mostly” I mean that they’re not perfectly proportional, since the difference in sizes between the largest and smallest districts is so large the algorithm eventually stabilizes without creating completely equal-sized electoral districts.

This map more accurately conveys the nature of political partisan support (at least as it relates to the winning of electoral districts) across the country during the 2019 election, and provides visual evidence for the reality of an election in which the Liberal Party (red) won a plurality of the seats in the federal parliament (House of Commons). Because urban districts are much smaller than rural districts, the strength of Liberal Party support in Canada’s two largest cities–Toronto and Montreal–is obfuscated by the traditional area-based electoral map, but becomes evident in this cartogram.

The next map in this series will analyze another approach to geographically-based electoral maps–the hexagon map.

Here’s the R code for the cartogram above. Here, the original R-spatial data object–can_sf–is the base for the calculation of the cartogram data.

## Here is the code to generate the cartogram object:

library(cartogram)
can_carto_sf = cartogram_cont(can_sf, "Population_2016", itermax=50)

## Now, the map, using ggplot2
library(ggplot2)

gg.can.can.carto <- ggplot(data = can_carto_sf) +
  geom_sf(aes(fill = partywinner_2019), col="black", lwd=0.075) + 
  scale_fill_manual(values=c("#33B2CC","#1A4782","#3D9B35","#D71920","#F37021","#2B2D2F"),name ="Party (2019)") +
  labs(title = "Cartogram of Canadian Federal Election Results \u2013 October 2019",
       subtitle = "(by Political Party and Electoral District)") +
  theme_void() + 
  theme(legend.title=element_blank(),
        legend.text = element_text(size = 16),
        plot.title = element_text(hjust = 0.5, size=20, vjust=2, face="bold"),
        plot.subtitle = element_text(hjust=0.5, size=18, vjust=2, face="bold"),
        legend.position = "bottom",
        plot.margin = margin(0.5, 0.5, 0.5, 0.5, "cm"),
        legend.box.margin = margin(0,0,30,0),
        legend.key.size = unit(0.75, "cm"),
        panel.border = element_rect(colour = "black", fill=NA, size=1.5))

Stephen Harper says voting is divisive!

The clever folks at The Syrup Trap (think The Onion, but sweeter…and more Canuckistani!) have a new article out on Canadian Prime Minister Stephen Harper’s attitude towards voting. With the big election coming up in just under three weeks (19 October), here’s Harper’s message regarding Canadians’ voting proclivities:

The [new election] campaign, titled “Voting: Not super important,” will encourage Canadians not to get too serious or divisive about politics, and promote a series of alternative, “more fun” activities to do on election day.

“Canadians are absolutely tired of partisan activities, like voting in an election, which is just about the most divisive thing you can do,” Harper explained during a press conference.

“I guess what we’re trying to tell Canadians is to just chill out for a second. Because, voting? It’s not that big of a deal.”

The campaign will also illustrate, using data and infographics, exactly how little influence each individual Canadian has, all things considered.

Snark
Stephen Harper’s new election campaign aims to make Canadian society less politically divisive by discouraging citizens from voting.

P.S. Before you freak out, look at the Tags to the post below.

Russell Brand defends his decision not to vote

As we learned in class today, voting is the most conventional form of political activity. Although an ever-increasing number of citizens in advanced industrial economies refuses to vote, still a majority of citizens gets out and votes during national elections. But, for a majority of these voters, voting is the extent of their political activity.

What can we say about most non-voters and the reasons that they don’t vote? Well, fortunately, pollsters and academics have tried to answer this question. Let’s take a look at the Canadian federal election from 2011. In that election, only 61.1% of eligible voters bothered to vote. To determine why Canadians were not voting, Elections Canada, in conjunction with the monthly Labour Force Survey, asked those who didn’t vote their main reason for not doing so. Here are the results:

Canadian Federal Election 2011

What do you think about these results? Below is an excerpt from an interview of Russell Brand on BBC, in which the actor/comedian explains why it is that he refuses to vote in elections in Great Britain. [By the way, he has since changed his views on voting.]

Generation Y Political Apathy

In advance of the Canadian federal election (for up-to-date poll-based projections, go here), Kensington TV has produced a compelling documentary, which aims to understand the current political apathy within the “millennial” generation, both in Canada and the United States. Hosted by Dylan Playfair, the documentary critically examines stereotypical assumptions that have been made about the reasons for youth political apathy.

The documentary will be showing at select universities around Canada in the run-up to the federal election, which is being held on October 19th. For more about the documentary and where it may be viewed, click here. The documentary will be available for television and Internet viewing in early October. Please watch the trailer below.