Research Results, R coding, and mistakes you can blame on your research assistant

I have just graded and returned the second lab assignment for my introductory research methods class in International Studies (IS240). The lab required the students to answer questions using the help of the R statistical program (which, you may not know, is every pirate’s favourite statistical program).

The final homework problem asked students to find a question in the World Values Survey (WVS) that tapped into homophobic sentiment and determine which of four countries under study–Canada, Egypt, Italy, Thailand–could be considered to be the most homophobic, based only on that single question.

More than a handful of you used the code below to try and determine how the respondents in each country answered question v38. First, here is a screenshot from the WVS codebook:

wvs_v38Students (rightfully, I think) argued that those who mentioned “Homosexuals” amongst the groups of people they would not want as neighbours can be considered to be more homophobic than those who didn’t mention homosexuals in their responses. (Of course, this may not be the case if there are different levels of social desirability bias across countries.) Moreover, students hypothesized that the higher the proportion of mentions of homosexuals, the more homophobic is that country.

But, when it came time to find these proportions some students made a mistake. Let’s assume that the student wanted to know the proportion of Canadian respondents who mentioned (and didn’t mention) homosexuals as persons they wouldn’t want to have as neighbours.

Here is the code they used (four.df is the data frame name, v38 is the variable in question, and country is the country variable):


prop.table(table(four.df$v38=="mentioned" | four.df$country=="canada"))

FALSE     TRUE
0.372808 0.627192

Thus, these students concluded that almost 63% of Canadian respondents mentioned homosexuals as persons they did not want to have as neighbours. That’s downright un-neighbourly of us allegedly tolerant Canadians, don’tcha think?. Indeed, when compared with the other two countries (Egyptians weren’t asked this question), Canadians come off as more homophobic than either the Italians or the Thais.


prop.table(table(four.df$v38=="mentioned" | four.df$country=="italy"))

FALSE      TRUE
0.6106025 0.3893975

prop.table(table(four.df$v38=="mentioned" | four.df$country=="thailand"))

FALSE      TRUE
0.5556995 0.4443005

So, is it true that Canadians are really more homophobic than either Italians or Thais? This may be a simple homework assignment but these kinds of mistakes do happen in the real academic world, and fame (and sometimes even fortune–yes, even in academia a precious few can make a relative fortune) is often the result as these seemingly unconventional findings often cause others to notice. There is an inherent publishing bias towards results that seem to run contrary to conventional wisdom (or bias). The finding that Canadians (widely seen as amongst the most tolerant of God’s children) are really quite homophobic (I mean, close to 2/3 of us allegedly don’t want homosexuals, or any LGBT persons, as neighbours) is radical and a researcher touting these findings would be able to locate a willing publisher in no time!

But, what is really going on here? Well, the problem is a single incorrect symbol that changes the findings dramatically. Let’s go back to the code:


prop.table(table(four.df$v38=="mentioned" | four.df$country=="canada"))

The culprit is the | (“or”) character. What these students are asking R to do is to search their data and find the proportion of all responses for which the respondent either mentioned that they wouldn’t want homosexuals as neighbours OR the respondent is from Canada. Oh, oh! They should have used the & symbol instead of the | symbol to get the proportion of Canadian who mentioned homosexuals in v38.

To understand visually what’s happening let’s take a look at the following venn diagram (see the attached video above for Ali G’s clever use of what he calls “zenn” diagrams to find the perfect target market for his “ice cream glove” idea; the code for how to create this diagram in R is at the end of this post). What we want is the intersection of the blue and red areas (the purple area). What the students’ coding has given us is the sum of (all of!) the blue and (all of!) the red areas.

To get the raw number of Canadians who answered “mentioned” to v38 we need the following code:


table(four.df$v38=="mentioned" & four.df$v2=="canada")

FALSE  TRUE
7457   304

Rplot_venn_canada_v38

But what if you then created a proportional table out of this? You still wouldn’t get the correct answer, which should be the proportion that the purple area on the venn diagram comprises of the total red area.


prop.table(table(four.df$v38=="mentioned" & four.df$v2=="canada"))

FALSE       TRUE
0.96082979 0.03917021

Just eyeballing the venn diagram we can be sure that the proportion of homophobic Canadians is larger than 3.9%. What we need is the proportion of Canadian respondents only(!) who mentioned homosexuals in v38. The code for that is:


prop.table(table(four.df$v38[four.df$v2=="canada"]))

mentioned not mentioned
0.1404806     0.8595194

So, only about 14% of Canadians can be considered to have given a homophobic response, not the 62% our students had calculated. What are the comparative results for Italy and Thailand, respectively?


prop.table(table(four.df$v38[four.df$v2=="italy"]))

mentioned not mentioned
0.235546      0.764454

prop.table(table(four.df$v38[four.df$v2=="thailand"]))

mentioned not mentioned
0.3372781     0.6627219

The moral of the story: if you mistakenly find something in your data that runs against conventional wisdom and it gets published, but someone comes along after publication and demonstrates that you’ve made a mistake, just blame it on a poorly-paid research assistant’s coding mistake.

Here’s a way to do the above using what is called a for loop:


four<-c("canada","egypt","italy","thailand")
for (i in 1:length(four)) {
+ print(prop.table(table(four.df$v38[four.df$v2==four[i]])))
+ print(four[i])
+ }

mentioned not mentioned
0.1404806     0.8595194
[1] "canada"

mentioned not mentioned

[1] "egypt"

mentioned not mentioned
0.235546      0.764454
[1] "italy"

mentioned not mentioned
0.3372781     0.6627219
[1] "thailand"

Here’s the R code to draw the venn diagram above:

install.packages("venneuler")

library(venneuler}

v1<-venneuler(c("Mentioned"=sum(four.df$v38=="mentioned",na.rm=T),"Canada"=sum(four.df$v2=="canada",na.rm=T),"Mentioned&Canada"=sum(four.df$v2=="canada" & four.df$v38=="mentioned",na.rm=T)))

plot(v1,main="Venn Diagram of Canada and v38 from WVS", sub="v38='I wouldn't want to live next to a homosexual'", col=c("blue","red"))
Advertisements

More on Qualitative Research Analysis

50-foot Jesus statue in front of an abortion clinic

In IS240 on Monday we looked at some of the characteristics of  qualitative research methods. such as i) it is inductive, ii) normally interpretivist, and iii)  qualitative researchers view constructionist ontological viewpoints.  A final characteristic of most qualitative research is that its approach is naturalistic. As the textbook notes:

…qualitative researchers try to minimize the disturbance they cause to the social worlds they study.

We can see the nature of this quality implicitly teased out by Lori Freedman, who discusses some of the research she did for her Master’s degree. Freedman writes about the relationship between abortion and religion (it’s not what you’re expecting) and the time she spent observing in a hospital that performed abortions. Here’s the part related to her research methods:

Claudia [a deeply religious Catholic woman who was having an abortion–JD] told me this story 13 years ago, while I was conducting ethnographic research as a participant-observer in a hospital-based abortion service. I spent considerable time there helping, observing, and intermittently conducting as many interviews as I could with counselors, doctors, and nurses, in order to gain a rich view of abortion clinic life. This study became my master’s thesis, but nothing else. I feared publication might amount to a gratuitous exposé of people I respected dearly. I couldn’t think of any policy or academic imperative that necessitated revealing the intimate dynamics of this particular social world—certainly nothing that could make the potential feelings of betrayal worthwhile. Ultimately, I just tucked it away.

 

‘Thick Description’ and Qualitative Research Analysis

In Chapter 8 of Bryman, Beel, and Teevan, the authors discuss qualitative research methods and how to do qualitative research. In a subsection entitled Alternative Criteria for Evaluating Qualitative Research, the authors reference Lincoln and Guba’s thoughts on how to assess the reliability, validity, and objectivity of qualitative research. Lincoln and Guba argue that these well-known criteria (which developed from the need to evaluate quantitative research) do not transfer well to qualitative research. Instead, they argue for evaluative criteria such as credibility, transferability, and objectivity.

Saharan Caravan Routes

Saharan Caravan Routes–The dotted red lines in the above map are caravan routes connecting the various countries of North Africa including Egypt, Libya, Algeria, Morocco, Mali, Niger and Chad. Many of the main desert pistes and tracks of today were originally camel caravan routes. (What do the green, yellow, and brown represent?)

Transferability is the extent to which qualitative research ‘holds in some other context’ (the quants reading this will immediately realize that this is analogous to the concept of the ‘generalizability of results’ in the quantitative realm). The authors argue that whether qualitative research fulfills this criterion is not a theoretical, but an empirical issue. Moreover, they argue that rather than worrying about transferability, qualitative researchers should produce ‘thick descriptions’ of phenomena. The term thick description is most closely associated with the anthropologist Clifford Geertz (and his work in Bali). Thick description can be defined as:

the detailed accounts of a social setting or people’s experiences that can form the basis for general statements about a culture and its significance (meaning) in people’s lives.

Compare this account (thick description) by Geertz of the caravan trades in Morocco at the turn of the 20th century to how a quantitative researcher may explain the same institution:

In the narrow sense, a zettata (from the Berber TAZETTAT, ‘a small piece of cloth’) is a passage toll, a sum paid to a local power…for protection when crossing localities where he is such a power. But in fact it is, or more properly was, rather more than a mere payment. It was part of a whole complex of moral rituals, customs with the force of law and the weight of sanctity—centering around the guest-host, client-patron, petitioner-petitioned, exile-protector, suppliant-divinity relations—all of which are somehow of a package in rural Morocco. Entering the tribal world physically, the outreaching trader (or at least his agents) had also to enter it culturally.

Despite the vast variety of particular forms through which they manifest themselves, the characteristics of protection in tbe Berber societies of the High and Middle Atlas are clear and constant. Protection is personal, unqualified, explicit, and conceived of as the dressing of one man in the reputation of another. The reputation may be political, moral, spiritual, or even idiosyncratic, or, often enough, all four at once. But the essential transaction is that a man who counts ‘stands up and says’ (quam wa qal, as the classical tag bas it) to those to whom he counts: ‘this man is mine; harm him and you insult me; insult me and you will answer for it.’ Benediction (the famous baraka),hospitality, sanctuary, and safe passage are alike in this: they rest on the perhaps somewhat paradoxical notion that though personal identity is radically individual in both its roots and its expressions, it is not incapable of being stamped onto tbe self of someone else. (Quoted in North (1991) Journal of Economic Perspectives, 5:1 p. 104.

A Virtual Trip to Myanmar for my Research Methods Class

For IS240 next week, (Intro to Research Methods in International Studies) we will be discussing qualitative research methods. We’ll address components of qualitative research and review issues related to reliability and validity and use these as the basis for an in-class activity.

The activity will require students to have viewed the following short video clips, all of which introduce the viewer to contemporary Myanmar. Some of you may know already that Myanmar (Burma) has been transitioning from rule by military dictatorship to democracy. Here are three aspects of Myanmar society and politics. Please watch as we won’t have time in class to watch all three clips. The clips themselves are not long (just over 3,5,and 8 minutes long, respectively).

The first clip shows the impact of heroin on the Kachin people of northern Myanmar:

The next clip is a short interview with a Buddhist monk on social relations in contemporary Myanmar:

The final video clip is of the potential impact (good and bad) of increased international tourism to Myanmar’s most sacred sites, one of which is Bagan.

Television makes us do crazy things…or does it?

During our second lecture in Research Methods, when asked to provide an example of a relational statement, one student offered the following:

Playing violent video games leads to more violent inter-personal behaviour by these game-playing individuals.

That’s a great example, and we used this in class for a discussion of how we could go about testing whether this statement is true. We then surmised that watching violence on television may have similar effects, though watching is more passive than “playing”, so there may not be as great an effect.

If television viewing can cause changes in our behaviour that are not socially productive, can it also lead viewers to change their behaviour in a positive manner? There’s evidence to suggest that this may be true. In a recent study, 

there is evidence to suggest that watching MTV’s 16 and Pregnant show is associated with lower rates of teen pregnancy. What do you think about the research study?

Proportional Representation versus First-Past-the-Post

As we learned in POLI 1100 today, Canada is one of small number of countries that continues to have a first-past-the-post system for national elections. What this means is that we divide the country up into 308 single-member districts (divided principally on the basis of the “representation by population” principle), from each of which exactly one individual is elected to represent that district in the House of Commons in Ottawa. In our case, a winner only has to have a plurality of the vote in that district to be elected the winner. What this does is it tends to give larger parties overrepresentation in parliament based on their actual electoral strength. It also gives regionally-concentrated parties (like the Bloc Quebecois) overrepresentation in parliament vis-a-vis parties whose electoral support is more diffuse geographically.

As we can see from the 2008 federal election results, the Green Party received almost 7% of the total national vote, yet because the vote was dispersed across the whole of the country, did not receive a single mandate in the House of Commons. The Bloc Quebecois, meanwhile, gained 50 seats in parliament with a slightly larger percentage of the vote than the Greens! Why? Because the BQ’s votes were geographically concentrated within a minority of ridings in the province of Quebec.

Turning now to the 2011 federal election, in which Stephen Harper’s Conservative Party won a majority in the House of Commons with 166 seats (and 39.6% of the vote). See the results below.

What if, on the other hand, Canada had a proportional representation system in which each province was its own electoral district and seats for the House of Commons were apportioned on the basis of the relative proportion of votes won by each party in each province? What would the results look like? With the help of my students, we were able to calculate the hypothesized makeup of the House of Commons were Canada to have such an electoral system.

Notice that the total number of MPs for the Conservative Party has dropped considerably such that the party no longer has a majority in the House of Commons. In fact, no single party has a majority! In order to form a relatively stable government, the Conservatives would have to find willing coalition partners. Unfortunately for them, however, other than the BQ, there is no immediately suitable coalition partner, given the respective ideological stances of the parties in parliament. Even with the BQ, the Conservatives could not get a governing majority, coming up 15 seats short. An NDP/Liberal?Green coalition, on the other hand, would work both ideologically and in terms of numbers (166 seats, exactly the same number as the Conservatives have today).

Note also how much a proportional representation system would help the Green Party–from only 1 seat in the House to 11 seats!

Which system would you prefer? Do you think that we should maintain the status quo? Should we change to PR? What are some of the advantages and disadvantages of each?

Writing to your MP in Support of a Bill in the House of Commons

In POLI 1100, we have been discussing the concept and structure of legislatures. Near the end of Chapter 8 we looked at the path a bill has to traverse in Parliament before it becomes law (see Figure 8.2 of the Dyck textbook, p. 235). We viewed a video clip of MP Ruby Dhalla introducing a bill to amend the residency provisions of the OAS act. (If you don’t know what OAS stands for, watch the short video.)

We have learned in the past couple of weeks that most of the contact that you, as a Canadian citizen, have with the government is via the political executive, whether at the provincial or federal level. Apart from voting for your MP (MLA), there is very little contact between you and the legislative branch of our government. This week’s blog assignment can help change that. As I’ve noted on Blackboard, for this week’s blog assignment you can choose to write on anything to do with “legislatures”. You may, however, choose to write a letter to your MP (or any MP) in support (or opposition to) any bill that is currently in middle of the legislative process in Parliament. Here are the steps:

1. Go to http://www.parl.gc.ca (and select your language of choice):

2. Click on “Bills before Parliament” on the left (see the screenshot below). (“Projets de loi a l’etude au Parliament”, en francais)

3. On the next page, you will see, amongst other things, a list of the “All Bills for the Current Session (41st Parliament, 1st Session). The Bills can be sorted by number (as seen below), or by “Latest Activity Date”.

4. Find a Bill that interests you, and write a letter to the MP who is sponsoring the bill. Here’s an example of a letter I wrote below:

Mr. Jean Rousseau, M.P. House of Commons
Ottawa, Ontario
K1A 0A6

Cher Monsieur Rousseau:

I am writing to you in support of Bill C-312, The Democratic Representation Act, which is currently at the Second Reading state of the legislative process in the House of Commons. As I understand it, the bill is meant to assuage the concerns of the Quebecois regarding the province of Quebec’s decreasing population, as a share of Canadian population as a whole. Bill C-312, should it be adopted into law, would maintain proportional representation of Quebec’s delegation in the House of Commons at 2006 levels, regardless of the relative proportion of Quebec’s population in the future.

While some might see this as anti-democratic in that this law would mandate a divergence from the idea that every citizen’s vote should be counted equally, I believe that the the violation of this core principle is justified in this case. (Indeed, in many areas of politics and public policy, debates centre around clashes of competing (and contradictory), fundamentally legitimate–morally and politically–principles.) In this case, the competing principle is the protection of a strong Quebec, and Quebecois society, which I believe is of inestimable value to Canadian society as a whole.

In the view of this Canadian citizen, who since immigrating to this wonderful country as an infant, has lived in the western province of British Columbia (when not living outside the country), Canada’s French heritage is an indispensable part of our country’s unique heritage and is part of the basis for the creation of what is today (though we know it hasn’t always been) a tolerant multicultural society, which is the envy of many around the world.

Sincerely,
Josip (Joseph) Dasovic
Dept of History, Latin, and Political Science
Langara College
Vancouver, BC

Do you agree with my position? Should we violate the principle of “one-person, one-vote” in the way intended by Bill C-312?