The Economist Intelligence Unit’s Global “Livability” Survey Omits Cost-of-Living

Before we can say anything definitive about the concepts and ideas that we’re studying, it is imperative that we have some understanding about whether the data that we observe and collect are actually “tapping into” the concept of interest.

For example, if my desire were to collect data that are meant to represent how democratic a country is, it would probably not be beneficial to that enterprise to collect measures of annual rainfall. [Though, in some predominantly agricultural countries, that might be an instrument for economic growth.] Presumably, I would want to collect data like whether elections were regularly held, free, and fair, whether the judiciary was independent of elected leaders, etc. That seems quite obvious to most.

The Economist’s Intelligence Unit puts out an annual  “Global Livability Report” , which claims to comparatively assess “livability” in about 140 cities worldwide. The EIU uses many different indicators (across five broad categories) to arrive at a single index value that allegedly reflects the level of livability of each city in the survey.  Have a look at the indicators below. Do you notice that the cost-of-living is not include? Why might that be?

livability_top_bottom_teneconomist_livability_1economist_livability_2

 

How to lie with Statistics

In class last week, we were introduced to recent research on the effect of same-sex parenting on children’s welfare, specifically on high school graduation rates. We discussed how easy it can be to manipulate data in order to present a distorted view of reality.

I’ll use a fictitious example to make the point. Let’s assume you had two schools–Sir Charles Tupper and William Gladstone. Assume further that the graduation rates of the two schools are 98% and 94% for Tupper and Gladstone, respectively. Is one school substantially better at graduating its students than the other? Not really. In fact, the graduation rate at Tupper is about 4.3% higher than at Gladstone. So, Tupper is marginally better at graduating students than is Gladstone.

But, what if we compared non-graduation rates instead? Well, the non-graduation rate at Tupper is 2%, while the non-graduation rate at Gladstone is 6%. Thus, the following accurate statistical claim can legitimately be made: “Gladstone’s drop-out [non-graduation] rate is 300% greater than is Tupper’s.” Or, “Tupper non-graduation rate is 33% of Gladstone’s!” Would parents’ reactions be the same if the data were presented in this manner?

Another way to lie with statistics using graphs.

‘Thick Description’ and Qualitative Research Analysis

In Chapter 8 of Bryman, Beel, and Teevan, the authors discuss qualitative research methods and how to do qualitative research. In a subsection entitled Alternative Criteria for Evaluating Qualitative Research, the authors reference Lincoln and Guba’s thoughts on how to assess the reliability, validity, and objectivity of qualitative research. Lincoln and Guba argue that these well-known criteria (which developed from the need to evaluate quantitative research) do not transfer well to qualitative research. Instead, they argue for evaluative criteria such as credibility, transferability, and objectivity.

Saharan Caravan Routes
Saharan Caravan Routes–The dotted red lines in the above map are caravan routes connecting the various countries of North Africa including Egypt, Libya, Algeria, Morocco, Mali, Niger and Chad. Many of the main desert pistes and tracks of today were originally camel caravan routes. (What do the green, yellow, and brown represent?)

Transferability is the extent to which qualitative research ‘holds in some other context’ (the quants reading this will immediately realize that this is analogous to the concept of the ‘generalizability of results’ in the quantitative realm). The authors argue that whether qualitative research fulfills this criterion is not a theoretical, but an empirical issue. Moreover, they argue that rather than worrying about transferability, qualitative researchers should produce ‘thick descriptions’ of phenomena. The term thick description is most closely associated with the anthropologist Clifford Geertz (and his work in Bali). Thick description can be defined as:

the detailed accounts of a social setting or people’s experiences that can form the basis for general statements about a culture and its significance (meaning) in people’s lives.

Compare this account (thick description) by Geertz of the caravan trades in Morocco at the turn of the 20th century to how a quantitative researcher may explain the same institution:

In the narrow sense, a zettata (from the Berber TAZETTAT, ‘a small piece of cloth’) is a passage toll, a sum paid to a local power…for protection when crossing localities where he is such a power. But in fact it is, or more properly was, rather more than a mere payment. It was part of a whole complex of moral rituals, customs with the force of law and the weight of sanctity—centering around the guest-host, client-patron, petitioner-petitioned, exile-protector, suppliant-divinity relations—all of which are somehow of a package in rural Morocco. Entering the tribal world physically, the outreaching trader (or at least his agents) had also to enter it culturally.

Despite the vast variety of particular forms through which they manifest themselves, the characteristics of protection in tbe Berber societies of the High and Middle Atlas are clear and constant. Protection is personal, unqualified, explicit, and conceived of as the dressing of one man in the reputation of another. The reputation may be political, moral, spiritual, or even idiosyncratic, or, often enough, all four at once. But the essential transaction is that a man who counts ‘stands up and says’ (quam wa qal, as the classical tag bas it) to those to whom he counts: ‘this man is mine; harm him and you insult me; insult me and you will answer for it.’ Benediction (the famous baraka),hospitality, sanctuary, and safe passage are alike in this: they rest on the perhaps somewhat paradoxical notion that though personal identity is radically individual in both its roots and its expressions, it is not incapable of being stamped onto tbe self of someone else. (Quoted in North (1991) Journal of Economic Perspectives, 5:1 p. 104.

What causes civil conflict?

In a series of recent articles, civil conflict researchers Esteban, Mayoral, and Ray (see this paper for an example) have tried to answer that question. Is it economic inequality, or cultural differences? Or maybe there is a political cause at its root. I encourage you to read the paper and to have a look at the video below. Here are a couple of images from the linked paper, which you’ll see remind you of concepts that we’ve covered in IS210 this semester. The first image is only part of the “Model of Civil Conflict.” Take a look at the paper if you want to see the “punchline.”

Screen shot 2014-02-07 at 1.54.26 AM

Here is the relationship between fractionalization and polarization. What does each of these measures of diversity measure?

Screen shot 2014-02-07 at 1.56.50 AM

And here’s a nice youtube video wherein the authors explain their theory.

Indicators and The Failed States Index

The Failed State Index is created and updated by the Fund for Peace. For the most recent year (2013), the Index finds the same cast of “failed” characters as previous years. There is some movement, the “top” 10 has not changed much over the last few years.

The Top 10 of the Failed States Index for 2013
The Top 10 of the Failed States Index for 2013

Notice the columns in the image above. Each of these columns is a different indicator of “state-failedness”. If you go to the link above, you can hover over each of the thumbnails to find out what each indicator measures. For, example, the column with what looks like a 3-member family is the score for “Mounting Demographic Pressures”, etc. What is most interesting about the individual indicator scores is how similar they are for each state. In other words, if you know Country X’s score on Mounting Demographic Pressures, you would be able to predict the scores of the other 11 indicators with high accuracy. How high? We’ll just run a simple regression analysis, which we’ll do in IS240 later this semester.

For now, though, I was curious as to how closely each indicator was correlated with the total score. Rather than run regression analyses, I chose (for now) to simply plot the associations. [To be fair, one would want to plot each indicator not against the total but against the total less that indicator, since each indicator comprises a portion (1/12, I suppose) of the total score. In the end, the general results are similar,if not exactly the same.]

So, what does this look like? See the image below (the R code is provided below, for those of you in IS240 who would like to replicate this.)

Plotting each of the Failed State Index (FSI) Indicators against the Total FSI Score
Plotting each of the Failed State Index (FSI) Indicators against the Total FSI Score

Here are two questions that you should ponder:

  1. If you didn’t have the resources and had to choose only one indicator as a measure of “failed-stateness”, which indicator would you choose? Which would you definitely not choose?
  2. Would you go to the trouble and expense of collecting all of these indicators? Why or why not?

R-code:


install.packages("gdata") #This package must be installed to import .xls file

library(gdata) #If you find error message--"required package missing", it means that you must install the dependent package as well, using the same procedure.

fsi.df<-read.xls("http://ffp.statesindex.org/library/cfsis1301-fsi-spreadsheet178-public-06a.xls")  #importing the data into R, and creating a data frame named fsi.df

pstack.1<-stack(fsi.df[4:15]) #Stacking the indicator variables in a single variable

pstack.df<-data.frame(fsi.df[3],pstack.1) #setting up the data correctly

names(pstack.df)<-c("Total","Score","Indicator") #Changing names of Variables for presentation

install.packages("lattice")  #to be able to create lattice plots

library(lattice) #to load the lattice package

xyplot(pstack.df$Total~pstack.df$Score|pstack.df$Indicator,  groups=pstack.df$Indicator, layout=c(4,3),xlab="FSI Individual Indicator Score", ylab="FSI Index Total")

Deal or no deal and rational choice theory

As my students are aware, I have been under the weather since the beginning of January and am finally feeling somewhat like a human being again. During my down time, I took some rest and had time to do some non-school-related activities, one of which was trying out the Deal or No Deal app on my smartphone. You do remember the TV show hosted by Howie Mandel, right?

Deal or No Deal and Rational Choice Theory
Deal or No Deal and Rational Choice Theory

Anyway, the basic idea of the show is this:

  • There are 26 suitcases on state, each with a card containing a dollar amount between $1 and $1 Million.
  • The game begins when the contestant chooses one of the 26 suitcases as “their” suitcase. If the contestant keeps the suitcase until the end of play, they win the dollar amount written on the card inside the suitcase.
  • The contestant must begin opening a certain amount of suitcases during each round of play–5 the first round, 4 the next, etc.
  • After each round, the game pauses and the contestant receives an offer from the mysterious banker via telephone with Howie as the intermediary.
  • The contestant is then asked whether there is a “deal, or no deal.” The contestant may accept the banker’s offer or continue. [There is where the drama gets ramped up to 11!]
  • If you have watched the show, you’ll notice that the banker’s offer depends upon which dollar amounts have been revealed. If the contestant reveals many high-value suitcases, it becomes like likely (probable) that the suitcase s/he chose at the beginning is a high-value suitcase.

The smartphone version is slightly different from the TV show in that the suitcases do not have dollar amounts attached but point multiples (that is, you win 1X, 2X, 3x, etc. 1000X the pot).

Take a look at the images above screenshot (is that the past participle?) from my smartphone. What do you notice about the banker’s offer? What’s of importance here is the red boxes in each picture. These are two separate games, btw.

These are two separate games. In the top game, there are only two suitcases left–one of them is the 20X and the 200X, Therefore, I have either the 20X or the 200X. That’s quite a big difference in winnings–ten times. So, what would you do? What would a rational choice theorist say you should do? Are the bankers offers rational in each case? Why or why not?

How much does political culture explain?

For decades now, comparativists have debated the usefulness of cultural explanations of political phenomena. In their path-breaking book, The Civic Culture, Almond and Verba argued that there was a relationship between, what they called, a country’s political culture and the nature and quality of democracy. (In fact, the relationship is a bit more complex in that the believed that a country’s political culture mediated the link between individual attitudes and the political system.) Moreover, the political culture was itself a product of underlying and enduring socially cultural factors, such as either an emphasis on the family, bias towards individualism, etc. Although Almond and Verba studied only five countries–the United States, West Germany, Mexico, Italy, and the United Kingdom–they suggested that the results could be generalized to (all) other countries.

How much, however, does culture explain? Can it explain why some countries have strong economies? Or why some countries have strong democracies? We know that cultural traits and values are relatively enduring, so how can we account for change? We know that a constant can not explain a variable.

The 1963 Cover of Almond and Verba's classic work.

In a recent op-ed piece in the New York Times, Professor Stephen L. Sass asks whether China can innovate its way to technological and economic dominance over the United States. There is much consternation in the United States over recent standardized test scores showing US students doing poorly, relative to their global peers, on science exams. (How have Canadian students been faring?)

Professor Sass answers his own question in the negative. Why, in his estimation, will China not innovate to the top? In a word (well, actually two words)–political culture:

Free societies encourage people to be skeptical and ask critical questions. When I was teaching at a university in Beijing in 2009, my students acknowledged that I frequently asked if they had any questions — and that they rarely did. After my last lecture, at their insistence, we discussed the reasons for their reticence.

Several students pointed out that, from childhood, they were not encouraged to ask questions. I knew that the Cultural Revolution had upturned higher education — and intellectual inquiry generally — during their parents’ lifetimes, but as a guest I didn’t want to get into a political discussion. Instead, I gently pointed out to my students that they were planning to be scientists, and that skepticism and critical questioning were essential for separating the wheat from the chaff in all scholarly endeavors.

Although Sass admits that there are institutional and other reasons that will also serve to limit China’s future technological innovation, he ends up affirming the primacy of political culture:

Perhaps I’m wrong that political freedom is critical for scientific innovation. As a scientist, I have to be skeptical of my own conclusions. But sometime in this still-new century, we will see the results of this unfolding experiment. At the moment, I’d still bet on America.

Do you agree? What other important political phenomena can be explained by political culture?

Nomothetic Explanations and Fear of Unfamiliar Things

Bringing two concepts together, in Research Methods today we discussed the MTV show 16 and Pregnant as part of our effort to look at cause-and-effect relationships in the social sciences. The authors of a new study on the aforementioned television program demonstrate a strong link between viewership and pregnancy awareness (including declining pregnancy rates) amongst teenagers.

We used this information, along with a hypothesized link between playing video games and violent behaviour. I then asked students to think about another putatively causal relationship that was similar to these two, from which we could derive a more general, law-like hypothesis or theory.

The computer lab presented us with another opportunity to think about moving from more specific and contextual causal claims to more general ones. Upon completion of the lab, one of the students remarked that learning how to use the R statistical program wasn’t too painful and that he had feared having to learn it. “I guess I’m afraid of technology,” he remarked. Then he corrected himself to say that this wasn’t true, since he didn’t fear the iphone, or his Mac laptop, etc. So, we agreed that he only feared technology with which he was unfamiliar. I then prodded him and others to use this observation to make a broader claim about social life. And the claim was “we fear that with which we are unfamiliar.” That is generalizing beyond the data that we’ve just used to extrapolate to other areas of social life.

Our finishing hypothesis, then, was extended to include not only technology, but people, countries, foods, etc.

P.S. Apropos of the attached TED talk, do we fear cannibals because we are unfamiliar with them?

Television makes us do crazy things…or does it?

During our second lecture in Research Methods, when asked to provide an example of a relational statement, one student offered the following:

Playing violent video games leads to more violent inter-personal behaviour by these game-playing individuals.

That’s a great example, and we used this in class for a discussion of how we could go about testing whether this statement is true. We then surmised that watching violence on television may have similar effects, though watching is more passive than “playing”, so there may not be as great an effect.

If television viewing can cause changes in our behaviour that are not socially productive, can it also lead viewers to change their behaviour in a positive manner? There’s evidence to suggest that this may be true. In a recent study, 

there is evidence to suggest that watching MTV’s 16 and Pregnant show is associated with lower rates of teen pregnancy. What do you think about the research study?

More on Milgram’s Methods of Research

In a previous post, I introduced Stanley Milgram’s experiments on obedience and authority. We watched a short video clip in class and students responded to questions about Milgram’s research methods. Upon realizing that the unwitting test subjects were all males, one student wondered whether that would have biased the results in a particular direction. The students hypothesized that women may have been much less likely to defer to authority and continue to inflict increasing doses of pain on the test-takers. While there are good reasons to believe either that women would be more or less deferential than are men, what I wanted to emphasize is the broader point about evidence and theory as it relates to research method and research ethics.

The 'sophisticated' machinery of the Milgram Obedience Experiment
The ‘sophisticated’ machinery of the Milgram Obedience Experiment

In the video clip, Milgram states candidly that his inspiration for his famous experiments was the Nazi regime’s treatment of Europe’s Jews, both before and during World War II. He wanted to understand (explain) why seemingly decent people in their everyday lives could have committed and/or allowed such atrocities to occur. Are we all capable of being perpetrators of, or passive accomplices to, severe brutality towards our fellow human beings?

Milgram’s answer to this question is obviously “yes!” But Milgram’s methods of research, his way of collecting the evidence to test his hypothesis, was biased in favour of confirming his predetermined position on the matter. His choice of lab participants is but one example. This is not good social science, however. The philosopher of science, Carl Hempel, long ago (1966) laid out the correct approach to  producing good (social) science:

  1. Have a clear model (of the phenomenon under study), or process, that one hypothesizes to be at work.
  2. Test out the deductive implications of that model, looking at particularly the implications that seem to be least plausible,
  3. Test these least plausible implications against empirical reality.

If even these least plausible implications turn out to be confirmed by the model, then you have srong evidence to suggest that you’ve got a good model of the phenomenon/phenomena of interest. As the physicist Richard Feynman (1965) once wrote,

…[through our experiments] we are trying to prove ourselves wrong as quickly as possible, because only in that way can we find progress.

Did the manner in which Milgram set up his experiment give him the best chance to “prove himself wrong as quickly as possible” or did he stack the deck in favour of finding evidence that would confirm his hypothesis?