The FIRSTS: the Dunning–Kruger effect (1999) or the unskilled-and-unaware phenomenon

Much talked about these days in the media, the unskilled-and-unaware phenomenon was mused upon since, as they say, immemorial times, but not actually seriously investigated until the ’80s. The phenomenon refers to the observation that incompetents overestimate their competence whereas the competent tend to underestimate their skill (see Bertrand Russell’s brilliant summary of it).


Although the phenomenon has gained popularity under the name of the “Dunning–Kruger effect”, it is my understanding that whereas the phenomenon refers to the above-mentioned observation, the effect refers to the the cause of the phenomenon, namely that the exact same skills required to make one proficient in a domain are the same skills that allow one to judge proficiency. In the words of Kruger & Dunning (1999),

“those with limited knowledge in a domain suffer a dual burden: Not only do they reach mistaken conclusions and make regrettable errors, but their incompetence robs them of the ability to realize it” (p. 1132).

Today’s paper on the Dunning–Kruger effect is the third in the cognitive biases series (the first was on depressive realism and the second on the superiority illusion).

Kruger & Dunning (1999) took a look at incompetence with the eyes of well-trained psychologists. As usual, let’s start by defining the terms so we are on the same page. The authors tell us, albeit in a footnote on p. 1122, that:

1) incompetence is a “matter of degree and not one of absolutes. There is no categorical bright line that separates ‘competent’ individuals from ‘incompetent’ ones. Thus, when we speak of ‘incompetent’ individuals we mean people who are less competent than their peers”.

and 2) The study is on domain-specific incompetents. “We make no claim that they would be incompetent in any other domains, although many a colleague has pulled us aside to tell us a tale of a person they know who is ‘domain-general’ incompetent. Those people may exist, but they are not the focus of this research”.

That being clarified, the authors chose 3 domains where they believe “knowledge, wisdom, or savvy was crucial: humor, logical reasoning, and English grammar” (p.1122). I know that you, just like me, can hardly wait to see how they assessed humor. Hold your horses, we’ll get there.

The subjects were psychology students, the ubiquitous guinea pigs of most psychology studies since the discipline started to be taught in the universities. Some people in the field even declaim with more or less pathos that most psychological findings do not necessarily apply to the general population; instead, they are restricted to the self-selected group of undergrad psych majors. Just as the biologists know far more about the mouse genome and its maladies than about humans’, so do the psychologists know more about the inner workings of the psychology undergrad’s mind than, say, the average stay-at-home mom. But I digress, as usual.

The humor was assessed thusly: students were asked to rate on a scale from 1 to 11 the funniness of 30 jokes. Said jokes were previously rated by 8 professional comedians and that provided the reference scale. “Afterward, participants compared their ‘ability to recognize what’s funny’ with that of the average Cornell student by providing a percentile ranking. In this and in all subsequent studies, we explained that percentile rankings could range from 0 (I’m at the very bottom) to 50 (I’m exactly average) to 99 (I’m at the very top)” (p. 1123). Since the social ability to identify humor may be less rigorously amenable to quantification (despite comedians’ input, which did not achieve a high interrater reliability anyway) the authors chose a task that requires more intellectual muscles. Like logical reasoning, whose test consisted of 20 logical problems taken from a Law School Admission Test. Afterward the students estimated their general logical ability compared to their classmates and their test performance. Finally, another batch of students answered 20 grammar questions taken from the National Teacher Examination preparation guide.

In all three tasks,

  • Everybody thought they were above average, showing the superiority illusion.
  • But the people in the bottom quartile (the lowest 25%) dubbed incompetents (or unskilled), overestimated their abilities the most, by approx. 50%. They were also unaware that, in fact, they scored the lowest.
  • In contrast, people in the top quartile underestimated their competence, but not by the same degree as the bottom quartile, by about 10%-15% (see Fig. 1).

126 Dunning–Kruger effect1 - Copy

I wish the paper showed scatter-plots with a fitted regression line instead of the quartile graphs without error bars. So I can judge the data for myself. I mean everybody thought they are above average? Not a single one out of more than three hundred students thought s/he is kindda… meah? The authors did not find any gender differences in any experiments.

Next, the authors tested the hypothesis about the unskilled that “the same incompetence that leads them to make wrong choices also deprives them of the savvy necessary to recognize competence, be it their own or anyone else’s” (p. 1126). And they did that by having both the competents and the incompetents see the answers that their peers gave at the tests. Indeed, the incompetents not only failed to recognize competence, but they continued to believe they performed very well in the face of contrary evidence. In contrast, the competents adjusted their ratings after seeing their peer’s performance, so they did not underestimate themselves anymore. In other words, the competents learned from seeing other’s mistakes, but the incompetents did not.

Based on this data, Kruger & Dunning (1999) argue that the incompetents are so because they lack the skills to recognize competence and error in them or others (jargon: lack of metacognitive skills). Whereas the competents overestimate themselves because they assume everybody does as well as they did, but when shown the evidence that other people performed poorly, they become accurate in their self-evaluations (jargon: the false consensus effect, a.k.a the social-projection error).

So, the obvious implication is: if incompetents learn to recognize competence, does that also translate into them becoming more competent? The last experiment in the paper attempted to answer just that. The authors got 70 students to complete a short (10 min) logical reasoning improving session and 70 students did something unrelated for 10 min. The data showed that the trained students not only improved their self-assessments (still showing superiority illusion though), but they also improved their performance. Yeays all around, all is not lost, there is hope left in the world!

This is an extremely easy read. I totally recommend it to non-specialists. Compare Kruger & Dunning (1999) with Pennycook et al. (2017): they both talk about the same subject and they both are redoubtable personages in their fields. But while the former is a pleasant leisurely read, the latter lacks mundane operationalizations and requires serious familiarization with the literature and its jargon.

Since Kruger & Dunning (1999) is under the paywall of the infamous APA website (infamous because they don’t even let you see the abstract and even with institutional access is difficult to extract the papers out of them, as if they own the darn things!), write to me at specifying that you need it for educational purposes and promise not to distribute it for financial gain, and thou shalt have its .pdf. As always. Do not, under any circumstance, use a sci-hub server to obtain this paper illegally! Actually, follow me on Twitter @Neuronicus to find out exactly which servers to avoid.

REFERENCE: Kruger J, & Dunning D. (Dec. 1999). Unskilled and unaware of it: how difficulties in recognizing one’s own incompetence lead to inflated self-assessments. Journal of Personality and Social Psychology, 77(6):1121-1134. PMID: 10626367. ARTICLE

P.S. I personally liked this example from the paper for illustrating what lack of metacognitive skills means:

“The skills that enable one to construct a grammatical sentence are the same skills necessary to recognize a grammatical sentence, and thus are the same skills necessary to determine if a grammatical mistake has been made. In short, the same knowledge that underlies the ability to produce correct judgment is also the knowledge that underlies the ability to recognize correct judgment. To lack the former is to be deficient in the latter” (p. 1121-1122).

By Neuronicus, 10 January 2018

The FIRSTS: The roots of depressive realism (1979)

There is a rumor stating that depressed people see the world more realistically and the rest of us are – to put it bluntly – deluded optimists. A friend of mine asked me if this is true. It took me a while to find the origins of this claim, but after I found it and figured out that the literature has a term for the phenomenon (‘depressive realism’), I realized that there is a whole plethora of studies on the subject. So the next following posts will be centered, more or less, on the idea of self-deception.

It was 1979 when Alloy & Abramson published a paper who’s title contained the phrase ‘Sadder but Wiser’, even if it was followed by a question mark. The experiments they conducted are simple, but the theoretical implications are large.

The authors divided several dozens of male and female undergraduate students into a depressed group and a non-depressed group based on their Beck Depression Inventory scores (a widely used and validated questionnaire for self-assessing depression). Each subject “made one of two possible responses (pressing a button or not pressing a button) and received one of two possible outcomes (a green light or no green light)” (p. 447). Various conditions presented the subjects with various degrees of control over what the button does, from 0 to 100%. After the experiments, the subjects were asked to estimate their control over the green light, how many times the light came on regardless of their behavior, what’s the percentage of trials on which the green light came on when they pressed or didn’t press the button, respectively, and how did they feel. In some experiments, the subjects were wining or losing money when the green light came on.

Verbatim, the findings were that:

“Depressed students’ judgments of contingency were surprisingly accurate in all four experiments. Nondepressed students, on the other hand, overestimated the degree of contingency between their responses and outcomes when noncontingent outcomes were frequent and/or desired and underestimated the degree of contingency when contingent outcomes were undesired” (p. 441).

In plain English, it means that if you are not depressed, when you have some control and bad things are happening, you believe you have no control. And when you have no control but good things are happening, then you believe you have control. If you are depressed, it does not matter, you judge your level of control accurately, regardless of the valence of the outcome.

Such illusion of control is a defensive mechanism that surely must have adaptive value by, for example, allowing the non-depressed to bypass a sense of guilt when things don’t work out and increase self-esteem when they do. This is fascinating, particularly since it is corroborated by findings that people receiving gambling wins or life successes like landing a good job, rewards that at least in one case are demonstrably attributable to chance, believe, nonetheless, that it is due to some personal attributes that make them special, that makes them deserving of such rewards. (I don’t remember the reference of this one so don’t quote me on it. If I find it, I’ll post it, it’s something about self-entitlement, I think). That is not to say that life successes are not largely attributable to the individual; they are. But, statistically speaking, there must be some that are due to chance alone, and yet most people feel like they are the direct agents for changes in luck.

Another interesting point is that Alloy & Abramson also tried to figure out how exactly their subjects reasoned when they asserted their level of control through some clever post-experiment questioners. Long story short (the paper is 45 pages long), the illusion of control shown by nondepressed subjects in the no control condition was the result of incorrect logic, that is, faulty reasoning.

In summary, the distilled down version of depressive realism that non-depressed people see the world through rose-colored glasses is slightly incorrect. Because only in particular conditions this illusion of control applies, and that is overestimation of control only when good things are happening and underestimation of control when bad things are happening. But, by and large, it does seem that depression clears the fog a bit.

Of course, it has been over 40 years since the publication of this paper and of course it has its flaws. Many replications and replications with caveats and meta-analyses and reviews and opinions and alternative hypotheses have been confirmed and infirmed and then confirmed again with alterations, so there is still a debate out there about the causes/ functions/ ubiquity/ circumstantiality of the depressive realism effect. One thing seems to be constant though: the effect exists.

I will leave you with the ponders of Alloy & Abramson (1979):

“A crucial question is whether depression itself leads people to be “realistic” or whether realistic people are more vulnerable to depression than other people” (p. 480).

124 - Copy

REFERENCE: Alloy LB, & Abramson LY (Dec. 1979). Judgment of contingency in depressed and nondepressed students: sadder but wiser? Journal of Experimental Psychology: General, 108(4): 441-485. PMID: 528910. ARTICLE | FULLTEXT PDF via ResearchGate

By Neuronicus, 30 November 2017

64% of psychology studies from 2008 could not be replicated

Free clipart from
Free clipart from

It’s not everyday that you are told – nay, proven! – that you cannot trust more than half of the published peer-reviewed work in your field. For nitpickers, I am using the word “proven” in its scientific sense, and not the philosophical “well, nothing can be technically really proven, etc…”

In an astonishing feat of collaboration, 270 psychologists from all over the world replicated 100 of the most prominent studies in their field, as published in 2008 in 3 leading journals: Psychological Science (leading journal in all psychology), Journal of Personality and Social Psychology (leading journal in social psychology), and Journal of Experimental Psychology: Learning, Memory, and Cognition (leading journal in cognitive psychology). All this without any formal funding! That’s right, no pay, no money, no grant (there was some philanthropy involved, after all, things cost). Moreover, they invited the original authors to take part in the replication process. Replication is possibly the most important step in any scientific endeavor; without it, you may have an interesting observation, but not a scientific fact. (Yes, I know, the investigation of some weird things that happen only once is still science. But a psychology study does not a Comet Shoemaker–Levy 9 make)

Results: 64% of the studies failed the replication test. Namely, 74% social psychology studies and 50% cognitive psychology studies failed to show significant results as originally published.

What does it mean? That the researchers intentionally faked their results? Not at all. Most likely the effects were very subtle and they were inflated by reporting biases fueled by the academic pressure and the journals’ policy to publish only positive results. Is this a plague that affects only psychology? Again, not at all; be on the lookout for a similar endeavor in cancer research and rumor has it that the preliminary results are equally scary.

There would be more to say, but I will leave you in the eloquent words of the authors themselves (p. aac4716-7):

“Humans desire certainty, and science infrequently provides it. […]. Accumulating evidence is the scientific community’s method of self-correction and is the best available option for achieving that ultimate goal: truth.”

Reference: Open Science Collaboration (28 August 2015). PSYCHOLOGY. Estimating the reproducibility of psychological science. Science, 349(6251):aac4716. doi: 10.1126/science.aac4716. Article | PDF | Science Cover | The Guardian cover | IFLS cover | Decision Science cover

By Neuronicus, 13 October 2015

Choose: God or reason

Photo Credit: Anton Darcy
Photo Credit: Anton Darcy

There are two different ways to problem-solving and decision-making: the intuitive style (fast, requires less cognitive resources and effort, relies heavily on implicit assumptions) and the analytic style (involves effortful reasoning, is more time-consuming, and tends to assess more aspects of a problem).

Pennycook et al. (2012) wanted to find out if the propensity for a particular type of reasoning can be used to predict one’s religiosity. They tested 223 subjects on their cognitive style and religiosity (religious engagement, religious belief, and theistic belief). The tests were in the form of questionnaires.

They found that the more people were willing to do analytic reasoning, the less likely they were to believe in God and other supernatural phenomena (witchcraft, ghosts, etc.). And that is because, the authors argue, the people that are engaging in analytic reasoning do not accept as easily ideas without putting effort into scrutinizing them; if the notions submitted to analyses are found to violate natural laws, then they are rejected. On the other hand, intuitive reasoning is based, partly, on stereotypical assumptions that hinder the application of logical thinking and therefore the intuitive mind is more likely to accept supernatural explanations of the natural world. For example, here is one of the problems used to asses analytical thinking versus stereotypical thinking:

In a study 1000 people were tested. Among the participants there were 995 nurses and 5 doctors.
Jake is a randomly chosen participant of this study. Jake is 34 years old. He lives in a beautiful home in a posh suburb. He is well spoken and very interested in politics. He invests a lot of time in his career. What is most likely?
(a) Jake is a nurse.
(b) Jake is a doctor.

Fig. 1 from Pennycook et al. (2012) depicting the relationship between the analytical thinking score (horizontal) and percentage of people that express a type of theistic belief (vertical). E.g. 55% of people that believe in a personal God scored 0 out of 3 at the analytical thinking test (first bar), whereas atheists were significantly more likely to answer all 3 questions correctly (last bar)
Fig. 1 from Pennycook et al. (2012) depicting the relationship between the analytical thinking score (horizontal) and percentage of people that express a type of theistic belief (vertical). E.g. 55% of people that believe in a personal God scored 0 out of 3 at the analytical thinking test (first bar), whereas atheists were significantly more likely to answer all 3 questions correctly (last bar)

First thing that comes to mind, based on stereotypical beliefs about these professions, is that Jake is a doctor, but a simple calculation tells you that there is 99.5% chance for Jake to be a nurse. Answer a) denotes analytical thinking, answer b) denotes stereotypical thinking.

And yet that is not the most striking thing about the results, but that the perception of God changes with the score on analytical thinking (see Fig. 1): the better you scored at analytical thinking the less conformist and more abstract view you’d have about God. The authors replicated their results on 267 additional more people. The findings were still robust and independent of demographic data.

Reference: Pennycook, G., Cheyne, J. A., Seli, P., Koehler, D. J., & Fugelsang, J. A. (June 2012, Epub 4 Apr 2012.). Analytic cognitive style predicts religious and paranormal belief. Cognition, 123(3): 335-46. doi: 10.1016/j.cognition.2012.03.003.  Article | PPT | full text PDF via Research Gate

by Neuronicus, 1 October 2015