Tuesday, April 7, 2015

How credibility spoiled this mini-lecture on statistics

In a recent reply to one of our commentaries (Smits et al., 2015), Domenic Cicchetti gave an interesting mini-lecture on effect sizes, power, and sample sizes. You'll find it here. Though interesting in itself, it is a pity that the full scope of his contribution will largely remain unnoticed given a series of credibility issues with the text itself and his position as the "statistics editor" of Journal of Nervous and Mental Disease. Given that some of my recent research deals with the credibility of online word-of-mouth communication, and because I feel a bit misunderstood as the author of two earlier commentaries to this journal, I will elaborate these credibility issues.

This story began with an article by Douglas Turkington and colleagues. With a number of Twitter scientists we found the statistics and methods reporting in this article to be unscientific and, soon, the Twitter discussion resulted in a letter to the editor of the journal. Daniel Lakens (@lakens) described this process in a blogpost. Joining Daniel and me in that letter were Stuart Ritchie (@StuartJRitchie) and Keith Laws (@Keith_Laws). To be honest, we did not cover everything in that letter to the editor but sticked to some of the points we thought were most important. I described other issues with that original paper by Douglas Turkington in another blogpost.

Following this first published commentary, Turkington wrote a rejoinder (ironically incorrectly referencing to our letter) attesting that all statistics were double checked, that everything was OK, and that we could have the data to check for ourselves. This seemed like a nice invitation, so we accepted to verify the analyses ourselves. We received the data, together with an invitation to get slaughtered in a seminar. We accepted the first and declined the latter. So, we went along and reanalyzed, confirming our suspicions. We then kindly wrote our second letter with the correct analyses of the effect sizes and noting that another type of effect sizes was even to be preferred. And here, dr. Cicchetti joins our discussion. The journal describes him as the statistical editor. He lectures about effect sizes, about how we should be careful interpreting them, how we should have referred to the work of Kraemer & Thiemann (1987), etc.

So, what are the credibility issues?

What Cicchetti writes about Frontiers in Psychology

  • If you lecture about statistics, being the statistics editor of the journal, it would be wise to also have a look at the initial claims. Cicchetti states that "The reader should also be advised that [his] comment rests upon the assumption that the revised data analyses are indeed accurate because I was not privy to the original data. This seemed inappropriate, given the circumstances." How it was inappropriate is beyond my understanding, but fact is that for a large number of issues we had with the original paper, one does not need to have the data to know that there are mistakes. Just a quick overview: the bounds of confidence intervals are equal to some of the parameter's means. Very large effect sizes supposedly had confidence intervals that suggest there is no effect at all. Graphs appear without axis labels. The paper lacks a clear naming of concepts and variables. For a full overview, see my earlier blogpost, which was also written without receiving the data.
  • Cicchetti saying he was "not privy to the original data", is quite ironic given the following two observations. First, he was "privy" of our data, apparently, because when he reproduces our Table, he uses the correct d(z) and not the mistake that the editorial office made when typesetting our letter. Second, when Douglas Turkington emailed the data to us, as requested, he also included the editor, thus distributing the data to the editorial offices as well. 
E-mail with editorial office included
  • Cicchetti started to lecture us about interpreting effect sizes. "Before interpreting the meaning of the revised data in Table 1, it is important to discuss how ESs have been interpreted by Cohen (1988)". This is interesting given that we too were criticizing such interpretations by Turkington and colleagues. While Cicchetti talks about being cautious to interpret effect sizes as small, medium, or large, we can read Turkington's abstract saying: "Cohen’s d effect sizes were medium to large for overall symptoms [...], and negative symptoms [...] There was a weak effect on dimensions of hallucinations but not delusions"
  • Cichetti also produced his own table of effect sizes based on our analyses. A simple check with the originally reported effect sizes would alarm any statistics editor, I guess, to indeed check which of these two versions is right. The differences are just too big. 
Turkington's original effect sizes and those mentioned by Cicchetti

  • Lastly, Cicchetti lectures about the planned sample sizes for future studies based on this exploratory study by Turkington. Again, the lecture misses credibility because it focuses on a secondary point of ours (the difference between types of effect sizes). In our commentary, we compared Turkington's reported effect sizes with the actual effect sizes and concluded that for 80% power one would not merely need 22 participants as Turkington's data suggest, but 45 participants. Of course, the statistically non-naive would also understand our claim just by looking at the above table.
Up to this point, the Journal of Nervous and Mental Disease did not take action to retract or correct the original article or results.


  1. It would be ironic if the Cicchetti commentary ends up getting corrected to withdraw the attack on Frontiers, if the original paper doesn't get corrected (as seems sadly likely)!

  2. For a humorous look on the academic Commenting and self-correction, this is the must read
    (HT @sTeamTraen)

  3. Although I explicitly asked Domenic Cicchetti for a response to this blogpost, I did not receive an answer yet. I guess he just wants to comment in the secluded medium called 'Journal of Nervous and Mental Disease'.
    That is how this journal deals with commentaries. They first publish awful research. Pretending to be a truly academic journal, they then agree to publish our first commentary, which they immediately disregard with a nonsensical response. Again pretending to be a truly academic journal they then agree to publish our correct analyses; only to again disregard them with this stupid reponse. Keeping the discussion on their playground rather than in the open, they thus prevent actual academic self-correction. Indeed, although correct results are now available in their journal, the original article and its erroneous statistics are still available without even noticing our commentaries.

  4. Further reading: http://blogs.plos.org/mindthebrain/2015/04/22/busting-foes-of-post-publication-peer-review-of-a-psychotherapy-study/