Should science journalists stay away from preprints?

journals-Optimized

If science journalists can report results discussed at conferences, why should preprints be off-limit, especially if reported responsibly? While peer-reviewed papers published in reputable journals give journalists a degree of confidence in reporting, there have been instances when such papers have caused much harm. Peer-reviewing is not a magic bullet as the number of papers with duplicated images show. Responsible journalism is what matters more.

Extraordinary claims require extraordinary evidence. And to broaden the aphorism popularised by Carl Sagan, I would say extraordinary claims (made by scientists) invite extraordinary scrutiny by peers and coverage by science journalists. To think otherwise is naivety.

The reason for writing this post is to clear some misconceptions that lay people in general and scientists in particular have about science journalists writing articles about preprints deposited in repositories. The context for this is the heated discussion on social media following media coverage of a preprint deposited in arXiv by a two-member team of Dev Kumar Thapa and Prof. Anshu Pandey from Indian Institute of Science (IISc), Bengaluru. The team had apparently observed nanosized films and pellets made of silver nanoparticles embedded in a gold matrix exhibiting superconductivity at ambient temperature (-37 degree C) and pressure.

The post I wrote (for want of space and other constraints a shorter version was published in The Hindu) based on the preprint did not elicit any comments from any scientist on why I chose to write on a preprint and not wait for the paper to be peer-reviewed and published in Nature, where the authors had submitted their manuscript. Since the authors were prohibited from speaking to the media, I spoke to a few physicists working in the field of superconductivity to deconstruct the paper and make the findings of the paper accessible to lay readers. I cross-checked certain details with Prof. Arindam Ghosh at IISc before publishing the piece. There was not a tinge of hype or hyperbole in the article.

The next article that I wrote on the subject was after Brian Skinner, a physicist at the Massachusetts Institute of Technology, posted his comments on the arXiv repository. He raised a red flag after noticing nearly identical noise in two presumably independent measurements of the magnetic susceptibility in Prof. Pandey’s preprint. Noise, by its very virtue, should and will be random and finding nearly identical noise in measurements made under different conditions is therefore highly improbable.

From arXiv to social media

Incidentally, Skinner did not stop with posting his comment on arXiv repository. The scientific discussion, nay monologue, spilled on to social media when starting August 9, in a series of tweets, Skinner explained the potential problem with the IISc study. By the time I wrote the second article on August 12, Skinner had got a reply from IISc researchers. “Thanks for pointing this out! We hadn’t noticed this peculiar noise correlation. We don’t know its origin yet,” they had said in an email to him. Skinner tweeted this as well and went one step further by saying: “I’ve had another email exchange with the authors, and I will just say: They are REALLY not backing down from their claims. They emphasize that they are focused on providing validation of their data, and will only post new data or a response to my note once they have done so.”

As if Skinner taking to social media to dissect what he felt was a problem with the preprint, which should have ideally be restricted to comments section, was not enough, the IISc researchers did not feel compelled to restrict themselves to arXiv repository. They instead emailed him. If the purpose was to keep the message private, then Skinner defeated it by tweeting the crux of the message.

And Prof. Pratap Raychaudhuri from the Superconductivity Lab at Tata Institute of Fundamental Research (TIFR) Mumbai too took to Facebook (on August 12 and hours before I posted my article) to explain what he felt was an alternative possibility for the origin of identical noise.

The entire discussion was in public domain but not a single soul found it odd. But some scientists suddenly and strongly suddenly felt I should have waited for the paper to be published before writing the article. One scientist (I don’t want to divulge the name) even when to the extent of saying I should have waited for the authors to speak to me after the embargo ends before posting Skinner’s views, which were anyway tweeted days ago.

Reprint-Optimized

To these scientists who voiced their opinion and to others who felt the same but chose to remain silent let me explain how science journalism works. While published papers, especially those published in high impact journals (I am aware of the problems with impact factor), are the main source based on which science journalists across the world write articles, they are not the only source.

Scientific conferences are another major source of information that many science journalists routinely bank on. And many of the presentations at these conferences would not have even reached the stage of submission to a peer-reviewed journal but are worthy enough to be written about. If there is no problem with articles written based on information gathered during conferences (and obviously cross checked with a few experts), why is there such commotion when it comes to reporting on preprints? Especially when they are done responsibly and are not on health- and medicine-related issues, which might cause immediate danger if the findings reported are downright wrong or cause unintended harm?

Journalists do have greater degree of confidence while reporting on a study published in a peer-reviewed journal, especially the ones which have high impact factor. But there are umpteen number of cases where science journalists have ended up communicating very wrong message by relying on peer-reviewed papers.

Tom Sheldon’s article in Nature News, has in so many words said that preprints in the hands of science journalists can end up writing articles that are plain wrong or “misleading”. And one of the hypothetical cases he cites is how early findings that show a common vaccine is unsafe could cause much harm. I wonder how he chose to cite this example and how it passed scrutiny.

The MMR-autism association

Andrew Wakefield
Andrew Wakefield

Has Sheldon ever heard of the now infamous study published in 1998 by Dr. Andrew Wakefield and his colleagues in one of the most respected, high-impact, peer-reviewed journal called The Lancet? In short, the study was about the association of MMR vaccine with autism in eight of the 12 children reported. “Many parents seeking a cause for their children’s illness seized upon the apparent link between the routine vaccination and autism,” says a 2010 paper in The Canadian Medical Association Journal. This study resulted in a precipitous drop in the number children being vaccinated leading to “dramatic health consequences”. There were large measles outbreaks in 2008 and 2009 in the UK following a drop in vaccination. The paper was completely retracted only after 12 long years and after BMJ wrote a series of articles exposing the fraud.

GM corn and cancer

Sheldon then refers to Gilles-EricSéralini’s paper on how rats fed GM corn developed cancer to drive home the point of how “public understanding has been distorted by media coverage of ambiguous or just downright bad science”. Though journalists were not allowed to cross check the findings with scientists not connected with the paper, Sheldon seems to have forgotten that the paper was indeed published in a peer-reviewed journal.

Scientists of all hue and colour know that peer-reviewing is far from perfect. Retraction Watch blog and websites such a Pubpeer, a website that allows users researchers to discuss and review scientific papers that have been published, will cease to exist if peer-reviewing was only good enough.

nature (1)-Optimized

Irreproducible research

Publication of a paper in a journal cannot be considered final. For all the peer-reviewing that has gone into a manuscript before acceptance and publication, there are umpteen papers published in top-notch peer-reviewed journals that do not stand further study. First is the ability to reproduce the experiment by the same group and then by other independent research groups across the world. There is now “growing alarm about results that cannot be reproduced”, says Nature.

A survey carried out by Nature found that there a reproducibility crisis in science. As much as two-thirds of scientists who responded to Nature survey admitted that reproducibility is a major problem. “Pressure to publish, selective reporting, poor use of statistics and finicky protocols can all contribute to wobbly work,” says the Nature editorial.

Peer-reviewed papers with duplicated images

A day before Sheldon published his News piece, a paper by Dr. Elisabeth M. Bik and others published in Molecular and Cellular Biology found as many as 35,000 papers published in peer-reviewed journals between 2009 and 2015 are candidates for retraction due to image duplication. Sheldon would have had enough time to write a better informed piece had he only referred her preprint posted in the bioRxiv repository on June 24 (one month before he published his News item.) And a 2006 study by Dr. Bik and her colleagues found that looking at the country of origin of the authors of the 348 papers published in PLOS ONE that had duplicated images, China and India had a higher proportion of papers containing problematic images. But more number of papers from China and the U.S. contain duplicated images.

Bik
Elisabeth Bik

In an interview, Dr. Bik told me: “Most [of the duplicated images] are easy to spot for me, but apparently not for others. All published papers have gone through peer review and editorial handling, and papers I am scanning have been published months or years ago, so there were several opportunities for others to see them.”

In all fairness, the near-identical noise in IISc team’s result could have been spotted by Nature peer-reviewers and the authors alerted. But looking at the numerous instances where even simple image duplication within the same panel in a paper has been missed by peer-reviewers and other scientists reading the paper, I doubt if peer-reviewers would noticed that. Skinner had to magnify or zoom (as he refers to) the bottom part of the figure to see the near-identical noise in two independent measurements of the magnetic susceptibility. Would peer-reviewers have done that?

Evidence nearer home

The proof that peer-reviewers have less ability to spot even simple image duplication can be seen in two recent cases in India where researchers from IIT Dhanbad and Bose Institute, Kolkata have been found to be serial offenders. Fourteen papers have been retracted and two corrected for image duplication and another two-dozen problematic papers are still listed on Pubpeer in the case of IIT Dhanbad researchers. The Bose Institute case is quite similar — two papers have been retracted and two corrected and about a dozen more problematic papers are still listed on Pubpeer. There are scientists from other Indian labs who have fewer problematic papers listed on Pubpeer website.

In the last one month, almost 90% of papers from Indian labs (including CSIR labs) and nodal institutions in the country posted on Pubpeer website have issues of image duplication. These papers have been published in peer-reviewed journals (some being really reputable ones).

Since the papers have been listed only in the last one month none of them have been retracted or corrected. The main reason why I spent a few days scanning Pubpeer website was to know which scientists’ work, even if they are from reputed institutions and published in respected journals, I should avoid reporting, lest I end up promoting fake science.

When peer-reviewing could have helped

But yes, there have been cases where the main result reported in preprints have proved to be completely wrong. In September 2011, the OPERA team posted a preprint in arXiv claiming that neutrinos travelled 60 nanoseconds faster than light. It later became apparent that the calculations were wrong, and the mistake was due to faulty wiring.

And in March 2014, scientists at the Harvard-Smithsonian Centre for Astrophysics in the U.S announced the indirect detection of gravitational waves in the afterglow of the Big Bang only to be proved wrong.

These two announcements prior to peer-reviewing have caused a lot of embarrassment and two members of the OPERA team had to step down owing to severe criticism. It is premature to say anything about the IISc superconductivity study.

“They [IISc team] are REALLY not backing down from their claims. They emphasize that they are focused on providing validation of their data, and will only post new data or a response to my note once they have done so,” Skinner tweeted. “People at IISc are being tight-lipped, and the official statement from the authors is that they’re waiting to “have their data validated by another group” before they reply.”

Unlike religion which is dogmatic, this episode gives a non-scientist outsider a peep at how science self-corrects. And that perhaps is the correct way to look at it without getting emotional.

4 Thoughts

  1. A bit surprised about people objecting to using Preprints as sources. Because I think any presentation oral or written to others means one implicitly accepts to stand behind what is said or written. You should just ignore people who say preprints or conference proceedings should not be used and move on..

  2. The text below is largely reproduced from a Facebook post that clarifies my own views, as a practicing scientist, on how preprints should be reported by science journalists. The discussion below arose during a meeting that Prasad attended, which was organized by Rahul Siddharthan and myself, from the Institute of Mathematical Sciences, Chennai. Prasad asked if I would agree to post it on his blog as well and I’m very happy to:

    A second set of musings about the science communication meeting. At the meeting, one issue that exercised some journalists, in particular Prasad Ravindranath of the Hindu, was the question of whether archived pre-prints were fair game for science journalists to write on.

    I understand that he received some flak for his choice to do so in one case, although it wasn’t clear from whom.

    My own stand on this is clear. By posting my submitted/to-be-submitted paper on the archive – I prefer the biology archive nowadays, given what I work on – I submit my work to the scrutiny of a far larger number of scientists in my area than will actually encounter the same paper by turning the issues of a journal. My name on the paper in preprint form counts, for me, as much or even more as seeing its final version in print. I am putting my reputation on the line when I post work in these forums. It’s my understanding, as well as that of those who read it, that this represents work I am happy to be associated with and which I am prepared to defend professionally.

    My own memory of the importance of archived preprints is an early one, of TIFR graduate students in string theory, anxiously checking to see if Ed Witten might not have scooped their own idea in his most recent preprint. And a later one, of similar graduate students in a western country, dashing to their computer terminals in time for the morning posting, to check the latest advances in their field. Different countries, the same worries.

    The paper might be revised before publication, one or more times, in accordance with the views of the one or two reviewers who might have serious, independent, and often valuable comments on it. This is hardly a point against the preprint system. As an author, I am expected to update the submitted preprint to the newer version, while keeping the older version intact and accessible, so that any reader can choose to compare older and more recent versions. This wording is actually slightly imisleading – the archives in physics and biology actually do not let you revise previous versions. They remain as they are for all to see. An updated version can be uploaded but does not over-write previous versions.)

    Also, Referees are not infallible and sometimes referee comments are thinly disguised versions of “Refer to my own papers, list provided below, or else …”. Often they improve the quality of the paper but equally often the changes they require are largely cosmetic.

    So my answer would be: Preprints are absolutely fair game for science journalists and don’t let anyone tell you they are not. (They could always ask me, or indeed any professional scientist I know, for clarification ;-))

    [A further set of clarifications, following a number of useful comments on my original Facebook post. The culture of preprints is very strong in mathematics, in theoretical physics and in computer science. It is gaining strength in biology. The chemists have been slow to catch up, overall. In Mathematics, Grigorii Perlman’s proof of the Poincare conjecture, one strong enough to be awarded a Clay Prize of a million dollars, which Perelman declined(!), remains a preprint! A grey area is whether one should consider preprints towards promotions and related scientific advancement. The interests of large commercial publishers is not always in tune with a culture of preprints. By allowing authors to share their work and permit commentary, such publishers feel that their primacy is being usurped. My own feeling is that it would be in their own interest to get on board with this as soon as possible, since they risk being left behind. Finally, and I cannot emphasize this enough, preprints level the playing field for scientists from the developing world. They may simply be the most innovative method we know of in enabling such access to the best of science from all round the world.]

  3. * On the IISc superconductivity issue: The Skinner report is a huge red flag and the entire story is very much newsworthy independent of the preprints debate (actually there is no longer a preprints debate among any practicing scientist that I know of).

    * On preprints: pioneered by high-energy physicists in the early 1990s, adopted wholesale by other physicists, mathematicians and computer scientists in the 1990s, and the original archive (arXiv, pronounced archive) hosts all these topics, and quantitative biology and much more. Experimental physicists were slower to accept it but today I don’t think a single reputable physicist would disapprove of posting on arXiv before publishing. In biology, bioRxiv, an independent effort from arXiv but inspired by it, has exploded in the past 3 years after strong endorsements and appeals to the community by prominent biologists including several Nobel laureates. Today, again, I don’t think a single reputable biologist would frown on posting pre-publication material on bioRxiv. Though biology was late to this game, it was early to the open-access movement (but author-pays OA has other problems). Some other fields of science are lagging in accepting preprints, true. But when Nature posts an anti-preprint article you can basically assume it’s a vested interest.

    * On research integrity: both arXiv and bioRxiv keep all posted preprints up permanently. You can upload a revision but the original will still be there. You can upload a retraction notice but the original will still be there. This (a) serves as a marker of historical record much much more strongly than any journal system that I know of; (b) serves as a strong disincentive to putting out half-baked results — since, unlike facebook posts, you know the ill-thought-out preprint will be there for posterity.

    Really, the debate should have been over 15 years ago, and it is definitely over now. Preprints are the primary way science is communicated.

    UPDATE – August 22, 2018, 8.30 pm:

    1) Perelman’s proof of the Poincaré conjecture was and remains a preprint. Peer review by a large community of a preprint is far more effective than peer review by three anonymous reviewers of a journal article.

    2) Should add within the present context that the Thapa-Pandey preprint is not comparable to Perelman’s. Perelman’s was 100% verifiable. And that’s what preprints need to be. Which, on the experimental side, means complete documentation of experiments, fabrication of samples, etc to the extent that any competent researcher in the field can try to replicate the results; and willingness to share samples with disinterested researchers. (last two comments basically cross-posted from my comments on Facebook)

  4. I find that you are presenting a romantic review of your articles. You have the right to write and doubt anything that you wish, but let us notice that you chose to reproduce the phrase: ” “…By definition noise patterns are random and cannot reproduce… The Skinner report thus immediately raised a red flag of possible academic misconduct: carelessness or outright fraud. But could there be a different explanation?”” from Prof. Raychaudhuri. Thus alarming the public before any proofs of misconduct have appeared. You have a duty towards the pubic who reads you. You have chosen your public to be scientists, therefore abide to their rules.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.