About
#Interviews

Improving research reliability

Malte Elson of the University of Bern, Switzerland, talks about an initiative that pays reviewers to find errors in influential articles

"It is strange that science has no formal mechanism for detecting and correcting errors after publication," says Malte Elson, leader of the ERROR project and a professor at the University of Bern in Switzerland | Image: RUB/Marquard

Estimating the Reliability & Robustness of Research (ERROR), a pilot project based at the University of Bern, Switzerland, was launched this year with the aim of providing a mechanism to systematically detect and correct errors in scientific articles after publication. The objective is to pay experts to find errors in influential papers.

Inspired by the technology industry, which recruits programmers to test the vulnerabilities of their products, ERROR seeks to check academic articles that have been cited at least 30 times per year since they were published.

The project is offering reviewers a payment of 1,000 Swiss francs (about US$1,100) per article, with a bonus for every error they find. For a review to be conducted, there has to be an agreement with the authors, who are also remunerated for accepting and paid bonuses if errors are not identified.

In an interview with Science Arena, Malte Elson, a psychologist from the University of Bern and head of the project, talks about ERROR’s first steps, its reliability, and how authors and reviewers are compensated.

Science Arena – What motivated you to create the ERROR project and what are its objectives?

Malte Elson – One of the purposes of science is to discover truths—about the universe, human beings, chemical elements, or whatever you want. Given this, it is strange that science has no formal mechanism for detecting and correcting errors after publication. Traditional peer review is not designed explicitly to detect errors; reviewers often do not have access to more detail or research material (such as raw data, code, etc.) than any other reader. And even if they did, it’s quite conceivable that they often do not have time to delve deeper.

Other professions, particularly in the technology industry, have used ‘bug bounties’ for decades to reward freelancers who find and report critical flaws in products such as code or hardware. It is cheaper to pay for error detection than not to do it.

ERROR was launched in February 2024. How has it been received?

We have about 15 articles under review, of which one has completed the review cycle. The reviewer found a few minor errors. These errors did not materially affect the conclusions of the article, but the author, Jan Wessel, has contacted the journal about a possible correction, which is very commendable.

Overall, the reception has been very positive, but of course we still do not know for sure if the community will accept a program like ERROR.

One of the goals of ERROR is to understand more about how well a systematic error detection mechanism, whether based on financial rewards or otherwise, would be accepted.

How many reviewers are initially part of the project and what kind of profile do they have?

We have funding to do at least 100 reviews over four years. We will start in the fields of psychology, social science, and broader behavioral sciences. Then we will move on to medicine and possibly other areas.

We will need reviewers with very diverse experiences and skills, but who share some characteristics. The people who are most successful at detecting errors are probably naturally curious. They have to approach the material they are analyzing—such as procedures, computer code, or data—in a unique way, thinking about what could likely have gone wrong.

Many will also have the technical skills to read someone else’s code or understand statistical analyses that are not always meticulously documented.

What are the criteria for selecting articles?

We will only review articles that have reached a certain level of “importance.” The logic is to make the most efficient use of our resources. Of course, no scientific article should contain errors, but our reasoning is that errors in important papers that get cited many times can have consequences on much of the literature, so their discovery has higher priority.

Defining or measuring importance, however, is not a simple task. In this initial phase of the ERROR project, we are using a very simple rule: the articles must have been cited at least 30 times per year since they were published.

Could financial incentives for authors who agree to have their articles reviewed constitute a scientific bias?

Any new incentive creates opportunities for bias. We are well aware of this, but we believe that the benefits of encouraging error detection outweigh the risks. With ERROR, we are trying to counter potential biases by implementing a transparent system of checks and balances.

Most importantly, it is not the reviewer who decides whether something constitutes an error or not. Instead, the reviewers prepare a report in which they describe their concerns. The authors are given the opportunity to respond, and then a recommender (similar to the role of an editor in a standard peer review by a journal) evaluates the arguments of both parties to decide what should happen.

The reviews and author responses are published together on our website, including any data or code generated as part of the review.

Anyone can take a look if they suspect that a reviewer made a mistake or missed an error in the article they were assigned to review.

Then, if necessary, we can make adjustments. This is a dry run, not a global implementation of a proven system.

How do you expect authors to react to having their articles examined by the project’s reviewers?

I think it’s safe to assume that the types of authors who agree to participate in ERROR are different from the types of authors who do not. They may have certain personality traits or perhaps they already have greater job stability. Maybe they work in a lab where openness and constructive criticism are already commonplace and encouraged.

Furthermore, authors who suspect that their work would not stand up to scrutiny by an expert reviewer would likely be less inclined to participate. The ERROR project is not intended to provide an estimate of the “true” error rate in scientific articles.

Rather, it was designed to test whether an error detection model would work in principle and whether it is worth the cost.

* This article may be republished online under the CC-BY-NC-ND Creative Commons license.
The text must not be edited and the author(s) and source (Science Arena) must be credited.

Interviews

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Receive our newsletter

Newsletter

Receive our content by email. Fill in the information below to subscribe to our newsletter

Captcha obrigatório
Seu e-mail foi cadastrado com sucesso!
Cadastre-se na Newsletter do Science Arena