Researchers Discover New Form of Scientific Fraud: Discovery of 'Clandestine References'

The image of a researcher working alone, isolated from the rest of the world and the scientific community, is classic but erroneous. In reality, research is based on a permanent exchange within the scientific community: you first understand the work of others, then you share your discoveries.

Reading and writing articles published in peer-reviewed journals and presented at conferences is an essential part of a researcher’s work. When a researcher writes a scientific article, he or she must cite the work of his or her peers to provide context, detail sources of inspiration, and explain differences in approaches and results. Positive citation by other researchers is a key measure of the visibility of a researcher’s work.

But what happens when this citation system is manipulated? A recent study Journal of the Association for Information Science and TechnologyA paper by our team of academic detectives, which includes information scientists, a computer scientist, and a mathematician, has revealed an insidious method for artificially inflating citation counts through metadata manipulation: sneaky references.

Hidden manipulation

People are increasingly aware of scientific publications and how they work, including their potential flaws. Last year, more than 10,000 scientific papers were retracted. The problems with the citation game and the damage it does to the scientific community, including by undermining its credibility, are well documented.

Citations of scientific works follow a standardized referencing system: each reference explicitly mentions at least the title, the authors’ names, the year of publication, the name of the journal or conference and the page numbers of the cited publication. This information is stored as metadata, not directly visible in the text of the article, but assigned to a digital object identifier, or DOI, a unique identifier for each scientific publication.

References in a scientific publication allow authors to justify methodological choices or to present the results of past studies, thus highlighting the iterative and collaborative nature of science.

However, we discovered by chance that some unscrupulous actors added additional references, invisible in the text but present in the metadata of the articles, when submitting them to scientific databases. Result? The number of citations of certain researchers or journals exploded, while these references were not cited by the authors in their articles.

Lucky find

The investigation began when Guillaume Cabanac, a professor at the University of Toulouse, wrote a post on PubPeer, a site dedicated to post-publication peer review, where scientists discuss and analyze publications. In the post, he explained that he had noticed an inconsistency: an article in the journal Hindawi that he suspected was fraudulent because it contained awkward sentences had many more citations than downloads, which is highly unusual.

The article caught the attention of several sleuths who are now the authors of the JASIST paper. We used a scientific search engine to search for articles citing the original article. Google Scholar found none, but Crossref and Dimensions found references. The difference? Google Scholar will likely rely primarily on the main text of the article to extract references that appear in the bibliography section, while Crossref and Dimensions use metadata provided by the publishers.

A new type of fraud

To understand the extent of the manipulation, we examined three scientific journals published by the Technoscience Academy, the publisher responsible for the articles containing questionable citations.

Our investigation took place in three stages:

We have listed the references explicitly present in the HTML or PDF versions of an article.
We compared these lists with the metadata recorded by Crossref, discovering additional references added in the metadata but not appearing in the articles.
We checked Dimensions, a bibliometric platform that uses Crossref as a metadata source, and found further inconsistencies.

In the journals published by Technoscience Academy, at least 9% of the references recorded were “hidden references.” These additional references were present only in the metadata, which skewed the citation count and gave some authors an unfair advantage. Some legitimate references were also lost, i.e., they were not present in the metadata.

Furthermore, when analyzing sneaky citations, we found that they greatly benefited some researchers. For example, a single researcher associated with Technoscience Academy benefited from over 3,000 additional illegitimate citations. Some journals from the same publisher benefited from a few hundred additional sneaky citations.

We wanted our results to be validated externally, so we published our study as a preprint, informed Crossref and Dimensions of our findings, and provided them with a link to the preprint survey. Dimensions acknowledged the illegitimate citations and confirmed that its database reflects Crossref’s data. Crossref also confirmed the additional references in Retraction Watch and noted that this was the first time it had been informed of such an issue in its database. The publisher, based on Crossref’s investigation, took steps to address the issue.

Consequences and potential solutions

Why is this finding important? Citation counts greatly influence research funding, academic promotions, and institutional rankings. Citation manipulation can lead to unfair decisions based on false data. More worrisome, this finding raises questions about the integrity of systems for measuring scientific impact, a concern that researchers have highlighted for years. These systems can be manipulated to foster unhealthy competition among researchers, incentivizing them to cut corners to publish faster or get more citations.

To combat this practice, we propose several measures:

Rigorous verification of metadata by publishers and agencies like Crossref.
Independent audits to ensure data reliability.
Increased transparency in reference and citation management.

This study is the first, to our knowledge, to report metadata manipulation. It also addresses the impact this may have on the evaluation of researchers. The study highlights, once again, that the overreliance on metrics to evaluate researchers, their work, and their impact can be inherently flawed and misguided.

Such overreliance is likely to foster questionable research practices, including hypothesis-building after results are published (HARKing), salami-slicing, data manipulation, and plagiarism. It also hampers the transparency that is essential for more robust and effective research. While problematic citation metadata and sneaky references have apparently been corrected, the corrections may have been made too late, as is often the case with scientific corrections.

Provided by The Conversation

This article is republished from The Conversation under a Creative Commons license. Read the original article.

Quote:Researchers uncover new form of scientific fraud: discovery of ‘clandestine references’ (2024, July 10) retrieved July 10, 2024 from https://phys.org/news/2024-07-scientific-fraud-uncovering.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without written permission. The content is provided for informational purposes only.

Source link

Researchers Discover New Form of Scientific Fraud: Discovery of ‘Clandestine References’

Hidden manipulation

Lucky find

A new type of fraud

Consequences and potential solutions

Leave a Comment Cancel Reply