AI spies questionable science journals, with some human help

(2025/08/31)

Reference: 1756635009
News link: https://www.theregister.co.uk/2025/08/31/ai_spies_questionable_science_journals/
Source link:

About 1,000 of a set of 15,000 open access scientific journals appear to exist mainly to extract fees from naive academics.

A trio of computer scientists from the University of Colorado Boulder, Syracuse University, and China's Eastern Institute of Technology (EIT) arrived at this figure after building a machine learning classier to help identify "questionable" journals and then conducting a human review of the results – because AI falls short on its own.

A questionable journal is one that violates [1]best practices and has low editorial standards, existing mainly to coax academics into paying high fees to have their work appear in a publication that fails to provide expected editorial review.

[2]

As detailed in [3]a research paper published in Science Advances, "Estimating the predictability of questionable open-access journals," scientific journals prior to the 1990s tended to be closed, available only through subscriptions paid for by institutions.

[4]

[5]

The [6]open access movement changed that dynamic. It dates back to the 1990s, as the free software movement was gaining momentum, when researchers sought to expand the availability of academic research. One consequence of that transition, however, was that costs associated with peer-review and publication were shifted from subscribing organizations to authors.

"The open access movement was set out to fix this lack of accessibility by changing the payment model," the paper explains. "Open-access venues ask authors to pay directly rather than ask universities or libraries to subscribe, allowing scientists to retain their copyrights."

[7]

Open access scientific publishing is now widely accepted. For example, a 2022 [8]memorandum from the White House Office of Science and Technology Policy directed US agencies to come up with a plan by the end of 2025 to make taxpayer-supported research publicly available.

But the shift toward open access has led to the proliferation of dubious scientific publications. For more than a decade, researchers have been raising concerns about [9]predatory and [10]hijacked [PDF] journals.

[11]Alibaba looks to end reliance on Nvidia for AI inference

[12]xAI's Grok has no place in US federal government, say advocacy groups

[13]AI web crawlers are destroying websites in their never-ending hunger for any and all content

[14]GitHub engineer claims team was 'coerced' to put Grok into Copilot

The authors credit Jeffrey Beall, a librarian at the University of Colorado, with applying the term "predatory publishing" in 2009 to suspect journals that try to extract fees from authors without editorial review services. An archived version of [15]Beall's List of Potentially Predatory Journals and Publishers can still be found. The problem with a list-based approach is that scam journals can change their names and websites with ease.

In light of these issues, Daniel Acuña (UC Boulder), Han Zhuang (EIT), and Lizheng Liang (Syracuse), set out to see whether an AI model might be able to help separate legitimate publications from the questionable ones using detectable characteristics (e.g. authors that frequently cite their own work).

"Science progresses through relying on the work of others," Acuña told The Register in an email. "Bad science is polluting the scientific landscape with unusable findings. Questionable journals publish almost anything and therefore the science they have is unreliable.

[16]

"What I hope to accomplish is to help get rid of this bad science by proactively helping flagging suspected journals so that professionals (who are scarce) can focus their efforts on what's most important."

Acuña is also the founder of [17]ReviewerZero AI , a service that employs AI to detect research integrity problems.

Winnowing down a data set of nearly 200,000 open access journals, the three computer scientists settled on a set of 15,191 of them.

They trained a classifier model to identify dubious journals and when they ran it on the set of 15,191, the model flagged 1,437 titles. But the model missed the mark about a quarter of the time, based on subsequent human review.

"About 1,092 are expected to be genuinely questionable, ~345 are false positives (24 percent of the flagged set), and ~1,782 problematic journals would remain undetected (false negatives)," the paper says.

"At a broader level, our technique can be adapted," said Acuña. "If we care a lot about false positives, we can flag more stringently." He pointed to a passage in the paper that says under a more stringent setting, only five false alarms out of 240 would be expected.

Acuña added that while many AI applications today aim for full automation, "for such delicate matters as the one we are examining here, the AI is not there yet, but it helps a lot."

The authors are not yet ready to name and shame the dubious journals – doing so could invite a legal challenge.

"We hope to collaborate with indexing services and assist reputable publishers who may be concerned about the degradation of their journals," said Acuña. "We could make it available in the near future to scientists before they submit to a journal." ®

Get our [18]Tech Resources

[1] https://doaj.org/apply/transparency/

[2] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_offbeat/science&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=2&c=2aLRxldXuWaTDG0i7OntE7AAAAJE&t=ct%3Dns%26unitnum%3D2%26raptor%3Dcondor%26pos%3Dtop%26test%3D0

[3] https://www.science.org/doi/10.1126/sciadv.adt2792#abstract

[4] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_offbeat/science&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=4&c=44aLRxldXuWaTDG0i7OntE7AAAAJE&t=ct%3Dns%26unitnum%3D4%26raptor%3Dfalcon%26pos%3Dmid%26test%3D0

[5] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_offbeat/science&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=3&c=33aLRxldXuWaTDG0i7OntE7AAAAJE&t=ct%3Dns%26unitnum%3D3%26raptor%3Deagle%26pos%3Dmid%26test%3D0

[6] https://open-access.network/en/information/open-access-primers/history-of-the-open-access-movement

[7] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_offbeat/science&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=4&c=44aLRxldXuWaTDG0i7OntE7AAAAJE&t=ct%3Dns%26unitnum%3D4%26raptor%3Dfalcon%26pos%3Dmid%26test%3D0

[8] https://web.archive.org/web/20250118023441/https://www.whitehouse.gov/ostp/news-updates/2022/08/25/ostp-issues-guidance-to-make-federally-funded-research-freely-available-without-delay/

[9] https://www.predatoryjournals.org/

[10] https://iris.uniroma1.it/bitstream/11573/964806/2/Dadkhah_Hijacked_2015.pdf

[11] https://www.theregister.com/2025/08/29/china_alibaba_ai_accelerator/

[12] https://www.theregister.com/2025/08/29/xais_grok_has_no_place/

[13] https://www.theregister.com/2025/08/29/ai_web_crawlers_are_destroying/

[14] https://www.theregister.com/2025/08/29/github_deepens_ties_with_elon/

[15] https://beallslist.net/

[16] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_offbeat/science&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=3&c=33aLRxldXuWaTDG0i7OntE7AAAAJE&t=ct%3Dns%26unitnum%3D3%26raptor%3Deagle%26pos%3Dmid%26test%3D0

[17] https://www.reviewerzero.ai/

[18] https://whitepapers.theregister.com/

Sadly

DarkwavePunk

I still think we're not in a much better place today when it comes to journals. Open Access was a noble goal but it seems like a pay to play rat race. Even "prestigious" journals are full of shoddy slop these days.

Re: Sadly

abend0c4

Some years ago, although not an academic, I had a role in an education-adjacent field with a very modest public profile. I was regularly receiving e-mails inviting me not only to contribute to some very questionable publications, but also to become an "editor" - a role which appeared to be indistinguishable from salesman. If a publication track record is one of the conditions of your research funding (and you can't get cited until you've been published), then ways to get published will be found or offered for a price.

Re: Sadly

Rafael #872397

I was an academic (and later promoted to department manager), but I still publish as a co-author with students, primarily at national conferences.

The number of "offers" I receive by e-mail is mind-boggling , mostly using this template:

We were deeply impressed by your paper Title of a paper in a specific field such as remote sensing and our editors want to publish it in our journal Title of a journal on a totally unrelated field such as Latin American Sociology . The publishing fee is around 90 USD dollars..

Usually, they scan the conference pages and obtain the names of the papers and the authors' contact information, and then it is Happy SPAM Time.

Sometimes I get bored and scan the journal's editors' lists and e-mail them asking if they are comfortable knowing that they are the editors of a predatory journal without any criteria for the invitations. Only once did I get a reply from an irate editor telling me that "it is the industry's practice" and the journal is "top quality".

Checking the CVs of the editors and associate editors of those journals is also revealing. I found one "researcher" who is editor of three journals of similar quality, and has dozens of articles published in the same journals..

Re: Sadly

Doctor Syntax

The institutions where the research is carried out could take the matters into their own hands. Instead of papers sending a new paper to a journal the author(s) submit it to their libraries. A library then decides on its merit, probably by consulting the department head or maybe the institution has a publishing committee. If it's OKed it goes on the library's website. The journals are cut out completely.

It might cost the libraries to do that but it's offset by the saving on journal subscriptions.

Re: Sadly

Rafael #872397

Unfortunately, in several cases, it is not about the publication of results, but the prestige of appearing in journals...

And with the lack of control of [1]some publishers , as soon as a truly open paper is published by an institution's library, someone is going to change the authors' names and republish it somewhere else.

As soon as institutions stop counting the number of papers as a metric of quality, "Publish or Perish" will perish (except for vanity publishing, that is an unkillable monster).

[1] https://lib.uliege.be/en/news/10000-fraudulent-articles-withdrawn-scientific-journals-2023#:~:text=Last%20December%2C%20an%20article%20in,ethics%2C%20if%20not%20outright%20fraudulent.

Doctor Syntax

"A questionable journal is one that violates best practices"

With "best practice" being a link to a site that's supposed to tell me about best practices. Blazoned across the bottom of the site's front page is "This website uses cookies to ensure you get the best experience. Learn more about DOAJ’s privacy policy." with a button to click. It's not even an opt out option. Somehow I lack confidence in their notion of what might constitute best practices.

elsergiovolador

The great trick of the original Cookie Law wasn't its failure to stop tracking, but its wild success in training an entire generation of users to click 'Agree' on anything just to make the banner go away. This conditioned reflex was the crucial groundwork for GDPR, where this casual click is now laundered into "freely given consent." It gave companies the explicit legal basis they desperately needed to legitimise their data trade, moving it from a legal grey area into a defensible business practice. 'Opt out' would really be nothing but cope and the whole thing is just hokey cokey.

PB90210

At the same time, AI crawlers are sucking up this slop to train their AI so they can be used to generate more slop... ad infinitum...

News: 1756635009

AI spies questionable science journals, with some human help

Sadly

Re: Sadly

Re: Sadly

Re: Sadly

Re: Sadly