News: 0181739396

  ARM Give a man a fire and he's warm for a day, but set fire to him and he's warm for the rest of his life (Terry Pratchett, Jingo)

OpenAI Starts Offering a Biology-Tuned LLM (arstechnica.com)

(Friday April 17, 2026 @05:00PM (BeauHD) from the AI-all-the-things dept.)


An anonymous reader quotes a report from Ars Technica:

> On Thursday, OpenAI [1]announced it had developed a large language model [2]specifically trained on common biology workflows . Called GPT-Rosalind after Rosalind Franklin, the model appears to differ from most science-focused models from major tech companies, which have generally taken a more generic approach that works for various fields. In a press briefing, Yunyun Wang, OpenAI's Life Sciences Product Lead, said the system was designed to tackle two major roadblocks faced by current biology researchers. One is the massive datasets created by decades of genome sequencing and protein biochemistry, which can be too much for any one researcher to take in. The second is that biology has many highly specialized subfields, each with its own techniques and jargon. So, for example, a geneticist who finds themselves working on a gene that's active in brain cells might struggle to understand the immense neurobiological literature.

>

> Wang said the company had taken an LLM and trained it on 50 of the most common biological workflows, as well as on how to access the major public databases of biological information. Further training has resulted in a system that can suggest likely biological pathways and prioritize potential drug targets. "We're connecting genotype to phenotype through known pathways and regulatory mechanisms, infer likely structural or functional properties of proteins, and really leveraging this mechanistic understanding," Wang said. To address LLMs' tendencies toward sycophancy and overenthusiasm, OpenAI says it has tuned the model to be more skeptical, so it's more likely to tell you when something is a bad drug target. There was a lot of talk about GPT-Rosalind's "reasoning" and "expert-level" abilities. We were told that the former was defined as being able to work through complex, multi-step processes, while the latter was derived from the model's performance on a handful of benchmarks.

Access to GPT-Rosalind is currently limited "due to concerns about the model's potential for harmful outputs if asked to do something like optimize a virus's infectivity," notes Ars. Only U.S.-based organizations can [3]request access at the moment.



[1] https://openai.com/index/introducing-gpt-rosalind/

[2] https://arstechnica.com/science/2026/04/openai-starts-offering-a-biology-tuned-llm/

[3] https://openai.com/form/life-sciences-access/



By 2030 this could be very bad and very good (Score:2)

by davidwr ( 791652 )

On the very good side, this will lower the cost and lead times for new drugs.

On the bad side, nation-states, terrorists, and even just Evil Agents Of Chaos[TM] who have access to tools like this and the knowledge to (ab)use them will be able to unleash biological chaos on the world.

Imagine if someone created a virus that infected everyone, spread rapidly, but was asymptomatic or had only common-cold-like-symptoms on everyone but their intended target, but it killed their target. The target could be an indi

Re: (Score:2)

by HiThere ( 15173 )

We can't do that yet, and may never be able to be that specific. Trying to do it, however, could be exceedingly dangerous.

N.B.: All bacteria and viruses have a very high mutation rate.

I have an idea (Score:2)

by CEC-P ( 10248912 )

"Hey Bio-GPT, how can we stop psychopathic, society-ruining lunatics like Sam Altman from being born. Is there some sort of test or genetic correction?"

Optimal Virus? (Score:2)

by backslashdot ( 95548 )

This is the same as curing cancer and virtually every disease, because if there was a way to get a "large" payload (say 10kb of RNA, or ideally, 30 kb like the Coronavirus) into every cell efficiently, you can cure cancer. With 10kb it can be done with genius level bioengineering skill. With 30 kb it's trivial.

Re: (Score:2)

by backslashdot ( 95548 )

Should point out humanity's "best" tool today for doing this is the adenovirus capsid, but it has 3 major shortcomings: it only holds about 4kb of code, can't get into all cells efficiently (relative to other things it can, but not good enough), AND it can (practically) only be dosed once (if you dose it again after a week or two the immune system destroys it).

Ambition is a poor excuse for not having sense enough to be lazy.
-- Charlie McCarthy