OpenAI releases GPT-Rosalind, biology-focused LLM trained on 50 common research workflows
OpenAI has released GPT-Rosalind, a large language model trained specifically on 50 common biology workflows and major biological databases. Unlike broader science-focused models from competitors, GPT-Rosalind targets specialized biology tasks including pathway analysis, drug target prioritization, and cross-disciplinary research navigation.
OpenAI releases GPT-Rosalind, biology-focused LLM trained on 50 common research workflows
OpenAI has released GPT-Rosalind, a large language model trained specifically on biology research workflows, marking a departure from the generalized science models offered by other major AI companies.
Named after Rosalind Franklin, the model has been trained on 50 of the most common biological workflows and programmed to access major public biological databases, according to Yunyun Wang, OpenAI's Life Sciences Product Lead.
Core capabilities
The model is designed to address two specific research bottlenecks: the massive scale of genomic and protein biochemistry datasets accumulated over decades, and the highly specialized jargon across biology subfields that creates barriers for cross-disciplinary work.
According to OpenAI, GPT-Rosalind can connect genotype to phenotype through known pathways, infer structural or functional properties of proteins, suggest biological pathways, and prioritize potential drug targets. The company claims the model has been tuned to be more skeptical than typical LLMs, making it more likely to identify poor drug targets rather than defaulting to agreement.
OpenAI states the model demonstrates "reasoning" capabilities, defined as working through complex multi-step processes, and "expert-level" performance on unspecified biology benchmarks. Specific benchmark scores were not disclosed.
Access restrictions
Access is currently restricted to US-based entities through OpenAI's trusted access deployment program due to concerns about potential misuse, such as optimizing virus infectivity. A more limited Life Sciences Research Plugin will be made generally available, though timing was not specified.
Pricing details were not disclosed.
Unknowns
OpenAI did not address whether the model has solved the hallucination problem that affects other LLMs, particularly when explaining reasoning steps. The company also did not specify which biology benchmarks were used to evaluate "expert-level" performance, the model's parameter count, context window size, or training data cutoff date.
What this means
GPT-Rosalind represents a bet on vertical specialization over horizontal generalization in scientific AI models. While competitors like Google and Anthropic have released broad science-focused models, OpenAI's biology-specific approach could prove more effective for specialized research tasks—or it could sacrifice flexibility for marginal accuracy gains. The proof will come from real-world usage reports, which should emerge once researchers gain access through the restricted deployment program. The access limitations suggest OpenAI is proceeding cautiously with biosecurity concerns, though the specific threat model remains unclear given existing computational biology tools already available to researchers.
Related Articles
OpenAI's Codex for Windows gains Computer Use and remote control from ChatGPT mobile apps
OpenAI has expanded its Codex desktop app to Windows with Computer Use capabilities and remote control from ChatGPT mobile apps. The features, previously Mac-only, allow Codex to operate Windows desktop applications autonomously and enable iPhone, iPad, and Android users to initiate and monitor Codex tasks on Windows devices.
MiniMax Launches M3 Model With 1M Context Window at $0.30 Per Million Input Tokens
MiniMax has released M3, a multimodal foundation model supporting text, image, and video inputs with a 1-million-token context window. The model costs $0.30 per million input tokens and $1.20 per million output tokens, available through OpenRouter.
StepFun releases Step-3.7-Flash: 198B-parameter MoE model with 256K context at $0.20/M input tokens
StepFun has released Step-3.7-Flash, a 198B-parameter sparse Mixture-of-Experts vision-language model that activates 11B parameters per token and delivers up to 400 tokens per second. The model supports a 256K context window, three selectable reasoning levels, and is priced at $0.20 per million input tokens (cache miss) and $1.15 per million output tokens.
Liquid AI Releases LFM2.5-8B: 8-Billion Parameter Hybrid Model Optimized for Edge Deployment
Liquid AI has released LFM2.5-8B-A1B, an 8-billion parameter hybrid model designed specifically for edge AI and on-device deployment. The model is available in multiple GGUF quantized formats ranging from 4-bit (4.84 GB) to 16-bit (16.9 GB), optimized for memory efficiency.
Comments
Loading...