AI在线 AI在线

Former Google CEO's startup releases 24 billion-parameter chemical reasoning model with accuracy surpassing multiple leading models

In the field of artificial intelligence, research on large models continues to advance, particularly in improving reasoning capabilities. Recently, FutureHouse, a startup funded by former Google CEO Eric Schmidt, has open-sourced a chemical task reasoning model named ether0, with a parameter scale as high as 24 billion. This model demonstrates strong domain-specific capabilities in chemistry without requiring additional pre-training in specific fields, achieving remarkable results through post-training techniques while significantly reducing data requirements compared to traditional field-specific models.The application of reasoning models goes beyond simple multiple-choice tests.

In the field of artificial intelligence, research on large models continues to advance, particularly in improving reasoning capabilities. Recently, FutureHouse, a startup funded by former Google CEO Eric Schmidt, has open-sourced a chemical task reasoning model named ether0, with a parameter scale as high as 24 billion. This model demonstrates strong domain-specific capabilities in chemistry without requiring additional pre-training in specific fields, achieving remarkable results through post-training techniques while significantly reducing data requirements compared to traditional field-specific models.

The application of reasoning models goes beyond simple multiple-choice tests. The FutureHouse team aims to change this situation with ether0, promoting in-depth research in scientific reasoning. To build this model, the research team extracted chemical experiment data from numerous academic papers, tracked molecular characteristics such as solubility and odor, and converted this data into verifiable scientific questions.

image.png

Ether0 is based on the Mistral-Small-24B architecture, trained using reinforcement learning, and processed 640,730 chemical problems related to experimental data, covering 18 tasks including synthetic feasibility, blood-brain barrier permeability, and odor analysis. To enhance the model's performance, the research team introduced technologies such as reasoning behavior distillation and dynamic curriculum learning.

In terms of performance evaluation, ether0 was compared with various general large language models (such as Claude, o1) and specialized chemical models (such as ChemDFM, TxGemma). The results showed that ether0 achieved the highest accuracy in the open-answer (OA) category and also demonstrated strong competitiveness in multiple-choice questions (MCQ). In some tasks, its accuracy even exceeded that of competitors by more than double.

Additionally, ether0 shows significant advantages in training costs. Traditional non-reasoning models require over 50 times more data to achieve similar reaction prediction accuracy. Although ether0 cannot be cross-validated with other models or human performance in independent benchmark tests, it can effectively reason about molecular structures it has not been trained on.

In summary, ether0 can understand natural language questions, reason through natural language, and generate molecular structures, especially excelling in drug-like molecule design. Despite still being in the prototype stage, it has laid a solid foundation for building general scientific reasoning models in the future.

Key Takeaways:  

🌟 Ether0 is a 24-billion-parameter open-source chemical reasoning model from FutureHouse.  

📈 The model outperforms leading models like GPT-4.1 and DeepSeek-R1 in accuracy across multiple tasks.  

💰 Training ether0 requires significantly less data compared to traditional non-reasoning models.

相关资讯

DeepSeek R1 Model Shocks the AI World: Low-Cost, High Efficiency Leads a New Industry Track

In January of this year, the release of DeepSeek's R1 model was not just an ordinary AI announcement; it was hailed as a "watershed moment" in the tech industry, causing a significant stir across the entire technology sector and forcing industry leaders to rethink their fundamental approaches to AI development. DeepSeek's extraordinary achievements did not stem from novel features but from its ability to deliver results comparable to those of tech giants at a fraction of the cost, marking the rapid progress of AI along two parallel tracks: "efficiency" and "computing."Innovation Under Constraints: High Performance at Low CostDeepSeek's emergence has been remarkable, showcasing the capability for innovation even under significant constraints. In response to U.S.
6/16/2025 12:01:13 PM
AI在线

从想太多到想不透?DeepSeek-R1等长推理模型也存在「思考不足」问题

AIxiv专栏是AI在线发布学术、技术内容的栏目。 过去数年,AI在线AIxiv专栏接收报道了2000多篇内容,覆盖全球各大高校与企业的顶级实验室,有效促进了学术交流与传播。 如果您有优秀的工作想要分享,欢迎投稿或者联系报道。
2/15/2025 7:51:00 PM
机器之心

​前谷歌 CEO 投资的初创公司发布240亿参数化学推理模型,准确率超越多种领先模型

在人工智能领域,大模型的研究不断进展,尤其是在推理能力的提升上。 最近,由前谷歌 CEO 埃里克・施密特投资的初创公司 FutureHouse,开源了一个名为 ether0的化学任务推理模型,参数规模高达240亿。 这一模型在不需要额外领域预训练的情况下,通过后训练技术,展现出强大的化学领域能力,尤其是在数据需求上相比于传统领域专用模型显著减少。
6/17/2025 4:01:40 PM
AI在线
  • 1