AI在线 AI在线

Kimi K2高速版发布 输出速度提升至每秒40 Tokens

Kimi开放平台发布Kimi K2 高速版。 新款模型命名为kimi-k2-turbo-preview,参数规模与现有kimi-k2 保持一致,但输出速度由每秒10 Tokens提升至每秒40 Tokens,显著提升使用效率。 此次升级旨在优化用户体验,满足更高实时性需求的应用场景。

Kimi开放平台发布Kimi K2 高速版。新款模型命名为kimi-k2-turbo-preview,参数规模与现有kimi-k2 保持一致,但输出速度由每秒10 Tokens提升至每秒40 Tokens,显著提升使用效率。此次升级旨在优化用户体验,满足更高实时性需求的应用场景。

相关资讯

Kimi K2 High-Speed Version Released, Output Speed Increased to 40 Tokens per Second

The Kimi Open Platform has launched Kimi K2 Speed Edition. The new model is named kimi-k2-turbo-preview, with the same parameter scale as the existing kimi-k2, but the output speed has increased from 10 Tokens per second to 40 Tokens per second, significantly improving usage efficiency. This upgrade aims to optimize user experience and meet application scenarios requiring higher real-time performance.
8/2/2025 4:35:53 PM
AI在线

Kimi K2 High-Speed Version Released, Output Speed Increased to 40 Tokens per Second

The Kimi Open Platform has launched Kimi K2 Speed Edition. The new model is named kimi-k2-turbo-preview, with the same parameter scale as the existing kimi-k2, but the output speed has increased from 10 Tokens per second to 40 Tokens per second, significantly improving usage efficiency. This upgrade aims to optimize user experience and meet application scenarios requiring higher real-time performance.
8/2/2025 4:35:53 PM
AI在线

月之暗面 Kimi 开放平台将启动 Context Caching 内测:提供预设内容 QA Bot、固定文档集合查询

月之暗面官宣 Kimi 开放平台 Context Caching 功能将启动内测,届时将支持长文本大模型,可实现上下文缓存功能。▲ 图源 Kimi 开放平台官方公众号,下同据介绍,Context Caching(上下文缓存)是由 Kimi 开放平台提供的一项高级功能,可通过缓存重复的 Tokens 内容,降低用户在请求相同内容时的成本,原理如下:官方表示,Context Caching 可提升 API 的接口响应速度(或首字返回速度)。在规模化、重复度高的 prompt 场景,Context Caching 功能带
6/19/2024 10:43:26 PM
归泷(实习)
  • 1