DataLearner

@DataLearnerAI

关注数据科学关注科技行业关注人工智能关注一切促进人类生活美好的新技术业界主流大模型列表：https://datalearner.com/ai-models/pretrained-models 国产开源大模型生态现状：https://datalearner.com/china-opensource-llm

Hefei, China

datalearner.com/blog_list

انضم في مارس 2023

796المنشورات 282المتابعون 644المتابَعون

قد يعجبك

@chakrabartis

@SeedsForbidden

@akhdanfadh

@juanignacioarg

@allstar_teams

@JIRIGESI

@DulcesNieves

@horkypog

@Smitsonian_1

@Davrosity

@msmith6037

@ChazPrunewood

@_kr3cik

@roblew9013

مثبتة

DataLearner

@DataLearnerAI

٢٣ يوليوم

阿里开源编程大模型 Qwen3-Coder-480B-A35B，不过这个模型的参数规模在此前的Qwen3系列中没有发现，而且也不是thinking和non-thinking混合架构，仅支持non-thinking模式，感觉是新训练的，非常奇怪的一个模型。看Qwen系列，不管是MoE还是稠密模型，推理参数似乎在30B规模是目前Qwen系模型认为的上限了~

DataLearner

@DataLearnerAI

18 س

写了一篇博客介绍Anthropic最新发布的Claude Skills。AI Agent显然不仅需要强大的模型，也需要工程上更多的设计。Skills作为MCP的互补，更加强调本地执行，强调外部工具的重要性，让AI Agent专注规划、工具使用、任务理解等。这种模式相比较MCP应该更加值得注意。 datalearner.com/blog/105176067…

DataLearnerAI's tweet card. Anthropic 正式推出全新功能 Claude Skills，旨在让通用 AI 代理（Agent）具备专业领域能力。该功能允许用户通过创建包含 SKILL.md 文件的技能文件夹，为 Claude 注入可执行脚本、模板与资源，实现 Excel 处理、PPT 生成等特定任务的自动化操作。与传统提示词不同，Skills 采用结构化加载与本地沙箱执行机制，兼顾安全性与效率。

المصدر: datalearner.com

We're launching Claude Agent Skills, a filesystem-based approach to extending Claude's capabilities. Progressive disclosure means agents load only relevant context. Bundle instructions, scripts, and resources in a folder. Claude discovers and executes what it needs.

DataLearner

@DataLearnerAI

١٥ أكتوبرم

OpenAI在周四北京时间0:00又要发布什么产品了吗🤔

Karina Nguyen

@karinanguyen_

١٥ أكتوبرم

Tomorrow, 9am PST 👁️

DataLearner

@DataLearnerAI

٢٣ سبتمبرم

哇哦，今晚阿里要发布6个东西，一个产品，2个开源模型和3个API接口

Junyang Lin

@JustinLin610

٢٣ سبتمبرم

1 product, 2 oss, 3 apis. every one is not small.

DataLearner

@DataLearnerAI

٢٢ سبتمبرم

DeepSeek - v3.1做了小幅更新，缓解了中英文混杂问题，提升了agent能力。看评测结果，部分评测提升不错

DataLearner

@DataLearnerAI

٢١ سبتمبرم

哇哦，Qwen3 Omni全模态模型即将到来了，文本图片音频视频输入和输出。带推理版本，似乎可能是MoE架构啊

Junyang Lin

@JustinLin610

٢١ سبتمبرم

github.com/huggingface/tr…

JustinLin610's tweet card. Qwen3-Omni here! This PR introduces support for the upcoming Qwen3-Omni models, including Instruct and Thinking versions. As the next generation of the Qwen-Omni family, Qwen3-Omni brings new archi...

Adding support for Qwen3Omni by BakerBunker · Pull Request #41025 · huggingface/transformers

المصدر: github.com

DataLearner

@DataLearnerAI

٩ سبتمبرم

即将发布的Qwen3-Next应该就是Qwen3-Next-80B-A3B了，一个极稀疏的MoE架构模型，总参数量800 亿，每次推理仅激活30亿，训练成本不到 Qwen3-32B的1/10，在处理超过32K的长上下文时，推理吞吐量比 Qwen3-32B 高出 10 倍以上。下游任务表现也强于Qwen3-32B。果真如此的话，32B模型未来可能也要放弃了~

Zephyr

@zephyr_z9

٩ سبتمبرم

The Qwen3-Next series represents our next-generation foundation models, optimized for extreme context length and large-scale parameter efficiency. The series introduces a suite of architectural innovations designed to maximize performance while minimizing computational cost:…

DataLearner

@DataLearnerAI

٩ سبتمبرم

看样子阿里很快要开源Qwen3新模型了这个qwen3-next是啥呢🤔

Junyang Lin

@JustinLin610

٩ سبتمبرم

github.com/huggingface/tr…

DataLearner

@DataLearnerAI

٨ سبتمبرم

qwen这又是有什么大动作🤔没有预期到的API是什么

Junyang Lin

@JustinLin610

٨ سبتمبرم

tonight an api not in your expectation

DataLearner

@DataLearnerAI

٦ سبتمبرم

OpenRouter上新增了2个非常强的模型，生成游戏啥的都很厉害。目前还不清楚是哪家的，上下文最高支持200万，测试了一下速度很快。有人认为可能是Gemini 3，但是我们测试了一个问题，让它自己根据自己的特点选一个最像的模型，它回复认为自己最像Grok，莫非是Grok 4.2🤔🤔

OpenRouter

@OpenRouterAI

٦ سبتمبرم

Introducing Sonoma Alpha, two new stealth models 🥷 Context: 2 million tokens Price: Free

DataLearner

@DataLearnerAI

٤ سبتمبرم

Qwen3家族要迎来最大最强的模型？会是什么呢~

Qwen

@Alibaba_Qwen

٤ سبتمبرم

Ready to meet the biggest, brainiest guy in the Qwen3 family?

DataLearner

@DataLearnerAI

٢٩ أغسطسم

Anthropic在Claude网页测试免费试用extended thinking、Opus、集成、项目等功能。这应该是允许不订阅的用户进行少量的新特性使用。当前Claude免费用户出了Sonnet 4外啥也用不了，不过这可能只是测试，未必会真的发布。不过会不会有可能你允许把数据给Anthropic，它让你用一段时间。🧐

Tibor Blaho

@btibor91

٢٩ أغسطسم

The "Trials Freemium" experiment in the Claude web app seems to be rolled out to some users now, giving them free access to extended thinking, Opus, integrations, and projects (however: "This is a temporary trial project that may be removed in the future") h/t @nadzi_mouad

btibor91's tweet image. The "Trials Freemium" experiment in the Claude web app seems to be rolled out to some users now, giving them free access to extended thinking, Opus, integrations, and projects (however: "This is a temporary trial project that may be removed in the future")

h/t @nadzi_mouad

DataLearner

@DataLearnerAI

٢٩ أغسطسم

阿里开源了一个全新的Agent：WebWatcher，这是一个Deep Research形态的Agent，最大的特点是利用了多模态大模型，可以提取图片数据来增强分析能力。32B的多模态模型配合RAG Flow的效果在几个评测中好于单纯使用文本的GPT-4o、Gemin等模型。只是不知道速度方面怎么样，图像识别可能影响速度。

Tongyi Lab

@Ali_TongyiLab

٢٩ أغسطسم

Thrilled to open-source WebWatcher: our vision-language deep research agent from @Alibaba_NLP! Available in 7B & 32B parameter scales for the community. Achieving SOTA on the toughest VQA benchmarks: • HLE-VL: 13.6% (vs GPT-4o's 9.8%) • BrowseComp-VL: 27.0% (2x GPT-4o!) •…