CurveWeb's profile picture. IT企業勤務。犬とコーヒーが好き。
HuggingFace → https://huggingface.co/HachiML
Note → https://note.com/hatti8
LLM, Synthetic data(合成データ), Agent Systemについて発言します

はち

@CurveWeb

IT企業勤務。犬とコーヒーが好き。 HuggingFace → https://huggingface.co/HachiML Note → https://note.com/hatti8 LLM, Synthetic data(合成データ), Agent Systemについて発言します

Pinned

OpenAI o1再現を目指し、LLMの推論能力を高めるライブラリを作成しました。 MCTSアルゴリズムを簡単にLLM(CoTデータ学習済)に統合して推論できるようにしてあります。 また、Transformersとなるべく近い使い方になっているので比較的簡単に試せると思います。 github.com/Hajime-Y/reaso…


はち reposted

Unlocking the Power of Multi-Agent LLM for Reasoning Designing and optimizing multi-agent systems is important. This paper analyzes multi‑agent systems where one meta‑thinking agent plans and another reasoning agent executes, and identifies a lazy agent failure mode. They find…

omarsar0's tweet image. Unlocking the Power of Multi-Agent LLM for Reasoning

Designing and optimizing multi-agent systems is important.

This paper analyzes multi‑agent systems where one meta‑thinking agent plans and another reasoning agent executes, and identifies a lazy agent failure mode.

They find…

はち reposted

The memory folding mechanism proposed in this paper is great. It makes sense that agents should spend time explicitly compressing their memory into a semantic / organized format to avoid context explosion. Worth mentioning though that memory compression / retention in agents…

cwolferesearch's tweet image. The memory folding mechanism proposed in this paper is great. It makes sense that agents should spend time explicitly compressing their memory into a semantic / organized format to avoid context explosion.

Worth mentioning though that memory compression / retention in agents…

はち reposted

Open-source, private, local alternative to Manus AI! AgenticSeek is an autonomous agent that browses the web, writes code, and plans tasks, all on your device. It runs entirely on your hardware, ensuring complete privacy and zero cloud dependency. Key Features: 🔒 Local &…

Sumanth_077's tweet image. Open-source, private, local alternative to Manus AI!

AgenticSeek is an autonomous agent that browses the web, writes code, and plans tasks, all on your device.

It runs entirely on your hardware, ensuring complete privacy and zero cloud dependency.

Key Features:

🔒 Local &…

はち reposted

cool idea from Meta What if we augment CoT + RL’s token space thinking into a “latent space”? This research proposes “The Free Transformer”, with a way to let LLMs make global decisions within a latent space (via VAE encoder) that could later simplify autoregressive sampling

askalphaxiv's tweet image. cool idea from Meta

What if we augment CoT + RL’s token space thinking into a “latent space”?

This research proposes “The Free Transformer”, with a way to let LLMs make global decisions within a latent space (via VAE encoder) that could later simplify autoregressive sampling

はち reposted

Now in research preview: gpt-oss-safeguard Two open-weight reasoning models built for safety classification. openai.com/index/introduc…


はち reposted

Introducing Chronos-2: a foundation model that enables forecasting with an arbitrary number of dimensions in a zero-shot manner, outperforming existing time series foundation models by a substantial margin: amzn.to/4nkHQqp


はち reposted

People are sleeping on Deep Agents. Start using them now. This is a fun paper showcasing how to put together advanced deep agents for enterprise use cases. Uses the best techniques: task decomposition, planning, specialized subagents, MCP for NL2SQL, file analysis, and more.

omarsar0's tweet image. People are sleeping on Deep Agents.

Start using them now.

This is a fun paper showcasing how to put together advanced deep agents for enterprise use cases.

Uses the best techniques: task decomposition, planning, specialized subagents, MCP for NL2SQL, file analysis, and more.
omarsar0's tweet image. People are sleeping on Deep Agents.

Start using them now.

This is a fun paper showcasing how to put together advanced deep agents for enterprise use cases.

Uses the best techniques: task decomposition, planning, specialized subagents, MCP for NL2SQL, file analysis, and more.

はち reposted

"「Attention Is All You Need」の共著者(Sakana AI の CTO)、「トランスフォーマーにはうんざり」と語る" という記事から: ・AI研究が「トランスフォーマー」という一つの技術に偏りすぎている ・AI研究の現状は危険なほど視野が狭くなっている…


はち reposted

Just 2 days after Google, we have another big quantum computing breakthrough IBM says one of its key quantum error correction algorithms now can run in real time on AMD FPGA (field programmable gate array) chips, no exotic hardware is needed. This breakthrough could make…

Dr_Singularity's tweet image. Just 2 days after Google, we have another big quantum computing breakthrough

IBM says one of its key quantum error correction algorithms now can run in real time on AMD FPGA (field programmable gate array) chips, no exotic hardware is needed. 

This breakthrough could make…

はち reposted

Knowledge Flow show LLMs can push past context limits by carrying a tiny editable knowledge list across attempts. Hits 100% on AIME25 using text only, so test time memory can unlock big gains. This approach achieved a 100% accuracy on AIME 2025 using only open-source models,…

rohanpaul_ai's tweet image. Knowledge Flow show LLMs can push past context limits by carrying a tiny editable knowledge list across attempts.

Hits 100% on AIME25 using text only, so test time memory can unlock big gains.

This approach achieved a 100% accuracy on AIME 2025 using only open-source models,…

Can LLMs reason beyond context limits? 🤔 Introducing Knowledge Flow, a training-free method that helped gpt-oss-120b & Qwen3-235B achieve 100% on the AIME-25, no tools. How? like human deliberation, for LLMs. 📝 Blog: yufanzhuang.notion.site/knowledge-flow 💻 Code: github.com/EvanZhuang/kno…

yufan_zhuang's tweet image. Can LLMs reason beyond context limits? 🤔 

Introducing Knowledge Flow, a training-free method that helped gpt-oss-120b & Qwen3-235B achieve 100% on the AIME-25, no tools.

How? like human deliberation, for LLMs.

📝 Blog: yufanzhuang.notion.site/knowledge-flow
💻 Code: github.com/EvanZhuang/kno…


はち reposted

Googleの量子アドバンテージきたー thequantuminsider.com/2025/10/22/goo… スクランブリングのあるダイナミクスにおいて2次のOTOCを測って、「構成的な干渉」を見ました。簡単に言うと量子コンピュータで物理現象をみました もちろん反論も出てます。 nature.com/articles/d4158… 原論文はnature.com/articles/s4158…


はち reposted

🚀 Excited to share a major update to our “Mixture of Cognitive Reasoners” (MiCRo) paper! We ask: What benefits can we unlock by designing language models whose inner structure mirrors the brain’s functional specialization? More below 🧠👇 cognitive-reasoners.epfl.ch

bkhmsi's tweet image. 🚀 Excited to share a major update to our “Mixture of Cognitive Reasoners” (MiCRo) paper!

We ask: What benefits can we unlock by designing language models whose inner structure mirrors the brain’s functional specialization?

More below 🧠👇
cognitive-reasoners.epfl.ch

はち reposted

ByteDance introduced a major advancement in long-context modeling with linearly scaling compute. 👏 Addresses a core challenge in AI—balancing efficiency and fidelity when processing extended sequences—by drawing inspiration from biological memory systems. On 128k tests, FLOPs…

rohanpaul_ai's tweet image. ByteDance introduced a major advancement in long-context modeling with linearly scaling compute. 👏

Addresses a core challenge in AI—balancing efficiency and fidelity when processing extended sequences—by drawing inspiration from biological memory systems.

On 128k tests, FLOPs…

はち reposted

🚨BREAKING: A new open-source multi-agent LLM trading framework in Python It's called TradingAgents. Here's what it does (and how to get it for FREE): 🧵

quantscience_'s tweet image. 🚨BREAKING: A new open-source multi-agent LLM trading framework in Python

It's called TradingAgents.

Here's what it does (and how to get it for FREE): 🧵

はち reposted

コンパクトですが、ちゃんとまとまったRAGAS入門 with Langfuse の Blogです。 gao-ai.com/post/ragas-rag…


はち reposted

The standard way to improve reasoning in LLMs is to train on long chains of thought. But these traces are often brute-force and shallow. Introducing RLAD, where models instead learn _reasoning abstractions_: concise textual strategies that guide structured exploration. 1/N🧵

yoonholeee's tweet image. The standard way to improve reasoning in LLMs is to train on long chains of thought.

But these traces are often brute-force and shallow.

Introducing RLAD, where models instead learn _reasoning abstractions_: concise textual strategies that guide structured exploration. 
1/N🧵

はち reposted

Today we’re introducing GDPval, a new evaluation that measures AI on real-world, economically valuable tasks. Evals ground progress in evidence instead of speculation and help track how AI improves at the kind of work that matters most. openai.com/index/gdpval-v0


はち reposted

ShinkaEvolve出ました!LLMを使ったコード自動改善のフレームワークです。Sakana AI版AlphaEvolve……と言うと話は簡単ですが、性能も使いやすさも様々な工夫があり、凄く良く出来てます。自分も面白い利用を既に何度か試してまして、このソフトウェアの大ファンです。その話はまた後日……!

We’re excited to introduce ShinkaEvolve: An open-source framework that evolves programs for scientific discovery with unprecedented sample-efficiency. Blog: sakana.ai/shinka-evolve/ Code: github.com/SakanaAI/Shink… Like AlphaEvolve and its variants, our framework leverages LLMs to…



はち reposted

New from Meta FAIR: Code World Model (CWM), a 32B-parameter research model designed to explore how world models can transform code generation and reasoning about code. We believe in advancing research in world modeling and are sharing CWM under a research license to help empower…


はち reposted

New on the Anthropic Engineering blog: writing effective tools for LLM agents. AI agents are only as powerful as the tools we give them. So how do we make those tools more effective? We share our best tips for developers: anthropic.com/engineering/wr…


Loading...

Something went wrong.


Something went wrong.