
Jiashuo Liu
@liujiashuo77
Research Scientist at ByteDance Seed | Advanced & Interesting LLM/Agent Evaluation. Opinions are my own.
내가 좋아할 만한 콘텐츠
We built FutureX, the world’s first live benchmark for real future prediction — politics, economy, culture, sports, etc. Among 23 AI agents, #Grok4 ranked #1 🏆 Elon didn’t lie. @elonmusk your model sees further 🚀🍀 LeaderBoard: futurex-ai.github.io

Agreed...
Anthropic CEO Dario Amodei on Open-Source AI Models. "I don't think open source works the same way in AI that it has worked in other areas. Primarily because with open source you can see the source code of the model. Here we can't see inside the model, it's often called open…
Grok has been underestimated for long, but clearly, it's among top-tier models now. We've investigated and analyzed Grok 4's search and reasoning patterns, which're really impressive! Looking forward to Grok 5 now:)
Introducing Grok 4 Fast, a multimodal reasoning model with a 2M context window that sets a new standard for cost-efficient intelligence. Available for free on grok.com, grok.x.com, iOS and Android apps, and OpenRouter. x.ai/news/grok-4-fa…
Week 2 Update of FutureX: - In our latest weekly leaderboard, surprisingly, MiroMind's open-source deep research agent plus GPT-5 got 1st place! Congrats @miromind_ai - ChatGPT-Agent slightly outperformed Grok4. Competition is tough! - Claude Opus 4.1 (submitted by independent…

We built FutureX, the world’s first live benchmark for real future prediction — politics, economy, culture, sports, etc. Among 23 AI agents, #Grok4 ranked #1 🏆 Elon didn’t lie. @elonmusk your model sees further 🚀🍀 LeaderBoard: futurex-ai.github.io

Check this ⬇️ super cool pipeline!
FutureX Results I am now officially an AI researcher. Something interesting in the results: I beat on level 1 and 2 but lost on level 3 and 4. In this first week, I fed everything into a single context which likely reduced the amount of searches per query required to best…

Wow thanks Elon! Yes, we think it's a measure of agent's AGI! Again, introduce our FutureX live benchmark. futurex-ai.github.io
United States 트렌드
- 1. #Worlds2025 51.4K posts
- 2. #LoveYourW2025 28.2K posts
- 3. Raila 179K posts
- 4. And the Word 74.7K posts
- 5. #100T N/A
- 6. Yamamoto 50.4K posts
- 7. #DWTS 45.8K posts
- 8. Young Republicans 83.2K posts
- 9. #MOST_WANTED_IN_CHICAGO 1,690 posts
- 10. halsey 9,756 posts
- 11. Chris Kreider N/A
- 12. Jared Butler N/A
- 13. Lucia 59.6K posts
- 14. Tami 4,671 posts
- 15. George Floyd 35.9K posts
- 16. Vivian 30.4K posts
- 17. Vishnu 9,007 posts
- 18. Totodile 4,314 posts
- 19. Politico 320K posts
- 20. The Dodgers 50.1K posts
내가 좋아할 만한 콘텐츠
-
Jindong Wang
@jd92wang -
Violet Peng
@VioletNPeng -
Dinghuai Zhang 张鼎怀
@zdhnarsil -
Huaxiu Yao
@HuaxiuYaoML -
Shuang Li
@ShuangL13799063 -
Wei Jin
@weisshelter -
Shuyan Zhou
@shuyanzhxyc -
Yikang Shen
@Yikang_Shen -
Haoran Liu
@Haoran89332647 -
Haoxiang Wang
@Haoxiang__Wang -
Zixin Wen
@Zixin_Wen -
Shanda Li 黎善达
@Shanda_Li_2000 -
Yifei Wang
@yifeiwang77 -
Tianyu Pang
@TianyuPang1 -
Zonghan Yang
@yang_zonghan
Something went wrong.
Something went wrong.