Apache TVM

@ApacheTVM

Open deep learning compiler stack for CPUs, GPUs and specialized accelerators. Join us for the TVM and Deep Learning Compilation Conference http://tvmcon.org

tvm.apache.org

Joined January 2018

671Posts 4KFollowers 929Following

You might like

@tqchenml

@tydsh

@OpenMMLab

@lm_zheng

@songhan_mit

@robertnishihara

@StefanoErmon

@luisceze

@profjoeyg

@cerebras

@haozhangml

@wightmanr

@JiaZhihao

@jekbradbury

@jiayq

Pinned

Apache TVM

@ApacheTVM

Feb 18, 2022

ICYMI, all of the sessions from #tvmcon are available for streaming! Catch up on the latest advances, case studies, and tutorials in #ML acceleration from the @ApacheTVM community. tvmcon.org

Apache TVM reposted

#MLSys2026 is inviting self-nominations for the External Review Committee (ERC)! If you want to contribute to the review process for the MLSys conference, nominate yourself and help shape this year's program. We especially welcome PhD students and early-career researchers!…

Apache TVM reposted

Tianqi Chen

@tqchenml

Oct 28

🧵Reflecting a bit after @PyTorch conference. ML compilers becoming "toolkits" rather than monolithic piece. Their target are also sub-modules that must interoperates with other pieces. This is THE biggest mindset difference from traditional compilers.

Apache TVM reposted

vLLM

@vllm_project

Oct 21

We are excited about an open ABI and FFI for ML Systems from @tqchenml. In our experience with vLLM, such interop layer is definitely needed!

Tianqi Chen

@tqchenml

Oct 21

📢Excited to introduce Apache TVM FFI, an open ABI and FFI for ML systems, enabling compilers, libraries, DSLs, and frameworks to naturally interop with each other. Ship one library across pytorch, jax, cupy etc and runnable across python, c++, rust tvm.apache.org/2025/10/21/tvm…

tqchenml's tweet image. 📢Excited to introduce Apache TVM FFI, an open ABI and FFI for ML systems, enabling compilers, libraries, DSLs, and frameworks to naturally interop with each other. Ship one library across pytorch, jax, cupy etc and runnable across python, c++, rust tvm.apache.org/2025/10/21/tvm…

Apache TVM reposted

XGBoost

@XGBoostProject

Oct 21

This is a solution that comes out from a lot our early lessons in building XGBoost, an open ABI foundation would definitely help to advance the ecosystem together

Tianqi Chen

@tqchenml

Oct 21

Apache TVM reposted

Ruihang Lai

@ruihanglai

Oct 21

TVM FFI captures the core and foundational insights we’ve gained from years of ML systems research. Can't wait to see such an open ABI enable new possibilities across systems and platforms 🎉

Tianqi Chen

@tqchenml

Oct 21

Apache TVM reposted

Zhihao Jia

@JiaZhihao

Oct 21

Great work! This kind of interoperability will help unlock new cross-compiler optimizations to push kernel performance to the extreme.

Tianqi Chen

@tqchenml

Oct 21

Apache TVM reposted

Tianqi Chen

@tqchenml

Oct 21

Apache TVM reposted

Tianqi Chen

@tqchenml

Oct 21

🚀Excited to launch FlashInfer Bench. We believe AI has the potential to help build LLM systems . To accelerate the path, we need an open schema for critical workloads and an AI-driven virtuous circle. First-class integration with FlashInfer, SGLang and vLLM support👉

Shanli Xing

@shanli_xing

Oct 21

🤔 Can AI optimize the systems it runs on? 🚀 Introducing FlashInfer-Bench, a workflow that makes AI systems self-improving with agents: - Standardized signature for LLM serving kernels - Implement kernels with your preferred language - Benchmark them against real-world serving…

shanli_xing's tweet image. 🤔 Can AI optimize the systems it runs on?

🚀 Introducing FlashInfer-Bench, a workflow that makes AI systems self-improving with agents:

- Standardized signature for LLM serving kernels
- Implement kernels with your preferred language
- Benchmark them against real-world serving…

Apache TVM reposted

Shanli Xing

@shanli_xing

Oct 21

Apache TVM reposted

Tianqi Chen

@tqchenml

Sep 19

Checkout how speculative decoding and #XGrammar can work together to get efficient and accurate structured outputs. github.com/NVIDIA/TensorR…

tqchenml's tweet card. TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor...

TensorRT-LLM/docs/source/blogs/tech_blog/blog12_Combining_Guided_Decoding_and_Speculative_Decodin...

Source: github.com

Apache TVM reposted

Tianqi Chen

@tqchenml

Aug 26

The new semester is here at CMU, excited to co-teach with @Tim_Dettmers , to offer our fun course again on "Build Your Mini-PyTorch (needle) from scratch, then build neural networks on top". (Deep Learning Systems) Check out dlsyscourse.org to learn more

tqchenml's tweet card. Algorithms and Implementation

Deep Learning Systems

Source: dlsyscourse.org

Apache TVM reposted

Yixin Dong

@yi_xin_dong

Aug 21

We’re excited to announce that XGrammar has partnered with Outlines! 🎉 XGrammar is now the grammar backend powering Outlines, enabling structured LLM generation with higher speed. Check out Outlines — an amazing library for LLM structured text generation! 🚀

yi_xin_dong's tweet image. We’re excited to announce that XGrammar has partnered with Outlines! 🎉
XGrammar is now the grammar backend powering Outlines, enabling structured LLM generation with higher speed.

Check out Outlines — an amazing library for LLM structured text generation! 🚀

Apache TVM reposted

Zhihao Jia

@JiaZhihao

Aug 20

🚀Excited to share the #MLSys Call for Papers! For the first time, we’re also welcoming submissions to the Industrial Track. Research and industrial track deadline: Oct 30, 2025 Reviews available: Jan 12, 2026 Author responses: Jan 16, 2026 Notifications: Jan 25, 2026…

Minjia Zhang

@_Minjia_Zhang_

Aug 18

Calling industry researchers: MLSys 2026 launches its first Industrial Track! 🚀 We're excited to announce the inaugural Call for Industrial Track Papers at MLSys 2026! 🎉 👉 mlsys.org/Conferences/20…) This is a unique opportunity for industry researchers and practitioners to…

Apache TVM reposted

Tianqi Chen

@tqchenml

Aug 18

MLSys infrastructure (compilers, inference engines, runtimes, GPU accelerations, and more) is at the heart of the AI revolution today, and AI has the potential to empower the system revolution itself. #MLSys2026 launches inaugural industry track, consider submit your paper!

Minjia Zhang

@_Minjia_Zhang_

Aug 18

Apache TVM reposted

Minjia Zhang

@_Minjia_Zhang_

Aug 18

Apache TVM reposted

Tianqi Chen

@tqchenml

Jun 25

#MLSys2026 will be led by the general chair @luisceze and PC chairs @JiaZhihao and @achowdhery. The conference will be held in Bellevue on Seattle's east side. Consider submitting and bringing your latest works in AI and systems—more details at mlsys.org.

Zhihao Jia

@JiaZhihao

Jun 25

📢Exciting updates from #MLSys2025! All session recordings are now available and free to watch at mlsys.org. We’re also thrilled to announce that #MLSys2026 will be held in Seattle next May—submissions open next month with a deadline of Oct 30. We look forward to…

JiaZhihao's tweet image. 📢Exciting updates from #MLSys2025! All session recordings are now available and free to watch at mlsys.org.
We’re also thrilled to announce that #MLSys2026 will be held in Seattle next May—submissions open next month with a deadline of Oct 30. We look forward to…

Apache TVM reposted

Zhihao Jia

@JiaZhihao

Jun 25

Apache TVM reposted

Xinyu Yang

@Xinyu2ML

Jun 16

🚀 Super excited to share Multiverse! 🏃 It’s been a long journey exploring the space between model design and hardware efficiency. What excites me most is realizing that, beyond optimizing existing models, we can discover better model architectures by embracing system-level…

Infini-AI-Lab

@InfiniAILab

Jun 16

🔥 We introduce Multiverse, a new generative modeling framework for adaptive and lossless parallel generation. 🚀 Multiverse is the first open-source non-AR model to achieve AIME24 and AIME25 scores of 54% and 46% 🌐 Website: multiverse4fm.github.io 🧵 1/n

Apache TVM reposted

Tianqi Chen

@tqchenml

Jun 16

Checkout the technical deep dive on FlashInfer

NVIDIA AI Developer

@NVIDIAAIDev

Jun 16

🔍 Our Deep Dive Blog Covering our Winning MLSys Paper on FlashInfer Is now live ➡️ nvda.ws/3ZA1Hca Accelerate LLM inference with FlashInfer—NVIDIA’s high-performance, JIT-compiled library built for ultra-efficient transformer inference on GPUs. Go under the hood with…

NVIDIAAIDev's tweet image. 🔍 Our Deep Dive Blog Covering our Winning MLSys Paper on FlashInfer Is now live ➡️ nvda.ws/3ZA1Hca

Accelerate LLM inference with FlashInfer—NVIDIA’s high-performance, JIT-compiled library built for ultra-efficient transformer inference on GPUs.

Go under the hood with…

Apache TVM reposted

Beidi Chen

@BeidiChen

Jun 16

Say hello to Multiverse — the Everything Everywhere All At Once of generative modeling. 💥 Lossless, adaptive, and gloriously parallel 🌀 Now open-sourced: multiverse4fm.github.io I was amazed how easily we could extract the intrinsic parallelism of even SOTA autoregressive…