Zack Li-Nexa AI

@zacklearner

Co-founder and CTO at Nexa AI, Industrial Veteran from Google & Amazon, and Stanford alumni. Committed to lifelong learning and advancing AI technology.

Bergabung pada Oktober 2021

176Postingan 305Pengikut 362Mengikuti

Zack Li-Nexa AI

@zacklearner

11 Okt

Thanks @simonw for mentioning our work! We continue to compress and prune gpt-oss such that it can fit in latest iPhone. More exciting updates to come soon!

TIL you can run GPT-OSS 20B on a phone! This is on Snapdragon phones with 16GB or more of GPU-accessible memory - I didn't realize they had the same unified CPU-GPU memory trick that Apple Silicon has (The largest iPhone 17 still maxes out at 12GB, so not enough RAM to run…

Zack Li-Nexa AI memposting ulang

NEXA AI

@nexa_ai

10 Okt

Thrilled to speak and demo at @IBM #TechXchange in Orlando this week! @alanzhuly shared how we’re advancing the frontier of on-device AI — showcasing: ⚡ IBM Granite 4.0 running lightning-fast on @Qualcomm NPU — the first Day-0 model support in NPU history. 💻 Hyperlink, the…

nexa_ai's tweet image. Thrilled to speak and demo at @IBM #TechXchange in Orlando this week!
@alanzhuly shared how we’re advancing the frontier of on-device AI — showcasing:

⚡ IBM Granite 4.0 running lightning-fast on @Qualcomm NPU — the first Day-0 model support in NPU history.
💻 Hyperlink, the…

Zack Li-Nexa AI memposting ulang

NEXA AI

@nexa_ai

6 Okt

Sam Altman recently said: “GPT-OSS has strong real-world performance comparable to o4-mini—and you can run it locally on your phone.” Many believed running a 20B-parameter model on mobile devices was still years away. At Nexa AI, we’ve built our foundation on deep on-device AI…

Zack Li-Nexa AI

@zacklearner

4 Okt

Thrilled to share that we just launched another Day 0 support: Qwen3-VL-30B-A3B-Instruct is now live on NexaSDK with full Apple Silicon GPU support. You can try it right now with: nexa infer NexaAI/qwen3vl-30B-A3B-mlx Thanks @JustinLin610 for partnership and @awnihannun for…

NEXA AI

@nexa_ai

4 Okt

🚀 Day 0 Support — Qwen3-VL-30B-A3B-Instruct on NexaSDK We’re excited to announce Day 0 support for Qwen3-VL-30B-A3B-Instruct, a breakthrough in multimodal intelligence, now running natively on NexaSDK. We’ve added full support for the MLX Engine on @Apple Silicon GPUs,…

Zack Li-Nexa AI

@zacklearner

2 Okt

🚀 Granite-4.0 day-0 support with NexaML According to our partners, this is the first-ever NPU day-0 support for a new LLM — Granite 4.0 is live on Qualcomm NPUs with NexaSDK. With NexaSDK, you can run Granite 4.0 seamlessly across NPU, GPU, and CPU — and switch between them…

NEXA AI

@nexa_ai

2 Okt

Run Granite 4 today on NPU, GPU, and CPU with NexaSDK github.com/NexaAI/nexa-sdk

nexa_ai's tweet card. Run the latest LLMs and VLMs across GPU, NPU, and CPU with PC (Python/C++) & mobile (Android & iOS) support, running quickly with OpenAI gpt-oss, Granite4, Qwen3VL, Gemma 3n and mor...

GitHub - NexaAI/nexa-sdk: Run the latest LLMs and VLMs across GPU, NPU, and CPU with PC (Python/C...

Sumber: github.com

Zack Li-Nexa AI memposting ulang

Joshua Timothy

@_JoshuaTimothy

1 Okt

Running AI models locally on the @Snapdragon X Elite just went to a WHOLE NEW LEVEL thanks to @nexa_ai! I can't wait to be able to test this on the X2 Elite Extreme which will be up to 78% faster. Check this out on the @Surface Laptop

Zack Li-Nexa AI memposting ulang

NEXA AI

@nexa_ai

30 Sep

Pyannote's brand-new model, speaker-diarization-community-1, now runs on Qualcomm NPU with NexaSDK (Day-0 Support). It identifies who speaks when — the core building block for transcription (meetings, healthcare, calls, intelligence) and media processing like dubbing. We are…

Zack Li-Nexa AI memposting ulang

NEXA AI

@nexa_ai

28 Sep

Screenshots and photos aren’t just memories — they’re the notes we try to keep: slides from a talk, posters from an event, a page from a book, a receipt, or a chat we want to revisit. The problem: they pile up. The insight gets lost. Hyperlink fixes this: - Ask your photos in…

Zack Li-Nexa AI memposting ulang

NEXA AI

@nexa_ai

27 Sep

Nexa SDK now supports Intel NPU server inference! With nexa serve, you can now run real-time, private, local AI directly on Intel AI Boost NPU — all through an OpenAI-compatible API. This builds on our unified architecture for CPU, GPU, and NPU, ensuring seamless developer…

Zack Li-Nexa AI memposting ulang

NEXA AI

@nexa_ai

25 Sep

Nexa AI is proud to be featured at Qualcomm Snapdragon Summit 2025 🚀 In partnership with @Qualcomm, Nexa AI is delivering day-zero NPU support on @Snapdragon X2 Elite platforms. With one unified SDK, developers and OEMs no longer need to juggle fragmented tools or complex…

nexa_ai's tweet image. Nexa AI is proud to be featured at Qualcomm Snapdragon Summit 2025 🚀

In partnership with @Qualcomm, Nexa AI is delivering day-zero NPU support on @Snapdragon X2 Elite platforms. With one unified SDK, developers and OEMs no longer need to juggle fragmented tools or complex…

Zack Li-Nexa AI

@zacklearner

24 Sep

NexaSDK brings Intel NPU, GPU, and CPU into one stack for real-time, private, local AI. Star us: github.com/NexaAI/nexa-sdk Try it: sdk.nexa.ai/model/Llama3.2…

zacklearner's tweet card. Nexa SDK lets developers run LLMs, multimodal, ASR & TTS models across PC, mobile, automotive, and IoT. Fast, private, and production-ready on NPU, GPU, and CPU.

Nexa SDK | On-Device AI Toolkit

Sumber: sdk.nexa.ai

NEXA AI

@nexa_ai

24 Sep

NexaSDK now runs on Intel NPU, GPU, and CPU with one unified stack. On Intel Lunar Lake (Arc 140V GPU + AI Boost NPU), Llama3.2-3B reaches: - NPU: 31.5 tok/s - GPU: 21.4 tok/s - CPU: 12.4 tok/s Real-time, private, local AI on Intel PCs—one SDK, one API, one installer

Zack Li-Nexa AI memposting ulang

NEXA AI

@nexa_ai

19 Sep

We’re excited to announce that EmbeddingGemma-300M now runs on the Nexa SDK with the Qualcomm Hexagon NPU — powered by the NexaML Engine. 🔎 Model Overview EmbeddingGemma is a 300M-parameter open embedding model from Google DeepMind, built on the Gemma/Gemini foundation. 🌍…

Zack Li-Nexa AI

@zacklearner

15 Sep

Now with CPU, GPU, and Snapdragon NPU support in one unified architecture—packed into a lightweight 60MB installer. No more juggling installers, APIs, or backend-specific builds. ⭐ If Nexa SDK helps you, give us a star: GitHub: github.com/NexaAI/nexa-sdk Blog:…

zacklearner's tweet card. Run the latest LLMs and VLMs across GPU, NPU, and CPU with PC (Python/C++) & mobile (Android & iOS) support, running quickly with OpenAI gpt-oss, Granite4, Qwen3VL, Gemma 3n and mor...

GitHub - NexaAI/nexa-sdk: Run the latest LLMs and VLMs across GPU, NPU, and CPU with PC (Python/C...

Sumber: github.com

NEXA AI

@nexa_ai

15 Sep

We’re excited to announce a major upgrade to the Nexa SDK—now delivering CPU, GPU, and NPU support in one unified architecture, with a lightweight installer of just 60MB on Snapdragon laptops. Until now, AI developers had to manage multiple installers, mismatched APIs, and…

Zack Li-Nexa AI

@zacklearner

14 Sep

Try Nexa SDK server for Mac CPU & GPU, our NPU support is coming soon!

NEXA AI

@nexa_ai

14 Sep

We're thrilled to share that Nexa SDK supports starting a server for multimodal inference, with full support for both MLX and GGUF models using nexa serve. Below demo shows how server works on Macbook: 🔹 MLX model inference We run NexaAI/gemma-3n-E4B-it-4bit-MLX locally on a…

Zack Li-Nexa AI memposting ulang

Qualcomm

@Qualcomm

11 Sep

On-device #AI is accelerating fast. With @nexa_ai, we’re tapping OmniNeural 4B and NexaML Engine directly into our Qualcomm Hexagon NPU, bringing scalable, multimodal intelligence to mobile, IoT & beyond. Learn more: bit.ly/3VLfSsq

Qualcomm's tweet card. Discover the breakthrough combination of OmniNeural-4B and nNexaML, optimized for Qualcomm Hexagon NPU. Learn how this multimodal AI solution is transforming the future of AI processing and innovat...

OmniNeural-4B & NexaML: innovating Multimodal AI on Qualcomm Hexagon NPU

Sumber: qualcomm.com

Zack Li-Nexa AI memposting ulang

Qualcomm

@Qualcomm

12 Sep

This week in #AI 🔵 Qualcomm and @nexa_ai bring multimodal on-device AI to phones, cars, PCs, and more powered by Qualcomm Hexagon NPU: bit.ly/3VLfSsq 🔵 @TheRegister spoke with Qualcomm VP Upendra Kulkarni about how #SnapdragonXSeries is driving a shift in personal…

Qualcomm's tweet image. This week in #AI

🔵 Qualcomm and @nexa_ai bring multimodal on-device AI to phones, cars, PCs, and more powered by Qualcomm Hexagon NPU: bit.ly/3VLfSsq

🔵 @TheRegister spoke with Qualcomm VP Upendra Kulkarni about how #SnapdragonXSeries is driving a shift in personal…

Zack Li-Nexa AI

@zacklearner

13 Sep

Nexa AI's Hyperlink product turns local AI models into real productivity tools—pick from Hugging Face, point them at your folders, and get insights in each model’s unique voice. Check below video : Qwen3-1.7B for speed + clarity, and GPT-OSS for deep, rigorous reasoning.

NEXA AI

@nexa_ai

13 Sep

Hyperlink is the easiest way to make local AI models actually useful on your computer. → Pick model easily from @huggingface → Let it access your local folders → Get insights flavored by each model’s unique “personality” We’ve been loving: @Qwen3-1.7B — speed + clarity,…

Zack Li-Nexa AI

@zacklearner

12 Sep

🚀 Nexa SDK now lets you host a local multimodal AI inference server — right on your device. 🔹 Ecosystem support • GGUF — compact, quantized for efficient local inference • MLX — lightweight, optimized for Apple Silicon 🔹 Platform support • CPU & GPU — run GGUF + MLX models…

NEXA AI

@nexa_ai

12 Sep

🚀 We’re excited to anounce that Nexa SDK now enables you to run a local host server for multimodal AI inference — directly on-device, with full support for CPU, GPU, and @Qualcomm NPU. We support two of the most important open-source model ecosystems: - GGUF models — compact,…

Zack Li-Nexa AI

@zacklearner

11 Sep

🚀 Excited to share that Nexa AI’s OmniNeural model and NexaML Engine have been officially featured by Qualcomm on their blog and social channels! 1. OmniNeural-4B — the world’s first truly NPU-native multimodal large model, enabling AI agents to run directly on-device without…