zacklearner's profile picture. Co-founder and CTO at Nexa AI, Industrial Veteran from Google & Amazon, and Stanford alumni. Committed to lifelong learning and advancing AI technology.

Zack Li-Nexa AI

@zacklearner

Co-founder and CTO at Nexa AI, Industrial Veteran from Google & Amazon, and Stanford alumni. Committed to lifelong learning and advancing AI technology.

Thanks @simonw for mentioning our work! We continue to compress and prune gpt-oss such that it can fit in latest iPhone. More exciting updates to come soon!

TIL you can run GPT-OSS 20B on a phone! This is on Snapdragon phones with 16GB or more of GPU-accessible memory - I didn't realize they had the same unified CPU-GPU memory trick that Apple Silicon has (The largest iPhone 17 still maxes out at 12GB, so not enough RAM to run…



Zack Li-Nexa AI memposting ulang

Thrilled to speak and demo at @IBM #TechXchange in Orlando this week! @alanzhuly shared how we’re advancing the frontier of on-device AI — showcasing: ⚡ IBM Granite 4.0 running lightning-fast on @Qualcomm NPU — the first Day-0 model support in NPU history. 💻 Hyperlink, the…

nexa_ai's tweet image. Thrilled to speak and demo at @IBM #TechXchange in Orlando this week!
@alanzhuly shared how we’re advancing the frontier of on-device AI — showcasing:
 
⚡ IBM Granite 4.0 running lightning-fast on @Qualcomm NPU — the first Day-0 model support in NPU history.
💻 Hyperlink, the…

Zack Li-Nexa AI memposting ulang

Sam Altman recently said: “GPT-OSS has strong real-world performance comparable to o4-mini—and you can run it locally on your phone.” Many believed running a 20B-parameter model on mobile devices was still years away. At Nexa AI, we’ve built our foundation on deep on-device AI…


Thrilled to share that we just launched another Day 0 support: Qwen3-VL-30B-A3B-Instruct is now live on NexaSDK with full Apple Silicon GPU support. You can try it right now with: nexa infer NexaAI/qwen3vl-30B-A3B-mlx Thanks @JustinLin610 for partnership and @awnihannun for…

🚀 Day 0 Support — Qwen3-VL-30B-A3B-Instruct on NexaSDK We’re excited to announce Day 0 support for Qwen3-VL-30B-A3B-Instruct, a breakthrough in multimodal intelligence, now running natively on NexaSDK. We’ve added full support for the MLX Engine on @Apple Silicon GPUs,…



🚀 Granite-4.0 day-0 support with NexaML According to our partners, this is the first-ever NPU day-0 support for a new LLM — Granite 4.0 is live on Qualcomm NPUs with NexaSDK. With NexaSDK, you can run Granite 4.0 seamlessly across NPU, GPU, and CPU — and switch between them…


Zack Li-Nexa AI memposting ulang

Running AI models locally on the @Snapdragon X Elite just went to a WHOLE NEW LEVEL thanks to @nexa_ai! I can't wait to be able to test this on the X2 Elite Extreme which will be up to 78% faster. Check this out on the @Surface Laptop


Zack Li-Nexa AI memposting ulang

Pyannote's brand-new model, speaker-diarization-community-1, now runs on Qualcomm NPU with NexaSDK (Day-0 Support). It identifies who speaks when — the core building block for transcription (meetings, healthcare, calls, intelligence) and media processing like dubbing. We are…


Zack Li-Nexa AI memposting ulang

Screenshots and photos aren’t just memories — they’re the notes we try to keep: slides from a talk, posters from an event, a page from a book, a receipt, or a chat we want to revisit. The problem: they pile up. The insight gets lost. Hyperlink fixes this: - Ask your photos in…


Zack Li-Nexa AI memposting ulang

Nexa SDK now supports Intel NPU server inference! With nexa serve, you can now run real-time, private, local AI directly on Intel AI Boost NPU — all through an OpenAI-compatible API. This builds on our unified architecture for CPU, GPU, and NPU, ensuring seamless developer…


Zack Li-Nexa AI memposting ulang

Nexa AI is proud to be featured at Qualcomm Snapdragon Summit 2025 🚀 In partnership with @Qualcomm, Nexa AI is delivering day-zero NPU support on @Snapdragon X2 Elite platforms. With one unified SDK, developers and OEMs no longer need to juggle fragmented tools or complex…

nexa_ai's tweet image. Nexa AI is proud to be featured at Qualcomm Snapdragon Summit 2025 🚀

In partnership with @Qualcomm, Nexa AI is delivering day-zero NPU support on @Snapdragon X2 Elite platforms. With one unified SDK, developers and OEMs no longer need to juggle fragmented tools or complex…
nexa_ai's tweet image. Nexa AI is proud to be featured at Qualcomm Snapdragon Summit 2025 🚀

In partnership with @Qualcomm, Nexa AI is delivering day-zero NPU support on @Snapdragon X2 Elite platforms. With one unified SDK, developers and OEMs no longer need to juggle fragmented tools or complex…
nexa_ai's tweet image. Nexa AI is proud to be featured at Qualcomm Snapdragon Summit 2025 🚀

In partnership with @Qualcomm, Nexa AI is delivering day-zero NPU support on @Snapdragon X2 Elite platforms. With one unified SDK, developers and OEMs no longer need to juggle fragmented tools or complex…

NexaSDK brings Intel NPU, GPU, and CPU into one stack for real-time, private, local AI. Star us: github.com/NexaAI/nexa-sdk Try it: sdk.nexa.ai/model/Llama3.2…

NexaSDK now runs on Intel NPU, GPU, and CPU with one unified stack. On Intel Lunar Lake (Arc 140V GPU + AI Boost NPU), Llama3.2-3B reaches: - NPU: 31.5 tok/s - GPU: 21.4 tok/s - CPU: 12.4 tok/s Real-time, private, local AI on Intel PCs—one SDK, one API, one installer



Zack Li-Nexa AI memposting ulang

We’re excited to announce that EmbeddingGemma-300M now runs on the Nexa SDK with the Qualcomm Hexagon NPU — powered by the NexaML Engine. 🔎 Model Overview EmbeddingGemma is a 300M-parameter open embedding model from Google DeepMind, built on the Gemma/Gemini foundation. 🌍…


Now with CPU, GPU, and Snapdragon NPU support in one unified architecture—packed into a lightweight 60MB installer. No more juggling installers, APIs, or backend-specific builds. ⭐ If Nexa SDK helps you, give us a star: GitHub: github.com/NexaAI/nexa-sdk Blog:…

We’re excited to announce a major upgrade to the Nexa SDK—now delivering CPU, GPU, and NPU support in one unified architecture, with a lightweight installer of just 60MB on Snapdragon laptops. Until now, AI developers had to manage multiple installers, mismatched APIs, and…



Try Nexa SDK server for Mac CPU & GPU, our NPU support is coming soon!

We're thrilled to share that Nexa SDK supports starting a server for multimodal inference, with full support for both MLX and GGUF models using nexa serve. Below demo shows how server works on Macbook: 🔹 MLX model inference We run NexaAI/gemma-3n-E4B-it-4bit-MLX locally on a…



Zack Li-Nexa AI memposting ulang

On-device #AI is accelerating fast. With @nexa_ai, we’re tapping OmniNeural 4B and NexaML Engine directly into our Qualcomm Hexagon NPU, bringing scalable, multimodal intelligence to mobile, IoT & beyond. Learn more: bit.ly/3VLfSsq


Zack Li-Nexa AI memposting ulang

This week in #AI​ 🔵 Qualcomm and @nexa_ai bring multimodal on-device AI to phones, cars, PCs, and more powered by Qualcomm Hexagon NPU: bit.ly/3VLfSsq ​ 🔵 @TheRegister spoke with Qualcomm VP Upendra Kulkarni about how #SnapdragonXSeries is driving a shift in personal…

Qualcomm's tweet image. This week in #AI​

🔵 Qualcomm and @nexa_ai bring multimodal on-device AI to phones, cars, PCs, and more powered by Qualcomm Hexagon NPU: bit.ly/3VLfSsq ​

🔵 @TheRegister spoke with Qualcomm VP Upendra Kulkarni about how #SnapdragonXSeries is driving a shift in personal…

Nexa AI's Hyperlink product turns local AI models into real productivity tools—pick from Hugging Face, point them at your folders, and get insights in each model’s unique voice. Check below video : Qwen3-1.7B for speed + clarity, and GPT-OSS for deep, rigorous reasoning.

Hyperlink is the easiest way to make local AI models actually useful on your computer. → Pick model easily from @huggingface → Let it access your local folders → Get insights flavored by each model’s unique “personality” We’ve been loving: @Qwen3-1.7B — speed + clarity,…



🚀 Nexa SDK now lets you host a local multimodal AI inference server — right on your device. 🔹 Ecosystem support • GGUF — compact, quantized for efficient local inference • MLX — lightweight, optimized for Apple Silicon 🔹 Platform support • CPU & GPU — run GGUF + MLX models…

🚀 We’re excited to anounce that Nexa SDK now enables you to run a local host server for multimodal AI inference — directly on-device, with full support for CPU, GPU, and @Qualcomm NPU. We support two of the most important open-source model ecosystems: - GGUF models — compact,…



🚀 Excited to share that Nexa AI’s OmniNeural model and NexaML Engine have been officially featured by Qualcomm on their blog and social channels! 1. OmniNeural-4B — the world’s first truly NPU-native multimodal large model, enabling AI agents to run directly on-device without…

On-device #AI is accelerating fast. With @nexa_ai, we’re tapping OmniNeural 4B and NexaML Engine directly into our Qualcomm Hexagon NPU, bringing scalable, multimodal intelligence to mobile, IoT & beyond. Learn more: bit.ly/3VLfSsq



United States Tren

Loading...

Something went wrong.


Something went wrong.