#llama_cpp kết quả tìm kiếm
slow progress is better than no progress. studying llama.cpp legacy quantization functions for today.
RCE exploit in llama.cpp’s RPC server via heap-overflow retr0.blog/blog/llama-rpc… #cybersecurity #llama
took only a couple minutes to get llama.cpp up and running on the dgx spark. gpt-oss-120b-mxfp4 at 41 t/s. Using 73gb of ram + maybe 50ish gigs of cache. Not super sure how ram works on this thing yet.. I combined and simplified the commands from a couple tutorials that…
【速報🎉】あの「Olmo3」モデルが、みなさんのPCで動くように! #llama_cpp にマージ完了でローカルAIがさらに進化しました!🚀✨ 高性能AIを手軽に、安全に使いたい願いが叶うニュースです!✨ 新AIモデル「Olmo3」が、オープンソース #llama_cpp に無事マージ!🎉…
ラウマ🦌🌙 Lauma コスプレ / cosplay 原神 / GenshinImpact ⟡.· その旅に月光の導きがあらんことを ⟡˖· 📸@miya_kamecamera #Lauma #菈乌玛 #GenshinImpactcosplay
Yesterday my changes to the LLaMA C++ file format were approved. Here's how folks in the community have been reacting to my work. github.com/ggerganov/llam…
@ollama 终于能够支持 vulkan 了,支持 vulkan 是 llama.cpp 早就支持的,不知道 lmstudio 早就能用了。你们真的是太慢了。
【8GB VRAMでも爆速!】MOEモデルが皆さんのPCで動く!?驚異の #llama_cpp パフォーマンス!😳 「高価なGPUがないと生成AIは厳しい…」そんな常識、もう過去の話かもしれませんね! なんと8GB VRAMのGPUでも大規模なMOEモデルが驚きの速度で動作するベンチマーク結果が報告されましたよ!✨…
ollama alternatives > lmstudio > llama.cpp > exllamav2/v3 > vllm > sglang among many others like literally anything is better than ollama lmao
Experimental Kimi K2 support landed in LM Studio's latest beta llama.cpp engine
NOW BOOSTED 🚀 $LLAMA $LLAMA is the first graduated token from Chakra, the AI agent and token launchpad on Cardano. Each token is paired with $INDY from @Indigo_protocol.
The tool-calling issue also exists in GLM-4.6 GGUFs and isn’t unique to our quants. We’ve already notified the GLM team, and there’s an old PR in llama.cpp that addresses it: github.com/ggml-org/llama… We also did fix chat templates in GLM-4.6 (eg 2nd prompt would error). Some…
Looks like someone made support for qwen next architecture in llama.cpp. I tested it with Q8 and F16 both went well. github.com/cturan/llama.c… Btw the official support still on going here github.com/ggml-org/llama…
I made this #RAGnrock a #flutter app for macos, using #llama_cpp with #gemma to search internet and make reports
If you really want to understand how LLMs work, try coding your own version of one from scratch. And that's exactly what you'll do in this course: build a Llama 4-like LLM from the bottom up. You'll build a tokenizer, learn about the attention mechanism, dive into Rotary…
ブログ記事更新【ローカルLLM導入】MacのターミナルでGPT-OSSを実用レベルで動かす!`llama.cpp` + GGUF量子化モデル + GPU(Metal)活用で、メモリ48GBの壁を超えました。ここから研究実用への道を模索します! note.com/gz_note/n/n83b… #ローカルLLM #llama_cpp #AI開発
Remote-Code Execution from a heap overflow in Llama.cpp retr0.blog/blog/llama-rpc… #cybersecurity #llama
【8GB VRAMでも爆速!】MOEモデルが皆さんのPCで動く!?驚異の #llama_cpp パフォーマンス!😳 「高価なGPUがないと生成AIは厳しい…」そんな常識、もう過去の話かもしれませんね! なんと8GB VRAMのGPUでも大規模なMOEモデルが驚きの速度で動作するベンチマーク結果が報告されましたよ!✨…
【速報🎉】あの「Olmo3」モデルが、みなさんのPCで動くように! #llama_cpp にマージ完了でローカルAIがさらに進化しました!🚀✨ 高性能AIを手軽に、安全に使いたい願いが叶うニュースです!✨ 新AIモデル「Olmo3」が、オープンソース #llama_cpp に無事マージ!🎉…
ローカルLLMは「メモリ設計+最適化」が決め手。int4量子化で8Bは約4GB、FlashAttention 3で注意機構が最大約3倍高速化。 文脈長もコスト要因(128kでは8Bのfp16で文脈メモリ≒重み)。実装はLlama.cpp/Ollama/Unsloth+API抽象化とルータ活用が実務的。#Ollama #llama_cpp
I made this #RAGnrock a #flutter app for macos, using #llama_cpp with #gemma to search internet and make reports
ブログ記事更新【ローカルLLM導入】MacのターミナルでGPT-OSSを実用レベルで動かす!`llama.cpp` + GGUF量子化モデル + GPU(Metal)活用で、メモリ48GBの壁を超えました。ここから研究実用への道を模索します! note.com/gz_note/n/n83b… #ローカルLLM #llama_cpp #AI開発
Google 致力于通过与开发者社区合作,确保 Gemma 3n 的广泛兼容性,支持 #HuggingFace、#llama_cpp、#Ollama、#MLX 等众多热门工具和平台。诚邀开发者参与 #Gemma3nImpactChallenge,共同利用其设备端、离线、多模态特性,构建改善世界的产品,赢取 $15 万奖金。(6/6)
Running local AI? Just launched: llama-optimus — automatic performance tuning for llama.cpp! Find your maximum tokens/s for prompt processing or generation in minutes. 🔗 GitHub: BrunoArsioli/llama-optimus 🔗 PyPI: llama-optimus Unleashing local AI #llama_cpp #Optuna #LocalAI
Want to run Llama 4 Scout cost-effectively? Our blog shows you how to leverage RTX 6000 Ada GPUs with llama.cpp as a more accessible alternative to the pricey H100. See how: blog.us.fixstars.com/?p=763 #llama_cpp #RTX6000Ada #TechTips
blog.us.fixstars.com
Using llama.cpp to run Llama 4 Scout on RTX 6000 Ada - Fixstars Corporation Tech Blog
In a previous verification, we used a server equipped with an NVIDIA H100 GPU to run Llama 4 Scout. The H100 is expensive, and its implementation locations are limited due to power consumption and...
🚀 Exciting news for AI developers! The merge of PR #11556 in llama.cpp unlocks tool calls for DeepSeek-R1, paving the way for robust local AI workflows like automated proofreading. Dive into the future of AI with OpenWebUI! #AI #DeepLearning #llama_cpp … ift.tt/rjPQs0R
#MistralSmall24B-Instruct is a really nice model to run locally for Coding Advice, Summarizing or Creative Writing. With a recent #llama_cpp on a #GeForce #RTX4090 at Q8, the 24GB VRAM is tightly maxed out and I am getting 7-9 token/s.
5/n: HuggingFace: huggingface.co/mistralai/Mist…
🚀 llama.cpp now supports Qwen2VL, a powerful multimodal model. This addition expands llama.cpp's capabilities in vision-language tasks, joining other supported models like LLaVA and BakLLaVA. #AI #MachineLearning #llama_cpp github.com/ggerganov/llam…
FYI GGUF is now following a naming convention of `<Model>-<Version>-<ExpertsCount>x<Parameters>-<EncodingScheme>-<ShardNum>-of-<ShardTotal>.gguf` github.com/ggerganov/ggml… #gguf #llm #llama_cpp #huggingface #llama #ai
Thanks to Josh Ramer for contributing a debug helper script to #llama_cpp which will help in debugging a specific test in GDB. This will help improve maintainer experience in improving the stability of the llama.cpp project! github.com/ggerganov/llam… github.com/josh-ramer #LLMs
I got tired of fighting with copy-and-pasting mangled webpages into #ChatGPT and #llama_cpp for discussion, so I put together a tiny website that converts HTML into Markdown. This has obvious uses for #Wikipedia, #GitHub and other services. htmltomarkdown.top
Just ran my own #ChatGPT instance on my laptop and it blew my mind! An open-source alternative to Stanford's #ALPACA Model, with 7B parameters running on my i5 processor without a GPU! Generated a romantic poem and a short story with all the feels. #LLAMA_cpp rocks! 🤯💻📚❤️
【速報🎉】あの「Olmo3」モデルが、みなさんのPCで動くように! #llama_cpp にマージ完了でローカルAIがさらに進化しました!🚀✨ 高性能AIを手軽に、安全に使いたい願いが叶うニュースです!✨ 新AIモデル「Olmo3」が、オープンソース #llama_cpp に無事マージ!🎉…
【8GB VRAMでも爆速!】MOEモデルが皆さんのPCで動く!?驚異の #llama_cpp パフォーマンス!😳 「高価なGPUがないと生成AIは厳しい…」そんな常識、もう過去の話かもしれませんね! なんと8GB VRAMのGPUでも大規模なMOEモデルが驚きの速度で動作するベンチマーク結果が報告されましたよ!✨…
I made this #RAGnrock a #flutter app for macos, using #llama_cpp with #gemma to search internet and make reports
Just ran my own #ChatGPT instance on my laptop and it blew my mind! An open-source alternative to Stanford's #ALPACA Model, with 7B parameters running on my i5 processor without a GPU! Generated a romantic poem and a short story with all the feels. #LLAMA_cpp rocks! 🤯💻📚❤️
After a loooong battle, finally got my llama.cpp + CUDA setup fully working, including linking llama-cpp-python! 🚀 Debugging CMake, FindCUDAToolkit, and nested lib paths was a wild ride. But the GPU inference speed? Totally worth it! 💪 #CUDA #llama_cpp #GPU #AI #LLM #BuildFixes
Something went wrong.
Something went wrong.
United States Trends
- 1. Wemby 85.3K posts
- 2. Spurs 52.1K posts
- 3. #QueenRadio 21.3K posts
- 4. Cooper Flagg 13.8K posts
- 5. Mavs 18.7K posts
- 6. Clippers 11.8K posts
- 7. Victor Wembanyama 28.1K posts
- 8. Anthony Edwards 6,748 posts
- 9. Dillon Brooks 1,301 posts
- 10. Anthony Davis 6,941 posts
- 11. VJ Edgecombe 26.4K posts
- 12. Maxey 11.7K posts
- 13. #PorVida 2,667 posts
- 14. Suns 16.5K posts
- 15. Embiid 14.4K posts
- 16. Sixers 25K posts
- 17. Blazers 3,989 posts
- 18. Lavine 1,449 posts
- 19. Klay 8,145 posts
- 20. Jazz 23.7K posts