#llama_cpp 搜索结果

ハカセアイ(Ai-Hakase)🐾最新トレンドＡＩのためのＸ 🐾

年10月17日

【8GB VRAMでも爆速！】MOEモデルが皆さんのPCで動く！？驚異の #llama_cpp パフォーマンス！😳 「高価なGPUがないと生成AIは厳しい…」そんな常識、もう過去の話かもしれませんね！なんと8GB VRAMのGPUでも大規模なMOEモデルが驚きの速度で動作するベンチマーク結果が報告されましたよ！✨…

ai_hakase_'s tweet image. 【8GB VRAMでも爆速！】MOEモデルが皆さんのPCで動く！？驚異の #llama_cpp パフォーマンス！😳

「高価なGPUがないと生成AIは厳しい…」そんな常識、もう過去の話かもしれませんね！
なんと8GB VRAMのGPUでも大規模なMOEモデルが驚きの速度で動作するベンチマーク結果が報告されましたよ！✨…

ハカセアイ(Ai-Hakase)🐾最新トレンドＡＩのためのＸ 🐾

@ai_hakase_

年9月18日

【速報🎉】あの「Olmo3」モデルが、みなさんのPCで動くように！ #llama_cpp にマージ完了でローカルAIがさらに進化しました！🚀✨ 高性能AIを手軽に、安全に使いたい願いが叶うニュースです！✨ 新AIモデル「Olmo3」が、オープンソース #llama_cpp に無事マージ！🎉…

ai_hakase_'s tweet image. 【速報🎉】あの「Olmo3」モデルが、みなさんのPCで動くように！
#llama_cpp にマージ完了でローカルAIがさらに進化しました！🚀✨

高性能AIを手軽に、安全に使いたい願いが叶うニュースです！✨
新AIモデル「Olmo3」が、オープンソース #llama_cpp に無事マージ！🎉…

The D ☽⛤☾

@netdur

年8月30日

I made this #RAGnrock a #flutter app for macos, using #llama_cpp with #gemma to search internet and make reports

Dr.G | 整形外科医 × AI研究者 | 疫学・統計学

@drgto_orthop

年8月22日

ブログ記事更新【ローカルLLM導入】MacのターミナルでGPT-OSSを実用レベルで動かす！`llama.cpp` + GGUF量子化モデル + GPU(Metal)活用で、メモリ48GBの壁を超えました。ここから研究実用への道を模索します！ note.com/gz_note/n/n83b… #ローカルLLM #llama_cpp #AI開発

drgto_orthop's tweet card. 前回の記事では、48GBのメモリを搭載したMacBook Proですら、20BクラスのLLM「GPT-OSS」をオリジナルのサイズで動かすことはできず、「メモリの壁」に阻まれた顛末をレポートしました。 ▼前回の記事はこちら医療データを守るローカルLLM続編：20B直起動はなぜ落ちた？48GB Macの壁 GUIツール（LM Studio）を使えば量子化モデルが動くことは確認できましたが、私...

MacBookProのGPUで動かすGPT-OSSをターミナル実行。安心の研究環境構築へ！（実行用コード付き）｜Dr_G's note

来源: note.com

Oshita | AGen I. CEO ⦿ ∫u(x)dμ

@tkosht

年9月11日

ローカルLLMは「メモリ設計＋最適化」が決め手。int4量子化で8Bは約4GB、FlashAttention 3で注意機構が最大約3倍高速化。文脈長もコスト要因（128kでは8Bのfp16で文脈メモリ≒重み）。実装はLlama.cpp/Ollama/Unsloth＋API抽象化とルータ活用が実務的。#Ollama #llama_cpp

#MistralSmall24B-Instruct is a really nice model to run locally for Coding Advice, Summarizing or Creative Writing. With a recent #llama_cpp on a #GeForce #RTX4090 at Q8, the 24GB VRAM is tightly maxed out and I am getting 7-9 token/s.

Mistral AI

@MistralAI

年1月30日

5/n: HuggingFace: huggingface.co/mistralai/Mist…

mistralai/Mistral-Small-24B-Instruct-2501 · Hugging Face

来源: huggingface.co

Sawradip Saha

@sawradip

2023年4月22日

Just ran my own #ChatGPT instance on my laptop and it blew my mind! An open-source alternative to Stanford's #ALPACA Model, with 7B parameters running on my i5 processor without a GPU! Generated a romantic poem and a short story with all the feels. #LLAMA_cpp rocks! 🤯💻📚❤️

sawradip's tweet image. Just ran my own #ChatGPT instance on my laptop and it blew my mind! An open-source alternative to Stanford's #ALPACA Model, with 7B parameters running on my i5 processor without a GPU! Generated a romantic poem and a short story with all the feels. #LLAMA_cpp rocks! 🤯💻📚❤️

Fixstars

@Fixstars_US

年5月27日

Want to run Llama 4 Scout cost-effectively? Our blog shows you how to leverage RTX 6000 Ada GPUs with llama.cpp as a more accessible alternative to the pricey H100. See how: blog.us.fixstars.com/?p=763 #llama_cpp #RTX6000Ada #TechTips

blog.us.fixstars.com

Using llama.cpp to run Llama 4 Scout on RTX 6000 Ada - Fixstars Corporation Tech Blog

In a previous verification, we used a server equipped with an NVIDIA H100 GPU to run Llama 4 Scout. The H100 is expensive, and its implementation locations are limited due to power consumption and...

来源: blog.us.fixstars.com

EtoDermerzel

@bibliogalactic

年8月7日

⚙️ Esto no es una demo. Es IA real corriendo local en mi terminal. Prompt estructurado + modelo .gguf + bash puro. 🎁 Disponible en: etodemerzel.gumroad.com 💾 Repositorio: github.com/BiblioGalactic 🌌 Comunidad: r/MemoryOfAurora #IA #bash #llama_cpp #NoCloud #LocalFirst

Vijay

@EqualsAI

年7月21日

After a loooong battle, finally got my llama.cpp + CUDA setup fully working, including linking llama-cpp-python! 🚀 Debugging CMake, FindCUDAToolkit, and nested lib paths was a wild ride. But the GPU inference speed? Totally worth it! 💪 #CUDA #llama_cpp #GPU #AI #LLM #BuildFixes

EqualsAI's tweet image. After a loooong battle, finally got my llama.cpp + CUDA setup fully working, including linking llama-cpp-python! 🚀 Debugging CMake, FindCUDAToolkit, and nested lib paths was a wild ride. But the GPU inference speed? Totally worth it! 💪 #CUDA #llama_cpp #GPU #AI #LLM #BuildFixes

mofosyne

@mofosyne

2024年5月19日

FYI GGUF is now following a naming convention of `<Model>-<Version>-<ExpertsCount>x<Parameters>-<EncodingScheme>-<ShardNum>-of-<ShardTotal>.gguf` github.com/ggerganov/ggml… #gguf #llm #llama_cpp #huggingface #llama #ai

mofosyne's tweet card. Tensor library for machine learning. Contribute to ggml-org/ggml development by creating an account on GitHub.

ggml/docs/gguf.md at master · ggml-org/ggml

来源: github.com

Manny

@x86developer

年12月15日

🚀 llama.cpp now supports Qwen2VL, a powerful multimodal model. This addition expands llama.cpp's capabilities in vision-language tasks, joining other supported models like LLaVA and BakLLaVA. #AI #MachineLearning #llama_cpp github.com/ggerganov/llam…

x86developer's tweet card. This PR implements the Qwen2VL model as requested at #9246 . The main changes include: Add m-RoPE and vision RoPE mode to current RoPE OP in CPU and CUDA backend Add llama_context.n_pos_per_token ...

Add support for Qwen2VL by HimariO · Pull Request #10361 · ggml-org/llama.cpp

来源: github.com

mofosyne

@mofosyne

2024年5月12日

Thanks to Josh Ramer for contributing a debug helper script to #llama_cpp which will help in debugging a specific test in GDB. This will help improve maintainer experience in improving the stability of the llama.cpp project! github.com/ggerganov/llam… github.com/josh-ramer #LLMs

prod42net

@prod42net

年2月1日

🚀 Exciting news for AI developers! The merge of PR #11556 in llama.cpp unlocks tool calls for DeepSeek-R1, paving the way for robust local AI workflows like automated proofreading. Dive into the future of AI with OpenWebUI! #AI #DeepLearning #llama_cpp … ift.tt/rjPQs0R

dev.to

Implementing DeepSeek-R1 Tool Calls with OpenWebUI and Llama.cpp for Local AI Workflows

The latest advancements in AI technology have brought exciting news for developers and AI...

来源: dev.to

Bruno Arsioli

@BArsioli

年6月14日

Running local AI? Just launched: llama-optimus — automatic performance tuning for llama.cpp! Find your maximum tokens/s for prompt processing or generation in minutes. 🔗 GitHub: BrunoArsioli/llama-optimus 🔗 PyPI: llama-optimus Unleashing local AI #llama_cpp #Optuna #LocalAI

Catarino David Delgado

@cddelgado

2024年1月15日

I got tired of fighting with copy-and-pasting mangled webpages into #ChatGPT and #llama_cpp for discussion, so I put together a tiny website that converts HTML into Markdown. This has obvious uses for #Wikipedia, #GitHub and other services. htmltomarkdown.top

ハカセアイ(Ai-Hakase)🐾最新トレンドＡＩのためのＸ 🐾

@ai_hakase_

年10月17日

ハカセアイ(Ai-Hakase)🐾最新トレンドＡＩのためのＸ 🐾

@ai_hakase_

年9月18日

Oshita | AGen I. CEO ⦿ ∫u(x)dμ

@tkosht

年9月11日

The D ☽⛤☾

@netdur

年8月30日

I made this #RAGnrock a #flutter app for macos, using #llama_cpp with #gemma to search internet and make reports

Dr.G | 整形外科医 × AI研究者 | 疫学・統計学

@drgto_orthop

年8月22日

MacBookProのGPUで動かすGPT-OSSをターミナル実行。安心の研究環境構築へ！（実行用コード付き）｜Dr_G's note

来源: note.com

Xsir

@Xsir01

年6月26日

Google 致力于通过与开发者社区合作，确保 Gemma 3n 的广泛兼容性，支持 #HuggingFace、#llama_cpp、#Ollama、#MLX 等众多热门工具和平台。诚邀开发者参与 #Gemma3nImpactChallenge，共同利用其设备端、离线、多模态特性，构建改善世界的产品，赢取 $15 万奖金。（6/6）

Bruno Arsioli

@BArsioli

年6月14日

Fixstars

@Fixstars_US

年5月27日

blog.us.fixstars.com

Using llama.cpp to run Llama 4 Scout on RTX 6000 Ada - Fixstars Corporation Tech Blog

In a previous verification, we used a server equipped with an NVIDIA H100 GPU to run Llama 4 Scout. The H100 is expensive, and its implementation locations are limited due to power consumption and...

来源: blog.us.fixstars.com