turboderp

@turboderp_

Joined June 2023

We just added Tensor Parallelism to TabbyAPI! Huge thanks to @turboderp_ and testers who made this possible. Now, enjoy the clip diving into the origins of Exllama. Wanna see TabbyAPI built live? Follow me on twitch: kingbri1st

turboderp reposted

kingbri

@kingbri1st

Jul 13

1000 stars on tabbyAPI. Holy crap. Huge thanks to @turboderp and everyone who contributed!

turboderp reposted

kingbri

@kingbri1st

May 11

TabbyAPI now supports ExllamaV3 with automatic backend detection! 🎉 Please note that exl3 is being actively worked on and mileage may vary compared to exl2 Thanks to @turboderp_ and all contributors for making this a reality.

turboderp

@turboderp_

Apr 25

I have decided to tweet today. So here is a visualization of how the paged cache works with continuous batching in ExLlamaV3. I think it's neat. #🐈

turboderp

@turboderp_

Apr 7

Seems to still be true that larger models are less sensitive to quantization. Here is Mistral-Large 123B at 1.4 bits per weight, running on one 24 GB GPU. #AI or something

turboderp

@turboderp_

Apr 6

\____

$turboderp_'s tweet image. \____$

turboderp

@turboderp_

Apr 6

I made a thing. github.com/turboderp-org/…

turboderp_'s tweet card. An optimized quantization and inference library for running LLMs locally on modern consumer-class GPUs - GitHub - turboderp-org/exllamav3: An optimized quantization and inference library for runni...

GitHub - turboderp-org/exllamav3: An optimized quantization and inference library for running LLMs...

Source: github.com

turboderp

@turboderp_

Mar 5

CUDA builds character 😭

turboderp reposted

kingbri

@kingbri1st

Dec 5

Supply chain alert! Don't use the comfyUI Impact pack! Its dependency ultralytics has been compromised on pypi. Thanks Shinon for letting me know in Discord. github.com/ultralytics/ul…

kingbri1st's tweet card. Bug Code in the published wheel 8.3.41 is not what's in GitHub and appears to invoke mining. Users of ultralytics who install 8.3.41 will unknowingly execute an xmrig miner. Examining the file ...

Discrepancy between what's in GitHub and what's been published to PyPI for v8.3.41 · Issue #18027 ·...

Source: github.com

turboderp

@turboderp_

Nov 23

Fun with grounding in Qwen2-VL. Finding the things. #wherearethethings #exllamav2 #cat

turboderp reposted

kingbri

@kingbri1st

Nov 22

TabbyAPI now supports vision. Thanks to @turboderp_ for exllamav2's updates and DocShotgun for the initial work. Any Exl2 supported vision model works, but this release focuses on Pixtral from @MistralAI

turboderp reposted

kingbri

@kingbri1st

Nov 10

1 year ago, I made TabbyAPI with @turboderp_ as a side project. Now, it's my most popular side project. I wanted to break away from the bloated nature of AIO local model backends and just run #exllama. Thanks to all the contributors and testers. github.com/theroyallab/ta…

kingbri1st's tweet card. The official API server for Exllama. OAI compatible, lightweight, and fast. - theroyallab/tabbyAPI

GitHub - theroyallab/tabbyAPI: The official API server for Exllama. OAI compatible, lightweight,...

Source: github.com

turboderp

@turboderp_

Jun 20, 2024

huggingface.co/turboderp/llam…

turboderp

@turboderp_

Jun 13, 2024

I performed a successful vocabulary transplant on Qwen2-0.5B and turned it into a useful draft model for Llama-3. What a time to be alive. #hashtag huggingface.co/turboderp/Qwam…

turboderp/Qwama-0.5B-Instruct · Hugging Face

Source: huggingface.co

turboderp

@turboderp_

May 10, 2024

Llama-3-instruct becomes much more useful when you censor some of its catchphrases. #simplesolutions etc. 🤷

turboderp reposted

Mike Lacher

@mikelacher

Feb 6, 2024

New project: goody2.ai GOODY-2 is an AI model that's so responsible it won't give a straight answer to anything.

turboderp reposted

Daniel van Strien

@vanstriendaniel

Jan 3, 2024

🚀 TACO: a new benchmark for code generation from @BAAIBeijing with 26,443 problems. • 🤖 English questions & Python solutions • 🧠 Ideal for evaluating code generation from natural language • 📊 Train: 25,443 samples, Test: 1,000 samples • 📚 Diverse difficulty levels

turboderp

@turboderp_

Dec 30, 2023

I guess I should post something once in a while. So here's a whole chatbot in 26 lines of Python running Mixtral 8x7B real fast on one 3090. Idk, I think it's neat. 🐈