🚨 DeepSeek just did something wild. They built an OCR system that compresses long text into vision tokens literally turning paragraphs into pixels. Their model, DeepSeek-OCR, achieves 97% decoding precision at 10× compression and still manages 60% accuracy even at 20×. That…

godofprompt's tweet image. 🚨 DeepSeek just did something wild.

They built an OCR system that compresses long text into vision tokens  literally turning paragraphs into pixels.

Their model, DeepSeek-OCR, achieves 97% decoding precision at 10× compression and still manages 60% accuracy even at 20×. That…

🚀 DeepSeek-OCR — the new frontier of OCR from @deepseek_ai , exploring optical context compression for LLMs, is running blazingly fast on vLLM ⚡ (~2500 tokens/s on A100-40G) — powered by vllm==0.8.5 for day-0 model support. 🧠 Compresses visual contexts up to 20× while keeping…

vllm_project's tweet image. 🚀 DeepSeek-OCR — the new frontier of OCR from @deepseek_ai , exploring optical context compression for LLMs, is running blazingly fast on vLLM ⚡ (~2500 tokens/s on A100-40G) — powered by vllm==0.8.5 for day-0 model support.

🧠 Compresses visual contexts up to 20× while keeping…
vllm_project's tweet image. 🚀 DeepSeek-OCR — the new frontier of OCR from @deepseek_ai , exploring optical context compression for LLMs, is running blazingly fast on vLLM ⚡ (~2500 tokens/s on A100-40G) — powered by vllm==0.8.5 for day-0 model support.

🧠 Compresses visual contexts up to 20× while keeping…
vllm_project's tweet image. 🚀 DeepSeek-OCR — the new frontier of OCR from @deepseek_ai , exploring optical context compression for LLMs, is running blazingly fast on vLLM ⚡ (~2500 tokens/s on A100-40G) — powered by vllm==0.8.5 for day-0 model support.

🧠 Compresses visual contexts up to 20× while keeping…

DeepSeek-OCR looks impressive, but its core idea is not new. Input “Text” as “Image” — already explored by: LANGUAGE MODELING WITH PIXELS (Phillip et al., ICLR 2023) CLIPPO: Image-and-Language Understanding from Pixels Only (Michael et al. CVPR 2023) Pix2Struct: Screenshot…

awinyimgprocess's tweet image. DeepSeek-OCR looks impressive, but its core idea is not new.

Input “Text” as “Image” — already explored by:
LANGUAGE MODELING WITH PIXELS (Phillip et al., ICLR 2023)
CLIPPO: Image-and-Language Understanding from Pixels Only (Michael et al. CVPR 2023)
Pix2Struct: Screenshot…
awinyimgprocess's tweet image. DeepSeek-OCR looks impressive, but its core idea is not new.

Input “Text” as “Image” — already explored by:
LANGUAGE MODELING WITH PIXELS (Phillip et al., ICLR 2023)
CLIPPO: Image-and-Language Understanding from Pixels Only (Michael et al. CVPR 2023)
Pix2Struct: Screenshot…
awinyimgprocess's tweet image. DeepSeek-OCR looks impressive, but its core idea is not new.

Input “Text” as “Image” — already explored by:
LANGUAGE MODELING WITH PIXELS (Phillip et al., ICLR 2023)
CLIPPO: Image-and-Language Understanding from Pixels Only (Michael et al. CVPR 2023)
Pix2Struct: Screenshot…
awinyimgprocess's tweet image. DeepSeek-OCR looks impressive, but its core idea is not new.

Input “Text” as “Image” — already explored by:
LANGUAGE MODELING WITH PIXELS (Phillip et al., ICLR 2023)
CLIPPO: Image-and-Language Understanding from Pixels Only (Michael et al. CVPR 2023)
Pix2Struct: Screenshot…

The big blue whale is back with something wild this time! DeepSeek built an OCR model that can compress text by 10x using vision tokens. Let me explain: They had a core insight - A picture containing text requires far fewer tokens to represent than the raw text itself. Now,…


DeepSeek released an OCR model today. Their motivation is really interesting: they want to use visual modality as an efficient compression medium for textual information, and use this to solve long-context challenges in LLMs. Of course, they are using it to get more training…

iScienceLuvr's tweet image. DeepSeek released an OCR model today. 

Their motivation is really interesting: they want to use visual modality as an efficient compression medium for textual information, and use this to solve long-context challenges in LLMs.

Of course, they are using it to get more training…

After 5 years of building APIs in Django, I decided to learn FastAPI by building a video calling application. Login and Signup APIs complete ✅ Next step: implementing web sockets I also plan to build a React js frontend for it.

DelightGbolahan's tweet image. After 5 years of building APIs in Django, I decided to learn FastAPI by building a video calling application.

Login and Signup APIs complete ✅
Next step: implementing web sockets

I also plan to build a React js frontend for it.
DelightGbolahan's tweet image. After 5 years of building APIs in Django, I decided to learn FastAPI by building a video calling application.

Login and Signup APIs complete ✅
Next step: implementing web sockets

I also plan to build a React js frontend for it.

Deepseek has developed a breakthrough OCR system that compresses image-based text documents so LLMs can handle much longer contexts with far less compute. Instead of using raw text, it processes documents as images, reducing token count by up to 10× while retaining 97% of the…

WesRothMoney's tweet image. Deepseek has developed a breakthrough OCR system that compresses image-based text documents so LLMs can handle much longer contexts with far less compute. 

Instead of using raw text, it processes documents as images, reducing token count by up to 10× while retaining 97% of the…
WesRothMoney's tweet image. Deepseek has developed a breakthrough OCR system that compresses image-based text documents so LLMs can handle much longer contexts with far less compute. 

Instead of using raw text, it processes documents as images, reducing token count by up to 10× while retaining 97% of the…
WesRothMoney's tweet image. Deepseek has developed a breakthrough OCR system that compresses image-based text documents so LLMs can handle much longer contexts with far less compute. 

Instead of using raw text, it processes documents as images, reducing token count by up to 10× while retaining 97% of the…

BOOOOOOOM! CHINA DEEPSEEK DOES IT AGAIN! An entire encyclopedia compressed into a single, high-resolution image! — A mind-blowing breakthrough. DeepSeek-OCR, unleashed an electrifying 3-billion-parameter vision-language model that obliterates the boundaries between text and…

BrianRoemmele's tweet image. BOOOOOOOM!

CHINA DEEPSEEK DOES IT AGAIN!

An entire encyclopedia compressed into a single, high-resolution image!

—

A mind-blowing breakthrough. DeepSeek-OCR, unleashed an electrifying 3-billion-parameter vision-language model that obliterates the boundaries between text and…

🚨 DeepSeek just dropped one of the most important AI papers of 2025. They built an OCR system that compresses long text into vision tokens literally turning paragraphs into pixels. Their model, DeepSeek-OCR, achieves 97% decoding precision at 10× compression and still manages…

thisdudelikesAI's tweet image. 🚨 DeepSeek just dropped one of the most important AI papers of 2025.

They built an OCR system that compresses long text into vision tokens  literally turning paragraphs into pixels.

Their model, DeepSeek-OCR, achieves 97% decoding precision at 10× compression and still manages…

what a bold direction by deepseek once again. they took "a picture is worth a thousand words" literally or the idea of "photographic memory" if i am to commit the crime of anthropomorphisation.

tokenbender's tweet image. what a bold direction by deepseek once again. 
they took "a picture is worth a thousand words" literally or the idea of "photographic memory" if i am to commit the crime of anthropomorphisation.

DeepSeek-OCR is out! 🔥 my take ⤵️ > pretty insane it can parse and re-render charts in HTML > it uses CLIP and SAM features concatenated, so better grounding > very efficient per vision tokens/performance ratio > covers 100 languages

mervenoyann's tweet image. DeepSeek-OCR is out! 🔥 my take ⤵️ 
> pretty insane it can parse and re-render charts in HTML
> it uses CLIP and SAM features concatenated, so better grounding
> very efficient per vision tokens/performance ratio
> covers 100 languages

(1/n) 🚀 With FastVideo, you can now generate a 5-second video in 5 seconds on a single H200 GPU! Introducing FastWan series, a family of fast video generation models trained via a new recipe we term as “sparse distillation”, to speed up video denoising time by 70X! 🖥️ Live…


DeepSeek just dropped a new OCR model! And this isn't about OCR. We've all heard "a picture is worth a thousand words." DeepSeek literally proved it. They've built a breakthrough in AI memory compression that could change how models handle long contexts. The core idea:…

akshay_pachaar's tweet image. DeepSeek just dropped a new OCR model!

And this isn't about OCR.

We've all heard "a picture is worth a thousand words." DeepSeek literally proved it.

They've built a breakthrough in AI memory compression that could change how models handle long contexts.

The core idea:…

deepseek released a new OCR model demonstrating a way to compress images into a smaller set of vision tokens 10× smaller while still achieving 97% accuracy, even at 20× compression retains around 60% accuracy can generate 200k+ pages/day on a single A100-40G !!

HarveenChadha's tweet image. deepseek released a new OCR model demonstrating a way to compress images into a smaller set of vision tokens

10× smaller while still achieving 97% accuracy, even at 20× compression retains around 60% accuracy

can generate 200k+ pages/day on a single A100-40G !!

This is the JPEG moment for AI. Optical compression doesn't just make context cheaper. It makes AI memory architectures viable. Training data bottlenecks? Solved. - 200k pages/day on ONE GPU - 33M pages/day on 20 nodes - Every multimodal model is data-constrained. Not anymore.…

RayFernando1337's tweet image. This is the JPEG moment for AI.

Optical compression doesn't just make context cheaper. It makes AI memory architectures viable.

Training data bottlenecks? Solved.
- 200k pages/day on ONE GPU
- 33M pages/day on 20 nodes
- Every multimodal model is data-constrained. Not anymore.…

Deliver personalized experiences faster with Altudo’s #FastLane Accelerator. Built on #SitecoreXMCloud, this AI-enhanced framework unlocks full value with reusable components and Figma-to-Storybook workflows. Discover more: altudo.co/insights/blogs… #DigitalExperience #AltudoBlogs

GoAltudo's tweet image. Deliver personalized experiences faster with Altudo’s #FastLane Accelerator. Built on #SitecoreXMCloud, this AI-enhanced framework unlocks full value with reusable components and Figma-to-Storybook workflows. Discover more: altudo.co/insights/blogs…
#DigitalExperience #AltudoBlogs

DeepSeek-OCR Contexts Optical Compression

_akhaliq's tweet image. DeepSeek-OCR

Contexts Optical Compression

DeepSeek-OCR just dropped. 🔥 Sets a new standard for open-source OCR A 3B-parameter vision-language model designed for high-performance optical character recognition and structured document conversion. - Can parse and re-render charts in HTML - Optical Context Compression:…

rohanpaul_ai's tweet image. DeepSeek-OCR just dropped. 🔥

Sets a new standard for open-source OCR

A 3B-parameter vision-language model designed for high-performance optical character recognition and structured document conversion. 

-  Can parse and re-render charts in HTML

- Optical Context Compression:…

"Decoding UTF8 with Parallel Extract - a nerd's dream. Branchless decoder, 29 instructions. It's compliant, sweet, and full source code available. #UTF8 #CodeNerd #FastDecoder" nrk.neocities.org/articles/utf8-…


ไม่พบผลลัพธ์สำหรับ "#fastdecoder"
ไม่พบผลลัพธ์สำหรับ "#fastdecoder"
Loading...

Something went wrong.


Something went wrong.


United States Trends