#dsa_python 搜索结果

How does @deepseek_ai Sparse Attention (DSA) work? It has 2 components: the Lightning Indexer and Sparse Multi-Latent Attention (MLA). The indexer keeps a small key cache of 128 per token (vs. 512 for MLA). It scores incoming queries. The top-2048 tokens to pass to Sparse MLA.

vllm_project's tweet image. How does @deepseek_ai Sparse Attention (DSA) work? 

It has 2 components: the Lightning Indexer and Sparse Multi-Latent Attention (MLA). The indexer keeps a small key cache of 128 per token (vs. 512 for MLA). It scores incoming queries. The top-2048 tokens to pass to Sparse MLA.

🚀 Introducing DeepSeek-V3.2-Exp — our latest experimental model! ✨ Built on V3.1-Terminus, it debuts DeepSeek Sparse Attention(DSA) for faster, more efficient training & inference on long context. 👉 Now live on App, Web, and API. 💰 API prices cut by 50%+! 1/n



DSA is the most important for anyone learning programming But not easy to master. I made a complete Data Structure and algorithms Handwritten Note. ( Worth $45) But, for 24 hours, It's 100% FREE. To get it, just: → Like & Retweet → Reply "DSA" → Follow me (so I can DM)

Ayzacoder's tweet image. DSA is the most important for anyone learning programming

But not easy to master.

I made a complete Data Structure and algorithms Handwritten Note. ( Worth $45)

But, for 24 hours, It's 100% FREE.

To get it, just:

→ Like & Retweet 
→ Reply "DSA"
→ Follow me (so I can DM)

DeepSeek V3.2 breakdown 1. Sparse attention via lightning indexer + top_k attention 2. Uses V3.1 Terminus + 1T continued pretraining tokens 3. 5 specialized models (coding, math etc) via RL then distillation for final ckpt 4. GRPO. Reward functions for length penalty, language…

danielhanchen's tweet image. DeepSeek V3.2 breakdown
1. Sparse attention via lightning indexer + top_k attention
2. Uses V3.1 Terminus + 1T continued pretraining tokens
3. 5 specialized models (coding, math etc) via RL then distillation for final ckpt
4. GRPO. Reward functions for length penalty, language…

TLDR; The PSF has made the decision to put our community and our shared diversity, equity, and inclusion values ahead of seeking $1.5M in new revenue. Please read and share. pyfound.blogspot.com/2025/10/NSF-fu… 🧵


V3.2 with DeepSeek Sparse Attention gets much better efficiency⚡️.

zizhpan's tweet image. V3.2 with DeepSeek Sparse Attention gets much better efficiency⚡️.

🚀 Introducing DeepSeek-V3.2-Exp — our latest experimental model! ✨ Built on V3.1-Terminus, it debuts DeepSeek Sparse Attention(DSA) for faster, more efficient training & inference on long context. 👉 Now live on App, Web, and API. 💰 API prices cut by 50%+! 1/n



Complete Advance DSA Resources in One Place👇 From beginner sheets to advanced problem-solving guides, logic-building notes, and real-world DSA applications , everything you need to master Data Structures & Algorithms is HERE! Perfect for coding prep & product-based interviews.…

AvinashSingh_20's tweet image. Complete Advance DSA Resources in One Place👇

From beginner sheets to advanced problem-solving guides, logic-building notes, and real-world DSA applications , everything you need to master Data Structures & Algorithms is HERE!

Perfect for coding prep & product-based interviews.…

⚡️ Efficiency Gains 🤖 DSA achieves fine-grained sparse attention with minimal impact on output quality — boosting long-context performance & reducing compute cost. 📊 Benchmarks show V3.2-Exp performs on par with V3.1-Terminus. 2/n

deepseek_ai's tweet image. ⚡️ Efficiency Gains

🤖 DSA achieves fine-grained sparse attention with minimal impact on output quality — boosting long-context performance & reducing compute cost.
📊 Benchmarks show V3.2-Exp performs on par with V3.1-Terminus.

2/n
deepseek_ai's tweet image. ⚡️ Efficiency Gains

🤖 DSA achieves fine-grained sparse attention with minimal impact on output quality — boosting long-context performance & reducing compute cost.
📊 Benchmarks show V3.2-Exp performs on par with V3.1-Terminus.

2/n

If you are in 3 year, ignore outside noises and focus on DSA. DSA will help you crack OA and interviews. Companies still ask DSA and focus on it. It's better for your future.


Become a DSA Master with this Handwritten note! Valued at $45, but yours for 24 hours only, completely FREE! Steps: 1. Like & Retweet 2. Reply "DSA" 3. Follow @Ayzacoder so l can DM Let's level up together!

Ayzacoder's tweet image. Become a DSA Master with this Handwritten note!

Valued at $45, but yours for 24 hours only, completely FREE!

Steps:
1. Like & Retweet
2. Reply "DSA"
3. Follow @Ayzacoder  so l can DM

Let's level up together!

DeepSeek 3.2? Wait, What? 👀

ivanfioravanti's tweet image. DeepSeek 3.2? Wait, What? 👀

💻 API Update 🎉 Lower costs, same access! 💰 DeepSeek API prices drop 50%+, effective immediately. 🔹 For comparison testing, V3.1-Terminus remains available via a temporary API until Oct 15th, 2025, 15:59 (UTC Time). Details: api-docs.deepseek.com/guides/compari… 🔹 Feedback welcome:…

deepseek_ai's tweet image. 💻 API Update

🎉 Lower costs, same access!
💰 DeepSeek API prices drop 50%+, effective immediately.

🔹 For comparison testing, V3.1-Terminus remains available via a temporary API until Oct 15th, 2025, 15:59 (UTC Time). Details: api-docs.deepseek.com/guides/compari…
🔹 Feedback welcome:…

Company wise DSA interview and exam questions link - drive.google.com/drive/mobile/f… like , rt for visibility ❤️


The sparse attention in the new DeepSeek v3.2 is quite simple. Here's a little sketch. - You have a full attention layer (or MLA as in DSV3). - You also have a lite-attention layer which only computes query-key scores. - From the lite layer you get the top-k indices for the each…

awnihannun's tweet image. The sparse attention in the new DeepSeek v3.2 is quite simple. Here's a little sketch.

- You have a full attention layer (or MLA as in DSV3).
- You also have a lite-attention layer which only computes query-key scores.
- From the lite layer you get the top-k indices for the each…

Gave a talk on @DSPyOSS at @pyconfinland @ploneconf. The reactions were pretty good, very few knew about it and gladly I could showcase the goldmine we're missing out on! Here's the slide deck - docs.google.com/presentation/d… Recordings coming soon.

jayitabhattac11's tweet image. Gave a talk on @DSPyOSS at @pyconfinland @ploneconf. The reactions were pretty good, very few knew about it and gladly I could showcase the goldmine we're missing out on!

Here's the slide deck - docs.google.com/presentation/d…

Recordings coming soon.

💥 DSA to nowe ACTA – tylko znacznie groźniejsze. Wtedy walczyliśmy o wolność w sieci, dziś musimy walczyć o prawo do prawdy. Bruksela pod hasłem „walki z dezinformacją” chce przejąć kontrolę nad tym, co widzisz, co piszesz i co myślisz. To nie jest walka z nienawiścią – to walka…

DariuszMatecki's tweet image. 💥 DSA to nowe ACTA – tylko znacznie groźniejsze. Wtedy walczyliśmy o wolność w sieci, dziś musimy walczyć o prawo do prawdy. Bruksela pod hasłem „walki z dezinformacją” chce przejąć kontrolę nad tym, co widzisz, co piszesz i co myślisz. To nie jest walka z nienawiścią – to walka…

Official release of DeepSeek-V3.2-Exp with DeepSeek Sparse Attention + massive price cuts! DeepSeek Sparse Attention (DSA) makes inference cheaper (especially long-context) by learning which past tokens matter for each new token and running full attention only on those. DSA…

scaling01's tweet image. Official release of DeepSeek-V3.2-Exp with DeepSeek Sparse Attention + massive price cuts!

DeepSeek Sparse Attention (DSA) makes inference cheaper (especially long-context) by learning which past tokens matter for each new token and running full attention only on those.

DSA…
scaling01's tweet image. Official release of DeepSeek-V3.2-Exp with DeepSeek Sparse Attention + massive price cuts!

DeepSeek Sparse Attention (DSA) makes inference cheaper (especially long-context) by learning which past tokens matter for each new token and running full attention only on those.

DSA…
scaling01's tweet image. Official release of DeepSeek-V3.2-Exp with DeepSeek Sparse Attention + massive price cuts!

DeepSeek Sparse Attention (DSA) makes inference cheaper (especially long-context) by learning which past tokens matter for each new token and running full attention only on those.

DSA…

🚀 Introducing DeepSeek-V3.2-Exp — our latest experimental model! ✨ Built on V3.1-Terminus, it debuts DeepSeek Sparse Attention(DSA) for faster, more efficient training & inference on long context. 👉 Now live on App, Web, and API. 💰 API prices cut by 50%+! 1/n



Best DSA resources I’ve come across so far 🧵 Here are the platforms, playlists, and guides that helped me truly understand Data Structures & Algorithms. Trust me these will be more than enough for you to master DSA too.


未找到 "#dsa_python" 的结果
未找到 "#dsa_python" 的结果
Loading...

Something went wrong.


Something went wrong.


United States Trends