CyberMath4's profile picture. ML Engineer | Master in Advanced & Applied AI | MSc Statistics | MSc Full Stack Web Developer | Math Teacher | Bachelor Engineering Sciences | From 🇪🇸🇨🇱

Nibaldo

@CyberMath4

ML Engineer | Master in Advanced & Applied AI | MSc Statistics | MSc Full Stack Web Developer | Math Teacher | Bachelor Engineering Sciences | From 🇪🇸🇨🇱

Nibaldo reposted

DeepSeek-OCR is the best OCR ever. It parses this extremely hard to read handwritten letter written by mathematician Ramanujan in 1913 with a frightening degree of accuracy. Not perfect, but beats former best dots ocr. Bonus points if you can spot the errors. Try it here:

deedydas's tweet image. DeepSeek-OCR is the best OCR ever.

It parses this extremely hard to read handwritten letter written by mathematician Ramanujan in 1913 with a frightening degree of accuracy.

Not perfect, but beats former best dots ocr. Bonus points if you can spot the errors.

Try it here:

Nibaldo reposted

This is the JPEG moment for AI. Optical compression doesn't just make context cheaper. It makes AI memory architectures viable. Training data bottlenecks? Solved. - 200k pages/day on ONE GPU - 33M pages/day on 20 nodes - Every multimodal model is data-constrained. Not anymore.…

RayFernando1337's tweet image. This is the JPEG moment for AI.

Optical compression doesn't just make context cheaper. It makes AI memory architectures viable.

Training data bottlenecks? Solved.
- 200k pages/day on ONE GPU
- 33M pages/day on 20 nodes
- Every multimodal model is data-constrained. Not anymore.…

Nibaldo reposted

Agentic Context Engineering Great paper on agentic context engineering. The recipe: Treat your system prompts and agent memory as a living playbook. Log trajectories, reflect to extract actionable bullets (strategies, tool schemas, failure modes), then merge as append-only…

omarsar0's tweet image. Agentic Context Engineering

Great paper on agentic context engineering.

The recipe:

Treat your system prompts and agent memory as a living playbook.

Log trajectories, reflect to extract actionable bullets (strategies, tool schemas, failure modes), then merge as append-only…

Nibaldo reposted

🚀 Introducing DeepSeek-V3.2-Exp — our latest experimental model! ✨ Built on V3.1-Terminus, it debuts DeepSeek Sparse Attention(DSA) for faster, more efficient training & inference on long context. 👉 Now live on App, Web, and API. 💰 API prices cut by 50%+! 1/n


Nibaldo reposted

🚀 Introducing Qwen3-LiveTranslate-Flash — Real‑Time Multimodal Interpretation — See It, Hear It, Speak It! 🌐 Wide language coverage — Understands 18 languages & 6 dialects, speaks 10 languages. 👁️ Vision‑Enhanced Comprehension — Reads lips, gestures, on‑screen text and…

Alibaba_Qwen's tweet image. 🚀 Introducing Qwen3-LiveTranslate-Flash — Real‑Time Multimodal Interpretation — See It, Hear It, Speak It!

🌐 Wide language coverage — Understands 18 languages & 6 dialects, speaks 10 languages.
👁️ Vision‑Enhanced Comprehension — Reads lips, gestures, on‑screen text and…

Nibaldo reposted

Microsoft introduces Latent Zoning Network (LZN) A unified principle for generative modeling, representation learning, and classification. LZN uses a shared Gaussian latent space and modular encoders/decoders to tackle all three core ML problems at once!

HuggingPapers's tweet image. Microsoft introduces Latent Zoning Network (LZN)

A unified principle for generative modeling, representation learning, and classification. LZN uses a shared Gaussian latent space and modular encoders/decoders to tackle all three core ML problems at once!

Nibaldo reposted

🚀 Introducing Qwen3-Next-80B-A3B — the FUTURE of efficient LLMs is here! 🔹 80B params, but only 3B activated per token → 10x cheaper training, 10x faster inference than Qwen3-32B.(esp. @ 32K+ context!) 🔹Hybrid Architecture: Gated DeltaNet + Gated Attention → best of speed &…

Alibaba_Qwen's tweet image. 🚀 Introducing Qwen3-Next-80B-A3B — the FUTURE of efficient LLMs is here!

🔹 80B params, but only 3B activated per token → 10x cheaper training, 10x faster inference than Qwen3-32B.(esp. @ 32K+ context!)
🔹Hybrid Architecture: Gated DeltaNet + Gated Attention → best of speed &…

Nibaldo reposted

Updated & turned my Big LLM Architecture Comparison article into a narrated video lecture. The 11 LLM architectures covered in this video: 1. DeepSeek V3/R1 2. OLMo 2 3. Gemma 3 4. Mistral Small 3.1 5. Llama 4 6. Qwen3 7. SmolLM3 8. Kimi 2 9. GPT-OSS 10. Grok 2.5 11. GLM-4.5


Nibaldo reposted

3T tokens, ~1800 languages, 2 models - we’re releasing mmBERT, a modern multilingual encoder model!

ruyimarone's tweet image. 3T tokens, ~1800 languages, 2 models - we’re releasing mmBERT, a modern multilingual encoder model!

Nibaldo reposted

Vibe-coded AI startup 2025

Hesamation's tweet image. Vibe-coded AI startup 2025

Nibaldo reposted

This is the fastest serving engine for LLMs! LMCache cuts time-to-first-token by 7x and slashes GPU costs dramatically. Think of it as a smart caching layer that remembers everything your LLM has processed before. 100% open-source, makes vLLM go brr...

akshay_pachaar's tweet image. This is the fastest serving engine for LLMs!

LMCache cuts time-to-first-token by 7x and slashes GPU costs dramatically.

Think of it as a smart caching layer that remembers everything your LLM has processed before.

100% open-source, makes vLLM go brr...

Nibaldo reposted

So, I did some coding this week... - Qwen3 Coder Flash (30B-A3B) - Mixture-of-Experts setup with 128 experts, 8 active per token - In pure PyTorch (optimized for human readability) - in a standalone Jupyter notebook - Runs on a single A100

rasbt's tweet image. So, I did some coding this week...
- Qwen3 Coder Flash (30B-A3B)
- Mixture-of-Experts setup with 128 experts, 8 active per token
- In pure PyTorch (optimized for human readability)
- in a standalone Jupyter notebook
- Runs on a single A100

Nibaldo reposted

The wait is over: Deep Think is here. At I/O, we previewed the frontiers of Gemini’s thinking capabilities. Now, @Google AI Ultra subscribers can experience it in the Gemini app. With Deep Think, Gemini 2.5 is able to intelligently extend its "thinking time" so it can generate…


Nibaldo reposted

DevOps life.

HazzimIO's tweet image. DevOps life.

Nibaldo reposted

A free MIT guide to key computer vision concepts: bit.ly/43Tn1vW

MIT_CSAIL's tweet image. A free MIT guide to key computer vision concepts: bit.ly/43Tn1vW

Nibaldo reposted

Don’t let your AI project die in a notebook. You don’t need more features. You need structure. This is the folder setup that actually ships from day one. 📁 𝗧𝗵𝗲 𝗳𝗼𝗹𝗱𝗲𝗿 𝘀𝘁𝗿𝘂𝗰𝘁𝘂𝗿𝗲 𝘁𝗵𝗮𝘁 𝘄𝗼𝗿𝗸𝘀 Forget monolithic scripts. You need this: /config 🔹YAML…

HeyNina101's tweet image. Don’t let your AI project die in a notebook. You don’t need more features. You need structure. This is the folder setup that actually ships from day one.

📁 𝗧𝗵𝗲 𝗳𝗼𝗹𝗱𝗲𝗿 𝘀𝘁𝗿𝘂𝗰𝘁𝘂𝗿𝗲 𝘁𝗵𝗮𝘁 𝘄𝗼𝗿𝗸𝘀

Forget monolithic scripts. You need this:

/config
🔹YAML…

Nibaldo reposted

A Great Cover For A Vibe Coding Book 😆

bindureddy's tweet image. A Great Cover For A Vibe Coding Book 😆

Nibaldo reposted

5 techniques to fine-tune LLMs, explained visually! Fine-tuning large language models traditionally involved adjusting billions of parameters, demanding significant computational power and resources. However, the development of some innovative methods have transformed this…


Nibaldo reposted

We made a Guide to teach you how to Fine-tune LLMs correctly! Learn about: • Choosing the right parameters & training method • RL, GRPO, DPO & CPT • Data prep, Overfitting & Evaluation • Training with Unsloth & deploy on vLLM, Ollama, Open WebUI 🔗 docs.unsloth.ai/get-started/fi…

UnslothAI's tweet image. We made a Guide to teach you how to Fine-tune LLMs correctly!

Learn about:
• Choosing the right parameters & training method
• RL, GRPO, DPO & CPT
• Data prep, Overfitting & Evaluation
• Training with Unsloth & deploy on vLLM, Ollama, Open WebUI

🔗 docs.unsloth.ai/get-started/fi…

United States Trends

Loading...

Something went wrong.


Something went wrong.