#visionlanguagemodels نتائج البحث

Shenhao Wang

٣ أكتوبرم

🚀✨ Exciting Publication from @UrbanAI_Lab The paper “Sparkle: Mastering Basic Spatial Capabilities in Vision Language Models Elicits Generalization to Spatial Reasoning” has been accepted to EMNLP 2025! Link: arxiv.org/pdf/2410.16162 #UrbanAI #VisionLanguageModels

leonliuzx

@leonliuzx

١٧ أكتوبرم

🚀 Exciting news! PaddleOCR-VL has rocketed to #1 on @huggingface Trending in just 16 hours! Dive in: huggingface.co/PaddlePaddle/P… #OCR #AI #VisionLanguageModels

leonliuzx's tweet image. 🚀 Exciting news! PaddleOCR-VL has rocketed to #1 on @huggingface Trending in just 16 hours! Dive in: huggingface.co/PaddlePaddle/P…

#OCR #AI #VisionLanguageModels

ArcGIS Pro

@ArcGISPro

٢٢ أغسطسم

Say goodbye to manual data labeling and hello to instant insights! Our new #VisionLanguageModels can extract features from aerial images using just simple prompts. Simply upload an image and ask a question such as "What do you see?" Learn more: ow.ly/FsbV50WKnC3

ArcGISPro's tweet image. Say goodbye to manual data labeling and hello to instant insights! Our new #VisionLanguageModels can extract features from aerial images using just simple prompts. Simply upload an image and ask a question such as "What do you see?"
Learn more: ow.ly/FsbV50WKnC3

JohnSnowLabs

@JohnSnowLabs

٣٠ أغسطسم

Read here: hubs.li/Q03Fs2V30 #MedicalAI #VisionLanguageModels #HealthcareAI #MedicalImaging #ClinicalDecisionSupport #GenerativeAI

JohnSnowLabs's tweet image. Read here: hubs.li/Q03Fs2V30

#MedicalAI #VisionLanguageModels #HealthcareAI #MedicalImaging #ClinicalDecisionSupport #GenerativeAI

AGI Talent

@mctalentowen

١٨ سبتمبرم

EdiVal-Agent enables scalable, object-centric evaluation of multi-turn image editing, improving instruction following, content consistency, and visual quality assessment. #EdiValAgent #MultiTurnEditing #VisionLanguageModels #ObjectCentric #AIResearch @TianyuChen @MingyuanZhou

mctalentowen's tweet image. EdiVal-Agent enables scalable, object-centric evaluation of multi-turn image editing, improving instruction following, content consistency, and visual quality assessment. #EdiValAgent #MultiTurnEditing #VisionLanguageModels #ObjectCentric #AIResearch @TianyuChen @MingyuanZhou

Vlad Ruso PhD

@vlruso

٦ سبتمبرم

Hugging Face FineVision: The Ultimate Multimodal Dataset for Vision-Language Model Training #FineVision #VisionLanguageModels #HuggingFace #AIResearch #MultimodalDataset itinai.com/hugging-face-f… Understanding the Impact of FineVision on Vision-Language Models Hugging Face has …

vlruso's tweet image. Hugging Face FineVision: The Ultimate Multimodal Dataset for Vision-Language Model Training #FineVision #VisionLanguageModels #HuggingFace #AIResearch #MultimodalDataset
itinai.com/hugging-face-f…

Understanding the Impact of FineVision on Vision-Language Models

Hugging Face has …

Eli Schwartz

@Eli_Schwartz

٢٥ مارسم

Did you know most vision-language models (like Claude, OpenAI, Gemini) totally suck at reading analog clocks ⏰? (Except Molmo—it’s actually trained for that) #AI #MachineLearning #VisionLanguageModels #vibecoding

Eli_Schwartz's tweet image. Did you know most vision-language models (like Claude, OpenAI, Gemini) totally suck at reading analog clocks ⏰? (Except Molmo—it’s actually trained for that)
#AI #MachineLearning #VisionLanguageModels #vibecoding

Woojin Kim

@woojinrad

٣١ يوليو ٢٠٢٤ م

1/ 🗑️ in, 🗑️ out With advances in #VisionLanguageModels, there is growing interest in automated #RadiologyReporting. It's great to see such high research interest, BUT... 🚧 Technique seems intriguing, but the figures raise serious doubts about this paper's merit. 🧵 👇

woojinrad's tweet image. 1/ 🗑️ in, 🗑️ out
With advances in #VisionLanguageModels, there is growing interest in automated #RadiologyReporting. It's great to see such high research interest, BUT...
🚧 Technique seems intriguing, but the figures raise serious doubts about this paper's merit. 🧵 👇

Debora Nozza

@debora_nozza

٧ أكتوبر ٢٠٢٤ م

Last week, I gave an invited talk at the 1st workshop on critical evaluation of generative models and their impact on society at #ECCV2024, focusing on unmasking and tackling bias in #VisionLanguageModels. Thanks to the organizers for the invitation!

debora_nozza's tweet image. Last week, I gave an invited talk at the 1st workshop on critical evaluation of generative models and their impact on society at #ECCV2024, focusing on unmasking and tackling bias in #VisionLanguageModels.

Thanks to the organizers for the invitation!

Vlad Ruso PhD

@vlruso

٢ سبتمبرم

Apple’s FastVLM: 85x Faster Hybrid Vision Encoder Revolutionizing AI Models #FastVLM #VisionLanguageModels #AIInnovation #MultimodalProcessing #AppleTech itinai.com/apples-fastvlm… Apple has made a significant leap in the field of Vision Language Models (VLMs) with the introducti…

vlruso's tweet image. Apple’s FastVLM: 85x Faster Hybrid Vision Encoder Revolutionizing AI Models #FastVLM #VisionLanguageModels #AIInnovation #MultimodalProcessing #AppleTech
itinai.com/apples-fastvlm…

Apple has made a significant leap in the field of Vision Language Models (VLMs) with the introducti…

Ashshak_off_

@AshshakO

٢ مارسم

Alhamdulillah! Thrilled to share that our work "O-TPT" has been accepted at #CVPR2025! Big thanks to my supervisor and co-authors for the support! thread(1/n) #MachineLearning #VisionLanguageModels #CVPR2025

AshshakO's tweet image. Alhamdulillah! Thrilled to share that our work "O-TPT" has been accepted at #CVPR2025! Big thanks to my supervisor and co-authors for the support!
thread(1/n)
#MachineLearning #VisionLanguageModels #CVPR2025

GoatStack.AI

@GoatstackAI

٨ مارس ٢٠٢٤ م

Exploring the limitations of Vision-Language Models (VLMs) like GPT-4V in complex visual reasoning tasks. #AI #VisionLanguageModels #DeductiveReasoning

GoatstackAI's tweet image. Exploring the limitations of Vision-Language Models (VLMs) like GPT-4V in complex visual reasoning tasks. #AI #VisionLanguageModels #DeductiveReasoning

M. Akhtar Munir

@akhtarTalks

٢٧ فبرايرم

Thrilled to share that we have two papers accepted at #CVPR2025! 🚀 A big thank you to all the collaborators for their contributions. Stay tuned for more updates! Titles in the thread (1/n) #CVPR #VisionLanguageModels #ModelCalibration #EarthObservation

akhtarTalks's tweet image. Thrilled to share that we have two papers accepted at #CVPR2025! 🚀
A big thank you to all the collaborators for their contributions. Stay tuned for more updates!

Titles in the thread (1/n)

#CVPR #VisionLanguageModels #ModelCalibration #EarthObservation

GenAINews.co

@genainewstop

٢٢ أغسطسم

Exciting news from Liquid AI! 🚀 Introducing LFM2-VL: super-fast, open-weight vision-language models perfect for low-latency, on-device deployment. Revolutionizing AI for smartphones, laptops, wearables, and more! #AI #VisionLanguageModels marktechpost.com/2025/08/20/liq…

Liquid AI Releases LFM2-VL: Super-Fast, Open-Weight Vision-Language Models Designed for Low-Latency...

المصدر: marktechpost.com

Himanshu

@WaghHimanshu

٨ أكتوبرم

A key challenge for VLMs is "grounding" - correctly linking text to visual elements. The latest research uses techniques like bounding box annotations and negative captioning to teach models to see and understand with greater accuracy. #DeepLearning #AI #VisionLanguageModels

Tanat Tonguthaisri

@gastronomy

١٩ يناير ٢٠٢٤ م

📣 Exciting new research alert! Learn about Voila-A, a groundbreaking approach aligning vision-language models with user gaze attention. Enhance AI interpretability and effectiveness in real-world scenarios. Explore the paper at: bit.ly/3vJ2aN9 #AI #VisionLanguageModels

JohnSnowLabs

@JohnSnowLabs

٩ أغسطسم

HackerNoon | Learn Any Technology

@hackernoon

١ أكتوبرم

This paper summarizes a comprehensive framework for typographic attacks, proving their effectiveness and transferability against Vision-LLMs like LLaVA - hackernoon.com/future-of-ad-s… #visionlanguagemodels #visionllms

hackernoon's tweet card. This paper summarizes a comprehensive framework for typographic attacks, proving their effectiveness and transferability against Vision-LLMs like LLaVA

Future of AD Security: Addressing Limitations and Ethical Concerns in Typographic Attack Research |...

المصدر: hackernoon.com

GoatStack.AI

@GoatstackAI

١٣ مايو ٢٠٢٤ م

Exploring the capabilities of multimodal LLMs in visual network analysis. #LargeLanguageModels #VisualNetworkAnalysis #VisionLanguageModels

GoatstackAI's tweet image. Exploring the capabilities of multimodal LLMs in visual network analysis. #LargeLanguageModels #VisualNetworkAnalysis #VisionLanguageModels

Abhinav Girdhar

@AbhinavGirdhar

٦ مايو ٢٠٢٤ م

1/5 Can feedback improve semantic grounding in large vision-language models? A recent study delves into this question, exploring the potential of feedback in enhancing the alignment between visual and textual representations. #AI #VisionLanguageModels

AbhinavGirdhar's tweet image. 1/5
Can feedback improve semantic grounding in large vision-language models? A recent study delves into this question, exploring the potential of feedback in enhancing the alignment between visual and textual representations. #AI #VisionLanguageModels

leonliuzx

@leonliuzx

١٧ أكتوبرم

🚀 Exciting news! PaddleOCR-VL has rocketed to #1 on @huggingface Trending in just 16 hours! Dive in: huggingface.co/PaddlePaddle/P… #OCR #AI #VisionLanguageModels

Himanshu

@WaghHimanshu

٨ أكتوبرم

Ahmed Masry @ COLM 2025 🇨🇦

@Ahmed_Masry97

٧ أكتوبرم

💻 We have open-sourced the code at github.com/ServiceNow/Big… 🙌 This was a collaboration effort between @ServiceNowRSRCH , @Mila_Quebec , and @YorkUniversity. #COLM2025 #AI #VisionLanguageModels #Charts #BigCharts

Ahmed_Masry97's tweet card. Contribute to ServiceNow/BigCharts-R1 development by creating an account on GitHub.

GitHub - ServiceNow/BigCharts-R1

المصدر: github.com

Shenhao Wang

@ShenhaoWang_AI

٣ أكتوبرم

abhishekjariwala

@abhijariwalaa

٣ أكتوبرم

New research reveals a paradigm-shifting approach to data curation in vision-language models, unlocking their intrinsic capabilities for more accurate and efficient AI understanding. A big step forward in bridging visual and textual data! 🤖📊 #AI #VisionLanguageModels

HackerNoon | Learn Any Technology

@hackernoon

١ أكتوبرم

Future of AD Security: Addressing Limitations and Ethical Concerns in Typographic Attack Research |...

المصدر: hackernoon.com

HackerNoon | Learn Any Technology

@hackernoon

١ أكتوبرم

This article presents an empirical study on the effectiveness and transferability of typographic attacks against major Vision-LLMs using AD-specific datasets. - hackernoon.com/empirical-stud… #visionlanguagemodels #visionllms

hackernoon's tweet card. This article presents an empirical study on the effectiveness and transferability of typographic attacks against major Vision-LLMs using AD-specific datasets.

Empirical Study: Evaluating Typographic Attack Effectiveness Against Vision-LLMs in AD Systems |...

المصدر: hackernoon.com

HackerNoon | Learn Any Technology

@hackernoon

١ أكتوبرم

This article explores the physical realization of typographic attacks, categorizing their deployment into background and foreground elements - hackernoon.com/foreground-vs-… #visionlanguagemodels #visionllms

hackernoon's tweet card. This article explores the physical realization of typographic attacks, categorizing their deployment into background and foreground elements

Foreground vs. Background: Analyzing Typographic Attack Placement in Autonomous Driving Systems |...

المصدر: hackernoon.com

AGI Talent

@mctalentowen

١٨ سبتمبرم

Vimal Singh

@VimalAITech

١٠ سبتمبرم

Interesting to see fast development in #VisionLanguageModels

لا توجد نتائج لـ "#visionlanguagemodels"

ArcGIS Pro

@ArcGISPro

٢٢ أغسطسم

JohnSnowLabs

@JohnSnowLabs

٣٠ أغسطسم

Read here: hubs.li/Q03Fs2V30 #MedicalAI #VisionLanguageModels #HealthcareAI #MedicalImaging #ClinicalDecisionSupport #GenerativeAI

Vlad Ruso PhD

@vlruso

٦ سبتمبرم

AGI Talent

@mctalentowen

١٨ سبتمبرم

Vlad Ruso PhD

@vlruso

٢ سبتمبرم

Woojin Kim

@woojinrad

٣١ يوليو ٢٠٢٤ م

JohnSnowLabs

@JohnSnowLabs

٩ أغسطسم

Daily Trending AI/ML Topics

@aitrendings

6 س

From Pixels to Words -- Towards Native Vision-Language Primitives at Scale 👥 Haiwen Diao, Mingxuan Li, Silei Wu et al. #VisionLanguageModels #AIResearch #DeepLearning #OpenSource #ComputerVision 🔗 trendtoknow.ai

aitrendings's tweet image. From Pixels to Words -- Towards Native Vision-Language Primitives at Scale

👥 Haiwen Diao, Mingxuan Li, Silei Wu et al.

#VisionLanguageModels #AIResearch #DeepLearning #OpenSource #ComputerVision

🔗 trendtoknow.ai

leonliuzx

@leonliuzx

١٧ أكتوبرم

🚀 Exciting news! PaddleOCR-VL has rocketed to #1 on @huggingface Trending in just 16 hours! Dive in: huggingface.co/PaddlePaddle/P… #OCR #AI #VisionLanguageModels

Abhinav Girdhar

@AbhinavGirdhar

٦ مايو ٢٠٢٤ م

Abhinav Girdhar

@AbhinavGirdhar

٧ مايو ٢٠٢٤ م

1/5 BRAVE: A groundbreaking approach to enhancing vision-language models (VLMs)! By combining features from multiple vision encoders, BRAVE creates a more versatile and robust visual representation. #AI #VisionLanguageModels

AbhinavGirdhar's tweet image. 1/5
BRAVE: A groundbreaking approach to enhancing vision-language models (VLMs)! By combining features from multiple vision encoders, BRAVE creates a more versatile and robust visual representation. #AI #VisionLanguageModels

GoatStack.AI

@GoatstackAI

٨ مارس ٢٠٢٤ م

Exploring the limitations of Vision-Language Models (VLMs) like GPT-4V in complex visual reasoning tasks. #AI #VisionLanguageModels #DeductiveReasoning

M. Akhtar Munir

@akhtarTalks

٢٧ فبرايرم

GoatStack.AI

@GoatstackAI

١٣ مايو ٢٠٢٤ م

Exploring the capabilities of multimodal LLMs in visual network analysis. #LargeLanguageModels #VisualNetworkAnalysis #VisionLanguageModels

IEEE Engineering Medicine and Biology Society

@IEEEembs

٢٣ يوليوم

📢 Call for Papers — JBHI Special Issue: “Transparent Large #VisionLanguageModels in Healthcare” Seeking research on: ✔️ Explainable VLMs ✔️ Medical image-text alignment ✔️ Fair & interpretable AI 📅 Deadline: Sep 30, 2025 🔗 Info: tinyurl.com/4a7d69t2

IEEEembs's tweet image. 📢 Call for Papers — JBHI Special Issue: “Transparent Large #VisionLanguageModels in Healthcare”

Seeking research on:
✔️ Explainable VLMs
✔️ Medical image-text alignment
✔️ Fair &amp; interpretable AI

📅 Deadline: Sep 30, 2025
🔗 Info: tinyurl.com/4a7d69t2

Ashshak_off_

@AshshakO

٢ مارسم

Data Science Dojo

@DataScienceDojo

١٦ أبريل ٢٠٢٤ م

Moondream 2 is a superstar in the world of vision-and-language models, but what makes it tick? This post unveils the magic behind it: Curious to learn more? ➡️ hubs.la/Q02sWg4R0 #Moondream2 #VisionLanguageModels #AIInnovation

DataScienceDojo's tweet image. Moondream 2 is a superstar in the world of vision-and-language models, but what makes it tick? This post unveils the magic behind it:

Curious to learn more? ➡️ hubs.la/Q02sWg4R0

#Moondream2 #VisionLanguageModels #AIInnovation

GoatStack.AI

@GoatstackAI

٩ مارس ٢٠٢٤ م

Investigating vision-language models on Raven's Progressive Matrices showcases gaps in visual deductive reasoning. #VisualReasoning #DeductiveReasoning #VisionLanguageModels

GoatstackAI's tweet image. Investigating vision-language models on Raven's Progressive Matrices showcases gaps in visual deductive reasoning. #VisualReasoning #DeductiveReasoning #VisionLanguageModels

Vlad Ruso PhD

@vlruso

١٧ أبريلم

Pixel-SAIL: A Revolutionary Single-Transformer Model for Pixel-Level Vision-Language Tasks #PixelSAIL #VisionLanguageModels #ArtificialIntelligence #MachineLearning #Innovation itinai.com/pixel-sail-a-r…

vlruso's tweet image. Pixel-SAIL: A Revolutionary Single-Transformer Model for Pixel-Level Vision-Language Tasks

#PixelSAIL #VisionLanguageModels #ArtificialIntelligence #MachineLearning #Innovation

itinai.com/pixel-sail-a-r…

Yassir Bendou

@YBendou

٢٧ فبرايرم

[1/6] 🚀 Exciting News! Our paper has been accepted at hashtag #CVPR2025 ! 🎉 We’re thrilled to introduce "ProKeR: A Kernel Perspective on Few-Shot Adaptation of Large Vision-Language Models" 📄 ybendou.github.io/ProKeR/ #VisionLanguageModels #FewShotLearning #ComputerVision

YBendou's tweet image. [1/6] 🚀 Exciting News! Our paper has been accepted at hashtag #CVPR2025 ! 🎉

We’re thrilled to introduce "ProKeR: A Kernel Perspective on Few-Shot Adaptation of Large Vision-Language Models"

📄 ybendou.github.io/ProKeR/

#VisionLanguageModels #FewShotLearning #ComputerVision

Something went wrong.

United States Trends

1. Good Tuesday 19.9K posts
2. Texans 38.7K posts
3. White House 330K posts
4. World Series 116K posts
5. Mariners 94.7K posts
6. Blue Jays 100K posts
7. #Talus_Labs N/A
8. Sanae Takaichi 55.3K posts
9. Cobie 32K posts
10. CJ Stroud 6,914 posts
11. StandX 4,828 posts
12. Seahawks 37.7K posts
13. Springer 69.9K posts
14. Nick Caley 2,733 posts
15. LA Knight 8,793 posts
16. East Wing 75.5K posts
17. Dodgers in 5 2,312 posts
18. Dan Wilson 4,344 posts
19. #LaCasaDeAlofoke2 15.9K posts
20. Financial 155K posts