#visionlanguagemodels résultats de recherche

Vlad Ruso PhD

6 sept.

Hugging Face FineVision: The Ultimate Multimodal Dataset for Vision-Language Model Training #FineVision #VisionLanguageModels #HuggingFace #AIResearch #MultimodalDataset itinai.com/hugging-face-f… Understanding the Impact of FineVision on Vision-Language Models Hugging Face has …

vlruso's tweet image. Hugging Face FineVision: The Ultimate Multimodal Dataset for Vision-Language Model Training #FineVision #VisionLanguageModels #HuggingFace #AIResearch #MultimodalDataset
itinai.com/hugging-face-f…

Understanding the Impact of FineVision on Vision-Language Models

Hugging Face has …

JohnSnowLabs

@JohnSnowLabs

30 août

Read here: hubs.li/Q03Fs2V30 #MedicalAI #VisionLanguageModels #HealthcareAI #MedicalImaging #ClinicalDecisionSupport #GenerativeAI

JohnSnowLabs's tweet image. Read here: hubs.li/Q03Fs2V30

#MedicalAI #VisionLanguageModels #HealthcareAI #MedicalImaging #ClinicalDecisionSupport #GenerativeAI

1/ 🗑️ in, 🗑️ out With advances in #VisionLanguageModels, there is growing interest in automated #RadiologyReporting. It's great to see such high research interest, BUT... 🚧 Technique seems intriguing, but the figures raise serious doubts about this paper's merit. 🧵 👇

woojinrad's tweet image. 1/ 🗑️ in, 🗑️ out
With advances in #VisionLanguageModels, there is growing interest in automated #RadiologyReporting. It's great to see such high research interest, BUT...
🚧 Technique seems intriguing, but the figures raise serious doubts about this paper's merit. 🧵 👇

Shenhao Wang

@ShenhaoWang_AI

3 oct.

🚀✨ Exciting Publication from @UrbanAI_Lab The paper “Sparkle: Mastering Basic Spatial Capabilities in Vision Language Models Elicits Generalization to Spatial Reasoning” has been accepted to EMNLP 2025! Link: arxiv.org/pdf/2410.16162 #UrbanAI #VisionLanguageModels

ArcGIS Pro

@ArcGISPro

22 août

Say goodbye to manual data labeling and hello to instant insights! Our new #VisionLanguageModels can extract features from aerial images using just simple prompts. Simply upload an image and ask a question such as "What do you see?" Learn more: ow.ly/FsbV50WKnC3

ArcGISPro's tweet image. Say goodbye to manual data labeling and hello to instant insights! Our new #VisionLanguageModels can extract features from aerial images using just simple prompts. Simply upload an image and ask a question such as "What do you see?"
Learn more: ow.ly/FsbV50WKnC3

Debora Nozza

@debora_nozza

7 oct. 2024

Last week, I gave an invited talk at the 1st workshop on critical evaluation of generative models and their impact on society at #ECCV2024, focusing on unmasking and tackling bias in #VisionLanguageModels. Thanks to the organizers for the invitation!

debora_nozza's tweet image. Last week, I gave an invited talk at the 1st workshop on critical evaluation of generative models and their impact on society at #ECCV2024, focusing on unmasking and tackling bias in #VisionLanguageModels.

Thanks to the organizers for the invitation!

Abhinav Girdhar

@AbhinavGirdhar

6 mai 2024

1/5 Can feedback improve semantic grounding in large vision-language models? A recent study delves into this question, exploring the potential of feedback in enhancing the alignment between visual and textual representations. #AI #VisionLanguageModels

AbhinavGirdhar's tweet image. 1/5
Can feedback improve semantic grounding in large vision-language models? A recent study delves into this question, exploring the potential of feedback in enhancing the alignment between visual and textual representations. #AI #VisionLanguageModels

Hans Willert

@HWillert

2 oct.

Using #VisionLanguageModels to Process Millions of Documents | Towards Data Science towardsdatascience.com/using-vision-l…

HWillert's tweet card. Learn how to effectively apply vision language models to problem solving

Using Vision Language Models to Process Millions of Documents | Towards Data Science

Source: towardsdatascience.com

Abhinav Girdhar

@AbhinavGirdhar

7 mai 2024

1/5 BRAVE: A groundbreaking approach to enhancing vision-language models (VLMs)! By combining features from multiple vision encoders, BRAVE creates a more versatile and robust visual representation. #AI #VisionLanguageModels

AbhinavGirdhar's tweet image. 1/5
BRAVE: A groundbreaking approach to enhancing vision-language models (VLMs)! By combining features from multiple vision encoders, BRAVE creates a more versatile and robust visual representation. #AI #VisionLanguageModels

Eli Schwartz

@Eli_Schwartz

25 mars

Did you know most vision-language models (like Claude, OpenAI, Gemini) totally suck at reading analog clocks ⏰? (Except Molmo—it’s actually trained for that) #AI #MachineLearning #VisionLanguageModels #vibecoding

Eli_Schwartz's tweet image. Did you know most vision-language models (like Claude, OpenAI, Gemini) totally suck at reading analog clocks ⏰? (Except Molmo—it’s actually trained for that)
#AI #MachineLearning #VisionLanguageModels #vibecoding

GoatStack.AI

@GoatstackAI

8 mars 2024

Exploring the limitations of Vision-Language Models (VLMs) like GPT-4V in complex visual reasoning tasks. #AI #VisionLanguageModels #DeductiveReasoning

GoatstackAI's tweet image. Exploring the limitations of Vision-Language Models (VLMs) like GPT-4V in complex visual reasoning tasks. #AI #VisionLanguageModels #DeductiveReasoning

Vlad Ruso PhD

@vlruso

2 sept.

Apple’s FastVLM: 85x Faster Hybrid Vision Encoder Revolutionizing AI Models #FastVLM #VisionLanguageModels #AIInnovation #MultimodalProcessing #AppleTech itinai.com/apples-fastvlm… Apple has made a significant leap in the field of Vision Language Models (VLMs) with the introducti…

vlruso's tweet image. Apple’s FastVLM: 85x Faster Hybrid Vision Encoder Revolutionizing AI Models #FastVLM #VisionLanguageModels #AIInnovation #MultimodalProcessing #AppleTech
itinai.com/apples-fastvlm…

Apple has made a significant leap in the field of Vision Language Models (VLMs) with the introducti…

Ashshak_off_

@AshshakO

2 mars

Alhamdulillah! Thrilled to share that our work "O-TPT" has been accepted at #CVPR2025! Big thanks to my supervisor and co-authors for the support! thread(1/n) #MachineLearning #VisionLanguageModels #CVPR2025

AshshakO's tweet image. Alhamdulillah! Thrilled to share that our work "O-TPT" has been accepted at #CVPR2025! Big thanks to my supervisor and co-authors for the support!
thread(1/n)
#MachineLearning #VisionLanguageModels #CVPR2025

M. Akhtar Munir

@akhtarTalks

27 févr.

Thrilled to share that we have two papers accepted at #CVPR2025! 🚀 A big thank you to all the collaborators for their contributions. Stay tuned for more updates! Titles in the thread (1/n) #CVPR #VisionLanguageModels #ModelCalibration #EarthObservation

akhtarTalks's tweet image. Thrilled to share that we have two papers accepted at #CVPR2025! 🚀
A big thank you to all the collaborators for their contributions. Stay tuned for more updates!

Titles in the thread (1/n)

#CVPR #VisionLanguageModels #ModelCalibration #EarthObservation

JohnSnowLabs

@JohnSnowLabs

9 août

GoatStack.AI

@GoatstackAI

13 mai 2024

Exploring the capabilities of multimodal LLMs in visual network analysis. #LargeLanguageModels #VisualNetworkAnalysis #VisionLanguageModels

GoatStack.AI

@GoatstackAI

9 mars 2024

Investigating vision-language models on Raven's Progressive Matrices showcases gaps in visual deductive reasoning. #VisualReasoning #DeductiveReasoning #VisionLanguageModels

GoatStack.AI

@GoatstackAI

19 juin 2024

Introducing a comprehensive benchmark and large-scale dataset to evaluate and improve LVLMs' abilities in multi-turn and multi-image conversations. #DialogUnderstanding #VisionLanguageModels #MultiImageConversations

GoatstackAI's tweet image. Introducing a comprehensive benchmark and large-scale dataset to evaluate and improve LVLMs' abilities in multi-turn and multi-image conversations. #DialogUnderstanding #VisionLanguageModels #MultiImageConversations

Sandra Hinshelwood

@Hinshelwood_S

10 juil.

LuminX is disrupting warehouse operations with edge-based Vision Language Models, turning manual tasks into automated systems. $5.5M raised. businesspartnermagazine.com/luminx-secures… #AIForLogistics #VisionLanguageModels

Hinshelwood_S's tweet card. LuminX raises $5.5M to transform warehouse operations using Vision Language Models (VLMs) on edge devices, enabling real-time automation and visibility.

LuminX lands $5.5M to connect AI, warehouses and supply chains - Business Partner Magazine

Source: businesspartnermagazine.com

Tanat Tonguthaisri

@gastronomy

19 janv. 2024

📣 Exciting new research alert! Learn about Voila-A, a groundbreaking approach aligning vision-language models with user gaze attention. Enhance AI interpretability and effectiveness in real-world scenarios. Explore the paper at: bit.ly/3vJ2aN9 #AI #VisionLanguageModels

HackerNoon | Learn Any Technology

@hackernoon

28 oct.

PerSense-D is a new benchmark dataset for personalized dense image segmentation, advancing AI accuracy in crowded visual environments. - hackernoon.com/new-dataset-pe… #visionlanguagemodels #denseimagesegmentation

hackernoon.com

New Dataset PerSense-D Enables Model-Agnostic Dense Object Segmentation | HackerNoon

PerSense-D is a new benchmark dataset for personalized dense image segmentation, advancing AI accuracy in crowded visual environments.

Source: hackernoon.com

HackerNoon | Learn Any Technology

@hackernoon

28 oct.

Adaptive prompts, density maps, and VLMs are used in PerSense's training-free one-shot segmentation framework for dense picture interpretation. - hackernoon.com/persense-deliv… #visionlanguagemodels #denseimagesegmentation

hackernoon.com

PerSense Delivers Expert-Level Instance Recognition Without Any Training | HackerNoon

Adaptive prompts, density maps, and VLMs are used in PerSense's training-free one-shot segmentation framework for dense picture interpretation.

Source: hackernoon.com

HackerNoon | Learn Any Technology

@hackernoon

28 oct.

PerSense is a model-aware, training-free system for one-shot tailored instance division in dense images based on density and vision-language cues. - hackernoon.com/persense-a-one… #visionlanguagemodels #denseimagesegmentation

hackernoon.com

PerSense: A One-Shot Framework for Personalized Segmentation in Dense Images | HackerNoon

PerSense is a model-aware, training-free system for one-shot tailored instance division in dense images based on density and vision-language cues.

Source: hackernoon.com

Pablo Rivas

@_rivas_ai

22 oct.

2/4 The score is computed in three stages: baseline accuracy, degradation under noise, degradation under crafted attacks, then blended with tunable weights w₁ + w₂ = 1 to reflect specific risk profiles. #VisionLanguageModels

Shweta Bhardwaj@ICCV2025

@sh10bhardwaj

21 oct.

(3/3) 🤝 Open to #Collaboration and #Internship Opportunities on: 🧠 Data-centric AI 🤖 Vision-language Model training and evaluation Shoutout to amazing co-authors @JoLiang17 @zhoutianyi ! #VisionLanguageModels #DCAI #DataCentric #ResponsibleAI #ICCV #AI #ML #ComputerVision

leonliuzx

@leonliuzx

17 oct.

🚀 Exciting news! PaddleOCR-VL has rocketed to #1 on @huggingface Trending in just 16 hours! Dive in: huggingface.co/PaddlePaddle/P… #OCR #AI #VisionLanguageModels

leonliuzx's tweet image. 🚀 Exciting news! PaddleOCR-VL has rocketed to #1 on @huggingface Trending in just 16 hours! Dive in: huggingface.co/PaddlePaddle/P…

#OCR #AI #VisionLanguageModels

Himanshu

@WaghHimanshu

8 oct.

A key challenge for VLMs is "grounding" - correctly linking text to visual elements. The latest research uses techniques like bounding box annotations and negative captioning to teach models to see and understand with greater accuracy. #DeepLearning #AI #VisionLanguageModels

Ahmed Masry @ COLM 2025 🇨🇦

@Ahmed_Masry97

7 oct.

💻 We have open-sourced the code at github.com/ServiceNow/Big… 🙌 This was a collaboration effort between @ServiceNowRSRCH , @Mila_Quebec , and @YorkUniversity. #COLM2025 #AI #VisionLanguageModels #Charts #BigCharts

github.com

GitHub - ServiceNow/BigCharts-R1

Contribute to ServiceNow/BigCharts-R1 development by creating an account on GitHub.

Source: github.com

Shenhao Wang

@ShenhaoWang_AI

3 oct.

abhishekjariwala

@abhijariwalaa

3 oct.

New research reveals a paradigm-shifting approach to data curation in vision-language models, unlocking their intrinsic capabilities for more accurate and efficient AI understanding. A big step forward in bridging visual and textual data! 🤖📊 #AI #VisionLanguageModels

Hans Willert

@HWillert

2 oct.

Using #VisionLanguageModels to Process Millions of Documents | Towards Data Science towardsdatascience.com/using-vision-l…

towardsdatascience.com

Using Vision Language Models to Process Millions of Documents | Towards Data Science

Learn how to effectively apply vision language models to problem solving

Source: towardsdatascience.com

HackerNoon | Learn Any Technology

@hackernoon

1 oct.

This paper summarizes a comprehensive framework for typographic attacks, proving their effectiveness and transferability against Vision-LLMs like LLaVA - hackernoon.com/future-of-ad-s… #visionlanguagemodels #visionllms

hackernoon.com

Future of AD Security: Addressing Limitations and Ethical Concerns in Typographic Attack Research |...

This paper summarizes a comprehensive framework for typographic attacks, proving their effectiveness and transferability against Vision-LLMs like LLaVA

Source: hackernoon.com

HackerNoon | Learn Any Technology

@hackernoon

1 oct.

This article presents an empirical study on the effectiveness and transferability of typographic attacks against major Vision-LLMs using AD-specific datasets. - hackernoon.com/empirical-stud… #visionlanguagemodels #visionllms

hackernoon.com

Empirical Study: Evaluating Typographic Attack Effectiveness Against Vision-LLMs in AD Systems |...

This article presents an empirical study on the effectiveness and transferability of typographic attacks against major Vision-LLMs using AD-specific datasets.

Source: hackernoon.com

Aucun résultat pour "#visionlanguagemodels"

Vlad Ruso PhD

@vlruso

6 sept.

Vlad Ruso PhD

@vlruso

2 sept.

JohnSnowLabs

@JohnSnowLabs

30 août

Read here: hubs.li/Q03Fs2V30 #MedicalAI #VisionLanguageModels #HealthcareAI #MedicalImaging #ClinicalDecisionSupport #GenerativeAI

Mozhgan

@Mozhgan_nasr

26 oct.

🚀 Submissions open for VLM4RWD @ NeurIPS 2025! Let’s make VLMs efficient & ready for the real world 🌎💡 🗓️ Deadline: Oct 31 📍 Mexico City 🇲🇽 🔗 openreview.net/group?id=NeurI… #NeurIPS2025 #VLM4RWD #VisionLanguageModels

Mozhgan_nasr's tweet image. 🚀 Submissions open for VLM4RWD @ NeurIPS 2025!
Let’s make VLMs efficient &amp; ready for the real world 🌎💡

🗓️ Deadline: Oct 31
📍 Mexico City 🇲🇽
🔗 openreview.net/group?id=NeurI…

#NeurIPS2025 #VLM4RWD #VisionLanguageModels

Woojin Kim

@woojinrad

31 juil. 2024

ArcGIS Pro

@ArcGISPro

22 août

Abhinav Girdhar

@AbhinavGirdhar

6 mai 2024

JohnSnowLabs

@JohnSnowLabs

9 août

Daily Trending AI/ML Topics

@aitrendings

21 oct.

From Pixels to Words -- Towards Native Vision-Language Primitives at Scale 👥 Haiwen Diao, Mingxuan Li, Silei Wu et al. #VisionLanguageModels #AIResearch #DeepLearning #OpenSource #ComputerVision 🔗 trendtoknow.ai

aitrendings's tweet image. From Pixels to Words -- Towards Native Vision-Language Primitives at Scale

👥 Haiwen Diao, Mingxuan Li, Silei Wu et al.

#VisionLanguageModels #AIResearch #DeepLearning #OpenSource #ComputerVision

🔗 trendtoknow.ai

Abhinav Girdhar

@AbhinavGirdhar

7 mai 2024

GoatStack.AI

@GoatstackAI

8 mars 2024

Exploring the limitations of Vision-Language Models (VLMs) like GPT-4V in complex visual reasoning tasks. #AI #VisionLanguageModels #DeductiveReasoning

GoatStack.AI

@GoatstackAI

13 mai 2024

Exploring the capabilities of multimodal LLMs in visual network analysis. #LargeLanguageModels #VisualNetworkAnalysis #VisionLanguageModels

greyson

@xSignLanguage

28 oct. 2024

It may seem confusing now, but it will make sense to everyone in the future. @DeafUmbrella #AIResearch #VisionLanguageModels #MultimodalLLM #AIUnderstanding #VisualLanguageAI #AI

xSignLanguage's tweet image. It may seem confusing now, but it will make sense to everyone in the future. @DeafUmbrella

#AIResearch #VisionLanguageModels #MultimodalLLM #AIUnderstanding #VisualLanguageAI #AI

greyson

@xSignLanguage

5 juil. 2024

Sign language is gaining traction to be a source of power.

M. Akhtar Munir

@akhtarTalks

27 févr.

GoatStack.AI

@GoatstackAI

9 mars 2024

Investigating vision-language models on Raven's Progressive Matrices showcases gaps in visual deductive reasoning. #VisualReasoning #DeductiveReasoning #VisionLanguageModels

leonliuzx

@leonliuzx

17 oct.

🚀 Exciting news! PaddleOCR-VL has rocketed to #1 on @huggingface Trending in just 16 hours! Dive in: huggingface.co/PaddlePaddle/P… #OCR #AI #VisionLanguageModels

GoatStack.AI

@GoatstackAI

19 juin 2024

IEEE Engineering Medicine and Biology Society

@IEEEembs

23 juil.

📢 Call for Papers — JBHI Special Issue: “Transparent Large #VisionLanguageModels in Healthcare” Seeking research on: ✔️ Explainable VLMs ✔️ Medical image-text alignment ✔️ Fair & interpretable AI 📅 Deadline: Sep 30, 2025 🔗 Info: tinyurl.com/4a7d69t2

IEEEembs's tweet image. 📢 Call for Papers — JBHI Special Issue: “Transparent Large #VisionLanguageModels in Healthcare”

Seeking research on:
✔️ Explainable VLMs
✔️ Medical image-text alignment
✔️ Fair &amp; interpretable AI

📅 Deadline: Sep 30, 2025
🔗 Info: tinyurl.com/4a7d69t2

Ashshak_off_

@AshshakO

2 mars

George Z Lin

@gzlin

2 avr. 2024

Mini-Gemini, an innovative #framework from CUHK, optimizes #VisionLanguageModels with a dual-encoder, expanded data, and #LargeLanguageModels. #AI #ComputerVision arxiv.org/abs/2403.18814

gzlin's tweet image. Mini-Gemini, an innovative #framework from CUHK, optimizes #VisionLanguageModels with a dual-encoder, expanded data, and #LargeLanguageModels. #AI #ComputerVision
arxiv.org/abs/2403.18814

Something went wrong.

United States Trends

1. Vandy 7,901 posts
2. Julian Sayin 4,014 posts
3. Carnell Tate 2,183 posts
4. Caicedo 22.4K posts
5. Vanderbilt 6,603 posts
6. Arch Manning 3,184 posts
7. Pavia 2,668 posts
8. Donaldson 1,969 posts
9. Clemson 8,464 posts
10. Christmas 132K posts
11. French Laundry 4,514 posts
12. #HookEm 3,035 posts
13. Jeremiah Smith 1,916 posts
14. Arvell Reese N/A
15. Buckeyes 3,799 posts
16. Joao Pedro 12.6K posts
17. Jim Knowles N/A
18. Jeff Sims N/A
19. Xavi 11.8K posts
20. Dalot 24.3K posts