#visionlanguagemodels 검색 결과

leonliuzx

. 10. 17.

🚀 Exciting news! PaddleOCR-VL has rocketed to #1 on @huggingface Trending in just 16 hours! Dive in: huggingface.co/PaddlePaddle/P… #OCR #AI #VisionLanguageModels

leonliuzx's tweet image. 🚀 Exciting news! PaddleOCR-VL has rocketed to #1 on @huggingface Trending in just 16 hours! Dive in: huggingface.co/PaddlePaddle/P…

#OCR #AI #VisionLanguageModels

Vlad Ruso PhD

@vlruso

. 9. 6.

Hugging Face FineVision: The Ultimate Multimodal Dataset for Vision-Language Model Training #FineVision #VisionLanguageModels #HuggingFace #AIResearch #MultimodalDataset itinai.com/hugging-face-f… Understanding the Impact of FineVision on Vision-Language Models Hugging Face has …

vlruso's tweet image. Hugging Face FineVision: The Ultimate Multimodal Dataset for Vision-Language Model Training #FineVision #VisionLanguageModels #HuggingFace #AIResearch #MultimodalDataset
itinai.com/hugging-face-f…

Understanding the Impact of FineVision on Vision-Language Models

Hugging Face has …

🚀✨ Exciting Publication from @UrbanAI_Lab The paper “Sparkle: Mastering Basic Spatial Capabilities in Vision Language Models Elicits Generalization to Spatial Reasoning” has been accepted to EMNLP 2025! Link: arxiv.org/pdf/2410.16162 #UrbanAI #VisionLanguageModels

JohnSnowLabs

@JohnSnowLabs

. 8. 30.

Read here: hubs.li/Q03Fs2V30 #MedicalAI #VisionLanguageModels #HealthcareAI #MedicalImaging #ClinicalDecisionSupport #GenerativeAI

JohnSnowLabs's tweet image. Read here: hubs.li/Q03Fs2V30

#MedicalAI #VisionLanguageModels #HealthcareAI #MedicalImaging #ClinicalDecisionSupport #GenerativeAI

Woojin Kim

@woojinrad

2024. 7. 31.

1/ 🗑️ in, 🗑️ out With advances in #VisionLanguageModels, there is growing interest in automated #RadiologyReporting. It's great to see such high research interest, BUT... 🚧 Technique seems intriguing, but the figures raise serious doubts about this paper's merit. 🧵 👇

woojinrad's tweet image. 1/ 🗑️ in, 🗑️ out
With advances in #VisionLanguageModels, there is growing interest in automated #RadiologyReporting. It's great to see such high research interest, BUT...
🚧 Technique seems intriguing, but the figures raise serious doubts about this paper's merit. 🧵 👇

Eli Schwartz

@Eli_Schwartz

. 3. 25.

Did you know most vision-language models (like Claude, OpenAI, Gemini) totally suck at reading analog clocks ⏰? (Except Molmo—it’s actually trained for that) #AI #MachineLearning #VisionLanguageModels #vibecoding

Eli_Schwartz's tweet image. Did you know most vision-language models (like Claude, OpenAI, Gemini) totally suck at reading analog clocks ⏰? (Except Molmo—it’s actually trained for that)
#AI #MachineLearning #VisionLanguageModels #vibecoding

Hans Willert

@HWillert

. 10. 2.

Using #VisionLanguageModels to Process Millions of Documents | Towards Data Science towardsdatascience.com/using-vision-l…

HWillert's tweet card. Learn how to effectively apply vision language models to problem solving

Using Vision Language Models to Process Millions of Documents | Towards Data Science

출처: towardsdatascience.com

Debora Nozza

@debora_nozza

2024. 10. 7.

Last week, I gave an invited talk at the 1st workshop on critical evaluation of generative models and their impact on society at #ECCV2024, focusing on unmasking and tackling bias in #VisionLanguageModels. Thanks to the organizers for the invitation!

debora_nozza's tweet image. Last week, I gave an invited talk at the 1st workshop on critical evaluation of generative models and their impact on society at #ECCV2024, focusing on unmasking and tackling bias in #VisionLanguageModels.

Thanks to the organizers for the invitation!

Vlad Ruso PhD

@vlruso

. 9. 2.

Apple’s FastVLM: 85x Faster Hybrid Vision Encoder Revolutionizing AI Models #FastVLM #VisionLanguageModels #AIInnovation #MultimodalProcessing #AppleTech itinai.com/apples-fastvlm… Apple has made a significant leap in the field of Vision Language Models (VLMs) with the introducti…

vlruso's tweet image. Apple’s FastVLM: 85x Faster Hybrid Vision Encoder Revolutionizing AI Models #FastVLM #VisionLanguageModels #AIInnovation #MultimodalProcessing #AppleTech
itinai.com/apples-fastvlm…

Apple has made a significant leap in the field of Vision Language Models (VLMs) with the introducti…

Abhinav Girdhar

@AbhinavGirdhar

2024. 5. 6.

1/5 Can feedback improve semantic grounding in large vision-language models? A recent study delves into this question, exploring the potential of feedback in enhancing the alignment between visual and textual representations. #AI #VisionLanguageModels

AbhinavGirdhar's tweet image. 1/5
Can feedback improve semantic grounding in large vision-language models? A recent study delves into this question, exploring the potential of feedback in enhancing the alignment between visual and textual representations. #AI #VisionLanguageModels

Ashshak_off_

@AshshakO

. 3. 2.

Alhamdulillah! Thrilled to share that our work "O-TPT" has been accepted at #CVPR2025! Big thanks to my supervisor and co-authors for the support! thread(1/n) #MachineLearning #VisionLanguageModels #CVPR2025

AshshakO's tweet image. Alhamdulillah! Thrilled to share that our work "O-TPT" has been accepted at #CVPR2025! Big thanks to my supervisor and co-authors for the support!
thread(1/n)
#MachineLearning #VisionLanguageModels #CVPR2025

GoatStack.AI

@GoatstackAI

2024. 3. 8.

Exploring the limitations of Vision-Language Models (VLMs) like GPT-4V in complex visual reasoning tasks. #AI #VisionLanguageModels #DeductiveReasoning

GoatstackAI's tweet image. Exploring the limitations of Vision-Language Models (VLMs) like GPT-4V in complex visual reasoning tasks. #AI #VisionLanguageModels #DeductiveReasoning

M. Akhtar Munir

@akhtarTalks

. 2. 27.

Thrilled to share that we have two papers accepted at #CVPR2025! 🚀 A big thank you to all the collaborators for their contributions. Stay tuned for more updates! Titles in the thread (1/n) #CVPR #VisionLanguageModels #ModelCalibration #EarthObservation

akhtarTalks's tweet image. Thrilled to share that we have two papers accepted at #CVPR2025! 🚀
A big thank you to all the collaborators for their contributions. Stay tuned for more updates!

Titles in the thread (1/n)

#CVPR #VisionLanguageModels #ModelCalibration #EarthObservation

Abhinav Girdhar

@AbhinavGirdhar

2024. 5. 7.

1/5 BRAVE: A groundbreaking approach to enhancing vision-language models (VLMs)! By combining features from multiple vision encoders, BRAVE creates a more versatile and robust visual representation. #AI #VisionLanguageModels

AbhinavGirdhar's tweet image. 1/5
BRAVE: A groundbreaking approach to enhancing vision-language models (VLMs)! By combining features from multiple vision encoders, BRAVE creates a more versatile and robust visual representation. #AI #VisionLanguageModels

JohnSnowLabs

@JohnSnowLabs

. 8. 9.

ArcGIS Pro

@ArcGISPro

. 8. 22.

Say goodbye to manual data labeling and hello to instant insights! Our new #VisionLanguageModels can extract features from aerial images using just simple prompts. Simply upload an image and ask a question such as "What do you see?" Learn more: ow.ly/FsbV50WKnC3

ArcGISPro's tweet image. Say goodbye to manual data labeling and hello to instant insights! Our new #VisionLanguageModels can extract features from aerial images using just simple prompts. Simply upload an image and ask a question such as "What do you see?"
Learn more: ow.ly/FsbV50WKnC3

GenAINews.co

@genainewstop

. 8. 22.

Exciting news from Liquid AI! 🚀 Introducing LFM2-VL: super-fast, open-weight vision-language models perfect for low-latency, on-device deployment. Revolutionizing AI for smartphones, laptops, wearables, and more! #AI #VisionLanguageModels marktechpost.com/2025/08/20/liq…

Liquid AI Releases LFM2-VL: Super-Fast, Open-Weight Vision-Language Models Designed for Low-Latency...

출처: marktechpost.com

GoatStack.AI

@GoatstackAI

2024. 5. 13.

Exploring the capabilities of multimodal LLMs in visual network analysis. #LargeLanguageModels #VisualNetworkAnalysis #VisionLanguageModels

GoatStack.AI

@GoatstackAI

2024. 3. 9.

Investigating vision-language models on Raven's Progressive Matrices showcases gaps in visual deductive reasoning. #VisualReasoning #DeductiveReasoning #VisionLanguageModels

GoatstackAI's tweet image. Investigating vision-language models on Raven's Progressive Matrices showcases gaps in visual deductive reasoning. #VisualReasoning #DeductiveReasoning #VisionLanguageModels

HackerNoon | Learn Any Technology

@hackernoon

. 10. 1.

This paper summarizes a comprehensive framework for typographic attacks, proving their effectiveness and transferability against Vision-LLMs like LLaVA - hackernoon.com/future-of-ad-s… #visionlanguagemodels #visionllms

hackernoon's tweet card. This paper summarizes a comprehensive framework for typographic attacks, proving their effectiveness and transferability against Vision-LLMs like LLaVA

Future of AD Security: Addressing Limitations and Ethical Concerns in Typographic Attack Research |...

출처: hackernoon.com

leonliuzx

@leonliuzx

. 10. 17.

🚀 Exciting news! PaddleOCR-VL has rocketed to #1 on @huggingface Trending in just 16 hours! Dive in: huggingface.co/PaddlePaddle/P… #OCR #AI #VisionLanguageModels

Himanshu

@WaghHimanshu

. 10. 8.

A key challenge for VLMs is "grounding" - correctly linking text to visual elements. The latest research uses techniques like bounding box annotations and negative captioning to teach models to see and understand with greater accuracy. #DeepLearning #AI #VisionLanguageModels

Ahmed Masry @ COLM 2025 🇨🇦

@Ahmed_Masry97

. 10. 7.

💻 We have open-sourced the code at github.com/ServiceNow/Big… 🙌 This was a collaboration effort between @ServiceNowRSRCH , @Mila_Quebec , and @YorkUniversity. #COLM2025 #AI #VisionLanguageModels #Charts #BigCharts

Ahmed_Masry97's tweet card. Contribute to ServiceNow/BigCharts-R1 development by creating an account on GitHub.

GitHub - ServiceNow/BigCharts-R1

출처: github.com

Shenhao Wang

@ShenhaoWang_AI

. 10. 3.

abhishekjariwala

@abhijariwalaa

. 10. 3.

New research reveals a paradigm-shifting approach to data curation in vision-language models, unlocking their intrinsic capabilities for more accurate and efficient AI understanding. A big step forward in bridging visual and textual data! 🤖📊 #AI #VisionLanguageModels

Hans Willert

@HWillert

. 10. 2.

Using #VisionLanguageModels to Process Millions of Documents | Towards Data Science towardsdatascience.com/using-vision-l…

Using Vision Language Models to Process Millions of Documents | Towards Data Science

출처: towardsdatascience.com

HackerNoon | Learn Any Technology

@hackernoon

. 10. 1.

Future of AD Security: Addressing Limitations and Ethical Concerns in Typographic Attack Research |...

출처: hackernoon.com

HackerNoon | Learn Any Technology

@hackernoon

. 10. 1.

This article presents an empirical study on the effectiveness and transferability of typographic attacks against major Vision-LLMs using AD-specific datasets. - hackernoon.com/empirical-stud… #visionlanguagemodels #visionllms

hackernoon's tweet card. This article presents an empirical study on the effectiveness and transferability of typographic attacks against major Vision-LLMs using AD-specific datasets.

Empirical Study: Evaluating Typographic Attack Effectiveness Against Vision-LLMs in AD Systems |...

출처: hackernoon.com

HackerNoon | Learn Any Technology

@hackernoon

. 10. 1.

This article explores the physical realization of typographic attacks, categorizing their deployment into background and foreground elements - hackernoon.com/foreground-vs-… #visionlanguagemodels #visionllms

hackernoon's tweet card. This article explores the physical realization of typographic attacks, categorizing their deployment into background and foreground elements

Foreground vs. Background: Analyzing Typographic Attack Placement in Autonomous Driving Systems |...

출처: hackernoon.com

AGI Talent

@mctalentowen

. 9. 18.

EdiVal-Agent enables scalable, object-centric evaluation of multi-turn image editing, improving instruction following, content consistency, and visual quality assessment. #EdiValAgent #MultiTurnEditing #VisionLanguageModels #ObjectCentric #AIResearch @TianyuChen @MingyuanZhou

mctalentowen's tweet image. EdiVal-Agent enables scalable, object-centric evaluation of multi-turn image editing, improving instruction following, content consistency, and visual quality assessment. #EdiValAgent #MultiTurnEditing #VisionLanguageModels #ObjectCentric #AIResearch @TianyuChen @MingyuanZhou

Vimal Singh

@VimalAITech

. 9. 10.

Interesting to see fast development in #VisionLanguageModels

Vlad Ruso PhD

@vlruso

. 9. 6.

DaveAI

@Socio_Graph

. 9. 5.

Vision Language Models (VLMs) are transforming AI—merging computer vision and NLP to let machines see and understand. #AI #MultimodalAI #VisionLanguageModels @YourStoryCo yourstory.com/ai-story/how-v…

yourstory.com

How vision language models are shaping multimodal AI

Recent years have witnessed AI evolve beyond single-mode systems to generate multiple streams of information for multiple modalities, including images, text, audio, video, and more, that too, within...

출처: yourstory.com

"#visionlanguagemodels"에 대한 결과가 없습니다

Vlad Ruso PhD

@vlruso

. 9. 6.

AGI Talent

@mctalentowen

. 9. 18.

ArcGIS Pro

@ArcGISPro

. 8. 22.

Vlad Ruso PhD

@vlruso

. 9. 2.

JohnSnowLabs

@JohnSnowLabs

. 8. 30.

Read here: hubs.li/Q03Fs2V30 #MedicalAI #VisionLanguageModels #HealthcareAI #MedicalImaging #ClinicalDecisionSupport #GenerativeAI

Woojin Kim

@woojinrad

2024. 7. 31.

Abhinav Girdhar

@AbhinavGirdhar

2024. 5. 6.

JohnSnowLabs

@JohnSnowLabs

. 8. 9.

Abhinav Girdhar

@AbhinavGirdhar

2024. 5. 7.

IEEE Engineering Medicine and Biology Society

@IEEEembs

. 7. 23.

📢 Call for Papers — JBHI Special Issue: “Transparent Large #VisionLanguageModels in Healthcare” Seeking research on: ✔️ Explainable VLMs ✔️ Medical image-text alignment ✔️ Fair & interpretable AI 📅 Deadline: Sep 30, 2025 🔗 Info: tinyurl.com/4a7d69t2

IEEEembs's tweet image. 📢 Call for Papers — JBHI Special Issue: “Transparent Large #VisionLanguageModels in Healthcare”

Seeking research on:
✔️ Explainable VLMs
✔️ Medical image-text alignment
✔️ Fair &amp; interpretable AI

📅 Deadline: Sep 30, 2025
🔗 Info: tinyurl.com/4a7d69t2

leonliuzx

@leonliuzx

. 10. 17.

🚀 Exciting news! PaddleOCR-VL has rocketed to #1 on @huggingface Trending in just 16 hours! Dive in: huggingface.co/PaddlePaddle/P… #OCR #AI #VisionLanguageModels

GoatStack.AI

@GoatstackAI

2024. 3. 8.

Exploring the limitations of Vision-Language Models (VLMs) like GPT-4V in complex visual reasoning tasks. #AI #VisionLanguageModels #DeductiveReasoning

Data Science Dojo

@DataScienceDojo

2024. 4. 16.

Moondream 2 is a superstar in the world of vision-and-language models, but what makes it tick? This post unveils the magic behind it: Curious to learn more? ➡️ hubs.la/Q02sWg4R0 #Moondream2 #VisionLanguageModels #AIInnovation

DataScienceDojo's tweet image. Moondream 2 is a superstar in the world of vision-and-language models, but what makes it tick? This post unveils the magic behind it:

Curious to learn more? ➡️ hubs.la/Q02sWg4R0

#Moondream2 #VisionLanguageModels #AIInnovation

GoatStack.AI

@GoatstackAI

2024. 5. 13.

Exploring the capabilities of multimodal LLMs in visual network analysis. #LargeLanguageModels #VisualNetworkAnalysis #VisionLanguageModels

AZoRobotics

@AZoRobotics

. 12. 10.

Discover a novel "black-box forgetting" technique that redefines AI model optimization. #AI #VisionLanguageModels #MachineLearning azorobotics.com/news.aspx?News…

AZoRobotics's tweet image. Discover a novel "black-box forgetting" technique that redefines AI model optimization. #AI #VisionLanguageModels #MachineLearning
azorobotics.com/news.aspx?News…

M. Akhtar Munir

@akhtarTalks

. 2. 27.

Yassir Bendou

@YBendou

. 2. 27.

[1/6] 🚀 Exciting News! Our paper has been accepted at hashtag #CVPR2025 ! 🎉 We’re thrilled to introduce "ProKeR: A Kernel Perspective on Few-Shot Adaptation of Large Vision-Language Models" 📄 ybendou.github.io/ProKeR/ #VisionLanguageModels #FewShotLearning #ComputerVision

YBendou's tweet image. [1/6] 🚀 Exciting News! Our paper has been accepted at hashtag #CVPR2025 ! 🎉

We’re thrilled to introduce "ProKeR: A Kernel Perspective on Few-Shot Adaptation of Large Vision-Language Models"

📄 ybendou.github.io/ProKeR/

#VisionLanguageModels #FewShotLearning #ComputerVision

Ashshak_off_

@AshshakO

. 3. 2.

Kitware

@Kitware

. 5. 12.

Headed to #GEOINT2025? Don’t miss Dr. Brian Clipp’s session on #VisionLanguageModels: 🧠 Explainable segmentation, tracking & detection 🧰 Compositional programming for analyst queries 🗓️ May 19 | 7:30 AM ow.ly/mE4r50VR5lI #GeospatialIntelligence

Kitware's tweet image. Headed to #GEOINT2025?

Don’t miss Dr. Brian Clipp’s session on #VisionLanguageModels:
🧠 Explainable segmentation, tracking &amp; detection
🧰 Compositional programming for analyst queries
🗓️ May 19 | 7:30 AM

ow.ly/mE4r50VR5lI #GeospatialIntelligence

George Z Lin

@gzlin

2024. 4. 2.

Mini-Gemini, an innovative #framework from CUHK, optimizes #VisionLanguageModels with a dual-encoder, expanded data, and #LargeLanguageModels. #AI #ComputerVision arxiv.org/abs/2403.18814

gzlin's tweet image. Mini-Gemini, an innovative #framework from CUHK, optimizes #VisionLanguageModels with a dual-encoder, expanded data, and #LargeLanguageModels. #AI #ComputerVision
arxiv.org/abs/2403.18814

Something went wrong.

United States Trends

1. #AEWWrestleDream 23.2K posts
2. #UFCVancouver 22.3K posts
3. No Kings 1.56M posts
4. Chito 5,884 posts
5. #RollTide 4,105 posts
6. CJ Carr 1,091 posts
7. Tennessee 45.1K posts
8. Sam Rivers 7,536 posts
9. Heupel 1,061 posts
10. Sark 2,138 posts
11. Texas Tech 9,165 posts
12. Iowa 16.2K posts
13. Holland 10.6K posts
14. Zabien Brown 1,849 posts
15. #ChristmasWithBedBathandBeyond N/A
16. Zahabi 3,031 posts
17. Arkansas 21K posts
18. Ole Miss 17.4K posts
19. Joey Aguilar N/A
20. Kentucky 19.2K posts