#visionlanguagemodels kết quả tìm kiếm

Shenhao Wang

3 thg 10

🚀✨ Exciting Publication from @UrbanAI_Lab The paper “Sparkle: Mastering Basic Spatial Capabilities in Vision Language Models Elicits Generalization to Spatial Reasoning” has been accepted to EMNLP 2025! Link: arxiv.org/pdf/2410.16162 #UrbanAI #VisionLanguageModels

leonliuzx

@leonliuzx

17 thg 10

🚀 Exciting news! PaddleOCR-VL has rocketed to #1 on @huggingface Trending in just 16 hours! Dive in: huggingface.co/PaddlePaddle/P… #OCR #AI #VisionLanguageModels

leonliuzx's tweet image. 🚀 Exciting news! PaddleOCR-VL has rocketed to #1 on @huggingface Trending in just 16 hours! Dive in: huggingface.co/PaddlePaddle/P…

#OCR #AI #VisionLanguageModels

Vlad Ruso PhD

@vlruso

6 thg 9

Hugging Face FineVision: The Ultimate Multimodal Dataset for Vision-Language Model Training #FineVision #VisionLanguageModels #HuggingFace #AIResearch #MultimodalDataset itinai.com/hugging-face-f… Understanding the Impact of FineVision on Vision-Language Models Hugging Face has …

vlruso's tweet image. Hugging Face FineVision: The Ultimate Multimodal Dataset for Vision-Language Model Training #FineVision #VisionLanguageModels #HuggingFace #AIResearch #MultimodalDataset
itinai.com/hugging-face-f…

Understanding the Impact of FineVision on Vision-Language Models

Hugging Face has …

JohnSnowLabs

@JohnSnowLabs

30 thg 8

Read here: hubs.li/Q03Fs2V30 #MedicalAI #VisionLanguageModels #HealthcareAI #MedicalImaging #ClinicalDecisionSupport #GenerativeAI

JohnSnowLabs's tweet image. Read here: hubs.li/Q03Fs2V30

#MedicalAI #VisionLanguageModels #HealthcareAI #MedicalImaging #ClinicalDecisionSupport #GenerativeAI

Woojin Kim

@woojinrad

31 thg 7, 2024

1/ 🗑️ in, 🗑️ out With advances in #VisionLanguageModels, there is growing interest in automated #RadiologyReporting. It's great to see such high research interest, BUT... 🚧 Technique seems intriguing, but the figures raise serious doubts about this paper's merit. 🧵 👇

woojinrad's tweet image. 1/ 🗑️ in, 🗑️ out
With advances in #VisionLanguageModels, there is growing interest in automated #RadiologyReporting. It's great to see such high research interest, BUT...
🚧 Technique seems intriguing, but the figures raise serious doubts about this paper's merit. 🧵 👇

Eli Schwartz

@Eli_Schwartz

25 thg 3

Did you know most vision-language models (like Claude, OpenAI, Gemini) totally suck at reading analog clocks ⏰? (Except Molmo—it’s actually trained for that) #AI #MachineLearning #VisionLanguageModels #vibecoding

Eli_Schwartz's tweet image. Did you know most vision-language models (like Claude, OpenAI, Gemini) totally suck at reading analog clocks ⏰? (Except Molmo—it’s actually trained for that)
#AI #MachineLearning #VisionLanguageModels #vibecoding

Hans Willert

@HWillert

2 thg 10

Using #VisionLanguageModels to Process Millions of Documents | Towards Data Science towardsdatascience.com/using-vision-l…

HWillert's tweet card. Learn how to effectively apply vision language models to problem solving

Using Vision Language Models to Process Millions of Documents | Towards Data Science

Nguồn: towardsdatascience.com

Debora Nozza

@debora_nozza

7 thg 10, 2024

Last week, I gave an invited talk at the 1st workshop on critical evaluation of generative models and their impact on society at #ECCV2024, focusing on unmasking and tackling bias in #VisionLanguageModels. Thanks to the organizers for the invitation!

debora_nozza's tweet image. Last week, I gave an invited talk at the 1st workshop on critical evaluation of generative models and their impact on society at #ECCV2024, focusing on unmasking and tackling bias in #VisionLanguageModels.

Thanks to the organizers for the invitation!

ArcGIS Pro

@ArcGISPro

22 thg 8

Say goodbye to manual data labeling and hello to instant insights! Our new #VisionLanguageModels can extract features from aerial images using just simple prompts. Simply upload an image and ask a question such as "What do you see?" Learn more: ow.ly/FsbV50WKnC3

ArcGISPro's tweet image. Say goodbye to manual data labeling and hello to instant insights! Our new #VisionLanguageModels can extract features from aerial images using just simple prompts. Simply upload an image and ask a question such as "What do you see?"
Learn more: ow.ly/FsbV50WKnC3

Vlad Ruso PhD

@vlruso

2 thg 9

Apple’s FastVLM: 85x Faster Hybrid Vision Encoder Revolutionizing AI Models #FastVLM #VisionLanguageModels #AIInnovation #MultimodalProcessing #AppleTech itinai.com/apples-fastvlm… Apple has made a significant leap in the field of Vision Language Models (VLMs) with the introducti…

vlruso's tweet image. Apple’s FastVLM: 85x Faster Hybrid Vision Encoder Revolutionizing AI Models #FastVLM #VisionLanguageModels #AIInnovation #MultimodalProcessing #AppleTech
itinai.com/apples-fastvlm…

Apple has made a significant leap in the field of Vision Language Models (VLMs) with the introducti…

Abhinav Girdhar

@AbhinavGirdhar

6 thg 5, 2024

1/5 Can feedback improve semantic grounding in large vision-language models? A recent study delves into this question, exploring the potential of feedback in enhancing the alignment between visual and textual representations. #AI #VisionLanguageModels

AbhinavGirdhar's tweet image. 1/5
Can feedback improve semantic grounding in large vision-language models? A recent study delves into this question, exploring the potential of feedback in enhancing the alignment between visual and textual representations. #AI #VisionLanguageModels

Ashshak_off_

@AshshakO

2 thg 3

Alhamdulillah! Thrilled to share that our work "O-TPT" has been accepted at #CVPR2025! Big thanks to my supervisor and co-authors for the support! thread(1/n) #MachineLearning #VisionLanguageModels #CVPR2025

AshshakO's tweet image. Alhamdulillah! Thrilled to share that our work "O-TPT" has been accepted at #CVPR2025! Big thanks to my supervisor and co-authors for the support!
thread(1/n)
#MachineLearning #VisionLanguageModels #CVPR2025

GoatStack.AI

@GoatstackAI

8 thg 3, 2024

Exploring the limitations of Vision-Language Models (VLMs) like GPT-4V in complex visual reasoning tasks. #AI #VisionLanguageModels #DeductiveReasoning

GoatstackAI's tweet image. Exploring the limitations of Vision-Language Models (VLMs) like GPT-4V in complex visual reasoning tasks. #AI #VisionLanguageModels #DeductiveReasoning

M. Akhtar Munir

@akhtarTalks

27 thg 2

Thrilled to share that we have two papers accepted at #CVPR2025! 🚀 A big thank you to all the collaborators for their contributions. Stay tuned for more updates! Titles in the thread (1/n) #CVPR #VisionLanguageModels #ModelCalibration #EarthObservation

akhtarTalks's tweet image. Thrilled to share that we have two papers accepted at #CVPR2025! 🚀
A big thank you to all the collaborators for their contributions. Stay tuned for more updates!

Titles in the thread (1/n)

#CVPR #VisionLanguageModels #ModelCalibration #EarthObservation

Abhinav Girdhar

@AbhinavGirdhar

7 thg 5, 2024

1/5 BRAVE: A groundbreaking approach to enhancing vision-language models (VLMs)! By combining features from multiple vision encoders, BRAVE creates a more versatile and robust visual representation. #AI #VisionLanguageModels

AbhinavGirdhar's tweet image. 1/5
BRAVE: A groundbreaking approach to enhancing vision-language models (VLMs)! By combining features from multiple vision encoders, BRAVE creates a more versatile and robust visual representation. #AI #VisionLanguageModels

JohnSnowLabs

@JohnSnowLabs

9 thg 8

GenAINews.co

@genainewstop

22 thg 8

Exciting news from Liquid AI! 🚀 Introducing LFM2-VL: super-fast, open-weight vision-language models perfect for low-latency, on-device deployment. Revolutionizing AI for smartphones, laptops, wearables, and more! #AI #VisionLanguageModels marktechpost.com/2025/08/20/liq…

Liquid AI Releases LFM2-VL: Super-Fast, Open-Weight Vision-Language Models Designed for Low-Latency...

Nguồn: marktechpost.com

GoatStack.AI

@GoatstackAI

13 thg 5, 2024

Exploring the capabilities of multimodal LLMs in visual network analysis. #LargeLanguageModels #VisualNetworkAnalysis #VisionLanguageModels

GoatStack.AI

@GoatstackAI

9 thg 3, 2024

Investigating vision-language models on Raven's Progressive Matrices showcases gaps in visual deductive reasoning. #VisualReasoning #DeductiveReasoning #VisionLanguageModels

GoatstackAI's tweet image. Investigating vision-language models on Raven's Progressive Matrices showcases gaps in visual deductive reasoning. #VisualReasoning #DeductiveReasoning #VisionLanguageModels

HackerNoon | Learn Any Technology

@hackernoon

1 thg 10

This paper summarizes a comprehensive framework for typographic attacks, proving their effectiveness and transferability against Vision-LLMs like LLaVA - hackernoon.com/future-of-ad-s… #visionlanguagemodels #visionllms

hackernoon's tweet card. This paper summarizes a comprehensive framework for typographic attacks, proving their effectiveness and transferability against Vision-LLMs like LLaVA

Future of AD Security: Addressing Limitations and Ethical Concerns in Typographic Attack Research |...

Nguồn: hackernoon.com

leonliuzx

@leonliuzx

17 thg 10

🚀 Exciting news! PaddleOCR-VL has rocketed to #1 on @huggingface Trending in just 16 hours! Dive in: huggingface.co/PaddlePaddle/P… #OCR #AI #VisionLanguageModels

Himanshu

@WaghHimanshu

8 thg 10

A key challenge for VLMs is "grounding" - correctly linking text to visual elements. The latest research uses techniques like bounding box annotations and negative captioning to teach models to see and understand with greater accuracy. #DeepLearning #AI #VisionLanguageModels

Ahmed Masry @ COLM 2025 🇨🇦

@Ahmed_Masry97

7 thg 10

💻 We have open-sourced the code at github.com/ServiceNow/Big… 🙌 This was a collaboration effort between @ServiceNowRSRCH , @Mila_Quebec , and @YorkUniversity. #COLM2025 #AI #VisionLanguageModels #Charts #BigCharts

Ahmed_Masry97's tweet card. Contribute to ServiceNow/BigCharts-R1 development by creating an account on GitHub.

GitHub - ServiceNow/BigCharts-R1

Nguồn: github.com

Shenhao Wang

@ShenhaoWang_AI

3 thg 10

abhishekjariwala

@abhijariwalaa

3 thg 10

New research reveals a paradigm-shifting approach to data curation in vision-language models, unlocking their intrinsic capabilities for more accurate and efficient AI understanding. A big step forward in bridging visual and textual data! 🤖📊 #AI #VisionLanguageModels

Hans Willert

@HWillert

2 thg 10

Using #VisionLanguageModels to Process Millions of Documents | Towards Data Science towardsdatascience.com/using-vision-l…

Using Vision Language Models to Process Millions of Documents | Towards Data Science

Nguồn: towardsdatascience.com

HackerNoon | Learn Any Technology

@hackernoon

1 thg 10

Future of AD Security: Addressing Limitations and Ethical Concerns in Typographic Attack Research |...

Nguồn: hackernoon.com

HackerNoon | Learn Any Technology

@hackernoon

1 thg 10

This article presents an empirical study on the effectiveness and transferability of typographic attacks against major Vision-LLMs using AD-specific datasets. - hackernoon.com/empirical-stud… #visionlanguagemodels #visionllms

hackernoon's tweet card. This article presents an empirical study on the effectiveness and transferability of typographic attacks against major Vision-LLMs using AD-specific datasets.

Empirical Study: Evaluating Typographic Attack Effectiveness Against Vision-LLMs in AD Systems |...

Nguồn: hackernoon.com

HackerNoon | Learn Any Technology

@hackernoon

1 thg 10

This article explores the physical realization of typographic attacks, categorizing their deployment into background and foreground elements - hackernoon.com/foreground-vs-… #visionlanguagemodels #visionllms

hackernoon's tweet card. This article explores the physical realization of typographic attacks, categorizing their deployment into background and foreground elements

Foreground vs. Background: Analyzing Typographic Attack Placement in Autonomous Driving Systems |...

Nguồn: hackernoon.com

AGI Talent

@mctalentowen

18 thg 9

EdiVal-Agent enables scalable, object-centric evaluation of multi-turn image editing, improving instruction following, content consistency, and visual quality assessment. #EdiValAgent #MultiTurnEditing #VisionLanguageModels #ObjectCentric #AIResearch @TianyuChen @MingyuanZhou

mctalentowen's tweet image. EdiVal-Agent enables scalable, object-centric evaluation of multi-turn image editing, improving instruction following, content consistency, and visual quality assessment. #EdiValAgent #MultiTurnEditing #VisionLanguageModels #ObjectCentric #AIResearch @TianyuChen @MingyuanZhou

Vimal Singh

@VimalAITech

10 thg 9

Interesting to see fast development in #VisionLanguageModels

Vlad Ruso PhD

@vlruso

6 thg 9

DaveAI

@Socio_Graph

5 thg 9

Vision Language Models (VLMs) are transforming AI—merging computer vision and NLP to let machines see and understand. #AI #MultimodalAI #VisionLanguageModels @YourStoryCo yourstory.com/ai-story/how-v…

yourstory.com

How vision language models are shaping multimodal AI

Recent years have witnessed AI evolve beyond single-mode systems to generate multiple streams of information for multiple modalities, including images, text, audio, video, and more, that too, within...

Nguồn: yourstory.com

Không có kết quả nào cho "#visionlanguagemodels"

Vlad Ruso PhD

@vlruso

6 thg 9

AGI Talent

@mctalentowen

18 thg 9

JohnSnowLabs

@JohnSnowLabs

30 thg 8

Read here: hubs.li/Q03Fs2V30 #MedicalAI #VisionLanguageModels #HealthcareAI #MedicalImaging #ClinicalDecisionSupport #GenerativeAI

ArcGIS Pro

@ArcGISPro

22 thg 8

Vlad Ruso PhD

@vlruso

2 thg 9

Woojin Kim

@woojinrad

31 thg 7, 2024

Abhinav Girdhar

@AbhinavGirdhar

6 thg 5, 2024

JohnSnowLabs

@JohnSnowLabs

9 thg 8

Abhinav Girdhar

@AbhinavGirdhar

7 thg 5, 2024

IEEE Engineering Medicine and Biology Society

@IEEEembs

23 thg 7

📢 Call for Papers — JBHI Special Issue: “Transparent Large #VisionLanguageModels in Healthcare” Seeking research on: ✔️ Explainable VLMs ✔️ Medical image-text alignment ✔️ Fair & interpretable AI 📅 Deadline: Sep 30, 2025 🔗 Info: tinyurl.com/4a7d69t2

IEEEembs's tweet image. 📢 Call for Papers — JBHI Special Issue: “Transparent Large #VisionLanguageModels in Healthcare”

Seeking research on:
✔️ Explainable VLMs
✔️ Medical image-text alignment
✔️ Fair &amp; interpretable AI

📅 Deadline: Sep 30, 2025
🔗 Info: tinyurl.com/4a7d69t2

GoatStack.AI

@GoatstackAI

8 thg 3, 2024

Exploring the limitations of Vision-Language Models (VLMs) like GPT-4V in complex visual reasoning tasks. #AI #VisionLanguageModels #DeductiveReasoning

leonliuzx

@leonliuzx

17 thg 10

🚀 Exciting news! PaddleOCR-VL has rocketed to #1 on @huggingface Trending in just 16 hours! Dive in: huggingface.co/PaddlePaddle/P… #OCR #AI #VisionLanguageModels

GoatStack.AI

@GoatstackAI

13 thg 5, 2024

Exploring the capabilities of multimodal LLMs in visual network analysis. #LargeLanguageModels #VisualNetworkAnalysis #VisionLanguageModels

Data Science Dojo

@DataScienceDojo

16 thg 4, 2024

Moondream 2 is a superstar in the world of vision-and-language models, but what makes it tick? This post unveils the magic behind it: Curious to learn more? ➡️ hubs.la/Q02sWg4R0 #Moondream2 #VisionLanguageModels #AIInnovation

DataScienceDojo's tweet image. Moondream 2 is a superstar in the world of vision-and-language models, but what makes it tick? This post unveils the magic behind it:

Curious to learn more? ➡️ hubs.la/Q02sWg4R0

#Moondream2 #VisionLanguageModels #AIInnovation

M. Akhtar Munir

@akhtarTalks

27 thg 2

AZoRobotics

@AZoRobotics

10 thg 12

Discover a novel "black-box forgetting" technique that redefines AI model optimization. #AI #VisionLanguageModels #MachineLearning azorobotics.com/news.aspx?News…

AZoRobotics's tweet image. Discover a novel "black-box forgetting" technique that redefines AI model optimization. #AI #VisionLanguageModels #MachineLearning
azorobotics.com/news.aspx?News…

Ashshak_off_

@AshshakO

2 thg 3

Yassir Bendou

@YBendou

27 thg 2

[1/6] 🚀 Exciting News! Our paper has been accepted at hashtag #CVPR2025 ! 🎉 We’re thrilled to introduce "ProKeR: A Kernel Perspective on Few-Shot Adaptation of Large Vision-Language Models" 📄 ybendou.github.io/ProKeR/ #VisionLanguageModels #FewShotLearning #ComputerVision

YBendou's tweet image. [1/6] 🚀 Exciting News! Our paper has been accepted at hashtag #CVPR2025 ! 🎉

We’re thrilled to introduce "ProKeR: A Kernel Perspective on Few-Shot Adaptation of Large Vision-Language Models"

📄 ybendou.github.io/ProKeR/

#VisionLanguageModels #FewShotLearning #ComputerVision

Kitware

@Kitware

12 thg 5

Headed to #GEOINT2025? Don’t miss Dr. Brian Clipp’s session on #VisionLanguageModels: 🧠 Explainable segmentation, tracking & detection 🧰 Compositional programming for analyst queries 🗓️ May 19 | 7:30 AM ow.ly/mE4r50VR5lI #GeospatialIntelligence

Kitware's tweet image. Headed to #GEOINT2025?

Don’t miss Dr. Brian Clipp’s session on #VisionLanguageModels:
🧠 Explainable segmentation, tracking &amp; detection
🧰 Compositional programming for analyst queries
🗓️ May 19 | 7:30 AM

ow.ly/mE4r50VR5lI #GeospatialIntelligence

GoatStack.AI

@GoatstackAI

9 thg 3, 2024

Investigating vision-language models on Raven's Progressive Matrices showcases gaps in visual deductive reasoning. #VisualReasoning #DeductiveReasoning #VisionLanguageModels

Something went wrong.

United States Trends

1. #AEWWrestleDream 38.3K posts
2. Kentucky 23.5K posts
3. Lincoln Riley 1,836 posts
4. Arch 24.7K posts
5. Stoops 4,006 posts
6. Sark 3,992 posts
7. #UFCVancouver 28.1K posts
8. Texas 163K posts
9. Bama 13.7K posts
10. Notre Dame 13.3K posts
11. No Kings 1.66M posts
12. Christian Gray N/A
13. #RollTide 5,547 posts
14. Tennessee 49K posts
15. #HookEm 2,343 posts
16. Heupel 1,971 posts
17. Sam Rivers 14.6K posts
18. Iowa 17.2K posts
19. Chito 6,958 posts
20. Tim Banks 1,042 posts