#benchmarks 搜尋結果

Microsoft Adoption & Community

年5月8日

✨ Taking the stage, @Jared_Spataro here to share his thoughts and insights about “A New Frontier: Building the Future Firm with #AI” #BenchMarks 📈📉📊 #M365Con Day Three Keynote

MSFTAdoption's tweet image. ✨ Taking the stage, @Jared_Spataro here to share his thoughts and insights about “A New Frontier: Building the Future Firm with #AI” #BenchMarks 📈📉📊

#M365Con Day Three Keynote

SLLY nomy :3

@nomyfps

年9月22日

Cerulean Complete. #VISCOSE #BENCHMARKS #VISCOSEBENCHMARKS

the peak district viking

@thepdviking

年10月9日

One for you @martgathercole a few of the benchmarks in my area #benchmarking #ordnancesurvey #benchmarks

Pixel 10 Pro XL pulls a 95% stability on Wild Life Extreme Stress Test 🔥 Best loop: 3252 | Lowest loop: 3094 Google finally nailed thermal performance – no wild throttling here. 💪📱 #Pixel10Pro #Benchmarks

iphonesickness's tweet image. Pixel 10 Pro XL pulls a 95% stability on Wild Life Extreme Stress Test 🔥
Best loop: 3252 | Lowest loop: 3094

Google finally nailed thermal performance – no wild throttling here. 💪📱 #Pixel10Pro #Benchmarks

Gerard Sans | Axiom 🇬🇧

@gerardsans

2024年8月6日

Achieving SOTA AI benchmarks in 2024 AI researchers: nobody is gonna know #ai #benchmarks

Prakash Sangam

@MyTechMusings

2024年10月21日

Some #benchmarks for #Oryon2ndgen #SnapdragonSummit @Qualcomm

Android Authority

@AndroidAuth

年9月24日

Snapdragon 8 Elite Gen 5 benchmarks are CRAZY! 📈 #qualcommsnapdragon #qualcomm #benchmarks

YellowBanana

@BananaYellow88

年9月22日

Open Deep Search (ODS) isn’t theory. It’s already outperforming closed labs: - FRAMES: 75.3% - SimpleQA: 88.3% That’s Sentient’s power: research that’s open, benchmarked, and winning. @SentientAGI @sentient_chat #SentientAGI #Benchmarks

BananaYellow88's tweet image. Open Deep Search (ODS) isn’t theory.

It’s already outperforming closed labs:
- FRAMES: 75.3%
- SimpleQA: 88.3%

That’s Sentient’s power: research that’s open, benchmarked, and winning.

@SentientAGI @sentient_chat

#SentientAGI #Benchmarks

AgenticLabs

@AgenticLabsLtd

年10月2日

Claude Sonnet 4.5 just topped SWE-bench Verified (n=500) with 82% accuracy — outperforming Opus 4.1, Sonnet 4, GPT-5 Codex, GPT-5, and Gemini 2.5 Pro. Software engineering benchmark results are clear: Sonnet 4.5 leads. #AI #SoftwareEngineering #Benchmarks #Craftvideo

AgenticLabsLtd's tweet image. Claude Sonnet 4.5 just topped SWE-bench Verified (n=500) with 82% accuracy — outperforming Opus 4.1, Sonnet 4, GPT-5 Codex, GPT-5, and Gemini 2.5 Pro.

Software engineering benchmark results are clear:
Sonnet 4.5 leads.

#AI #SoftwareEngineering #Benchmarks #Craftvideo

AlpacaLips 🐢

@AlpacaLips_

年1月31日

I camped overnight outside of Microcenter and got my hands on an #RTX5080 #ffxiv #benchmarks #ff14

AlpacaLips 🐢

@AlpacaLips_

年1月31日

I camped overnight outside of Microcenter to get an RTX 5080! Here's how it runs on #FFXIV youtu.be/D_CgrIaw1nU?si…

Pine St Elementary

@pinestreetelem

年8月28日

The district ELA team is at Pine testing students ⁦@PalmyraSchools⁩ #benchmarks #meetingstudentswheretheyareat #ThisIsPalmyra #ThisIsPine

pinestreetelem's tweet image. The district ELA team is at Pine testing students ⁦@PalmyraSchools⁩ #benchmarks #meetingstudentswheretheyareat #ThisIsPalmyra #ThisIsPine

Predibase by Rubrik

@predibase

年5月28日

The fastest open-source LLM #inference stack just landed. Check out our latest #benchmarks that leave vLLM and Fireworks in the dust. 🏎️💨 Our blog has all the juicy details—but here's the 30-sec version: ⚡ Up to 4× lower P50/P95 latency on the same #H100 & L40S GPUs 📈…

predibase's tweet image. The fastest open-source LLM #inference stack just landed.

Check out our latest #benchmarks that leave vLLM and Fireworks in the dust. 🏎️💨

Our blog has all the juicy details—but here's the 30-sec version:
⚡ Up to 4× lower P50/P95 latency on the same #H100 &amp; L40S GPUs
📈…

NITheCS

@NITheCS

年3月26日

NITheCS & SU Seminar: 'Benchmarking Benchmarks: A PBFJ Replication Study' - Katherine van der Merwe (SU) - Fri, 28 Mar 2025 @ 13h10-14h10 SAST. Attend online or in person. buff.ly/dcu65QP #financialrisk #benchmarks #interestrate #pbfj #libor #arr #jibar #zaronia

NITheCS's tweet image. NITheCS &amp; SU Seminar: 'Benchmarking Benchmarks: A PBFJ Replication Study' - Katherine van der Merwe (SU) - Fri, 28 Mar 2025 @ 13h10-14h10 SAST. Attend online or in person. buff.ly/dcu65QP #financialrisk #benchmarks #interestrate #pbfj #libor #arr #jibar #zaronia

Vals AI

@_valsai

年5月9日

We just released our evaluation of @MistralAI Medium 3 across all of our benchmarks! 🧵(1/6) #AI #LLM #Benchmarks

Hyke

@0xhyke

年8月8日

GPT-5 vs Grok 4 - SkateBench → GPT-5: 98.6% accuracy | $0.07 cost → Grok 4: 79% accuracy | $4.86 cost GPT-5 is: → 14× cheaper → More accurate → Much faster This is precision at scale. That is burn rate with lag. #GPT5 #LLM #Benchmarks

0xhyke's tweet image. GPT-5 vs Grok 4 - SkateBench

→ GPT-5: 98.6% accuracy | $0.07 cost
→ Grok 4: 79% accuracy | $4.86 cost

GPT-5 is:
→ 14× cheaper
→ More accurate
→ Much faster

This is precision at scale.
That is burn rate with lag.

#GPT5 #LLM #Benchmarks

Benedikt Koehler

@furukama

年9月20日

You can now filter the LLM benchmark list by size. Here's the top XS models (< 2B parameters) furukama.com/llm-bob/?size=… #benchmarks #artificialintelligence

furukama's tweet image. You can now filter the LLM benchmark list by size. Here's the top XS models (&lt; 2B parameters) furukama.com/llm-bob/?size=… #benchmarks #artificialintelligence

Brandon

@tweetbrandon

年11月19日

FYI, these are the most cited #accuracy #benchmarks for #ai. I asked the top 8 LLMs/chatbots (not sure what to call them) listed in my pinned post: Most Frequently Cited Benchmarks Universal consensus (7-8 mentions): TruthfulQA - Cited by 7 out of 8 LLMs as the gold standard…

New Submissions to TMLR

@TmlrSub

年11月27日

Nondeterministic Polynomial-time Problem Challenge: An Ever-Scaling Reasoning Benchmark for LLMs openreview.net/forum?id=Xb6d5… #npsolver #npeval #benchmarks

New Change FX

@NewchangeFX

年11月27日

Great to see our CEO, Paul Lambert, speaking at the ACI FMA webinar “Inside Swaps: Challenges and Possibilities” — alongside leading voices from across the FX industry. #FX #Benchmarks #MarketData #eFX #FXSwaps #Trading #Finance #NewChangeFX

NewchangeFX's tweet image. Great to see our CEO, Paul Lambert, speaking at the ACI FMA webinar “Inside Swaps: Challenges and Possibilities” — alongside leading voices from across the FX industry.

#FX #Benchmarks #MarketData #eFX #FXSwaps #Trading #Finance #NewChangeFX

Sakshi k kewat

@sakshi2008

年11月27日

#Benchmarks off record highs; #Sensex up 200 points, #Nifty50 near 26,250

Brandon

@tweetbrandon

年11月26日

Great piece on #ai #benchmarks evidentlyai.com/blog/ai-benchm… @explorersofai @morqon

tweetbrandon's tweet card. In this blog, we’ll explore AI benchmarks and why we need them. We’ll also provide 25 examples of widely used AI benchmarks for reasoning and language understanding, conversation abilities, coding,...

25 AI benchmarks: examples of AI models evaluation

來源: evidentlyai.com

mark seery

@140ismymax

年11月26日

Real experience with #LLMs and #Agents, for example in complex tasks like coding, does not match expectations set by #benchmarks/evals. thanks to @ilyasut for saying out loud. 👏 they are still great, but the industry should have clarity about this.

Carlos E. Perez

@IntuitMachine

年11月26日

Sutskever SuperIntelligence Insights Core insight: Current AI memorizes answers without ever learning the subject. Why models can ace benchmarks while failing trivially—and why more data won't fix it. You've probably noticed this: ChatGPT solves a hard coding problem, then…

IntuitMachine's tweet image. Sutskever SuperIntelligence Insights

Core insight: Current AI memorizes answers without ever learning the subject.

Why models can ace benchmarks while failing trivially—and why more data won't fix it.

You've probably noticed this: ChatGPT solves a hard coding problem, then…

Mookie Spitz

@MookieWriter

年11月26日

#Google #gemini3 surpasses #benchmarks! But what will they do about #searchengine #advertising? open.substack.com/pub/mookiespit…

Demz One🎶🎨🎮🐶🤖demzone.eth|tez @omen_collective

@DemzOneMusic

年11月25日

⚙️ Category crowns: • Claude 4.5 → strongest coding performance • Gemini 3 Pro → best multimodal/image+text • GPT-5.1 → most balanced + steady outputs We’re in the era of skill-specialized giants. 🤖💻🎨 #AI #Benchmarks

Stephen Oliver

@StephenCOliver

年11月25日

Many think they're great teachers, but they don't track their dropout rate or have benchmarks. Assuming industry norms equal success is often settling for mediocre. #Education #Benchmarks

Banking and Financal news

@PremRaj81800354

年11月24日

Equity #benchmarks ended lower as #Nifty fell below 26,000 amid caution over delays in the US-India trade deal. Analysts expect consolidation with key support near 25,600–25,800 and advise selective buying while broader market sentiment stays bearish.

PremRaj81800354's tweet image. Equity #benchmarks ended lower as #Nifty fell below 26,000 amid caution over delays in the US-India trade deal. Analysts expect consolidation with key support near 25,600–25,800 and advise selective buying while broader market sentiment stays bearish.

Sports Media Inc. NIL

@SportsMedianet_

年11月23日

The stats are in! 97% of sports fans take action after seeing OOH ads. When your brand shows up on vivid stadium signage, you drive real results. Ready to get in the game? Any Sport. Any Venue. Any Time. #Benchmarks #SportsMarketing #OOHSports #AnySportAnyVenueAnyTime

Surendra

@drsurendrajar

年11月20日

🚀 AI Benchmark Face‑off ( November 2025): Gemini 3 Pro → Math & multimodal champ 📊 ChatGPT GPT‑5.1 → Balanced productivity & coding stability 💻 Grok 4.1 → Emotional intelligence & creative flair 🎭 Each shines in its domain—logic, balance, or empathy. #AI #Benchmarks…

drsurendrajar's tweet image. 🚀 AI Benchmark Face‑off ( November 2025):
Gemini 3 Pro → Math &amp; multimodal champ 📊
ChatGPT GPT‑5.1 → Balanced productivity &amp; coding stability 💻
Grok 4.1 → Emotional intelligence &amp; creative flair 🎭
Each shines in its domain—logic, balance, or empathy. #AI #Benchmarks…

AIC ACADEMY

@AIC_Academy

年11月20日

MASTER CLASS - ENTIENDE LAS BATERIAS DEL FUTURO Dentro de nuestro #MasterExpertoAutomoción tenemos una vertical de especialización en vehículo eléctrico con formaciones específicas sobre #electromovilidad, incluida una experiencia en #benchmarks con el #BYD o el #XIAOMI

AIC_Academy's tweet image. MASTER CLASS - ENTIENDE LAS BATERIAS DEL FUTURO

Dentro de nuestro #MasterExpertoAutomoción tenemos una vertical de especialización en vehículo eléctrico con formaciones específicas sobre #electromovilidad, incluida una experiencia en #benchmarks con el #BYD o el #XIAOMI

Association of Security Consultants (ASC)

@assocsecurity

年11月20日

Day 2 kicks off with the @LondonBuildExpo buzzing louder than ever! Our stand has been buzzing with visitors eager to hear about #ASCmembership, #SABRE, and the upcoming seminar that is shaping new #benchmarks for the industry. Visit us at Stall Q10!

assocsecurity's tweet image. Day 2 kicks off with the @LondonBuildExpo buzzing louder than ever!

Our stand has been buzzing with visitors eager to hear about #ASCmembership, #SABRE, and the upcoming seminar that is shaping new #benchmarks for the industry.

Visit us at Stall Q10!

Syed Irfan 🤔

@GutsyHustler

年11月20日

Does the benchmarks even matter now ?? I guess not, me being an end user, I just need if it is useful for me. #benchmarks

Brandon

@tweetbrandon

年11月19日

J A Rodriguez CPA, LLC

@jar_cpa

年11月19日

Benchmarking = Growth mindset 📊 For professional service providers, benchmarking helps identify where your firm stands in profitability, efficiency, and client retention compared to industry standards. Small insights → Big improvements 💼 #Benchmarks #SmartMoneyMoves #JARCPA

jar_cpa's tweet image. Benchmarking = Growth mindset 📊

For professional service providers, benchmarking helps identify where your firm stands in profitability, efficiency, and client retention compared to industry standards.

Small insights → Big improvements 💼

#Benchmarks #SmartMoneyMoves #JARCPA

Baltic Exchange

@BalticExchange

TechEmpower Framework Benchmarks

@TFBenchmarks

Tom's Hardware

@tomshardware

MLPerf

@MLPerf

SpeedTest G

@SpeedTest_G

Alumni Ventures

@alumniventures

Ofir Press

@OfirPress

Abaxx Exchange

@abaxx_exchange

OOT | テ

@OOTEsports

SPEC

@spec_perf

notebookcheck.net

@nbc_net

RandomGaminginHD

@RGinHD

Leo Reed

@Leoreedmax

HFR

@HFRinc

Open Life Science AI

@OpenlifesciAI

Polaris

@Polaris_HQ

Refinitiv Benchmarks

@benchmarks

IPOR Rates

@ipor_rates

PHP Benchmarks

@phpbenchmarks

Fusion (by IPOR)

@ipor_io

ness

@ness_eilish

Search Benchmarks

@sembenchmarks

Game Tester

@GT_benchmarks

Sloppy's Pc Gaming & Benchmarks

@Sloppywetblow

Brain-Score

@brain_score

CF Benchmarks

@CFBenchmarks

RuaninhoBR - Hardware e Tecnologia

@gagaruano

Ultrawide Benchmarks

@uwbenchmarks

DaCapo Benchmarks

@DaCapoBench

Bot Benchmarks

@BotBenchmarks

Karan Benchmarks

@KaranBenchmarks

Cannabis Benchmarks

@CannaBenchmark

Overload Digital

@HanuOverload

AimSpeed

@SpeedAim

NPS Benchmarks

@npsbenchmarks

Soil Health Benchmarks

@benchmarks_eu

BarclayHedge

@BarclayHedge

Workshop on Graph Learning Benchmarks

@GLB_Workshop

AiThority.com

@AiThority

Luxaviation Group

@Luxaviation

Helicentre Aviation

@flyhelicentre

Betting Benchmarks

@BetBenchmarks

Geeks3D

@Geeks3D

HempBenchmarks

@HempBenchmarks

Berkeley AI Research Climate Initiative

@ai_climate

Britain’s Benchmarks

@BBenchmarks

Adeer International

@Adeer_egypt

Ultra Benchmarks

@UltraBenchmarks

Benchmarks Online

@BenchmarksNews

Careers At Aquinas

@Aquinascareers

ComputerBase 🕊️

@ComputerBase

2023年6月8日

#DiabloIV #Benchmarks - 38 GPUs tested✅(yesterday) - 14 CPUs tested✅(today) Enjoy! computerbase.de/2023-06/diablo…

TONKO

@yamomzouse

年4月6日

#Benchmarks

Microsoft Adoption & Community

@MSFTAdoption

年5月8日

✨ Taking the stage, @Jared_Spataro here to share his thoughts and insights about “A New Frontier: Building the Future Firm with #AI” #BenchMarks 📈📉📊 #M365Con Day Three Keynote

Prakash Sangam

@MyTechMusings

2024年10月21日

Some #benchmarks for #Oryon2ndgen #SnapdragonSummit @Qualcomm

yanoyano@kai鯖運営

@yanoyano4649

年2月5日

鯖缶のぼやき今流行り？のベンチマークを動かしてみた #MonsterHunterWilds #Benchmarks

Vals AI

@_valsai

年5月9日

We just released our evaluation of @MistralAI Medium 3 across all of our benchmarks! 🧵(1/6) #AI #LLM #Benchmarks

Nina McNeary

@Heritage_Nina

2023年3月21日

#benchmarks Places of worship across the island of Ireland bear a tangible link to the legacy of the Ordnance Survey which mapped Ireland nearly 200 years ago. The OS was the completion of the world’s first large scale mapping of an entire country.

Heritage_Nina's tweet image. #benchmarks
Places of worship across the island of Ireland bear a tangible link to the legacy of the Ordnance Survey which mapped Ireland nearly 200 years ago. The OS was the completion of the world’s first large scale mapping of an entire country.

Gerard Sans | Axiom 🇬🇧

@gerardsans

2023年9月18日

Do you know how Google PaLM2 model powering Bard compares to other LLMs? 🤔 Tomorrow at GitHub SF I will compare publicly available benchmarks for PaLM2, GPT-4, GPT-3.5 and Llama2 representing open source! RSVP now! Last seats 👉🏻 meetup.com/graphql-sf/eve… #ai #benchmarks ✨🚀

gerardsans's tweet image. Do you know how Google PaLM2 model powering Bard compares to other LLMs? 🤔

Tomorrow at GitHub SF I will compare publicly available benchmarks for PaLM2, GPT-4, GPT-3.5 and Llama2 representing open source!

RSVP now! Last seats 👉🏻
meetup.com/graphql-sf/eve…

#ai #benchmarks ✨🚀

Neil Gunther

@DrQz

2024年8月27日

Gaphorism 1.14: Not even wrong !!! perfdynamics.com/Manifesto/gcap… #latency #performance #benchmarks

MajorGeeks

@majorgeeks

2023年9月8日

Updated - Futuremark SystemInfo is a #freeware utility used to identify the #hardware in your system and is used for many of Futuremark's #benchmarks. majorgeeks.com/files/details/…

majorgeeks's tweet image. Updated - Futuremark SystemInfo is a #freeware utility used to identify the #hardware in your system and is used for many of Futuremark's #benchmarks.
majorgeeks.com/files/details/…

PyLadies Paris

@PyLadiesParis

2023年11月16日

It's important to use proper #benchmarks and #evaluation methods to validate your #models, especially for time series

Hyke

@0xhyke

年8月8日

MajorGeeks

@majorgeeks

2023年1月24日

Updated - #Futuremark SystemInfo is a #freeware utility used to identify the hardware in your system and is used for many of Futuremark's #benchmarks. majorgeeks.com/files/details/…

SearchEngineJournal®

@sejournal

2023年10月14日

SEO strategy is important for many reasons ranging from maximizing return to having an organized plan to manage tactics and the work overall. bit.ly/45rJxJO #roi #benchmarks #SEOstrategies #analytics

sejournal's tweet image. SEO strategy is important for many reasons ranging from maximizing return to having an organized plan to manage tactics and the work overall.
bit.ly/45rJxJO
#roi #benchmarks #SEOstrategies #analytics

EMARKETER

@eMarketer

2024年4月23日

📈 Connect with us today to explore how our trusted #forecasts, #research, and #benchmarks across industries can maximize your revenues, optimize your spend, and help you anticipate digital disruption: emarketer.com/learn-more-dem…