Pulse__AI's profile picture. Production-grade unstructured document extraction

Pulse

@Pulse__AI

Production-grade unstructured document extraction

Pulse รีโพสต์แล้ว

spotted

sid_mnk's tweet image. spotted

Pulse รีโพสต์แล้ว

Exciting research preview to share on XLSX parsing at @Pulse__AI . Spreadsheets are deceptively hard - merged cells, multi-tab workbooks, and cross-sheet references break when you flatten them. Our team has developed and implemented a token-efficient encoder resulting in…

sid_mnk's tweet image. Exciting research preview to share on XLSX parsing at @Pulse__AI . Spreadsheets are deceptively hard - merged cells, multi-tab workbooks, and cross-sheet references break when you flatten them. 

Our team has developed and implemented a token-efficient encoder resulting in…

Pulse รีโพสต์แล้ว

threw a screenshot of this post into @pulse__ai ~99% accurate try it here: platform.runpulse.com

sid_mnk's tweet image. threw a screenshot of this post into @pulse__ai  ~99% accurate 

try it here: platform.runpulse.com

deepseek-ocr can't handle rotated pages, hallucinates badly

HarveenChadha's tweet image. deepseek-ocr can't handle rotated pages, hallucinates badly
HarveenChadha's tweet image. deepseek-ocr can't handle rotated pages, hallucinates badly


Pulse รีโพสต์แล้ว

DeepSeek AI dropped a new open-source OCR model today 👀 At @pulse__ai, we tested it on financial docs, handwritten forms, and complex tables. The results showed the same issues plaguing LLM-driven OCR: - Unstable outputs - Hallucinated text - Broken table structures Reality…

ritvikpandey21's tweet image. DeepSeek AI dropped a new open-source OCR model today 👀

At @pulse__ai, we tested it on financial docs, handwritten forms, and complex tables. The results showed the same issues plaguing LLM-driven OCR:

- Unstable outputs
- Hallucinated text
- Broken table structures

Reality…

Pulse รีโพสต์แล้ว

.@Pulse__AI just launched Ultra Nano, their new enterprise-focused document extraction model with complete self-hosting, already running across Fortune 50s, insurers, investment firms, banks, and foundational model labs. runpulse.com/blog/self-host… Congrats on the launch, @sid_mnk

ycombinator's tweet image. .@Pulse__AI just launched Ultra Nano, their new enterprise-focused document extraction model with complete self-hosting, already running across Fortune 50s, insurers, investment firms, banks, and foundational model labs.

runpulse.com/blog/self-host…

Congrats on the launch, @sid_mnk…

Pulse รีโพสต์แล้ว

🧢

sid_mnk's tweet image. 🧢

Pulse is now officially part of @cloudera's Enterprise AI ecosystem. Excited to partner with Cloudera and continue delivering the most accurate document extraction models at enterprise scale.

Pulse__AI's tweet image. Pulse is now officially part of @cloudera's Enterprise AI ecosystem.

Excited to partner with Cloudera and continue delivering the most accurate document extraction models at enterprise scale.

Pulse รีโพสต์แล้ว

The @Pulse__AI team just published "The Precision Tax" - why "99% accuracy" fails in finance. One percent error in financial document processing means broken valuations, failed covenant tests, and regulatory exposure. The real benchmark isn't accuracy, it's determinism. Same…

ritvikpandey21's tweet image. The @Pulse__AI  team just published "The Precision Tax" - why "99% accuracy" fails in finance.

One percent error in financial document processing means broken valuations, failed covenant tests, and regulatory exposure. The real benchmark isn't accuracy, it's determinism. Same…

Pulse รีโพสต์แล้ว

Culture building is everything when you're asking engineers to solve the hardest problems for enterprises. The entire @Pulse__AI team is usually in the office 12 hours a day - everyone needs to be in one place, building together. Having an immediate feedback loop is incredibly…

ritvikpandey21's tweet image. Culture building is everything when you're asking engineers to solve the hardest problems for enterprises. The entire @Pulse__AI  team is usually in the office 12 hours a day - everyone needs to be in one place, building together.

Having an immediate feedback loop is incredibly…

Pulse รีโพสต์แล้ว

@pulse__ai just launched formula recognition. trained on 10m+ formula/latex pairs from papers + handwritten notes. traditional ocr breaks on math (α, β, fractions, matrices). our model treats formulas as structured objects → clean latex. built on pulse’s production-grade…

sid_mnk's tweet image. @pulse__ai just launched formula recognition.
trained on 10m+ formula/latex pairs from papers + handwritten notes.

traditional ocr breaks on math (α, β, fractions, matrices). our model treats formulas as structured objects → clean latex.

built on pulse’s production-grade…

Pulse รีโพสต์แล้ว

Almost a year since this

sid_mnk's tweet image. Almost a year since this

Join us!

almost 700m pages processed so far. we’re hiring engineers + ops in sf. $10k referral bonus. join us - link in comment.

sid_mnk's tweet image. almost 700m pages processed so far.

we’re hiring engineers + ops in sf. $10k referral bonus.

join us - link in comment.


Pulse รีโพสต์แล้ว

customer feedback last week hours of work → minutes with @pulse__ai

sid_mnk's tweet image. customer feedback last week

hours of work → minutes with @pulse__ai

Pulse รีโพสต์แล้ว

Pulse (@Pulse__AI) just launched their state-of-the-art document extraction platform. It turns complex PDFs, scans, decks, and images into LLM-ready data. No training required. runpulse.com/blog/pulse-ope… Congrats on the launch, @sid_mnk and @ritvikpandey21!


Pulse รีโพสต์แล้ว

@pulse__ai team just dropped why "98% accurate" document extraction still breaks in production with 4000 errors per 1000 pages. single accuracy scores miss broken reading order, shifted table columns, and lost cross page context that silently corrupt entire datasets. we’ve…

ritvikpandey21's tweet image. @pulse__ai team just dropped why "98% accurate" document extraction still breaks in production with 4000 errors per 1000 pages.

single accuracy scores miss broken reading order, shifted table columns, and lost cross page context that silently corrupt entire datasets.

we’ve…

Pulse รีโพสต์แล้ว

the team at @Pulse__AI put bytedance's dolphin OCR to the test against complex documents that matter for real business use cases. while it shows improvements in reading order detection, we found critical limitations across key areas: - 7.7% structured data extraction from…

ritvikpandey21's tweet image. the team at @Pulse__AI  put bytedance's dolphin OCR to the test against complex documents that matter for real business use cases. 

while it shows improvements in reading order detection, we found critical limitations across key areas:
-  7.7% structured data extraction from…

Pulse รีโพสต์แล้ว

we're super excited to be launching Meridian publicly! no more analysts having to manually copy numbers from pdfs into spreadsheets at 2 am before a board deadline if you're interested in trying it out give me a DM! 🫡

Pulse (@Pulse__AI) has just launched Meridian, an AI-powered financial document processor that can automatically convert any PDF, Word doc, PowerPoint presentation, or image into a structured Excel export with charts and graphs. runpulse.com/blog/introduci… Congrats on the launch,…



Pulse รีโพสต์แล้ว

Pulse (@Pulse__AI) has just launched Meridian, an AI-powered financial document processor that can automatically convert any PDF, Word doc, PowerPoint presentation, or image into a structured Excel export with charts and graphs. runpulse.com/blog/introduci… Congrats on the launch,…


Pulse รีโพสต์แล้ว

After processing nearly 500M pages, we discovered the biggest challenge in document AI isn't OCR accuracy - it's semantic understanding across page breaks and column boundaries. 🧵 (1/8)

ritvikpandey21's tweet image. After processing nearly 500M pages, we discovered the biggest challenge in document AI isn't OCR accuracy - it's semantic understanding across page breaks and column boundaries. 🧵

(1/8)

Pulse รีโพสต์แล้ว

After processing 400M+ pages for the world's largest investment firms, AI startups, and Fortune 500s, @Pulse__AI is launching Ultra: their new hybrid reasoning model. It's the most accurate document extraction model in the industry. Live for all customers today.…


United States เทรนด์

Loading...

Something went wrong.


Something went wrong.