#fineweb search results

I'm just laughing at these guys... "The Finest Collection of Data that The Web Has To Offer 🍷 #FineWeb

alby13's tweet image. I'm just laughing at these guys... "The Finest Collection of Data that The Web Has To Offer 🍷 #FineWeb

» Eric Lam - 感謝律果科技陳啟桐律師及中央社黃兆徽董事的協助,已順利與中央社達成和解, 以下是我的聲明 各界朋友好:... | Facebook facebook.com/eric.lam.74467… #fineweb #dataset

yesonline's tweet image. » Eric Lam - 感謝律果科技陳啟桐律師及中央社黃兆徽董事的協助,已順利與中央社達成和解, 以下是我的聲明 各界朋友好:... | Facebook facebook.com/eric.lam.74467…
#fineweb
#dataset

HuggingFace Releases 🍷 FineWeb: A New Large-Scale (15-Trillion Tokens, 44TB Disk Space) Dataset for LLM Pretraining itinai.com/huggingface-re… #HuggingFace #FineWeb #LLMPretraining #AI #PracticalSolutions #ai #news #llm #ml #research #ainews #innovation #artificialintelligence

vlruso's tweet image. HuggingFace Releases 🍷 FineWeb: A New Large-Scale (15-Trillion Tokens, 44TB Disk Space) Dataset for LLM Pretraining

itinai.com/huggingface-re…

#HuggingFace #FineWeb #LLMPretraining #AI #PracticalSolutions #ai #news #llm #ml #research #ainews #innovation #artificialintelligence …

Descubre cómo #FineWeb de @huggingface está redefiniendo la creación de conjuntos de datos de IA 🌐. Optimiza el entrenamiento, mejora la precisión y explora su impacto en la educación personalizada con #FineWebEdu 🎓.Más detalles aquí: t.ly/ithHN #IA

budynere's tweet image. Descubre cómo #FineWeb de @huggingface  está redefiniendo la creación de conjuntos de datos de IA 🌐. 
Optimiza el entrenamiento, mejora la precisión y explora su impacto en la educación personalizada con #FineWebEdu 🎓.Más detalles aquí: t.ly/ithHN #IA

#FineWeb from @huggingface is a great filtered dataset to learn and try to pre-tain foundation models from scratch


a proporcionar los datasets que voy a usar que son #OpenWebText, #BookCorpus y Spanish Billion Words. En dado caso se pueda escalar y entrenar a más escala Zeus, estoy pensando usar #FineWeb. Pero igual eso en un futuro tal vez :b. Todos estos están disponibles en #Huggingface


Plein de services seront présents pour contribuer à la création de votre site ! #FineWeb


Vous revez de créer une grosse plateforme d'hebergement de fichiers ? C'est pour bientôt avec l'offre #EStock de #FineWeb !


🌟 BestOfWeb, is a highly refined subset of the TxT360 CC dataset! 📊 It undergoes filtration using the ProX document filtering model, which use quality signals similar to the FineWeb-Edu classifier, and also adds additional format signals. #DataQuality #WebData #FineWeb


Fineweb.fr - #FineWeb | FineWeb.fr, Votre serveur virtuel à bas webwiki.fr/fineweb.fr


🤗Terrific work! @huggingface introduced #FineWeb, a comprehensive dataset designed to enhance the training of #LLMs. It demonstrates improved performance through meticulous data curation and innovative filtering techniques.

We are (finally) releasing the 🍷 FineWeb technical report! In it, we detail and explain every processing decision we took, and we also introduce our newest dataset: 📚 FineWeb-Edu, a (web only) subset of FW filtered for high educational content. Link: hf.co/spaces/Hugging…

gui_penedo's tweet image. We are (finally) releasing the 🍷 FineWeb technical report!

In it, we detail and explain every processing decision we took, and we also introduce our newest dataset: 📚 FineWeb-Edu, a (web only) subset of FW filtered for high educational content.

Link: hf.co/spaces/Hugging…


As dataset always the crucial aspect for any #LLMModel, getting quality dataset is a challenge. Internet is filled with garbage. So this particular #FineWeb pipeline is built on top of #CommonCrawl (open-source web-crawled dataset) huggingface.co/spaces/Hugging…


+ data alignment! -> Hugging Face's #FineWeb is a good step in the right direction, however we need much more data commons.


Exciting news from FineWeb - they're revolutionizing text data collection on a large scale, making it easier to access high-quality information from the web. #FineWeb #TextData #Innovation ift.tt/j546ZFq


» Eric Lam - 感謝律果科技陳啟桐律師及中央社黃兆徽董事的協助,已順利與中央社達成和解, 以下是我的聲明 各界朋友好:... | Facebook facebook.com/eric.lam.74467… #fineweb #dataset

yesonline's tweet image. » Eric Lam - 感謝律果科技陳啟桐律師及中央社黃兆徽董事的協助,已順利與中央社達成和解, 以下是我的聲明 各界朋友好:... | Facebook facebook.com/eric.lam.74467…
#fineweb
#dataset

🌟 BestOfWeb, is a highly refined subset of the TxT360 CC dataset! 📊 It undergoes filtration using the ProX document filtering model, which use quality signals similar to the FineWeb-Edu classifier, and also adds additional format signals. #DataQuality #WebData #FineWeb


I'm just laughing at these guys... "The Finest Collection of Data that The Web Has To Offer 🍷 #FineWeb

alby13's tweet image. I'm just laughing at these guys... "The Finest Collection of Data that The Web Has To Offer 🍷 #FineWeb

a proporcionar los datasets que voy a usar que son #OpenWebText, #BookCorpus y Spanish Billion Words. En dado caso se pueda escalar y entrenar a más escala Zeus, estoy pensando usar #FineWeb. Pero igual eso en un futuro tal vez :b. Todos estos están disponibles en #Huggingface


#FineWeb from @huggingface is a great filtered dataset to learn and try to pre-tain foundation models from scratch


🤗Terrific work! @huggingface introduced #FineWeb, a comprehensive dataset designed to enhance the training of #LLMs. It demonstrates improved performance through meticulous data curation and innovative filtering techniques.

We are (finally) releasing the 🍷 FineWeb technical report! In it, we detail and explain every processing decision we took, and we also introduce our newest dataset: 📚 FineWeb-Edu, a (web only) subset of FW filtered for high educational content. Link: hf.co/spaces/Hugging…

gui_penedo's tweet image. We are (finally) releasing the 🍷 FineWeb technical report!

In it, we detail and explain every processing decision we took, and we also introduce our newest dataset: 📚 FineWeb-Edu, a (web only) subset of FW filtered for high educational content.

Link: hf.co/spaces/Hugging…


HuggingFace Releases 🍷 FineWeb: A New Large-Scale (15-Trillion Tokens, 44TB Disk Space) Dataset for LLM Pretraining itinai.com/huggingface-re… #HuggingFace #FineWeb #LLMPretraining #AI #PracticalSolutions #ai #news #llm #ml #research #ainews #innovation #artificialintelligence

vlruso's tweet image. HuggingFace Releases 🍷 FineWeb: A New Large-Scale (15-Trillion Tokens, 44TB Disk Space) Dataset for LLM Pretraining

itinai.com/huggingface-re…

#HuggingFace #FineWeb #LLMPretraining #AI #PracticalSolutions #ai #news #llm #ml #research #ainews #innovation #artificialintelligence …

Descubre cómo #FineWeb de @huggingface está redefiniendo la creación de conjuntos de datos de IA 🌐. Optimiza el entrenamiento, mejora la precisión y explora su impacto en la educación personalizada con #FineWebEdu 🎓.Más detalles aquí: t.ly/ithHN #IA

budynere's tweet image. Descubre cómo #FineWeb de @huggingface  está redefiniendo la creación de conjuntos de datos de IA 🌐. 
Optimiza el entrenamiento, mejora la precisión y explora su impacto en la educación personalizada con #FineWebEdu 🎓.Más detalles aquí: t.ly/ithHN #IA

As dataset always the crucial aspect for any #LLMModel, getting quality dataset is a challenge. Internet is filled with garbage. So this particular #FineWeb pipeline is built on top of #CommonCrawl (open-source web-crawled dataset) huggingface.co/spaces/Hugging…


Exciting news from FineWeb - they're revolutionizing text data collection on a large scale, making it easier to access high-quality information from the web. #FineWeb #TextData #Innovation ift.tt/j546ZFq


+ data alignment! -> Hugging Face's #FineWeb is a good step in the right direction, however we need much more data commons.


Fineweb.fr - #FineWeb | FineWeb.fr, Votre serveur virtuel à bas webwiki.fr/fineweb.fr


Plein de services seront présents pour contribuer à la création de votre site ! #FineWeb


Vous revez de créer une grosse plateforme d'hebergement de fichiers ? C'est pour bientôt avec l'offre #EStock de #FineWeb !


» Eric Lam - 感謝律果科技陳啟桐律師及中央社黃兆徽董事的協助,已順利與中央社達成和解, 以下是我的聲明 各界朋友好:... | Facebook facebook.com/eric.lam.74467… #fineweb #dataset

yesonline's tweet image. » Eric Lam - 感謝律果科技陳啟桐律師及中央社黃兆徽董事的協助,已順利與中央社達成和解, 以下是我的聲明 各界朋友好:... | Facebook facebook.com/eric.lam.74467…
#fineweb
#dataset

HuggingFace Releases 🍷 FineWeb: A New Large-Scale (15-Trillion Tokens, 44TB Disk Space) Dataset for LLM Pretraining itinai.com/huggingface-re… #HuggingFace #FineWeb #LLMPretraining #AI #PracticalSolutions #ai #news #llm #ml #research #ainews #innovation #artificialintelligence

vlruso's tweet image. HuggingFace Releases 🍷 FineWeb: A New Large-Scale (15-Trillion Tokens, 44TB Disk Space) Dataset for LLM Pretraining

itinai.com/huggingface-re…

#HuggingFace #FineWeb #LLMPretraining #AI #PracticalSolutions #ai #news #llm #ml #research #ainews #innovation #artificialintelligence …

I'm just laughing at these guys... "The Finest Collection of Data that The Web Has To Offer 🍷 #FineWeb

alby13's tweet image. I'm just laughing at these guys... "The Finest Collection of Data that The Web Has To Offer 🍷 #FineWeb

Descubre cómo #FineWeb de @huggingface está redefiniendo la creación de conjuntos de datos de IA 🌐. Optimiza el entrenamiento, mejora la precisión y explora su impacto en la educación personalizada con #FineWebEdu 🎓.Más detalles aquí: t.ly/ithHN #IA

budynere's tweet image. Descubre cómo #FineWeb de @huggingface  está redefiniendo la creación de conjuntos de datos de IA 🌐. 
Optimiza el entrenamiento, mejora la precisión y explora su impacto en la educación personalizada con #FineWebEdu 🎓.Más detalles aquí: t.ly/ithHN #IA

Loading...

Something went wrong.


Something went wrong.


United States Trends