#trainingdatasets ผลการค้นหา

This Github has a very wide collection of High-quality datasets, tools, and concepts for LLM fine-tuning. All the datasets listed here should be under permissive licensing (Apache 2.0, MIT, cc-by-4.0, etc.). Categorized into segments like Math & Logic, Code, Conversation &…

rohanpaul_ai's tweet image. This Github has a very wide collection of High-quality datasets, tools, and concepts for LLM fine-tuning.

All the datasets listed here should be under permissive licensing (Apache 2.0, MIT, cc-by-4.0, etc.).

Categorized into segments like Math & Logic, Code, Conversation &…

Training isn’t failing because of effort; it’s failing because of visibility. Data-driven learning gives managers real-time insight into who’s progressing and who needs support. You can’t fix what you can’t see. #LearningAnalytics #LnD #SkillDevelopment #Tekstac


86.65 GB All paid Courses Collection🔥 Full Drive link, Worth: $599 FREE for first 2000 people👇 💀 Data science 💀 Python 💀 AI 💀 Cloud 💀 BIG DATA 💀 Data Analytics 💀 BI 💀 Google Cloud Training 💀 Machine Learning 💀 Deep Learning 💀 Ethical Hacking To get it: ✅ Follow…

Trisha_Techie's tweet image. 86.65 GB All paid Courses Collection🔥

Full Drive link, Worth: $599

FREE for first 2000 people👇

💀 Data science
💀 Python
💀 AI
💀 Cloud
💀 BIG DATA
💀 Data Analytics
💀 BI
💀 Google Cloud Training
💀 Machine Learning
💀 Deep Learning
💀 Ethical Hacking

To get it:

✅ Follow…

PhD Students - Do you need datasets for your research? Here are 30 datasets for research. These datasets are offered by @nexdata_ai Link ---> nexdata.ai 1. Korean Exam Question Dataset for AI Training lnkd.in/d_paSwt7 2. Multilingual Grammar…

Faheem_uh's tweet image. PhD Students - Do you need datasets for your research?

Here are 30 datasets for research.

These datasets are offered by @nexdata_ai 

Link ---> nexdata.ai 

1. Korean Exam Question Dataset for AI Training

lnkd.in/d_paSwt7
 
2. Multilingual Grammar…

Releasing the Jupyter Agent Dataset! 🚀 Training on this data dramatically improves the ability to execute code and analyze data. Built from 7 TB of real Kaggle datasets + 20k notebooks, creating real code exec traces using Qwen3-Coder and E2B. huggingface.co/datasets/data-…

a_yukh's tweet image. Releasing the Jupyter Agent Dataset! 🚀

Training on this data dramatically improves the ability to execute code and analyze data.

Built from 7 TB of real Kaggle datasets + 20k notebooks, creating real code exec traces using Qwen3-Coder and E2B.

huggingface.co/datasets/data-…

Às vezes a gente perde horas procurando conjuntos de dados para treinar nossos modelos. Esse repositório veio para simplificar e agilizar essa etapa. Ele organiza os datasets em categorias como: • Matemática • Lógica • Código • Conversação • Chamada de Agente e Função

__ambrosio's tweet image. Às vezes a gente perde horas procurando conjuntos de dados para treinar nossos modelos.

Esse repositório veio para simplificar e agilizar essa etapa.

Ele organiza os datasets em categorias como:

• Matemática
• Lógica
• Código
• Conversação
• Chamada de Agente e Função

🛠️ How to Train Your Own LLM — The Real Process, Step by Step Everyone talks about training models from scratch, but few explain what it actually takes. Here’s the real workflow and why most teams end up fine-tuning instead. 👇 A must-read for teams thinking about going deep…

DataScienceDojo's tweet image. 🛠️ How to Train Your Own LLM — The Real Process, Step by Step

Everyone talks about training models from scratch, but few explain what it actually takes.
Here’s the real workflow and why most teams end up fine-tuning instead.

👇 A must-read for teams thinking about going deep…

The best way to grow your machine learning skills is to build more models. But, to do that, you need some interesting datasets. Google Research has released 106 datasets spanning from text to images to time series data. Want to try a new method? Use one of these as a basis!

marktenenholtz's tweet image. The best way to grow your machine learning skills is to build more models.

But, to do that, you need some interesting datasets.

Google Research has released 106 datasets spanning from text to images to time series data.

Want to try a new method? Use one of these as a basis!

Excited to share findings from TeachLM our new research study on fine-tuning LLMs with a massive authentic learning dataset to improve their pedagogical capabilities. Read more in the thread 👇 (1/N)

JanosPerczel's tweet image. Excited to share findings from TeachLM our new research study on fine-tuning LLMs with a massive authentic learning dataset to improve their pedagogical capabilities.

Read more in the thread 👇 (1/N)

Dear DATA ANALYSTS🚀 Last week, I shared the full resources on getting started with SQL to Completion as part of your DATA ANALYTICS training✅ TODAY I will be sharing 10 FREE DATASETS to start building your PORTFOLIO after your training💻 Thread🧵 Please RT for others 2c🙌

ayo_purity's tweet image. Dear DATA ANALYSTS🚀

Last week, I shared the full resources on getting started with SQL to Completion as part of your DATA ANALYTICS training✅

TODAY I will be sharing 10 FREE DATASETS to start building your PORTFOLIO after your training💻 

Thread🧵

Please RT for others 2c🙌

Yesterday, I shared 8 sources to obtain DATA ANALYTICS CERTIFICATIONS🚀 Today, I will be sharing 15 SOLID PLACES to EXTRACT DATASETS for you to practice with or Explore after your training, if you don't know where to find them Thread 🧵👇 Please RT for others 2 c🙌

ayo_purity's tweet image. Yesterday, I shared 8 sources to obtain DATA ANALYTICS CERTIFICATIONS🚀

Today, I will be sharing  15 SOLID PLACES to EXTRACT DATASETS for you to practice with or Explore after your training, if you don't know where to find them

Thread 🧵👇

Please RT for others 2 c🙌

This is a list of over 200k open and free downloadable datasets for you to practice your Data Analytics, Science, and Engineering skills with? With Excel, SQL, Power BI or Python? Kindly like and retweet.

thenaijacarguy's tweet image. This is a list of over 200k open and free downloadable datasets for you to practice your Data Analytics, Science, and Engineering skills with? With Excel, SQL, Power BI or Python?

Kindly like and retweet.

The Ultimate Cheat Sheet for ML Enthusiasts 48 Open Datasets Every AI Learner Should Know!

Gourav_y2's tweet image. The Ultimate Cheat Sheet for ML Enthusiasts 48 Open Datasets Every AI Learner Should Know!

We made a Guide on how to create Datasets for Fine-tuning! Learn to: • Curate high-quality datasets (with best practices & examples) • Format datasets correctly for conversation, SFT, GRPO, Vision etc. • Generate synthetic data with Llama & ChatGPT 🔗docs.unsloth.ai/basics/dataset…

UnslothAI's tweet image. We made a Guide on how to create Datasets for Fine-tuning!

Learn to:
• Curate high-quality datasets (with best practices & examples)
• Format datasets correctly for conversation, SFT, GRPO, Vision etc.
• Generate synthetic data with Llama & ChatGPT

🔗docs.unsloth.ai/basics/dataset…

Dear DATA ANALYSTS🚀 Last week, I shared a very help RESOURCE on where you can get FREE DATASET to BUILD your PORTFOLIO. Today, I will be sharing with you where you can get FREE VIRTUAL INTERNSHIPS after your training✅ Thread 🧵👇 Please RT for others 2 C🙌

ayo_purity's tweet image. Dear DATA ANALYSTS🚀

Last week, I shared a very help RESOURCE on where you can get FREE DATASET to BUILD your PORTFOLIO.

Today, I will be sharing with you where you can get FREE VIRTUAL INTERNSHIPS after your training✅

Thread 🧵👇

Please RT for others 2 C🙌

🚀 Introducing Trie-Packed Training for agentic RL scenarios! One of the key innovations behind KAT-Coder. Efficiently compute shared prefixes only once during both forward and backward passes, drastically boosting training throughput.💥

KwaiAICoder's tweet image. 🚀 Introducing Trie-Packed Training for agentic RL scenarios! One of the key innovations behind KAT-Coder.
Efficiently compute shared prefixes only once during both forward and backward passes, drastically boosting training throughput.💥

Where to get data for your next machine learning project? An overview of 8 amazing resources to accelerate your next project with data! - Google Datasets - Big Bad NLP Datasets - Hugging Face Datasets - Papers with Code Datasets - Open Data on AWS - Awesome Public Datasets


This blog will delve into popular data augmentation techniques in deep learning, demonstrating how these methods enhance model performance by simulating a more diverse dataset. akridata.ai/blog/data-augm… #dataaugmentation #deeplearning #trainingdatasets


Catch our coverage of this eagerly awaited decision from the Hamburg Regional Court - the first in Germany to rule on AI training datasets under EU copyright law. ow.ly/hIsA50TFriv #AI #TrainingDatasets #EU #copyright #LAION #DSM

HoganLovellsIP's tweet image. Catch our coverage of this eagerly awaited decision from the Hamburg Regional Court - the first in Germany to rule on AI training datasets under EU copyright law.
ow.ly/hIsA50TFriv

#AI #TrainingDatasets #EU #copyright #LAION #DSM

🛫 Patent US20220335336A1: How does AI help air traffic control? By using historical flight data to train on trajectory changes before they happen! ✈️🤖 #AIinAirTrafficControl #TrainingDataSets #TrajectoryChanges #patent #patents


💡Have you created a tutorial using open geospatial #trainingdatasets available in #RadiantMLHub? 🚨We would love to showcase them to the #EOChat community via our #TutorialTuesday threads!!🚨 🟢Share your tutorials / how-to guides with us below or DM us! #ML4EO #CloudComputing

OurRadiantEarth's tweet image. 💡Have you created a tutorial using open geospatial #trainingdatasets available in #RadiantMLHub? 
🚨We would love to showcase them to the #EOChat community via our #TutorialTuesday threads!!🚨 
🟢Share your tutorials / how-to guides with us below or DM us!
#ML4EO #CloudComputing

Are you a #datascientist working on rural and urban development issues or disaster response strategies? Discover open #trainingdatasets around the world on #RadiantMLHub for extracting building footprints with #ML algorithms. 👉 mlhub.earth/datasets?tags=…

OurRadiantEarth's tweet image. Are you a #datascientist working on rural and urban development issues or disaster response strategies? Discover open #trainingdatasets around the world on #RadiantMLHub for extracting building footprints with #ML algorithms. 
👉 mlhub.earth/datasets?tags=…

💡Have you created a tutorial or application using open geospatial #trainingdatasets available in #RadiantMLHub? 🚨We would love to showcase them to the #EOChat community!!🚨 🟢 Post the link to your tutorial/ application below or DM us!

OurRadiantEarth's tweet image. 💡Have you created a tutorial or application using open geospatial #trainingdatasets available in #RadiantMLHub? 
🚨We would love to showcase them to the #EOChat community!!🚨 
🟢 Post the link to your tutorial/ application below or DM us!

💡Have you created a tutorial using open geospatial #trainingdatasets available in #RadiantMLHub? 🚨We would love to showcase them to the #EOChat community via our #TutorialTuesday threads!!🚨 🟢Share your tutorials / how-to guides with us below or DM us!

OurRadiantEarth's tweet image. 💡Have you created a tutorial using open geospatial #trainingdatasets available in #RadiantMLHub? 
🚨We would love to showcase them to the #EOChat community via our #TutorialTuesday threads!!🚨 
🟢Share your tutorials / how-to guides with us below or DM us!

Are you a #datascientist working on rural and urban development issues or disaster response strategies? Discover open #trainingdatasets around the world on #RadiantMLHub for extracting building footprints with #ML algorithms. 👉 mlhub.earth/datasets?tags=…

OurRadiantEarth's tweet image. Are you a #datascientist working on rural and urban development issues or disaster response strategies? Discover open #trainingdatasets around the world on #RadiantMLHub for extracting building footprints with #ML algorithms. 
👉 mlhub.earth/datasets?tags=…

Building footprints are a necessary input to train #ML models to automatically extract features in satellite imagery. Now available on #RadiantMLHub for download, explore these open #trainingdatasets with labels for primary health facilities worldwide: mlhub.earth/datasets?searc…

OurRadiantEarth's tweet image. Building footprints are a necessary input to train #ML models to automatically extract features in satellite imagery. Now available on #RadiantMLHub for download, explore these open #trainingdatasets with labels for primary health facilities worldwide: mlhub.earth/datasets?searc…

💡Have you created a tutorial using one of the open-access #trainingdatasets on #RadiantMLHub? 🚨Share your tutorials / how-to-guides with us below! We would love to see and showcase them to the larger #EOChat community via our #TutorialTuesday threads!!🚨


The same thing happened to Taybot but he went more racist iThink? /S 🤷‍♂️ #TrainingDataSets #MSTaybot #FacebookAI #BlenderbotAI #politics of #AI #Trump The #TermLimits module needs an update. 🤦‍♂️


Thanks @OurRadiantEarth Humbled to be on the list with other domain experts, indicating the need and importance of enhanced #stewardship and sharing of #qualityinformation of EO #trainingdatasets and #models for a wide range of #AI/#ML applications.


Perhaps now that the Machines Have Learned to Judge though.. perhaps this depends on the #TrainingDataSets but back in the day no one really looked at anything much different than a hammer could fix the thing iF the "paper jam" needed to be.. supported to.. percussively print. 🖨️


ไม่พบผลลัพธ์สำหรับ "#trainingdatasets"

RT @DesiCrewSolns: This is what I got, when I asked my 'Foodie' designer for a poster on Image Tagging. #MachineLearning #TrainingDataSets #autonomouscars #ImageTagging #ImageRecognition

machine_ml's tweet image. RT @DesiCrewSolns: This is what I got, when I asked my 'Foodie' designer for a poster on Image Tagging. #MachineLearning  #TrainingDataSets #autonomouscars  #ImageTagging #ImageRecognition

Interested in a virtual tour of our training data annotation platform? See SamaHub in action and chat with our team of training data experts from the convenience of your own home. bit.ly/2TUrl9Z #trainingdata #trainingdatasets #AI #ML #machinele

SamaAI's tweet image. Interested in a virtual tour of our training data annotation platform? See SamaHub in action and chat with our team of training data experts from the convenience of your own home. bit.ly/2TUrl9Z
 #trainingdata #trainingdatasets #AI #ML #machinele

Building footprints are a necessary input to train #ML models to automatically extract features in satellite imagery. Now available on #RadiantMLHub for download, explore these open #trainingdatasets with labels for primary health facilities worldwide: mlhub.earth/datasets?searc…

OurRadiantEarth's tweet image. Building footprints are a necessary input to train #ML models to automatically extract features in satellite imagery. Now available on #RadiantMLHub for download, explore these open #trainingdatasets with labels for primary health facilities worldwide: mlhub.earth/datasets?searc…

Catch our coverage of this eagerly awaited decision from the Hamburg Regional Court - the first in Germany to rule on AI training datasets under EU copyright law. ow.ly/hIsA50TFriv #AI #TrainingDatasets #EU #copyright #LAION #DSM

HoganLovellsIP's tweet image. Catch our coverage of this eagerly awaited decision from the Hamburg Regional Court - the first in Germany to rule on AI training datasets under EU copyright law.
ow.ly/hIsA50TFriv

#AI #TrainingDatasets #EU #copyright #LAION #DSM

💡Have you created a tutorial using open geospatial #trainingdatasets available in #RadiantMLHub? 🚨We would love to showcase them to the #EOChat community via our #TutorialTuesday threads!!🚨 🟢Share your tutorials / how-to guides with us below or DM us! #ML4EO #CloudComputing

OurRadiantEarth's tweet image. 💡Have you created a tutorial using open geospatial #trainingdatasets available in #RadiantMLHub? 
🚨We would love to showcase them to the #EOChat community via our #TutorialTuesday threads!!🚨 
🟢Share your tutorials / how-to guides with us below or DM us!
#ML4EO #CloudComputing

💡Have you created a tutorial or application using open geospatial #trainingdatasets available in #RadiantMLHub? 🚨We would love to showcase them to the #EOChat community!!🚨 🟢 Post the link to your tutorial/ application below or DM us!

OurRadiantEarth's tweet image. 💡Have you created a tutorial or application using open geospatial #trainingdatasets available in #RadiantMLHub? 
🚨We would love to showcase them to the #EOChat community!!🚨 
🟢 Post the link to your tutorial/ application below or DM us!

🌟We’re excited to release the new #RadiantMLHub website! This interface allows you to explore & access our open geospatial #trainingdatasets (& soon models too) all in one site! Also on hand are the API documentation, tutorials, FAQ & more. 👉🏽Get started: mlhub.earth


Are you a #datascientist working on rural and urban development issues or disaster response strategies? Discover open #trainingdatasets around the world on #RadiantMLHub for extracting building footprints with #ML algorithms. 👉 mlhub.earth/datasets?tags=…

OurRadiantEarth's tweet image. Are you a #datascientist working on rural and urban development issues or disaster response strategies? Discover open #trainingdatasets around the world on #RadiantMLHub for extracting building footprints with #ML algorithms. 
👉 mlhub.earth/datasets?tags=…

🚀📢We are happy to support the 2nd @LacunaFund projects in creating openly accessible #trainingdatasets in 29 languages across Africa. ⏩lacunafund.org/language-2020-…

fair_forward's tweet image. 🚀📢We are happy to support the 2nd @LacunaFund projects in creating openly accessible #trainingdatasets in 29 languages across Africa. 

⏩lacunafund.org/language-2020-…

Loading...

Something went wrong.


Something went wrong.


United States Trends