learnteachAIED's profile picture.

Thorben Jansen

@learnteachAIED

คุณอาจชื่นชอบ
Thorben Jansen รีโพสต์แล้ว

The most important skill for a researcher is not technical ability. It's taste. The ability to identify interesting and tractable problems, and recognize important ideas when they show up. This can't be taught directly. It's cultivated through curiosity and broad reading.


Thorben Jansen รีโพสต์แล้ว

I just realized something most people are going to lose when (as they inevitably will) they start using AIs to write everything for them. They'll lose the knowledge of how writing is constructed.


Thorben Jansen รีโพสต์แล้ว

Thorben Jansen รีโพสต์แล้ว

Most people don't realize they can significantly influence what frontier LLMs improve at, it just requires some work. Publish a high-quality eval on a task where models currently struggle, and I guarantee future models will show substantial improvement on it.


Thorben Jansen รีโพสต์แล้ว

There’s no demand for “average.”


Thorben Jansen รีโพสต์แล้ว

I suspect that a lot of "AI training" in companies and schools has become obsolete in the last few months As models get larger, the prompting tricks that used to be useful are no longer good; reasoners don't play well with Chain-of-Thought; hallucination rates have dropped, etc.


Thorben Jansen รีโพสต์แล้ว

we trained a new model that is good at creative writing (not sure yet how/when it will get released). this is the first time i have been really struck by something written by AI; it got the vibe of metafiction so right. PROMPT: Please write a metafictional literary short story…


Thorben Jansen รีโพสต์แล้ว

We have to take the LLMs to school. When you open any textbook, you'll see three major types of information: 1. Background information / exposition. The meat of the textbook that explains concepts. As you attend over it, your brain is training on that data. This is equivalent…

karpathy's tweet image. We have to take the LLMs to school.

When you open any textbook, you'll see three major types of information:

1. Background information / exposition. The meat of the textbook that explains concepts. As you attend over it, your brain is training on that data. This is equivalent…

Thorben Jansen รีโพสต์แล้ว

“Self-beliefs in childhood and adolescence can influence important life outcomes years later.” Building competencies, with adult support, can help children develop positive self-beliefs, say Jennifer Meyer & Thorben Jansen. @jennymeyer10 @learnteachAIED boldscience.org/how-do-childre…


Thorben Jansen รีโพสต์แล้ว

We’re releasing Humanity’s Last Exam, a dataset with 3,000 questions developed with hundreds of subject matter experts to capture the human frontier of knowledge and reasoning. State-of-the-art AIs get <10% accuracy and are highly overconfident. @ai_risk @scaleai

DanHendrycks's tweet image. We’re releasing Humanity’s Last Exam, a dataset with 3,000 questions developed with hundreds of subject matter experts to capture the human frontier of knowledge and reasoning.

State-of-the-art AIs get &amp;lt;10% accuracy and are highly overconfident.
@ai_risk @scaleai
DanHendrycks's tweet image. We’re releasing Humanity’s Last Exam, a dataset with 3,000 questions developed with hundreds of subject matter experts to capture the human frontier of knowledge and reasoning.

State-of-the-art AIs get &amp;lt;10% accuracy and are highly overconfident.
@ai_risk @scaleai
DanHendrycks's tweet image. We’re releasing Humanity’s Last Exam, a dataset with 3,000 questions developed with hundreds of subject matter experts to capture the human frontier of knowledge and reasoning.

State-of-the-art AIs get &amp;lt;10% accuracy and are highly overconfident.
@ai_risk @scaleai
DanHendrycks's tweet image. We’re releasing Humanity’s Last Exam, a dataset with 3,000 questions developed with hundreds of subject matter experts to capture the human frontier of knowledge and reasoning.

State-of-the-art AIs get &amp;lt;10% accuracy and are highly overconfident.
@ai_risk @scaleai

Thorben Jansen รีโพสต์แล้ว

Our lack of good deep measures of human creativity, reasoning, empathy, etc. is really a problem in AI right now. A lot of tests that were "good enough" for human research (RAT for creativity, Seeing the Mind in The Eyes for empathy) are not robust enough for benchmarks for AI.


Thorben Jansen รีโพสต์แล้ว

I read a lot of social science papers on AI and my conclusion is that there are far too few people rigorously studying the implications (good & bad) of LLMs Computer science is producing a tide of good AI work. Economics, management, psych, & sociology etc. need to do the same.


Thorben Jansen รีโพสต์แล้ว

Two simple rules: 1. You get better at what you practice. 2. Everything is practice. Look around and you may be surprised by what people are “practicing" each day. If you consider each moment a repetition, what are most people training for all day long? Many people are…


Thorben Jansen รีโพสต์แล้ว

I cannot agree with this more. Please use basic research methods on AI benchmarking!

emollick's tweet image. I cannot agree with this more. Please use basic research methods on AI benchmarking!
emollick's tweet image. I cannot agree with this more. Please use basic research methods on AI benchmarking!
emollick's tweet image. I cannot agree with this more. Please use basic research methods on AI benchmarking!

New Anthropic research: Adding Error Bars to Evals. AI model evaluations don’t usually include statistics or uncertainty. We think they should. Read the blog post here: anthropic.com/research/stati…



Thorben Jansen รีโพสต์แล้ว

Hate it when you ask o1-preview a hard question and it thinks for less than a second. You really feel that you failed to interest the AI in your problem.


Thorben Jansen รีโพสต์แล้ว

Have a question that is challenging for humans and AI? We (@ai_risks + @scale_AI) are launching Humanity's Last Exam, a massive collaboration to create the world's toughest AI benchmark. Submit a hard question and become a co-author. Best questions get part of $500,000 in…

DanHendrycks's tweet image. Have a question that is challenging for humans and AI?

We (@ai_risks + @scale_AI) are launching Humanity&apos;s Last Exam, a massive collaboration to create the world&apos;s toughest AI benchmark.
Submit a hard question and become a co-author.
Best questions get part of $500,000 in…
DanHendrycks's tweet image. Have a question that is challenging for humans and AI?

We (@ai_risks + @scale_AI) are launching Humanity&apos;s Last Exam, a massive collaboration to create the world&apos;s toughest AI benchmark.
Submit a hard question and become a co-author.
Best questions get part of $500,000 in…
DanHendrycks's tweet image. Have a question that is challenging for humans and AI?

We (@ai_risks + @scale_AI) are launching Humanity&apos;s Last Exam, a massive collaboration to create the world&apos;s toughest AI benchmark.
Submit a hard question and become a co-author.
Best questions get part of $500,000 in…

Thorben Jansen รีโพสต์แล้ว

Neuer Blogbeitrag: Kann KI Lehrkräfte bei der Beurteilung von Schüler:leistungen unterstützen? Dr. Thorben Jansen @learnteachAIED vom IPN fasst die aktuelle Forschungslage zusammen und leitet daraus Implikationen für die Praxis ab. fiete.ai/blog/kuenstlic…

fellofish.com

Künstliche Intelligenz als Beurteilungshilfe: Wie genau können K…

Lehrkräfte beurteilen im Unterricht ständig die Leistungen ihrer Schüler:innen. Beurteilungen sind notwendig, um weitere Lehr- und Lernschritte zu planen und durchzuführen. Ohne eine Beurteilung…


Thorben Jansen รีโพสต์แล้ว

🚀Startschuss für das Projekt GENIUS am IPN, gefördert von der @telekomstiftung Ziel: Mit #KI die Beurteilungs- und Feedbackprozesse in der #Schule verbessern und neue Maßstäbe setzen🌟📚🤖 Mehr Infos: leibniz-ipn.de #DigitaleBildung Copyright Foto: Timo Wilke

IPN_Kiel's tweet image. 🚀Startschuss für das Projekt GENIUS am IPN, gefördert von  der @telekomstiftung 

Ziel: Mit #KI die Beurteilungs- und Feedbackprozesse in der #Schule verbessern und neue Maßstäbe setzen🌟📚🤖

Mehr Infos: leibniz-ipn.de

#DigitaleBildung 
Copyright Foto: Timo Wilke

Thorben Jansen รีโพสต์แล้ว

Paul Graham on why ambitious people need to be around other ambitious people:

readswithravi's tweet image. Paul Graham on why ambitious people need to be around other ambitious people:

Thorben Jansen รีโพสต์แล้ว

What cultural values do GPT-4o, 4, 3.5, 3 express? Using World Values Survey questions, we find GPT consistently aligns with English-speaking countries/Protestant Europe. We show that Cultural Prompting improves alignment. arxiv.org/abs/2311.14096 @yan_ytyt @OlgaOvi @BakerEDMLab

whynotyet's tweet image. What cultural values do GPT-4o, 4, 3.5, 3 express? Using World Values Survey questions, we find GPT consistently aligns with English-speaking countries/Protestant Europe. We show that Cultural Prompting improves alignment. arxiv.org/abs/2311.14096 @yan_ytyt @OlgaOvi @BakerEDMLab
whynotyet's tweet image. What cultural values do GPT-4o, 4, 3.5, 3 express? Using World Values Survey questions, we find GPT consistently aligns with English-speaking countries/Protestant Europe. We show that Cultural Prompting improves alignment. arxiv.org/abs/2311.14096 @yan_ytyt @OlgaOvi @BakerEDMLab

United States เทรนด์

คุณอาจชื่นชอบ

Loading...

Something went wrong.


Something went wrong.