BigCode

@BigCodeProject

Open and responsible research and development of large language models for code. #BigCodeProject run by @huggingface + @ServiceNowRSRCH

bigcode-project.org

Entrou em Agosto de 2022

272Posts 9KSeguidores 3Seguindo

Talvez você curta

@huggingface

@geoffreyhinton

@AnthropicAI

@ClementDelangue

@DrJimFan

@StabilityAI

@tri_dao

@arena

@Gradio

@arankomatsuzaki

@NousResearch

@hwchase17

@p_nawrot

@ZenMoore1

@YiTayML

Fixado

BigCode

@BigCodeProject

28 de fev. de 2024

Introducing: StarCoder2 and The Stack v2 ⭐️ StarCoder2 is trained with a 16k token context and repo-level information for 4T+ tokens. All built on The Stack v2 - the largest code dataset with 900B+ tokens. All code, data and models are fully open! hf.co/bigcode/starco…

BigCodeProject's tweet image. Introducing: StarCoder2 and The Stack v2 ⭐️

StarCoder2 is trained with a 16k token context and repo-level information for 4T+ tokens. All built on The Stack v2 - the largest code dataset with 900B+ tokens.

All code, data and models are fully open!

hf.co/bigcode/starco…

BigCode repostou

Terry Yue Zhuo @ SF 🏖️

@terryyuezhuo

8 de out. de

It’s so much fun working with the other 39 community members on this project! Start to try out various frontier models in BigCodeArena today.

terryyuezhuo's tweet image. It’s so much fun working with the other 39 community members on this project!

Start to try out various frontier models in BigCodeArena today.

BigCode

@BigCodeProject

8 de out. de

Introducing BigCodeArena, a human-in-the-loop platform for evaluating code through execution. Unlike current open evaluation platforms that collect human preferences on text, it enables interaction with runnable code to assess functionality and quality across any language.

BigCodeProject's tweet image. Introducing BigCodeArena, a human-in-the-loop platform for evaluating code through execution.

Unlike current open evaluation platforms that collect human preferences on text, it enables interaction with runnable code to assess functionality and quality across any language.

BigCode repostou

Terry Yue Zhuo @ SF 🏖️

@terryyuezhuo

7 de out. de 2024

BigCodeBench @BigCodeProject evaluation framework has been fully upgraded! Just pip install -U bigcodebench With v0.2.0, it's now much easier to use compared to the previous v0.1.* versions. The new version adopts the @Gradio Client API interface from @huggingface Spaces by…

terryyuezhuo's tweet image. BigCodeBench @BigCodeProject evaluation framework has been fully upgraded! Just pip install -U bigcodebench

With v0.2.0, it's now much easier to use compared to the previous v0.1.* versions. The new version adopts the @Gradio Client API interface from @huggingface Spaces by…

BigCode repostou

Josh

@JoshPurtell

5 de set. de 2024

Evaluating LM agents has come a long way since gpt-4 released in March of 2023. We now have SWE-Bench, (Visual) Web Arena, and other evaluations that tell us a lot about how the best models + architectures do on hard and important tasks. There's still lots to do, though 🧵

BigCode repostou

Terry Yue Zhuo @ SF 🏖️

@terryyuezhuo

3 de set. de 2024

People may think BigCodeBench @BigCodeProject is nothing more than a straightforward coding benchmark, but it is not. BigCodeBench is a rigorous testbed for LLM agents using code to solve complex and practical challenges. Each task demands significant reasoning capabilities for…

BigCode repostou

Qian Liu

@sivil_taram

23 de ago. de 2024

By popular demand, I have released the StarCoder2 code documentation dataset, please check it out ⬇️ hf.co/datasets/Sivil…

SivilTaram/starcoder2-documentation · Datasets at Hugging Face

Fonte: huggingface.co

BigCode repostou

Arjun Guha

@ArjunGuha

21 de ago. de 2024

This work will appear at OOPSLA 2024. New since last year: the StarCoder2 LLM from @BigCodeProject uses MultiPL-T as part of its pretraining corpus.

Arjun Guha

@ArjunGuha

18 de ago. de 2023

LLMs are great at programming tasks... for Python and other very popular PLs. But, they are often unimpressive at artisanal PLs, like OCaml or Racket. We've come up with a way to significantly boost LLM performance of on low-resource languages. If you care about them, read on!

BigCode repostou

Terry Yue Zhuo @ SF 🏖️

@terryyuezhuo

19 de ago. de 2024

Today, we are happy to announce the beta mode of real-time Code Execution for BigCodeBench @BigCodeProject, which has been integrated into our Hugging Face leaderboard. We understand that setting up a dependency-based execution environment can be cumbersome, even with the…

terryyuezhuo's tweet image. Today, we are happy to announce the beta mode of real-time Code Execution for BigCodeBench @BigCodeProject, which has been integrated into our Hugging Face leaderboard.

We understand that setting up a dependency-based execution environment can be cumbersome, even with the…

$terryyuezhuo's profile picture. @BigCodeProject-{⚔️Arena, 📊Bench} | Going Stealth | @codelm_tutorial EMNLP’25$

Terry Yue Zhuo @ SF 🏖️

@terryyuezhuo

18 de jun. de 2024

In the past few months, we’ve seen SOTA LLMs saturating basic coding benchmarks with short and simplified coding tasks. It's time to enter the next stage of coding challenge under comprehensive and realistic scenarios! -- Here comes BigCodeBench, benchmarking LLMs on solving…

terryyuezhuo's tweet image. In the past few months, we’ve seen SOTA LLMs saturating basic coding benchmarks with short and simplified coding tasks. It's time to enter the next stage of coding challenge under comprehensive and realistic scenarios!

-- Here comes BigCodeBench, benchmarking LLMs on solving…

BigCode

@BigCodeProject

17 de jul. de 2024

Releasing BigCodeBench-Hard: a subset of more challenging and user-facing tasks. BigCodeBench-Hard provides more accurate model performance evaluations and we also investigate some recent model updates. Read more: huggingface.co/blog/terryyz/b… Leaderboard: huggingface.co/spaces/bigcode…

BigCodeProject's tweet image. Releasing BigCodeBench-Hard: a subset of more challenging and user-facing tasks.

BigCodeBench-Hard provides more accurate model performance evaluations and we also investigate some recent model updates.

Read more: huggingface.co/blog/terryyz/b…
Leaderboard: huggingface.co/spaces/bigcode…

BigCode repostou

Rajiv Shah

@rajistics

25 de jun. de 2024

BigCodeBench dataset🌸 Use it as inspiration when building your Generative AI evaluations. BigCodeBench h/t: @BigCodeProject @terryyuezhuo @lvwerra @clefourrier @huggingface (to name just a few of the people involved)

BigCode repostou

Terry Yue Zhuo @ SF 🏖️

@terryyuezhuo

19 de jun. de 2024

Ppl are curious about the performance of DeepSeek-Coder-V2-Lite on BigCodeBench. We've added its results, along with a few other models, to the leaderboard! huggingface.co/spaces/bigcode… DeepSeek-Coder-V2-Lite-Instruct is a beast indeed, similar to Magicoder-S-DS-6.7B, but with only…

BigCode

@BigCodeProject

18 de jun. de 2024

Introducing 🌸BigCodeBench: Benchmarking Large Language Models on Solving Practical and Challenging Programming Tasks! BigCodeBench goes beyond simple evals like HumanEval and MBPP and tests LLMs on more realistic and challenging coding tasks.

BigCode repostou

Philipp Schmid

@_philschmid

19 de jun. de 2024

It is time to deprecate HumanEval! 🧑🏻‍💻 @BigCodeProject just released BigCodeBench, a new benchmark to evaluate LLMs on challenging and complex coding tasks focused on realistic, function-level tasks that require the use of diverse libraries and complex reasoning! 👀 🧩 Contains…

_philschmid's tweet image. It is time to deprecate HumanEval! 🧑🏻‍💻 @BigCodeProject just released BigCodeBench, a new benchmark to evaluate LLMs on challenging and complex coding tasks focused on realistic, function-level tasks that require the use of diverse libraries and complex reasoning! 👀

🧩 Contains…

BigCode repostou

Terry Yue Zhuo @ SF 🏖️

@terryyuezhuo

18 de jun. de 2024

BigCode

@BigCodeProject

18 de jun. de 2024

Jim Fan

@DrJimFan

Hugging Face

@huggingface

clem 🤗

@ClementDelangue

Omar Sanseviero

@osanseviero

merve

@mervenoyann

Jeremy Howard

@jeremyphoward

Thomas Wolf

@Thom_Wolf

Soumith Chintala

@soumithchintala

Jay Hack

@mathemagic1an

Brendan Dolan-Gavitt

@moyix

👩‍💻 Paige Bailey

@DynamicWebPaige

Leandro von Werra

@lvwerra

EleutherAI

@AiEleuther

near

@nearcyan

Sasha Luccioni, PhD 🦋🌎✨🤗

@SashaMTL

MMitchell

@mmitchell_ai

🍉 Abubakar Abid

@abidlabs

Sharif Shameem

@sharifshameem

Lewis Tunstall

@_lewtun

$sarahookr's profile picture. Adaptive Intelligence. Built @Cohere_Labs, @GoogleBrain, @GoogleDeepmind. ML Efficiency, Multimodal\lingual. Changing spaces where breakthroughs happen.$