supercoderhawk

@supercoderhawk

NLP engineer at patsnap. NLP, deep learning researcher.

Shanghai, People's Republic of

github.com/supercoderhawk

於十一月 2016 加入

452貼文 74位跟隨者 2千個跟隨中

你可能會喜歡

@xiangyue96

@Jiachen_Gu

@Siru_Ouyang

@boyuan__zheng

@genglin_liu

@ShaoboWang6

@YifeiLiPKU

@jwanglvy

@ma_chang_nlp

@chongyanchen1

@ShijieChen98

@ibisbill_01

@Xiaodong_Yu_126

@l1tu_0u

@u_junhao

supercoderhawk 已轉發

Qian Liu

@sivil_taram

2024年3月20日

🕊️The Paloma paper is truly impressive - a must-read for anyone caring about the language model evaluation. It addresses two crucial questions that had previously left me puzzled: ❓Can the validation loss on one corpus (e.g., C4) represent all domains? The answer is no🚫.…

sivil_taram's tweet image. 🕊️The Paloma paper is truly impressive - a must-read for anyone caring about the language model evaluation. It addresses two crucial questions that had previously left me puzzled:

❓Can the validation loss on one corpus (e.g., C4) represent all domains? The answer is no🚫.…

supercoderhawk 已轉發

Bindu Reddy

@bindureddy

2024年2月15日

RAG And Context Understanding A great diagram that showcases the challenges with RAG benchmarking and LLM context understanding RAG systems are complex because of the following 4 issues. Stuffing the context of the LLM rarely helps and typically confuses the LLM We need a…

bindureddy's tweet image. RAG And Context Understanding

A great diagram that showcases the challenges with RAG benchmarking and LLM context understanding

RAG systems are complex because of the following 4 issues. Stuffing the context of the LLM rarely helps and typically confuses the LLM

We need a…

supercoderhawk 已轉發

AK

@_akhaliq

2024年2月14日

Microsoft presents UFO A UI-Focused Agent for Windows OS Interaction paper page: huggingface.co/papers/2402.07… introduce UFO, an innovative UI-Focused agent to fulfill user requests tailored to applications on Windows OS, harnessing the capabilities of GPT-Vision. UFO employs a…

_akhaliq's tweet image. Microsoft presents UFO

A UI-Focused Agent for Windows OS Interaction

paper page: huggingface.co/papers/2402.07…

introduce UFO, an innovative UI-Focused agent to fulfill user requests tailored to applications on Windows OS, harnessing the capabilities of GPT-Vision. UFO employs a…

supercoderhawk 已轉發

Daniel Johnson

@_ddjohnson

2024年2月15日

New paper: How can you tell when a model is hallucinating? Let it cheat! An expert doesn't need to cheat, so if your model learns to cheat, there must be something it doesn't know. Our general new approach for measuring uncertainty: arxiv.org/abs/2402.08733

_ddjohnson's tweet image. New paper: How can you tell when a model is hallucinating? Let it cheat! An expert doesn't need to cheat, so if your model learns to cheat, there must be something it doesn't know.

Our general new approach for measuring uncertainty: arxiv.org/abs/2402.08733

supercoderhawk 已轉發

Jason Wei

@_jasonwei

2024年2月13日

An incredible skill that I have witnessed, especially at OpenAI, is the ability to make “yolo runs” work. The traditional advice in academic research is, “change one thing at a time.” This approach forces you to understand the effect of each component in your model, and…

supercoderhawk 已轉發

Susan Zhang

@suchenzang

2024年2月15日

so i guess this is a thing now universities running ads to resell students' data for training llms 💰💰💰

supercoderhawk 已轉發

Jiacheng Liu

@liujc1998

2024年2月1日

It’s year 2024, and n-gram LMs are making a comeback!! We develop infini-gram, an engine that efficiently processes n-gram queries with unbounded n and trillion-token corpora. It takes merely 20 milliseconds to count the frequency of an arbitrarily long n-gram in RedPajama (1.4T…

liujc1998's tweet image. It’s year 2024, and n-gram LMs are making a comeback!!

We develop infini-gram, an engine that efficiently processes n-gram queries with unbounded n and trillion-token corpora. It takes merely 20 milliseconds to count the frequency of an arbitrarily long n-gram in RedPajama (1.4T…

supercoderhawk 已轉發

Xingyao Wang

@xingyaow_

2024年2月5日

Large Language Model (LLM) agents promise to free us from mundane tasks, but how should they best interact with our world? Introducing CodeAct, an agent {framework, instruction-tuning dataset, model}, employs executable Python code to unify the actions of LLM agents. 🧵1/

$xingyaow_'s tweet image. Large Language Model (LLM) agents promise to free us from mundane tasks, but how should they best interact with our world? Introducing CodeAct, an agent {framework, instruction-tuning dataset, model}, employs executable Python code to unify the actions of LLM agents. 🧵1/$

supercoderhawk 已轉發

elvis

@omarsar0

2024年2月5日

Continual Learning for LLMs One of the biggest challenges of working with LLMs is keeping them updated. Continual learning aims to enhance the overall linguistic and reasoning capabilities of LLMs. This survey paper provides an overview of developments in continual learning.…

omarsar0's tweet image. Continual Learning for LLMs

One of the biggest challenges of working with LLMs is keeping them updated.

Continual learning aims to enhance the overall linguistic and reasoning capabilities of LLMs.

This survey paper provides an overview of developments in continual learning.…

supercoderhawk 已轉發

Bindu Reddy

@bindureddy

2024年2月4日

A Novel RAG Approach That Understands The Whole Document Context RAG has rapidly evolved to be the standard way to apply LLMs in production. However, most methods are still limited because most existing methods retrieve only short contiguous chunks from a retrieval corpus,…

bindureddy's tweet image. A Novel RAG Approach That Understands The Whole Document Context

RAG has rapidly evolved to be the standard way to apply LLMs in production. However, most methods are still limited because most existing methods retrieve only short contiguous chunks from a retrieval corpus,…

supercoderhawk 已轉發

elvis

@omarsar0

2024年2月2日

Lots of compelling AI research ideas this week ranging from self-correcting RAG to sparsified LVLMs. A few papers I’ve been reading this week: - OLMo - SliceGPT - MoE-LLaVa - Corrective RAG - Rephrasing the Web - Redefining Retrieval in RAG - LLMs for Mathematical Reasoning…

supercoderhawk 已轉發

Rishabh Srivastava

@rishdotblog

2024年1月30日

We just opened sourced SQLCoder-70B! It outperforms all publicly accessible LLMs for Postgres text-to-SQL generation by a very wide margin. SQLCoder is finetuned on @AIatMeta's CodeLlama-70B model that was released yesterday on less than 20,000 hand-curated prompt completion…

rishdotblog's tweet image. We just opened sourced SQLCoder-70B! It outperforms all publicly accessible LLMs for Postgres text-to-SQL generation by a very wide margin.

SQLCoder is finetuned on @AIatMeta's CodeLlama-70B model that was released yesterday on less than 20,000 hand-curated prompt completion…

supercoderhawk 已轉發

Fuzhao Xue (Frio)

@XueFz

2024年1月30日

(1/5)🚀 Our OpenMoE Paper is out! 📄 Including: 🔍ALL Checkpoints 📊 In-depth MoE routing analysis 🤯Learning from mistakes & solutions Three important findings: (1) Context-Independent Specialization; (2) Early Routing Learning; (3) Drop-towards-the-End. Paper Link:…

XueFz's tweet image. (1/5)🚀 Our OpenMoE Paper is out! 📄 Including:

🔍ALL Checkpoints
📊 In-depth MoE routing analysis
🤯Learning from mistakes &amp; solutions

Three important findings:
(1) Context-Independent Specialization;
(2) Early Routing Learning;
(3) Drop-towards-the-End.

Paper Link:…

supercoderhawk 已轉發

Leonie

@helloiamleonie

2024年1月16日

I'm currently looking into different metrics and frameworks around Retrieval-Augmented Generation (RAG) evaluation. This is a first brain dump. But the landscape is already quite broad. What RAG evaluation metrics and frameworks have you already tested? And which ones did you…

helloiamleonie's tweet image. I'm currently looking into different metrics and frameworks around Retrieval-Augmented Generation (RAG) evaluation.

This is a first brain dump.

But the landscape is already quite broad.

What RAG evaluation metrics and frameworks have you already tested?

And which ones did you…

supercoderhawk 已轉發

Sumit

@_reachsumit

2024年1月15日

MuGI: Enhancing Information Retrieval through Multi-Text Generation Intergration with Large Language Models Proposes a framework that leverages LLM text generation to expand queries and substantially improves IR performance. 📝arxiv.org/abs/2401.06311 👨🏽‍💻github.com/lezhang7/Retri…

_reachsumit's tweet image. MuGI: Enhancing Information Retrieval through Multi-Text Generation Intergration with Large Language Models

Proposes a framework that leverages LLM text generation to expand queries and substantially improves IR performance.

📝arxiv.org/abs/2401.06311
👨🏽‍💻github.com/lezhang7/Retri…

supercoderhawk 已轉發

elvis

@omarsar0

2024年1月15日

Improving Information Retrieval in LLMs One effective way to use open-source LLMs is for search tasks, which could power many other applications. This work explores the use of instruction tuning to improve a language model's proficiency in information retrieval (IR) tasks.…

omarsar0's tweet image. Improving Information Retrieval in LLMs

One effective way to use open-source LLMs is for search tasks, which could power many other applications.

This work explores the use of instruction tuning to improve a language model's proficiency in information retrieval (IR) tasks.…

supercoderhawk 已轉發

Jerry Liu

@jerryjliu0

2024年1月14日

Here’s a neat paper by Barnett et al. (@DeakinA2I2) that outlines 7 failure points in building a RAG pipeline over your data. 🚫 Missing content (did not index it) 🚫 Missing in top-k retrieved set 🚫 Missing in reranked set 🚫 Not extracted (in context but LLM couldn’t use) 🚫…

jerryjliu0's tweet image. Here’s a neat paper by Barnett et al. (@DeakinA2I2) that outlines 7 failure points in building a RAG pipeline over your data.
🚫 Missing content (did not index it)
🚫 Missing in top-k retrieved set
🚫 Missing in reranked set
🚫 Not extracted (in context but LLM couldn’t use)
🚫…

supercoderhawk 已轉發

Jerry Liu

@jerryjliu0

2024年1月6日

There was a lot of cool RAG research in the past year or two, and luckily for you, all of these efforts are tracked under one place! “Retrieval-Augmented Generation for Large Language Models: A Survey” by Gao et al. does an admirable job categorizing all RAG research into three…

jerryjliu0's tweet image. There was a lot of cool RAG research in the past year or two, and luckily for you, all of these efforts are tracked under one place!

“Retrieval-Augmented Generation for Large Language Models: A Survey” by Gao et al. does an admirable job categorizing all RAG research into three…

LlamaIndex 🦙

@llama_index

2024年1月6日

One thing we loved about 2023 was the volume of new research around RAG from the entire community ❤️. This survey by Gao et al. is the most comprehensive survey of this research we’ve seen yet - it covers 100+ papers, blog posts, and projects across every step of the RAG…

llama_index's tweet image. One thing we loved about 2023 was the volume of new research around RAG from the entire community ❤️.

This survey by Gao et al. is the most comprehensive survey of this research we’ve seen yet - it covers 100+ papers, blog posts, and projects across every step of the RAG…

supercoderhawk 已轉發

Yao Fu

@Francis_YAO_

2023年12月26日

Although there are abundant work studying long-context LLMs, most of them talks about architecture / positional encoding, almost none of existing papers talk about data. In this work, we take a close look at data influence on context scaling yaofu.notion.site/Understanding-…

supercoderhawk 已轉發

Jerry Liu

@jerryjliu0

2023年12月20日

New RAG technique alert 🚨 We’ve come up with an advanced RAG technique in @llama_index that lets you ask structured questions over many documents ✨: 1. Model each document as a metadata dictionary - store more attributes beyond a simple text summary. (e.g. a row in SQL…

LlamaIndex 🦙

@llama_index

2023年12月20日

Structured Hierarchical RAG 💫 Doing RAG well over many docs is hard. A popular existing approach is hierarchical retrieval: select the relevant doc summaries before retrieving the content inside. But selecting docs purely based on summaries is tough - a doc can have a bunch of…