'LLM' 태그의 글 목록

LLM 3

[논문 리뷰] RAG-DDR: OPTIMIZING RETRIEVAL-AUGMENTEDGENERATION USING DIFFERENTIABLE DATA REWARDS

https://openreview.net/forum?id=uaYKxmn3Ab RAG-DDR: Optimizing Retrieval-Augmented Generation Using...Retrieval-Augmented Generation (RAG) has proven its effectiveness in mitigating hallucinations in Large Language Models (LLMs) by retrieving knowledge from external resources. To adapt LLMs for RAG...openreview.netGithub: https://github.com/OpenMatch/RAG-DDR개요LLM의 hallucination 문제를 완화하기 위해 RAG(R..

논문 2025.07.14

[논문 리뷰] Learning to Think: Information-Theoretic Reinforcement Fine-Tuning for LLMs

https://arxiv.org/abs/2505.10425 Learning to Think: Information-Theoretic Reinforcement Fine-Tuning for LLMsLarge language models (LLMs) excel at complex tasks thanks to advances in reasoning abilities. However, existing methods overlook the trade-off between reasoning effectiveness and computational efficiency, often encouraging unnecessarily long reasoning chaarxiv.orgLLM의 최근 연구들은 추론 시 더 많은 토큰..

논문 2025.06.02

[논문 리뷰] DAPO: An Open-Source LLM Reinforcement Learning System at Scale

https://arxiv.org/abs/2503.14476 DAPO: An Open-Source LLM Reinforcement Learning System at ScaleInference scaling empowers LLMs with unprecedented reasoning ability, with reinforcement learning as the core technique to elicit complex reasoning. However, key technical details of state-of-the-art reasoning LLMs are concealed (such as in OpenAI o1 blogarxiv.orgIntroductionTest-time scaling은 더 긴 Cha..

논문 2025.04.28

khseon7 님의 블로그

인공지능과 관련된 이것저것 정리해보는 블로그

rmok, grpo, Kan, LLM, 강화 학습, ANN, 심층 강화 학습, ray, dapo, Rag, LMM, 강화학습,

Today :
Yesterday :

일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

LLM 3

티스토리툴바