Search

TRLX

Affiliation
CarperAI
Commercial
Fine-tuning Method
RLHF
Note
github code • NeMo ILQL (link) • T5 ILQL & PPO • LLaMa & Alpaca PPO/SFT support • HF Transformers integration for ILQL • Data integration ◦ AnthropicAI HH
데이터
모델 크기
새롭게 제공된 Resource
Training/Inference Pipeline
출시일
2023-02-23