StableVicuna

Affiliation

Stability AI

Commercial

Fine-tuning Method

RLHF

Note

[Stability AI releases StableVicuna, the AI World’s First Open Source RLHF LLM Chatbot] code • StableVicuna-13B (2023-04-28) ◦ LLaMA-13B →SFT Vicuna-13B v0 →RLHF ◦ Training data oasst1 GPT4All (400k) : GPT-3.5 Turbo로 생성한 437,605개의 (prompts-responses) dataset Alpaca (52k) ◦ Data for RM oasst1 Anthropic HH : ~160k Human-rated examples (harmfulness & helpfulness 기준, response pair 중에 더 선호되는 것) SHP : ~385K Stanford Human Preferences dataset (348,718 datasets. 요리에서 철학에 이르는 서로 다른 18개 영역에 대한 questions/instructions datasets) ◦ CC BY-NC-SA-4.0 : Non-commercial

데이터

[Training Data] oasst1 GPT4All (400k) : GPT-3.5 Turbo로 생성한 437,605개의 (prompts-responses) dataset Alpaca [Data for RM] oasst1 Anthropic HH : ~160k Human-rated examples (harmfulness & helpfulness 기준, response pair 중에 더 선호되는 것) SHP : ~385K Stanford Human Preferences dataset (348,718 datasets. 요리에서 철학에 이르는 서로 다른 18개 영역에 대한 questions/instructions datasets)

모델 크기

13B

새롭게 제공된 Resource

Model

출시일

2023-04-28