OpenAssistant Project
•
2022년 12월 Project launch
•
2023년 4월 6일, Models & Training Data & Code 공개
•
학습 데이터
◦
New Data : human-generated, human-annotated assistant-style conversation corpus 161,443
▪
66,497 conversation trees
▪
35개의 서로 다른 언어
▪
461,292 quality ratings
▪
13,500 volunteers
▪
Ready for export : Spam & Deleted data 제외 (초기 prompt로만 구성 : prompt_lottery_waiting; 낮은 퀄리티 : aborted_low_grade; 중단 : halted_by_moderator)
•
10,364 conversation trees
•
88,838 messages
◦
oasst-mix : sft를 위한 추가 데이터
▪
vicuna (non-commercial)
▪
code_alpaca (non-commercial)
▪
dolly15k (commercial)
▪
grade_school_math_instructions
◦
RM Data : rm 학습을 위한 추가 데이터
▪
▪
SHP : ~385K Stanford Human Preferences dataset
▪
hellaswag
▪
hf_summary_pairs
•
Models
◦
LLaMA & Pythia & StableLM fine-tuning
▪
가장 큰 모델 : LLaMA를 fine-tuning 한 30B
▪
LLaMA는 non-commercial
▪
Pythia는 commercial
◦
202304, 공개된 list
▪
sft-7-llama-30b-xor & sft-6-llama-30b-xor : llama는 license 때문에 배포가 불가능하여, xor weights로 배포
•
Data : oasst-mix
▪
stablelm-7b-sft-v7-epoch-3 : stablelm-base-alpha-7b로부터 3 epoch 학습한 7번째 English SFT 모델
•
CC-BY-SA-4.0
•
Data : oasst-mix
▪
sft-1-pythia-12b : pythia-12b-deduped로부터 학습한 첫번째 English SFT 모델
•
Apache 2.0
•
Data : oasst only
▪
sft-4-pythia-12b-epoch-3.5 : pythia-12b-deduped로부터 3.5 epoch 학습한 4번째 English SFT 모델
•
Apache 2.0
•
Data : oasst_export
▪
rm-2.1-phthia-1.4b-epoch-2.5 : pythia-1.4b-gpt4all-pretrain로부터 10k step 학습한 RM
•
Data : RM Data
▪
rm-2-pythia-6.9b-epoch-1 : pythia-6.9b-gpt4all-pretrain로부터 3.5k step 학습한 RM
•
Data : RM Data
•
한계점
◦
평균 26세의 남성들에 의해 annotated 된 데이터로 인한 biases 존재
“We strongly encourage researchers to thoroughly investigate the safety and bias of the models before employing them in downstream tasks. It is important to recognize that the released models may exhibit unsafe behavior and are likely susceptible to prompt injection attacks.” in paper.