OpenAssistant

Affiliation

LAION-AI

Commercial

Fine-tuning Method

Note

paper, code, New Data, blog, website for chat, roadmap, Data Structures, models • 2022년 12월 Project launch • 2023년 4월 6일, Models & Training Data & Code 공개 Other data 참고 • oasst-mix : sft를 위한 추가 데이터 ◦ vicuna (non-commercial) ◦ code_alpaca (non-commercial) ◦ dolly15k (commercial) ◦ grade_school_math_instructions • RM Data : rm 학습을 위한 추가 데이터 ◦ Anthropic HH : ~160k Human-rated examples (harmfulness & helpfulness 기준, response pair 중에 더 선호되는 것) ◦ SHP : ~385K Stanford Human Preferences dataset ◦ hellaswag ◦ webgpt ◦ hf_summary_pairs • RM Others ◦ summarize_from_feedback ◦ synthetic-instruct-gptj-pairwise

데이터

[oasst1] • New Data : human-generated, human-annotated assistant-style conversation corpus 161,443 ◦ 66,497 conversation trees ◦ 35개의 서로 다른 언어 ◦ 461,292 quality ratings ◦ 13,500 volunteers • Ready for export : Spam & Deleted data 제외 (초기 prompt로만 구성 : prompt_lottery_waiting; 낮은 퀄리티 : aborted_low_grade; 중단 : halted_by_moderator) ◦ 10,364 conversation trees ◦ 88,838 messages

모델 크기

새롭게 제공된 Resource

InstructData

출시일

2023-04-06

OpenAssistant Project

paper, code, New Data, blog, website for chat, roadmap, Data Structures, models

202305_OpenAssistant Data Structures.pdf

130.9KB

OpenAssistant releases its open-source ChatGPT competitor

OpenAssistant is supposed to become a real open-source alternative to OpenAI's ChatGPT. Now first models, training data, and code are available.

https://the-decoder.com/openassistant-releases-its-open-source-chatgpt-competitor/

OpenAssistant preview now available on the website, latest model

https://www.reddit.com/r/OpenAssistant/comments/12dneqv/openassistant_preview_now_available_on_the/

•

2022년 12월 Project launch

•

2023년 4월 6일, Models & Training Data & Code 공개

•

학습 데이터

◦

New Data : human-generated, human-annotated assistant-style conversation corpus 161,443

▪

66,497 conversation trees

▪

35개의 서로 다른 언어

▪

461,292 quality ratings

▪

13,500 volunteers

▪

Ready for export : Spam & Deleted data 제외 (초기 prompt로만 구성 : prompt_lottery_waiting; 낮은 퀄리티 : aborted_low_grade; 중단 : halted_by_moderator)

•

10,364 conversation trees

•

88,838 messages

◦

oasst-mix : sft를 위한 추가 데이터

▪

vicuna (non-commercial)

▪

code_alpaca (non-commercial)

▪

dolly15k (commercial)

▪

grade_school_math_instructions

◦

RM Data : rm 학습을 위한 추가 데이터

▪

Anthropic HH : ~160k Human-rated examples (harmfulness & helpfulness 기준, response pair 중에 더 선호되는 것)

▪

SHP : ~385K Stanford Human Preferences dataset

▪

hellaswag

▪

webgpt

▪

hf_summary_pairs

◦

RM Others

▪

summarize_from_feedback

▪

synthetic-instruct-gptj-pairwise

•

Models

◦

LLaMA & Pythia & StableLM fine-tuning

▪

가장 큰 모델 : LLaMA를 fine-tuning 한 30B

▪

LLaMA는 non-commercial

▪

Pythia는 commercial

◦

202304, 공개된 list

▪

sft-7-llama-30b-xor & sft-6-llama-30b-xor : llama는 license 때문에 배포가 불가능하여, xor weights로 배포

•

Data : oasst-mix

▪

stablelm-7b-sft-v7-epoch-3 : stablelm-base-alpha-7b로부터 3 epoch 학습한 7번째 English SFT 모델

•

CC-BY-SA-4.0

•

Data : oasst-mix

▪

sft-1-pythia-12b : pythia-12b-deduped로부터 학습한 첫번째 English SFT 모델

•

Apache 2.0

•

Data : oasst only

▪

sft-4-pythia-12b-epoch-3.5 : pythia-12b-deduped로부터 3.5 epoch 학습한 4번째 English SFT 모델

•

Apache 2.0

•

Data : oasst_export

▪

rm-2.1-phthia-1.4b-epoch-2.5 : pythia-1.4b-gpt4all-pretrain로부터 10k step 학습한 RM

•

Data : RM Data

▪

rm-2-pythia-6.9b-epoch-1 : pythia-6.9b-gpt4all-pretrain로부터 3.5k step 학습한 RM

•

Data : RM Data

•

한계점

◦

평균 26세의 남성들에 의해 annotated 된 데이터로 인한 biases 존재

“We strongly encourage researchers to thoroughly investigate the safety and bias of the models before employing them in downstream tasks. It is important to recognize that the released models may exhibit unsafe behavior and are likely susceptible to prompt injection attacks.” in paper.