Vicuna

Affiliation

UC Berkeley

CMU

Stanford

UC San Diego

Commercial

Fine-tuning Method

SFT

Note

- blog, github, demo, huggingface(GPTQ 4bit), Data & Cleansing & Finetuning . data link : ShareGPT (chatgpt 대화 내용 공유를 extension을 통해 진행; website, data), alpaca instruct data, shareGPT filtered data - Model (학습 코드는 alpaca를 수정해서 진행) . input size 2048 . CPU 60GB RAM, GPU 28GB VRAM 이상에서 작동 (하지만, 3090으로 작동 확인) . 체크포인트는 LLaMA 13B 체크포인트에 delta weight를 더해서 업데이트 하는 방식으로 제공 . 체크포인트는 huggingface transformers로 작동 . Finetuning은 pytorch FSDP & A100 8개로 하루동안 진행, Skypilot(링크)이라는 것을 이용하여 학습 비용은 $1000 -> $300으로 최적화 . gradient checkpointing & flash attention - Tools . 학습 : SkyPilot (managed spot jobs) . Serving : FastChat (gradio) . Evaluation : GPT-4 - comparison : https://vicuna.lmsys.org/eval/ . GPT-4를 이용한 평가에서, ChatGPT ~ Bard > Vicuna > Alpaca > LLaMA

데이터

- ShareGPT : 53k cleaned English data obtained by ChatGPT (from ~100k original data) → 실제 학습에는 70k 데이터 활용했다고 함. . multi-turn . 데이터 전처리 : HTML → Markdown / filter out low-quality samples / divide lengthy conversations into smaller segments that fit the model’s maximum context length

모델 크기

13B (LLaMA fine-tuning)

새롭게 제공된 Resource

Model

InstructData

출시일

2023-03

Quality Accesssed by GPT-4

Comparison btw several notable models

Instruct Data Example

•

instruction - input - output : "Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request. ### Instruction: Identify the odd one out. ### Input: Twitter, Instagram, Telegram ### Response: Telegram"

•

instruction - output : "Below is an instruction that describes a task. Write a response that appropriately completes the request. ### Instruction: What are the three primary colors? ### Response: The three primary colors are red, blue, and yellow."