ColossalChat (Coati)

Affiliation

HPC-AI tech

Commercial

Fine-tuning Method

RLHF

SFT

Note

- Homepage, blog, ColossalAI Github, ColossalChat github - 거대 GPT 모델 학습을 위한 pipeline 제공 - 10x speedup, 47x cost savings, 175B 이상의 parameters - the first to open-source a complete RLHF pipeline : from training to deployment - 학습/추론 pipeline 제공 : Pytorch 대비 추론 1.4배, 학습 7.7배, 10.3배 큰 모델 학습 가능 (alpaca code 대비 학습 속도 3배 빠름) - automatic parallelism, memory management, dynamic scheduling, “data / pipeline / sequence / tensor parallelism” - chip and cloud agnostic : GPUs, TPUs, FPGAs, CPUs - ColossalAI Talking Intelligence (Coati models) : RLHF Pipeline 제공 • Supports comprehensive large-model training acceleration capabilities for ColossalAI, without requiring knowledge of complex distributed training algorithms • Supervised datasets collection • Supervised instructions fine-tuning • Training reward model • Reinforcement learning with human feedback • Quantization inference • Fast model deploying • Perfectly integrated with the Hugging Face ecosystem, a high degree of model customization - Limitations of LLaMA-finetuned models and dataset . limitation of LLaMa fine-tuning models : missing knowledge by LLaMa / Lack of counting ability / Lack of Logics (reasoning and calculation) / Tend to repeat the last sentence / poor multilingual results . limitation of InstructWild dataset : Lack of summarization ability / multi-turn chat and role-playing / self-recognition / Safety - Quantization 지원 & Deployment 지원 . 8-bit quantization (RTN), 4-bit quantization (GPTQ), and FP16 inference.

데이터

- InstructWild (link): 52k instructions for English (24M tokens) & 52K instructions for Chinese (30M tokens) . 생성 과정 : 700 noisy instructions 수집 from Twitter & filter out noisy ones → 429 clean instruction (alpaca와 다르게 제한없이 instruction 얻음) → ChatGPT → 5개의 prompts를 예시로 두고 새로운 instruction들을 생성하도록 함 → ChatGPT → 각각의 instruction에 대해 response 수집 . 영어와 중국어 각각 별도로 수집 . 데이터 모으는데 총 $880 소모

모델 크기

7B (LLaMa RLHF)

새롭게 제공된 Resource

Training/Inference Pipeline

InstructData

출시일

2023-03-29

References

•

ETRI 강의 : ColossalChat code를 활용한 학습 예시

긴급 세미나. 2부 (실전) ChatGPT-replica 만들기 코드실습 (전자통신부설연구소 고우영 선임연구원)

■ 연사 : 전자통신부설연구소 고우영 선임연구원 ■ 주제 : (실전) ChatGPT-replica 만들기 코드실습 ■ 내용 : ● RLHF(GPT fine-tuning/강화학습) ● ChatGPT 데이터셋 구축 ● ChatGPT-replica 코드실습 ■ 발표자료 : https://bit.ly/chatgpt_presentation ■ 데이터 : https://github.com/airobotlab/KoChatGPT ■ 코드 : https://bit.ly/401rCrd ■ 문의 : gwy876@gmail.com 1부 "GPT4 톺아보기" 보러가기 : https://youtu.be/T7BpCMI7kDY AI Frenz 세미나는 수요일 저녁 7시에 찾아옵니다. ▶ 홈페이지 http://aifrenz.org/ ▶ 오픈채팅방 https://bit.ly/aifrenzkakao ▶ 유튜브 https://bit.ly/aifrenzyoutube ▶ 페이스북 https://bit.ly/aifrenzfacebook ▶ 슬랙 https://bit.ly/aifrenzslack ▶ 깃허브 https://aifrenz.github.io ▶ 뉴스레터 https://page.stibee.com/subscriptions/39726 ※ AI Frenz는 주로 카카오톡 오픈채팅방에서 소통하고 있습니다. #AI #인공지능 #세미나

https://www.youtube.com/watch?v=Iq8erq62s8c

•

소개 & Introduction blog : https://medium.com/@yangyou_berkeley/colossalchat-an-open-source-solution-for-cloning-chatgpt-with-a-complete-rlhf-pipeline-5edf08fb538b

•

Github : ColossalAI

◦

Feature : https://github.com/hpcaitech/ColossalAI#features

•

ColossalChat github : https://github.com/hpcaitech/ColossalAI/tree/main/applications/Chat

RLHF

Zero + Gemini to Reduce Memory Redundancy

Colossal-AI supports ZeRO (Zero Redundancy Optimizer) to improve memory usage efficiency, enabling larger models to be accommodated at a lower cost, without affecting computing granularity and communication efficiency.

The automatic chunk mechanism can further improve ZeRO’s performance by increasing memory usage efficiency, reducing communication frequency, and avoiding memory fragmentation.

The heterogeneous memory space manager, Gemini, supports unloading optimizer states from GPU memory to CPU memory or hard disk space to overcome the limitations of GPU memory capacity, expand the scale of trainable models, and reduce the cost of large AI model applications.