Fine-tuning Method
only focusing on coding capability
모델 크기
새롭게 제공된 Resource
최근 MS 내에서 SLM (small-scaled LM) 연구가 활발하게 진행되고 있는데, 대부분의 approach가 data organization & synthetic data generation을 통한 성능 향상


2.7B 23/12 phi-1 -> phi-1.5 -> phi-2, 작은 모델을 덜 정제된 데이터로 학습 -> 더 큰 모델을 작은 모델로부터 scale-up & 더 정제된 데이터로 학습
2.7 billion parameter, 14 days on 96 A100 GPUs
1.4T tokens from multiple passes on a mixture of Synthetic and Web datasets for NLP and coding
Phi-2 is a base model that has not undergone alignment through reinforcement learning from human feedback (RLHF), nor has it been instruct fine-tuned. ⇒ 즉, RLHF & instruct fine-tuned 등 advanced training technique을 활용하지 않고도 데이터 quality를 높이고 organized synthetic data를 활용하는 것만으로 더 높은 성능을 보일 수 있음.
The Importance of Data Quality & Data Organization powered by GPT-4 : phi-1 => phi-1.5 (from phi-1) with higher quality data => phi-2 (from phi-1.5) with higher quality data
data organization : data를 세부 도메인 별로 나누고, 각 도메인별로 점차 quality를 높여가며 데이터를 학습 (science, daily activities, theory of mind, among others.)
mixtured with synthetic data & filtered web-data based on educational value and content quality : gpt-4를 활용하여, 도메인 별 prompting을 통해, data filtering & data-crafting & enriched with NL explanations
coding 에서 25배 큰 Llama-2-70B 보다 성능이 좋음 & 훨씬 큰 모델인 llama-2-13B & mistral-7B & Gemini Nano 2-3.2B 보다 압도적으로 높은 성능을 보임 (math & coding에서 높은 성능)


phi-1 : (1.3B parameters, trained for 4 days on 8 A100s)


These license terms are an agreement between you and Microsoft Corporation (or one of its affiliates). They apply to the source code, object code, machine learning models, or data (collectively “Materials”) that accompany this license. IF YOU COMPLY WITH THESE LICENSE TERMS, YOU HAVE THE RIGHTS BELOW. BY USING THE MATERIALS, YOU ACCEPT THESE TERMS. 1) INSTALLATION AND USE RIGHTS TO THE MATERIALS. Subject to the terms of this agreement, you have the below rights, if applicable, to use the Materials solely for non-commercial, non-revenue generating, research purposes: