Search

OPT

Affiliation
MetaAI
Commercial
Fine-tuning Method
Note
데이터
- RoBERTa Data (BookCorpus, Stories, CCNews v2) - the Pile (CommonCrawl, DM Mathematics, Project Gutenberg, HackerNews, OpenSubtitles, OpenWebText2, USPTO, Wikipedia) - PushShift.io Reddit (only longest chain of thread; 66%) → English text 위주로 추출 (CommonCrawl 내부에 non-English data 조금 있음) → filtered out by MinhashLSH (Jaccard similarity ≥ .95; Pile은 중복 문서가 많음) → GPT-2 BPE tokenizer → final data : 180B tokens
모델 크기
175B
새롭게 제공된 Resource
Model
출시일
2022-05