OPT

Affiliation

MetaAI

Commercial

Fine-tuning Method

Note

데이터

- RoBERTa Data (BookCorpus, Stories, CCNews v2) - the Pile (CommonCrawl, DM Mathematics, Project Gutenberg, HackerNews, OpenSubtitles, OpenWebText2, USPTO, Wikipedia) - PushShift.io Reddit (only longest chain of thread; 66%) → English text 위주로 추출 (CommonCrawl 내부에 non-English data 조금 있음) → filtered out by MinhashLSH (Jaccard similarity ≥ .95; Pile은 중복 문서가 많음) → GPT-2 BPE tokenizer → final data : 180B tokens

모델 크기

175B

새롭게 제공된 Resource

Model

출시일

2022-05