Category	component	owner	Closed source or property	OSS License	Commercial use	modle size(B)	release date	code/paper	Star	Description
Multi-Model	ImageBind	Meta		License	No			Github	5.9k	ImageBind One Embedding Space to Bind Them All
Image	DeepFloyd IF	stability.ai		License Model license				Github	6.4k	text-to-image model with a high degree of photorealism and language understanding
Image	Stable Diffusion Version 2	stability.ai		MIT, unknown				Github	23.5k	High-Resolution Image Synthesis with Latent Diffusion Models
Image	DALL-E	OpenAI		Modified MIT	Yes			Github	10.3k	PyTorch package for the discrete VAE used for DALL·E.
Image	DALL·E 2	OpenAI	Yes					product
Image	DALLE2-pytorch	lucidrains		MIT	Yes			Github	9.7k	Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis neural network, in Pytorch
Speech	Whisper	OpenAI		MIT	Yes			Github	37.7k	Robust Speech Recognition via Large-Scale Weak Supervision
Speech	MMS	Meta	Yes						paper
Code model	Codex	OpenAI	Yes			12	2021/7/1	blog	Paper
Code model	AlphaCode					41	Feb 2022			Competition-Level Code Generation with AlphaCode
Code model	starcoder	BigCode	No	Apache		15	May 2023	Github	4.8k	language model (LM) trained on source code and natural language text
Code model	CodeGen	Salesforce	No	?				Github	3.6k	model for program synthesis. Trained on TPU-v4. Competitive with OpenAI Codex.
Code model	Replit Code	replit				3	May 2023			replit-code-v1-3b model is a 2.7B LLM trained on 20 languages from the Stack Dedup v1.2 dataset.
Code model	CodeGen2	Salesforce		BSD	Yes	1, 3, 7, 16	May 2023	Github		Code models for program synthesis.
Code model	CodeT5 and CodeT5+	Salesforce		BSD	Yes	16	May 2023	CodeT5		CodeT5 and CodeT5+ models for Code Understanding and Generation from Salesforce Research.
language model	GPT						June 2018	GPT		Improving Language Understanding by Generative Pre-Training
language model	BERT						Oct 2018	BERT		Bidirectional Encoder Representations from Transformers
language model	RoBERTa					0.125 - 0.355	July 2019	RoBERTa		A Robustly Optimized BERT Pretraining Approach
language model	GPT-2					1.5	Nov 2019	GPT-2		Language Models are Unsupervised Multitask Learners
language model	T5					0.06 - 11	Oct 2019	Flan-T5		Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
language model	XLNet						Jun 2019	XLNet		Generalized Autoregressive Pretraining for Language Understanding and Generation
language model	ALBERT					0.235	Sep 2019	ALBERT		A Lite BERT for Self-supervised Learning of Language Representations
language model	CTRL					1.63	Sep 2019	CTRL		CTRL: A Conditional Transformer Language Model for Controllable Generation
language model	GPT 3	Azure	Yes			175	May 2020	Paper		Language Models are Few-Shot Learners
language model	GShard					600	Jun 2020	Paper		GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding
language model	BART						Jul 2020	BART		Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
language model	mT5					13	Oct 2020	mT5		mT5: A massively multilingual pre-trained text-to-text transformer
language model	PanGu-α					13	April 2021	PanGu-α		PanGu-α: Large-scale Autoregressive Pretrained Chinese Language Models with Auto-parallel Computation
language model	CPM-2					198	Jun 2021	CPM		CPM-2: Large-scale Cost-effective Pre-trained Language Models
language model	GPT-J 6B	EleutherAI	No		Yes	6	June 2021	GPT-J-6B		A 6 billion parameter, autoregressive text generation model trained on The Pile.
language model	ERNIE 3.0	Baidu	Yes			10	July 2021			ERNIE 3.0: Large-scale Knowledge Enhanced Pre-training for Language Understanding and Generation
language model	Jurassic-1					178	Aug 2021			Jurassic-1: Technical Details and Evaluation
language model	ERNIE 3.0 Titan					10	July 2021			ERNIE 3.0 Titan: Exploring Larger-scale Knowledge Enhanced Pre-training for Language Understanding and Generation
language model	HyperCLOVA					82	Sep 2021			What Changes Can Large-scale Language Models Bring? Intensive Study on HyperCLOVA: Billions-scale Korean Generative Pretrained Transformers
language model	FLAN					137	2021/10/1	Paper		Finetuned Language Models Are Zero-Shot Learners
language model	GPT 3.5	Azure	Yes
language model	GPT 4	Azure	Yes				2023/3/1
language model	ERNIE 3.0	Baidu	Yes			10	2021/7/1	Paper
language model	Jurassic-1					178	2021/8/1	Paper
language model	T0					11	Oct 2021	T0		Multitask Prompted Training Enables Zero-Shot Task Generalization
language model	Yuan 1.0					245	Oct 2021			Yuan 1.0: Large-Scale Pre-trained Language Model in Zero-Shot and Few-Shot Learning
language model	WebGPT					175	Dec 2021			WebGPT: Browser-assisted question-answering with human feedback
language model	Gopher					280	Dec 2021			Scaling Language Models: Methods, Analysis & Insights from Training Gopher
language model	GLaM					1200	Dec 2021			GLaM: Efficient Scaling of Language Models with Mixture-of-Experts
language model	LaMDA	Bard	Yes			137	Jan 2022	Paper		LaMDA: Language Models for Dialog Applications
language model	MT-NLG					530	Jan 2022			Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model
language model	InstructGPT					175	Mar 2022			Training language models to follow instructions with human feedback
language model	Chinchilla					70	Mar 2022			Shows that for a compute budget, the best performances are not achieved by the largest models but by smaller models trained on more data.
language model	GPT-NeoX-20B					20	April 2022	GPT-NeoX-20B		GPT-NeoX-20B: An Open-Source Autoregressive Language Model
language model	Tk-Instruct					11	April 2022	Tk-Instruct-11B		Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks
language model	PALM	Google	Yes			540	April 2022			PaLM: Scaling Language Modeling with Pathways
language model	OPT	Meta	No		Yes	175	May 2022	OPT-13B, OPT-66B ,Paper		OPT: Open Pre-trained Transformer Language Models
language model	OPT-IML					30, 175	Dec 2022	OPT-IML		OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization
language model	GLM-130B					130	Oct 2022	GLM-130B		GLM-130B: An Open Bilingual Pre-trained Model
language model	AlexaTM					20	Aug 2022			AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model
language model	Flan-T5					11	Oct 2022	Flan-T5-xxl		Scaling Instruction-Finetuned Language Models
language model	Sparrow					70	Sep 2022			Improving alignment of dialogue agents via targeted human judgements
language model	UL2					20	Oct 2022	UL2, Flan-UL2		UL2: Unifying Language Learning Paradigms
language model	U-PaLM					540	Oct 2022			Transcending Scaling Laws with 0.1% Extra Compute
language model	BLOOM	BigScience	Bo		Yes	176	Nov 2022	BLOOM ,Paper		BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
language model	mT0					13	Nov 2022	mT0-xxl		Crosslingual Generalization through Multitask Finetuning
language model	Galactica					0.125 - 120	Nov 2022	Galactica		Galactica: A Large Language Model for Science
language model	ChatGPT						Nov 2022			A model called ChatGPT which interacts in a conversational way. The dialogue format makes it possible for ChatGPT to answer followup questions, admit its mistakes, challenge incorrect premises, and reject inappropriate requests.
language model	LLama	Meta	No		No	7, 13, 33, 65	2023/2/1	Paper,LLaMA		LLaMA: Open and Efficient Foundation Language Models
language model	GPT-4						March 2023
language model	PanGU-Σ		Yes			1085	2023/3/1			PanGu-Σ: Towards Trillion Parameter Language Model with Sparse Heterogeneous Computing
language model	BloombergGPT					50	March 2023			BloombergGPT: A Large Language Model for Finance
language model	Cerebras-GPT	Cerebras	No		Yes	0.111 - 13	2023/3	hf		Open Compute-Optimal Language Models Trained on the Cerebras Wafer-Scale Cluster
language model	oasst-sft-1-pythia-12b	LAION-AI	No		Yes	12	2023/3	HF		OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.
language model	Pythia	Eleuthera AI	No		Yes	0.070 - 12	2023/3	Pythia, Paper		A suite of 16 LLMs all trained on public data seen in the exact same order and ranging in size from 70M to 12B parameters.
language model	StableLM		No		No	3, 7	April 2023	Github		Stability AI's StableLM series of language models
language model	Dolly 2.0	DataBricks	No		Yes	3, 7, 12	2023/4	Dolly		An instruction-following LLM, fine-tuned on a human-generated instruction dataset licensed for research and commercial use.
language model	DLite					0.124 - 1.5	2023/5	HF		Lightweight instruction following models which exhibit ChatGPT-like interactivity.
language model	MPT-7B	MosaicML	No	Apache 2	Yes	7	2023/5/5	blog		a GPT-style model, and the first in the MosaicML Foundation Series of models.
h2oGPT					12	2023/5	HF		h2oGPT is a large language model (LLM) fine-tuning framework and chatbot UI with document(s) question-answer capabilities.
language model	LIMA					65	2023/5			A 65B parameter LLaMa language model fine-tuned with the standard supervised loss on only 1,000 carefully curated prompts and responses, without any reinforcement learning or human preference modeling.
language model	RedPajama-INCITE					3, 7	2023/5	HF		A family of models including base, instruction-tuned & chat models.
language model	Gorilla					7	2023/5	Gorilla		Gorilla: Large Language Model Connected with Massive APIs
language model	Med-PaLM 2						2023/5			Towards Expert-Level Medical Question Answering with Large Language Models
PaLM 2						2023/5			A Language Model that has better multilingual and reasoning capabilities and is more compute-efficient than its predecessor PaLM.
language model	Falcon LLM					7, 40	2023/5	7B, 40B		foundational large language model (LLM) with 40 billion parameters trained on one trillion tokens.
language model	Claude	Anthropic	Yes
language model	GPT-Neo	Eleuthera AI	No		Yes
language model	GPT-Neox	Eleuthera AI	No		Yes	20	2022/2/1	Paper
language model	FastChat-T5-3B	LMSYS	No	Apache	Yes		2023/4/
language model	OpenLLama	openlm-research	No		Yes
language model	OpenChatKit	Together	No		Yes
language model	YaLM	Yandex	No		Yes	100	2022/6/1	Github
ChatGLM-6B	TsingHua	No	ChatGLM-6B	No	6	2023/3/1	Github
language model	Alpaca	Stanford	No		No
language model	Vicuna		No		No	13	2023/3/1	Blog
language model	StableVicuna		No		No
language model	RWKV-4-Raven-7B	BlinkDL	No		No
language model	Alpaca-LoRA	tloen	No		No
language model	Koala	BAIR	No		No	13	2023/4/1	Blog

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

large model landscape.md

large model landscape.md

Files

large model landscape.md

Latest commit

History

large model landscape.md

File metadata and controls