bytedance/sdxl-lightning-4step |
SDXL-Lightning by ByteDance: a fast text-to-image model that makes high-quality images in 4 steps |
323955 |
meta/meta-llama-3-70b-instruct |
A 70 billion parameter language model from Meta, fine tuned for chat completions |
58277 |
salesforce/blip |
Bootstrapping Language-Image Pre-training |
37319 |
nightmareai/real-esrgan |
Real-ESRGAN with optional face correction and adjustable upscale |
37015 |
tencentarc/gfpgan |
Practical face restoration algorithm for old photos or AI-generated faces |
21712 |
stability-ai/sdxl |
A text-to-image generative AI model that creates beautiful images |
21205 |
openai/whisper |
Convert speech in audio to text |
15102 |
yorickvp/llava-13b |
Visual instruction tuning towards large language and vision models with GPT-4 level capabilities |
14386 |
meta/meta-llama-3-8b |
Base version of Llama 3, an 8 billion parameter language model from Meta. |
14337 |
lucataco/qwen-vl-chat |
A multimodal LLM-based AI assistant, which is trained with alignment techniques. Qwen-VL-Chat supports more flexible interaction, such as multi-round question answering, and creative capabilities. |
12088 |
meta/llama-2-7b-chat |
A 7 billion parameter language model from Meta, fine tuned for chat completions |
11292 |
xinntao/gfpgan |
Practical face restoration algorithm for old photos or AI-generated faces |
8686 |
philz1337x/clarity-upscaler |
High resolution image Upscaler and Enhancer. Use at ClarityAI.cc. A free Magnific alternative. Twitter/X: @philz1337x |
8341 |
andreasjansson/clip-features |
Return CLIP features for the clip-vit-large-patch14 model |
8273 |
mistralai/mixtral-8x7b-instruct-v0.1 |
The Mixtral-8x7B-instruct-v0.1 Large Language Model (LLM) is a pretrained generative Sparse Mixture of Experts tuned to be a helpful assistant. |
7625 |
lucataco/proteus-v0.2 |
Proteus v0.2 shows subtle yet significant improvements over Version 0.1. It demonstrates enhanced prompt understanding that surpasses MJ6, while also approaching its stylistic capabilities. |
5300 |
m1guelpf/nsfw-filter |
Run any image through the Stable Diffusion content filter |
5197 |
andreasjansson/blip-2 |
Answers questions about images |
4635 |
mistralai/mistral-7b-v0.1 |
A 7 billion parameter language model from Mistral. |
3814 |
daanelson/real-esrgan-a100 |
Real-ESRGAN for image upscaling on an A100 |
3715 |
tencentarc/photomaker |
Create photos, paintings and avatars for anyone in any style within seconds. |
3588 |
omniedgeio/face-swap |
Face Swap |
3467 |
tomasmcm/llamaguard-7b |
Source: llamas-community/LlamaGuard-7b ✦ Quant: TheBloke/LlamaGuard-7B-AWQ ✦ Llama-Guard is a 7B parameter Llama 2-based input-output safeguard model |
2985 |
sczhou/codeformer |
Robust face restoration algorithm for old photos / AI-generated faces |
2918 |
meta/meta-llama-3-8b-instruct |
An 8 billion parameter language model from Meta, fine tuned for chat completions |
2914 |
cjwbw/clip-vit-large-patch14 |
openai/clip-vit-large-patch14 with Transformers |
2795 |
ai-forever/kandinsky-2.2 |
multilingual text2image latent diffusion model |
2665 |
fofr/sdxl-emoji |
An SDXL fine-tune based on Apple Emojis |
2480 |
stability-ai/stable-diffusion-inpainting |
Fill in masked parts of images with Stable Diffusion |
2465 |
playgroundai/playground-v2.5-1024px-aesthetic |
Playground v2.5 is the state-of-the-art open-source model in aesthetic quality |
2030 |
snowflake/snowflake-arctic-instruct |
An efficient, intelligent, and truly open-source language model |
1972 |
fofr/face-to-many |
Turn a face into 3D, emoji, pixel art, video game, claymation or toy |
1971 |
lucataco/juggernaut-xl-v9 |
Juggernaut XL v9 |
1853 |
pharmapsychotic/clip-interrogator |
The CLIP Interrogator is a prompt engineering tool that combines OpenAI's CLIP and Salesforce's BLIP to optimize text prompts to match a given image. Use the resulting prompts with text-to-image models like Stable Diffusion to create cool art! |
1801 |
cjwbw/rembg |
Remove images background |
1773 |
tencentarc/photomaker-style |
Create photos, paintings and avatars for anyone in any style within seconds. (Stylization version) |
1623 |
allenhooo/lama |
🦙 LaMa: Resolution-robust Large Mask Inpainting with Fourier Convolutions |
1462 |
yorickvp/llava-v1.6-34b |
LLaVA v1.6: Large Language and Vision Assistant (Nous-Hermes-2-34B) |
1457 |
lucataco/remove-bg |
Remove background from an image |
1439 |
lucataco/sdxl-controlnet |
SDXL ControlNet - Canny |
1391 |
replicate/all-mpnet-base-v2 |
This is a language model that can be used to obtain document embeddings suitable for downstream tasks like semantic search and clustering. |
1308 |
batouresearch/magic-image-refiner |
A better alternative to SDXL refiners, providing a lot of quality and detail. Can also be used for inpainting or upscaling. |
1281 |
xinntao/realesrgan |
Practical Image Restoration Algorithms for General/Anime Images |
1228 |
stability-ai/stable-diffusion |
A latent text-to-image diffusion model capable of generating photo-realistic images given any text input |
1114 |
usamaehsan/controlnet-1.1-x-realistic-vision-v2.0 |
controlnet 1.1 lineart x realistic-vision-v2.0 (updated to v5) |
1096 |
fofr/face-to-sticker |
Turn a face into a sticker |
1005 |
mejiabrayan/logoai |
null |
989 |
konieshadow/fooocus-api |
Third party Fooocus replicate model |
956 |
shefa/turbo-enigma |
SDXL based text-to-image model applying Distribution Matching Distillation, supporting zero-shot identity generation in 2-5s. https://ai-visionboard.com |
938 |
tomasmcm/llama2-13b-tiefighter |
Source: KoboldAI/LLaMA2-13B-Tiefighter ✦ Quant: TheBloke/LLaMA2-13B-Tiefighter-AWQ ✦ A merged model achieved trough merging two different lora's on top of a well established existing merge |
868 |
meta/musicgen |
Generate music from a prompt or melody |
827 |
vaibhavs10/incredibly-fast-whisper |
whisper-large-v3, incredibly fast, powered by Hugging Face Transformers! 🤗 |
766 |
cjwbw/anything-v3-better-vae |
high-quality, highly detailed anime style stable-diffusion with better VAE |
693 |
fofr/audio-to-waveform |
Create a waveform video from audio |
685 |
fofr/any-comfyui-workflow |
Run any ComfyUI workflow. Guide: https://github.com/fofr/cog-comfyui |
681 |
hnesk/whisper-wordtimestamps |
openai/whisper with exposed settings for word_timestamps |
669 |
lucataco/nsfw_image_detection |
Falcons.ai Fine-Tuned Vision Transformer (ViT) for NSFW Image Classification |
665 |
bfirsh/segformer-b0-finetuned-ade-512-512 |
null |
609 |
usamaehsan/controlnet-x-ip-adapter-realistic-vision-v5 |
Inpainting |
|
jingyunliang/swinir |
Image Restoration Using Swin Transformer |
593 |
mistralai/mistral-7b-instruct-v0.2 |
The Mistral-7B-Instruct-v0.2 Large Language Model (LLM) is an improved instruct fine-tuned version of Mistral-7B-Instruct-v0.1. |
587 |
cjwbw/anything-v4.0 |
high-quality, highly detailed anime-style Stable Diffusion models |
581 |
yuval-alaluf/sam |
Only a Matter of Style: Age Transformation Using a Style-Based Regression Model |
560 |
meta/llama-2-70b-chat |
A 70 billion parameter language model from Meta, fine tuned for chat completions |
556 |
spuuntries/flatdolphinmaid-8x7b-gguf |
Undi95's FlatDolphinMaid 8x7B Mixtral Merge, GGUF Q5_K_M quantized by TheBloke. |
534 |
aussielabs/musicgen |
Deployment of Meta's MusicGen |
533 |
lucataco/sdxl-inpainting |
SDXL Inpainting developed by the HF Diffusers team |
523 |
heedster/realistic-vision-v5 |
Deployment of Realistic vision v5.0 with xformers for fast inference |
516 |
meta/llama-2-13b-chat |
A 13 billion parameter language model from Meta, fine tuned for chat completions |
512 |
fofr/realvisxl-v3-multi-controlnet-lora |
RealVisXl V3 with multi-controlnet, lora loading, img2img, inpainting |
510 |
nateraw/goliath-120b |
An auto-regressive causal LM created by combining 2x finetuned Llama-2 70B into one. |
506 |
mcai/babes-v2.0-img2img |
Generate a new image from an input image with Babes 2.0 |
498 |
mark3labs/embeddings-gte-base |
General Text Embeddings (GTE) model. |
485 |
jagilley/controlnet-scribble |
Generate detailed images from scribbled drawings |
478 |
fofr/realvisxl-v3 |
Amazing photorealism with RealVisXL_V3.0, based on SDXL, trainable |
476 |
antoinelyset/openhermes-2-mistral-7b-awq |
null |
416 |
cjwbw/zoedepth |
ZoeDepth: Combining relative and metric depth |
412 |
zsxkib/clip-age-predictor |
Age prediction using CLIP - Patched version of https://replicate.com/andreasjansson/clip-age-predictor that works with the new version of cog! |
405 |
fofr/become-image |
Adapt any picture of a face into another image |
394 |
fofr/sticker-maker |
Make stickers with AI. Generates graphics with transparent backgrounds. |
380 |
135arvin/my_comfyui |
Run comfyui with api |
375 |
lucataco/ms-img2vid |
Turn any image into a video |
351 |
cuuupid/idm-vton |
Best-in-class clothing virtual try on in the wild (non-commercial use only) |
343 |
cjwbw/real-esrgan |
Real-ESRGAN: Real-World Blind Super-Resolution |
342 |
andreasjansson/sheep-duck-llama-2-70b-v1-1-gguf |
null |
339 |
lucataco/codeformer |
Robust face restoration algorithm for old photos/AI-generated faces - (A40 GPU) |
338 |
asiryan/reliberate-v3 |
Reliberate v3 Model (Text2Img, Img2Img and Inpainting) |
337 |
zsxkib/realistic-voice-cloning |
Create song covers with any RVC v2 trained AI voice from audio files. |
337 |
batouresearch/high-resolution-controlnet-tile |
Fermat.app open-source implementation of an efficient ControlNet 1.1 tile for high-quality upscales. Increase the creativity to encourage hallucination. |
334 |
fofr/prompt-classifier |
Determines the toxicity of text to image prompts, llama-13b fine-tune. [SAFETY_RANKING] between 0 (safe) and 10 (toxic) |
334 |
mistralai/mistral-7b-instruct-v0.1 |
An instruction-tuned 7 billion parameter language model from Mistral |
327 |
orpatashnik/styleclip |
Text-Driven Manipulation of StyleGAN Imagery |
311 |
pvitoria/chromagan |
An Adversarial Approach for Picture Colorization |
307 |
catacolabs/sdxl-ad-inpaint |
Product advertising image generator using SDXL |
306 |
zsxkib/instant-id |
Make realistic images of real people instantly |
293 |
mv-lab/swin2sr |
3 Million Runs! AI Photorealistic Image Super-Resolution and Restoration |
286 |
lucataco/xtts-v2 |
Coqui XTTS-v2: Multilingual Text To Speech Voice Cloning |
281 |
zsxkib/pulid |
📖 PuLID: Pure and Lightning ID Customization via Contrastive Alignment |
266 |
catacolabs/cartoonify |
Turn your image into a cartoon |
261 |
lambdal/text-to-pokemon |
Generate Pokémon from a text description |
253 |
asiryan/blue-pencil-xl-v2 |
Blue Pencil XL v2 Model (Text2Img, Img2Img and Inpainting) |
247 |
lucataco/realvisxl2-lcm |
RealvisXL-v2.0 with LCM LoRA - requires fewer steps (4 to 8 instead of the original 40 to 50) |
246 |
mercurio005/whisperx-spanish |
WhisperX model for spanish language. |
245 |
mcai/absolutebeauty-v1.0 |
Generate a new image given any input text with AbsoluteReality v1.0 |
245 |
jagilley/controlnet-hough |
Modify images using M-LSD line detection |
240 |
lucataco/ssd-1b |
Segmind Stable Diffusion Model (SSD-1B) is a distilled 50% smaller version of SDXL, offering a 60% speedup while maintaining high-quality text-to-image generation capabilities |
234 |
mcai/absolutebeauty-v1.0-img2img |
Generate a new image from an input image with AbsoluteReality v1.0 |
234 |
philz1337x/controlnet-deliberate |
Modify images with canny edge detection and Deliberate model twitter: @philz1337x |
233 |
mcai/realistic-vision-v2.0 |
Generate a new image given any input text with Realistic Vision V2.0 |
230 |
prompthero/openjourney |
Stable Diffusion fine tuned on Midjourney v4 images. |
228 |
01-ai/yi-34b-chat |
The Yi series models are large language models trained from scratch by developers at 01.AI. |
225 |
asiryan/absolutereality-v1.8.1 |
AbsoluteReality V1.8.1 Model (Text2Img, Img2Img and Inpainting) |
224 |
lucataco/sdxl |
SDXL v1.0 - A text-to-image generative AI model that creates beautiful images |
216 |
nateraw/openchat_3.5-awq |
OpenChat: Advancing Open-source Language Models with Mixed-Quality Data |
214 |
zylim0702/qr_code_controlnet |
ControlNet QR Code Generator: Simplify QR code creation for various needs using ControlNet's user-friendly neural interface, making integration a breeze. Just key in the url ! |
213 |
lucataco/dreamshaper-xl-lightning |
dreamshaper-xl-lightning is a Stable Diffusion model that has been fine-tuned on SDXL |
210 |
methexis-inc/img2prompt |
Get an approximate text prompt, with style, matching an image. (Optimized for stable-diffusion (clip ViT-L/14)) |
205 |
megvii-research/nafnet |
Nonlinear Activation Free Network for Image Restoration |
196 |
pengdaqian2020/image-tagger |
image tagger |
189 |
batouresearch/sdxl-controlnet-lora |
'''Last update: Now supports img2img.''' SDXL Canny controlnet with LoRA support. |
179 |
lucataco/moondream2 |
moondream2 is a small vision language model designed to run efficiently on edge devices |
172 |
cswry/seesr |
SeeSR: Towards Semantics-Aware Real-World Image Super-Resolution |
171 |
shanginn/supir |
null |
167 |
ai-forever/kandinsky-2 |
text2img model trained on LAION HighRes and fine-tuned on internal datasets |
161 |
piddnad/ddcolor |
Towards Photo-Realistic Image Colorization via Dual Decoders |
160 |
wolverinn/webui-api |
sd-webui API full support with extensions |
145 |
sunfjun/stable-video-diffusion |
null |
144 |
fofr/latent-consistency-model |
Super-fast, 0.6s per image. LCM with img2img, large batching and canny controlnet |
144 |
lucataco/hotshot-xl |
😊 Hotshot-XL is an AI text-to-GIF model trained to work alongside Stable Diffusion XL |
141 |
tgohblio/instant-id-multicontrolnet |
InstantID. ControlNets. More base SDXL models. And the latest ByteDance's ⚡️SDXL-Lightning !⚡️ |
140 |
daanelson/imagebind |
A model for text, audio, and image embeddings in one space |
139 |
fofr/style-transfer |
Transfer the style of one image to another |
138 |
meta/meta-llama-3-70b |
Base version of Llama 3, a 70 billion parameter language model from Meta. |
134 |
swartype/sdxl-pixar |
Create Pixar poster easily with SDXL Pixar. |
134 |
cjwbw/dreamshaper |
Dream Shaper stable diffusion |
134 |
cjwbw/animagine-xl-3.1 |
Anime-themed text-to-image stable diffusion model |
130 |
cjwbw/bigcolor |
Colorization using a Generative Color Prior for Natural Images |
130 |
alaradirik/t2i-adapter-sdxl-openpose |
Modify images using human pose |
124 |
lucataco/animate-diff |
Animate Your Personalized Text-to-Image Diffusion Models |
124 |
mixinmax1990/realisitic-vision-v3-inpainting |
Realistic Vision V3.0 Inpainting |
124 |
rossjillian/controlnet |
Control diffusion models |
124 |
usamaehsan/instant-id-x-juggernaut |
null |
120 |
playgroundai/playground-v2-1024px-aesthetic |
Playground v2 is a diffusion-based text-to-image generative model trained from scratch by the research team at Playground |
120 |
lucataco/realvisxl-v2.0 |
Implementation of SDXL RealVisXL_V2.0 |
120 |
tstramer/midjourney-diffusion |
null |
120 |
simbrams/segformer-b5-finetuned-ade-640-640 |
Semantic Segmentation |
119 |
jagilley/controlnet-depth2img |
Modify images using depth maps |
118 |
logerzhu/ad-inpaint |
Product advertising image generator |
115 |
jyoung105/playground-v2.5 |
State-of-the-art text to image "with turbo speed" |
113 |
lucataco/proteus-v0.4 |
ProteusV0.4: The Style Update |
111 |
alaradirik/t2i-adapter-sdxl-depth-midas |
Modify images using depth maps |
109 |
konieshadow/fooocus-api-anime |
Third party Fooocus replicate model with preset 'anime' |
104 |
nandycc/sdxl-app-icons |
Fine tuned to generate awesome app icons, by aistartupkit.com |
104 |
hvision-nku/storydiffusion |
Consistent Self-Attention for Long-Range Image and Video Generation |
101 |
google-research/maxim |
Multi-Axis MLP for Image Processing |
101 |