You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: examples/community/README.md
+56-10Lines changed: 56 additions & 10 deletions
Original file line number
Diff line number
Diff line change
@@ -73,7 +73,8 @@ Please also check out our [Community Scripts](https://github.com/huggingface/dif
73
73
| Stable Diffusion BoxDiff Pipeline | Training-free controlled generation with bounding boxes using [BoxDiff](https://github.com/showlab/BoxDiff)|[Stable Diffusion BoxDiff Pipeline](#stable-diffusion-boxdiff)| - |[Jingyang Zhang](https://github.com/zjysteven/)|
74
74
| FRESCO V2V Pipeline | Implementation of [[CVPR 2024] FRESCO: Spatial-Temporal Correspondence for Zero-Shot Video Translation](https://arxiv.org/abs/2403.12962)|[FRESCO V2V Pipeline](#fresco)| - |[Yifan Zhou](https://github.com/SingleZombie)|
75
75
| AnimateDiff IPEX Pipeline | Accelerate AnimateDiff inference pipeline with BF16/FP32 precision on Intel Xeon CPUs with [IPEX](https://github.com/intel/intel-extension-for-pytorch)|[AnimateDiff on IPEX](#animatediff-on-ipex)| - |[Dan Li](https://github.com/ustcuna/)|
76
-
| HunyuanDiT Differential Diffusion Pipeline | Applies [Differential Diffsuion](https://github.com/exx8/differential-diffusion) to [HunyuanDiT](https://github.com/huggingface/diffusers/pull/8240). |[HunyuanDiT with Differential Diffusion](#hunyuandit-with-differential-diffusion)|[](https://colab.research.google.com/drive/1v44a5fpzyr4Ffr4v2XBQ7BajzG874N4P?usp=sharing)|[Monjoy Choudhury](https://github.com/MnCSSJ4x)|
76
+
| HunyuanDiT Differential Diffusion Pipeline | Applies [Differential Diffusion](https://github.com/exx8/differential-diffusion) to [HunyuanDiT](https://github.com/huggingface/diffusers/pull/8240). |[HunyuanDiT with Differential Diffusion](#hunyuandit-with-differential-diffusion)|[](https://colab.research.google.com/drive/1v44a5fpzyr4Ffr4v2XBQ7BajzG874N4P?usp=sharing)|[Monjoy Choudhury](https://github.com/MnCSSJ4x)|
77
+
|[🪆Matryoshka Diffusion Models](https://huggingface.co/papers/2310.15111)| A diffusion process that denoises inputs at multiple resolutions jointly and uses a NestedUNet architecture where features and parameters for small scale inputs are nested within those of the large scales. See [original codebase](https://github.com/apple/ml-mdm). |[🪆Matryoshka Diffusion Models](#matryoshka-diffusion-models)|[](https://huggingface.co/spaces/pcuenq/mdm)[](https://colab.research.google.com/gist/tolgacangoz/1f54875fc7aeaabcf284ebde64820966/matryoshka_hf.ipynb)|[M. Tolga Cangöz](https://github.com/tolgacangoz)|
77
78
78
79
To load a custom pipeline you just need to pass the `custom_pipeline` argument to `DiffusionPipeline`, as one of the files in `diffusers/examples/community`. Feel free to send a PR with your own pipelines, we will merge them quickly.
Know more about Flux [here](https://blackforestlabs.ai/announcing-black-forest-labs/). Since Flux doesn't use CFG, this implementation provides one, inspired by the [PuLID Flux adaptation](https://github.com/ToTheBeginning/PuLID/blob/main/docs/pulid_for_flux.md).
89
+
Know more about Flux [here](https://blackforestlabs.ai/announcing-black-forest-labs/). Since Flux doesn't use CFG, this implementation provides one, inspired by the [PuLID Flux adaptation](https://github.com/ToTheBeginning/PuLID/blob/main/docs/pulid_for_flux.md).
89
90
90
91
Example usage:
91
92
92
93
```py
93
94
from diffusers import DiffusionPipeline
94
-
import torch
95
+
import torch
95
96
96
97
pipeline = DiffusionPipeline.from_pretrained(
97
-
"black-forest-labs/FLUX.1-dev",
98
-
torch_dtype=torch.bfloat16,
98
+
"black-forest-labs/FLUX.1-dev",
99
+
torch_dtype=torch.bfloat16,
99
100
custom_pipeline="pipeline_flux_with_cfg"
100
101
)
101
102
pipeline.enable_model_cpu_offload()
102
103
prompt ="a watercolor painting of a unicorn"
103
104
negative_prompt ="pink"
104
105
105
106
img = pipeline(
106
-
prompt=prompt,
107
-
negative_prompt=negative_prompt,
108
-
true_cfg=1.5,
109
-
guidance_scale=3.5,
107
+
prompt=prompt,
108
+
negative_prompt=negative_prompt,
109
+
true_cfg=1.5,
110
+
guidance_scale=3.5,
110
111
num_images_per_prompt=1,
111
112
generator=torch.manual_seed(0)
112
113
).images[0]
@@ -2656,7 +2657,7 @@ image with mask mech_painted.png
A colab notebook demonstrating all results can be found [here](https://colab.research.google.com/drive/1v44a5fpzyr4Ffr4v2XBQ7BajzG874N4P?usp=sharing). Depth Maps have also been added in the same colab.
>Diffusion models are the _de-facto_ approach for generating high-quality images and videos but learning high-dimensional models remains a formidable task due to computational and optimization challenges. Existing methods often resort to training cascaded models in pixel space, or using a downsampled latent space of a separately trained auto-encoder. In this paper, we introduce Matryoshka Diffusion (MDM), **a novel framework for high-resolution image and video synthesis**. We propose a diffusion process that denoises inputs at multiple resolutions jointly and uses a **NestedUNet** architecture where features and parameters for small scale inputs are nested within those of the large scales. In addition, MDM enables a progressive training schedule from lower to higher resolutions which leads to significant improvements in optimization for high-resolution generation. We demonstrate the effectiveness of our approach on various benchmarks, including class-conditioned image generation, high-resolution text-to-image, and text-to-video applications. Remarkably, we can train a **_single pixel-space model_ at resolutions of up to 1024 × 1024 pixels**, demonstrating strong zero shot generalization using the **CC12M dataset, which contains only 12 million images**. Code and pre-trained checkpoints are released at https://github.com/apple/ml-mdm.
4334
+
4335
+
-`64×64, nesting_level=0`: 1.719 GiB. With `50` DDIM inference steps:
0 commit comments