-
Notifications
You must be signed in to change notification settings - Fork 6k
Add SkyReels V2: Infinite-Length Film Generative Model #11518
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…usion forcing - Introduced the drafts of `SkyReelsV2TextToVideoPipeline`, `SkyReelsV2ImageToVideoPipeline`, `SkyReelsV2DiffusionForcingPipeline`, and `FlowUniPCMultistepScheduler`.
It's about time. Thanks. |
Replaces custom attention implementations with `SkyReelsV2AttnProcessor2_0` and the standard `Attention` module. Updates `WanAttentionBlock` to use `FP32LayerNorm` and `FeedForward`. Removes the `model_type` parameter, simplifying model architecture and attention block initialization.
Introduces new classes `SkyReelsV2ImageEmbedding` and `SkyReelsV2TimeTextImageEmbedding` for enhanced image and time-text processing. Refactors the `SkyReelsV2Transformer3DModel` to integrate these embeddings, updating the constructor parameters for better clarity and functionality. Removes unused classes and methods to streamline the codebase.
…ds and begin reorganizing the forward pass.
…hod, integrating rotary embeddings and improving attention handling. Removes the deprecated `rope_apply` function and streamlines the attention mechanism for better integration and clarity.
…ethod by updating parameter names for clarity, integrating attention masks, and improving the handling of encoder hidden states.
…ethod by enhancing the handling of time embeddings and encoder hidden states. Updates parameter names for clarity and integrates rotary embeddings, ensuring better compatibility with the model's architecture.
…sV2TimeTextImageEmbedding`.
…itialization to directly assign the list of SkyReelsV2 components.
…ys convert query, key, and value to `torch.bfloat16`, simplifying the code and improving clarity.
…by adding VAE initialization and detailed prompt for video generation, improving clarity and usability of the documentation.
…and improve formatting in `pipeline_skyreels_v2_diffusion_forcing.py` to enhance code readability and maintainability.
…ine` from 5.0 to 6.0 to enhance video generation quality.
…definition of `SkyReelsV2DiffusionForcingPipeline` to ensure consistency and improve video generation quality.
…peline` to default to `None`.
…odel` to *ensure* correct tensor operations.
…peat_interleave` for improved efficiency in `SkyReelsV2Transformer3DModel`.
… with guidance scale and shift parameters for T2V and I2V. Remove unused `retrieve_latents` function to streamline the code.
…line` to use `deepcopy` for improved state management during inference steps.
…ngPipeline` for `overlap_history` and `addnoise_condition` parameters to improve long video generation guidance.
…nForcingPipeline` to clarify asynchronous inference settings and improve progress tracking during denoising steps.
# 6. Denoising loop | ||
num_warmup_steps = len(timesteps) - num_inference_steps * self.scheduler.order | ||
self._num_timesteps = len(step_matrix) | ||
progress_bar_step = len(timesteps) / len(step_matrix) | ||
|
||
with self.progress_bar(total=num_inference_steps) as progress_bar: | ||
for i, t in enumerate(step_matrix): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we set inference_steps=30
in asynchronous mode (by setting e.g., ar_step=5
), the original repository displays 50
steps via its tqdm
-step_matrix
includes 50
elements; i.e., requires more steps. I integrated this with progress_bar_step = len(timesteps) / len(step_matrix)
. The user sees 30
steps moving by decimal 0.6
at each step, while there are actually 50
. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a second thought, this might be confusing. Should I arrange with self.progress_bar(total=len(step_matrix)) as progress_bar:
so that the user would see the real number of steps directly? And decimal stepping in the progress bar probably isn't something common. The difference between synchronous and asynchronous inferences can be explained in the documentation.
…e` by rounding the step size to one decimal place for improved readability during denoising steps.
…mentation for improved clarity and organization.
Thanks for the opportunity to fix #11374!
Original repo: https://github.com/SkyworkAI/SkyReels-V2
Paper: https://huggingface.co/papers/2504.13074
TODOs:
⏳
FlowMatchUniPCMultistepScheduler
: just copy-pasted from the original repo✅
SkyReelsV2Transformer3DModel
: 90%WanTransformer3DModel
✅
SkyReelsV2DiffusionForcingPipeline
tolgacangoz/SkyReels-V2-DF-1.3B-540P-Diffusers
is ready to be forked.tolgacangoz/SkyReels-V2-DF-14B-720P-Diffusers
is ready to be forked.tolgacangoz/SkyReels-V2-DF-14B-540P-Diffusers
is ready to be forked.⏳
SkyReelsV2DiffusionForcingImageToVideoPipeline
: Includes Start/End Frame Control.⏳
SkyReelsV2DiffusionForcingVideoToVideoPipeline
: Extends a given video.⬜
SkyReelsV2Pipeline
⬜
SkyReelsV2ImageToVideoPipeline
⏳
scripts/convert_skyreelsv2_to_diffusers.py
⬜ Did you make sure to update the documentation with your changes?
⬜ Did you write any new necessary tests?
diffusers
integrationoriginal_0_short.mp4
diffusers_0_short.mp4
diffusers
integrationoriginal_37_short.mp4
diffusers_37_short.mp4
diffusers
integrationoriginal_0_long.mp4
diffusers_0_long.mp4
diffusers
integrationoriginal_37_long.mp4
diffusers_37_long.mp4
Firstly, I want to congratulate you on this great work, and thanks for open-sourcing it, SkyReels Team! This PR attempted to integrate your model.
Now, this PR is ready for review for
SkyReelsV2Transformer3DModel
andSkyReelsV2DiffusionForcingPipeline
. Other pipelines will be incoming right after the first feedback...Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR.
@yiyixuxu @a-r-r-o-w @linoytsaban @yjp999 @Howe2018 @RoseRollZhu @pftq @Langdx @guibinchen @qiudi0127 @nitinmukesh @tin2tin @ukaprch @okaris