-
Notifications
You must be signed in to change notification settings - Fork 6k
Add SkyReels V2: Infinite-Length Film Generative Model #11518
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
tolgacangoz
wants to merge
187
commits into
huggingface:main
Choose a base branch
from
tolgacangoz:skyreels-v2
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+6,404
−0
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…usion forcing - Introduced the drafts of `SkyReelsV2TextToVideoPipeline`, `SkyReelsV2ImageToVideoPipeline`, `SkyReelsV2DiffusionForcingPipeline`, and `FlowUniPCMultistepScheduler`.
It's about time. Thanks. |
Replaces custom attention implementations with `SkyReelsV2AttnProcessor2_0` and the standard `Attention` module. Updates `WanAttentionBlock` to use `FP32LayerNorm` and `FeedForward`. Removes the `model_type` parameter, simplifying model architecture and attention block initialization.
Introduces new classes `SkyReelsV2ImageEmbedding` and `SkyReelsV2TimeTextImageEmbedding` for enhanced image and time-text processing. Refactors the `SkyReelsV2Transformer3DModel` to integrate these embeddings, updating the constructor parameters for better clarity and functionality. Removes unused classes and methods to streamline the codebase.
…ds and begin reorganizing the forward pass.
…hod, integrating rotary embeddings and improving attention handling. Removes the deprecated `rope_apply` function and streamlines the attention mechanism for better integration and clarity.
…ethod by updating parameter names for clarity, integrating attention masks, and improving the handling of encoder hidden states.
…ethod by enhancing the handling of time embeddings and encoder hidden states. Updates parameter names for clarity and integrates rotary embeddings, ensuring better compatibility with the model's architecture.
…ngImageToVideoPipeline` documentation.
…ForcingVideoToVideoPipeline`, enhancing support for Video-to-Video (v2v) generation. Introduce video input handling, update latent preparation logic, and improve error handling for input parameters.
… the `image_encoder` and `image_processor` dependencies. Update the CPU offload sequence accordingly.
…latent preparation logic and condition handling. Update image input type to `Optional`, streamline video condition processing, and improve handling of `last_image` during latent generation.
…ration for long video generation. Introduce new parameters for video handling, overlap history, and causal block size. Update logic to accommodate both short and long video scenarios, ensuring compatibility and improved processing.
…ideoToVideoPipeline` to ensure proper noise scaling during latent generation.
…pport for `last_image` parameter and refining latent frame calculations. Update preprocessing logic.
…eoPipeline` by correcting variable names and reintroducing latent mean and standard deviation calculations. Update logic for frame preparation and sampling to ensure accurate video generation.
…latent handling by enforcing tensor input for video, updating frame preparation logic, and adjusting default frame count. Enhance preprocessing and postprocessing steps for better integration.
…ForcingImageToVideoPipeline` to ensure correct dimensionality for video conditions and latent conditions.
…VideoPipeline` to handle tensor dimensions more robustly, ensuring compatibility with both 3D and 4D video inputs.
…teration print statements for better debugging. Clean up unused code related to prefix video latents length calculation in `SkyReelsV2DiffusionForcingImageToVideoPipeline`.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Thanks for the opportunity to fix #11374!
Original Work
Original repo: https://github.com/SkyworkAI/SkyReels-V2
Paper: https://huggingface.co/papers/2504.13074
TODOs:
⏳
FlowMatchUniPCMultistepScheduler
: just copy-pasted from the original repo✅
SkyReelsV2Transformer3DModel
: 90%WanTransformer3DModel
✅
SkyReelsV2DiffusionForcingPipeline
tolgacangoz/SkyReels-V2-DF-1.3B-540P-Diffusers
is ready to be forked.tolgacangoz/SkyReels-V2-DF-14B-720P-Diffusers
is ready to be forked.tolgacangoz/SkyReels-V2-DF-14B-540P-Diffusers
is ready to be forked.✅
SkyReelsV2DiffusionForcingImageToVideoPipeline
: Includes FLF2V.✅
SkyReelsV2DiffusionForcingVideoToVideoPipeline
: Extends a given video.⬜
SkyReelsV2Pipeline
⬜
SkyReelsV2ImageToVideoPipeline
⏳
scripts/convert_skyreelsv2_to_diffusers.py
⬜ Did you make sure to update the documentation with your changes?
⬜ Did you write any new necessary tests?
T2V with Diffusion Forcing
diffusers
integrationoriginal_0_short.mp4
diffusers_0_short.mp4
diffusers
integrationoriginal_37_short.mp4
diffusers_37_short.mp4
diffusers
integrationoriginal_0_long.mp4
diffusers_0_long.mp4
diffusers
integrationoriginal_37_long.mp4
diffusers_37_long.mp4