-
Notifications
You must be signed in to change notification settings - Fork 376
stable-diffusion: TAESD implementation - faster autoencoder #88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
is this added to the gguf?, if its replacing the normal vae, then this should be inferred from the gguf file. |
@Green-Sky it's a separated model, I included |
I don't understand |
ok, running without taesd AND with lcm lora AND with eulerA sampler (so not lcm sampler) produces invalid files. running more tests... |
works:
fails:
|
@Green-Sky I have conducted some tests, and I am not getting any errors related to the regular VAE or the sampling. Does |
using a different, non-lora-lcm model, it does not produce an invalid file
yes, it appears this also happens on master. |
With the last commit before the latest commit of master |
not sure that can be indicative, since that is pre gguf -> maybe different model file on master with the seed 1337 (instead of 42) it actually crashes, so i ran it in debug
|
I switched back to this branch and now it works in release mode. BUT in debug, it triggers the assert(). ( @leejet there is a bug in master ) sorry for spamming unrelated comments in your taesd pr 👀 edit: ah yes, now it generates invalid files with the lcm sampler and taesd 😵💫 |
When I enable Assertion failed: node->src[j]->backend == GGML_BACKEND_GPU, file C:\proyectos\stable-diffusion.cpp\ggml\src\ggml-cuda.cu, line 8738 Only when I use |
Wow, very fast. Maybe it would be nice, if it is possible to set the |
I also think it would be better to specify the path of the tae model through parameters rather than hardcode. |
Now the TAE model should pass like this build\bin\Release\sd -m AnythingV5_v5PrtRE-f16.gguf --taesd taesd-model.gguf -p "<lora:Kana_Arima-10:0.9><lora:lcm-lora:1>beautiful anime girl, short hair, red hair, red eyes, realistic, masterpiece, azur lane, 4k, high quality" --sampling-method lcm --cfg-scale 1 --steps 5 -t 1 -s 424354 |
@leejet The TinyAutoEncoder's encoder is quite bad; I'll have to run some tests with the original Python implementation. I tried passing the init_latent generated by the encoder directly to the decoder, and I get dark results, whereas with AutoEncoderKL, I get the original image. |
Are you using the original python implementation? |
I suggest that you compare the original Python version to see if the problem is with the model itself or your implementation |
having this working with lcm-lora will be a huge speed boost, thanks! |
@leejet So I should specify that in README.md and delete the existing gguf file here.
I think it will be cumbersome to support the gguf format and handle discrepancies between the safetensors of taesd. It's better to maintain compatibility with the safetensor that already contains all the tensors from the encoder and decoder. This should be specified in the README.md file. @Green-Sky GGUF-formatted LORAs will no longer be accepted due to naming differences, and correcting that seems cumbersome to me. Only SafeTensors and CKPT formats will be accepted from now on. In short, you can delete them as they are now obsolete. |
I'm using this one, https://huggingface.co/madebyollin/taesd/blob/main/diffusion_pytorch_model.safetensors, because this weight contains both the encoder and decoder, and it doesn't take up much storage space. |
@leejet Leaving 10 MB of padding in |
The names of the tensors in https://huggingface.co/madebyollin/taesd/blob/main/diffusion_pytorch_model.safetensors.
The names of the tensors in https://huggingface.co/madebyollin/taesd/blob/main/taesd_decoder.safetensors.
They use different indices, so we have to choose one to make compatible. |
This is the test code that was left behind initially. I'll remove it. |
It's better to leave compatibility only to the safetensor that comes with everything, the first. I'm making some changes, wait a little bit please |
OK, I'll submit it when you're done. |
@leejet Try creating a function to accurately calculate the memory usage of LoRA parameters. For TAESD, we will only support the safetensors, which already includes all tensors from both the encoder and decoder for simplicity. |
@FSSRepo It would be great if you could take the time to upgrade ggml, so that I can remove my pile of cmake that supports rocm, and also I can test the support for rocm. Because of ignorance, I bought 7900xtx .I have since moved towards support Rocm's and HIP's point of no return |
Although rocm is very rubbish on Windows, it is better than nothing. |
@Cyberhan123 I'm waiting for my pull request that adds the necessary functions for stable diffusion to be merged into the original repository. |
@leejet Ready, you can add the final changes and test stable diffusion 2.1 again with cuda. |
It's been fixed now. |
This pr will be merged soon, after I finish testing the necessary features, probably today or tomorrow. |
Sorry, This may be caused by my misunderstanding of Chinese, English and Google Translate. What I mean is to merge the main branch of ggml,I need this change: ggml-org/ggml#626 |
I tried the newest version and now I have the issue that Edit:
|
Because ggml_type is currently introduced, and I was thinking about how to provide a pure c api that hides the details of ggml. |
@leejet Later on, we will need to refactor the code to eliminate a large portion of duplicated code and make the use of the stable diffusion API more flexible. For this reason, I proposed doing something similar to what we did with llama.cpp and whisper.cpp, natively supporting a C API. |
That's what I want to do. |
@FSSRepo This pr has been merged, thanks for your contribution! You've done an amazing job. |
Also a big thanks from me. An additional note: The TAESD model needs to be in the same folder as the stable diffusion model, which was not the case before. |
I didn't find this limitation, can you post your command parameters? |
Sorry, I made a mistake. It works. Maybe it would be nice to still have the possibility to set the log level in |
This function is currently in util.h, but a unified header will be provided later to provide the associated api. |
Fixes #36
This is a quick implementation, so I may have overlooked something. Initially, I thought about implementing it in a separate header, but there were some issues due to dependencies collisions with ggml, and I honestly didn't want to bother solving that. Therefore, I directly added it to the
stable-diffusion.cpp
file.Results
How to use it:
Just add
--taesd TAESD_MODEL_PATH
to command line, just works for txt2img for now:Tasks:
ggml_tanh
in CUDA, for full offloading support.