Skip to content
/ PURE Public

[ICCV2025] PyTorch implementation of "Perceive, Understand and Restore: Real-World Image Super-Resolution with Autoregressive Multimodal Generative Models"

License

Notifications You must be signed in to change notification settings

nonwhy/PURE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Perceive, Understand and Restore: Real-World Image Super-Resolution with Autoregressive Multimodal Generative Models (ICCV 2025)

Hongyang Wei1,3,*Shuaizheng Liu2,3,*Chun Yuan1,†Lei Zhang2,3,†
1Tsinghua Shenzhen International Graduate School, Tsinghua University
2The Hong Kong Polytechnic University, 3OPPO Research Institute 

PURE HuggingFace Model


⭐ If PURE is helpful to your images or projects, please help star this repo. Thanks! 🤗

🚩Accepted by ICCV 2025

📢 News

  • 2025.07.07 🎉🎉🎉 Tanining code is released! 🎉🎉🎉
  • 2025.04.11 🎉🎉🎉 Inference code and checkpoints are released! 🎉🎉🎉
  • 2025.3.13 🎉🎉🎉 PURE is released! 🎉🎉🎉

🎬 Overview

overview

📷 Results

Quantitative Comparisons (click to expand)

Visual Comparisons (click to expand)

⚙️ Installation

git clone https://github.com/nonwhy/PURE.git && cd PURE
conda create -n pure python=3.10 -y
conda activate pure
pip install -r requirements.txt
pip install flash-attn --no-build-isolation
pip install -e .

You need to download the VQ-VAE model from LlamaGen and place it in pure/tokenizer/vq_ds8_c2i.pt.

💡 Inference

🔑 Simple Inference

The simple code for PURE inference:

from inference_solver import FlexARInferenceSolver
from PIL import Image
from utils.wavelet_color_fix import wavelet_color_fix

inference_solver = FlexARInferenceSolver(
    model_path="nonwhy/PURE",
    precision="bf16",
    target_size=512,
)

image_path = "/path/to/example_input.png"
image = Image.open(image_path)

if image.size != (512, 512):
    image = image.resize((512, 512), Image.BICUBIC)

q1 = "Perceive the degradation level, understand the image content, and restore the high-quality image. <|image|>"
images = [image]
qas = [[q1, None]]

generated = inference_solver.generate(
                images=images,
                qas=qas,
                max_gen_len=8192,
                temperature=0.9,
                logits_processor=inference_solver.create_logits_processor(
                            cfg=0.8,
                            text_top_k=1
                        ),
            )

new_image = generated[1][0]
new_image = wavelet_color_fix(new_image, image)
new_image.save("./example_output.png", "PNG")
text=generated[0]
print(text)

The final output image resolution is 512x512, so your input image resolution should be 128x128 (4x), 256x256 (2x), or 512x512 (1x). You can adjust the temperature and CFG(Classifier-Free Guidance) parameters to achieve different restoration results.

🚀 Accelerate Inference

We can seamlessly accelerate inference through Speculative Jacobi Decoding:

python test_pure_jacobi.py

🌈 Train

Please refer to TRAIN.md.

📚 Training Datasets

Following SeeSR, we train PURE on LSDIR+FFHQ10k. To generate realistic LQ-HQ image pairs for training, we apply the degradation pipeline from Real-ESRGAN.

🤗 Checkpoints

Model Size Resolution Huggingface
PURE 7B 512 nonwhy/PURE

💕 Acknowledgements

Thanks to the following excellent open-source projects:

🎫 License

This project is released under the MIT License.

📧 Contact

If you have any questions, please feel free to contact: [email protected]

🎓Citations

If our PURE helps your research or work, please consider citing our paper:

@misc{wei2025perceiveunderstandrestorerealworld,
      title={Perceive, Understand and Restore: Real-World Image Super-Resolution with Autoregressive Multimodal Generative Models}, 
      author={Hongyang Wei and Shuaizheng Liu and Chun Yuan and Lei Zhang},
      year={2025},
      eprint={2503.11073},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2503.11073}, 
}

About

[ICCV2025] PyTorch implementation of "Perceive, Understand and Restore: Real-World Image Super-Resolution with Autoregressive Multimodal Generative Models"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published