Pixel-Aware Stable Diffusion for Realistic Image Super-Resolution and Personalized Stylization

Tao Yang¹, Rongyuan Wu², Peiran Ren³, Xuansong Xie³, Lei Zhang²
¹ByteDance Inc.
²Department of Computing, The Hong Kong Polytechnic University
³DAMO Academy, Alibaba Group

Our model can do various tasks. Hope you can enjoy it.

Realistic Image SR

Old photo restoration

Personalized Stylization

Colorization

News

(2024-3-18) Please have a try on our colorization model via python test_pasd.y --pasd_model_path runs/pasd_color/checkpoint-180000 --control_type grayscale --high_level_info caption --use_pasd_light. You should use the noise scheduler provided in runs/pasd_color/scheduler which has been updated to ensure zero-terminal SNR in order to avoid the leaking residual signal from RGB image during training. Please read the updated paper for more details.

(2024-3-18) We have updated the paper. The weights and datasets are now available on Huggingface.

(2024-1-16) You may also want to check our new updates SeeSR and Phantom.

(2023-10-20) Add additional noise level via --added_noise_level and the SR result achieves a great balance between "extremely-detailed" and "over-smoothed". Very interesting!. You can control the SR's detail level freely.

(2023-10-18) Completely solved the issues by initializing latents with input LR images. Interestingly, the SR results also become much more stable.

(2023-10-11) Colab demo is now available. Credits to Masahide Okada.

(2023-10-09) Add training dataset.

(2023-09-28) Add tiled latent to allow upscaling ultra high-resolution images. Please carefully set latent_tiled_size as well as --decoder_tiled_size when upscaling large images.

(2023-09-12) Add Gradio demo.

(2023-09-11) Upload pre-trained models.

(2023-09-07) Upload source codes.

Usage

Clone this repository:

git clone https://github.com/yangxy/PASD.git
cd PASD

Download SD1.5 models from huggingface and put them into checkpoints/stable-diffusion-v1-5.
Prepare training datasets. Please check dataloader/localdataset.py and dataloader/webdataset.py carefully and set the paths correctly. We highly recommend to use dataloader/webdataset.py.
Download our training dataset. DIV2K_train_HR | DIV8K-0 | DIV8K-1 | DIV8K-2 | DIV8K-3 | DIV8K-4 | DIV8K-5 | FFHQ_5K | Flickr2K_HR-0 | Flickr2K_HR-1 | Flickr2K_HR-2 | OST_animal | OST_building | OST_grass | OST_mountain | OST_plant | OST_sky | OST_water | Unsplash2K
Train a PASD.

bash ./train_pasd.sh

if you want to train pasd_light, use --use_pasd_light.

Test PASD.

Download our pre-trained models pasd | pasd_rrdb | pasd_light | pasd_light_rrdb, and put them into runs/.

python test_pasd.py # --use_pasd_light --use_personalized_model

Please read the arguments in test_pasd.py carefully. We adopt the tiled vae method proposed by multidiffusion-upscaler-for-automatic1111 to save GPU memory.

Please try --use_personalized_model for personalized stylizetion, old photo restoration and real-world SR. Set --conditioning_scale for different stylized strength.

We use personalized models including majicMIX realistic(for SR and restoration), ToonYou(for stylization) and modern disney style(unet only, for stylization). You can download more from communities and put them into checkpoints/personalized_models.

If the default setting does not yield good results, try different --pasd_model_path, --seed, --prompt, --upscale, or --high_level_info to get better performance.

Gradio Demo

python gradio_pasd.py

Citation

If our work is useful for your research, please consider citing:

@inproceedings{yang2023pasd,
    title={Pixel-Aware Stable Diffusion for Realistic Image Super-Resolution and Personalized Stylization},
    author={Tao Yang, Rongyuan Wu, Peiran Ren, Xuansong Xie, and Lei Zhang},
    booktitle={arXiv:2308.14469v3},
    year={2023}
}

Acknowledgments

Our project is based on diffusers.

Contact

If you have any questions or suggestions about this paper, feel free to reach me at yangtao9009@gmail.com.

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
annotator		annotator
checkpoints		checkpoints
dataloader		dataloader
datasets		datasets
examples		examples
models		models
myutils		myutils
pipelines		pipelines
runs		runs
samples		samples
LICENSE		LICENSE
README.md		README.md
gradio_pasd.py		gradio_pasd.py
requirements.txt		requirements.txt
test_pasd.py		test_pasd.py
train_pasd.py		train_pasd.py
train_pasd.sh		train_pasd.sh

License

yangxy/PASD

Folders and files

Latest commit

History

Repository files navigation

Pixel-Aware Stable Diffusion for Realistic Image Super-Resolution and Personalized Stylization

Our model can do various tasks. Hope you can enjoy it.

Realistic Image SR

Old photo restoration

Personalized Stylization

Colorization

News

Usage

Citation

Acknowledgments

Contact

About

Resources

License

Stars

Watchers

Forks

Languages