125 lines
5.6 KiB
Markdown
125 lines
5.6 KiB
Markdown
## Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild
|
|
|
|
> [[Paper](https://arxiv.org/abs/2401.13627)]   [[Project Page](http://supir.xpixel.group/)]   [Online Demo (Coming soon)] <br>
|
|
> Fanghua, Yu, [Jinjin Gu](https://www.jasongt.com/), Zheyuan Li, Jinfan Hu, Xiangtao Kong, [Xintao Wang](https://xinntao.github.io/), [Jingwen He](https://scholar.google.com.hk/citations?user=GUxrycUAAAAJ), [Yu Qiao](https://scholar.google.com.hk/citations?user=gFtI-8QAAAAJ), [Chao Dong](https://scholar.google.com.hk/citations?user=OSDCB0UAAAAJ) <br>
|
|
> Shenzhen Institute of Advanced Technology; Shanghai AI Laboratory; University of Sydney; The Hong Kong Polytechnic University; ARC Lab, Tencent PCG; The Chinese University of Hong Kong <br>
|
|
|
|
|
|
<p align="center">
|
|
<img src="assets/teaser.png">
|
|
</p>
|
|
|
|
---
|
|
## 🔧 Dependencies and Installation
|
|
|
|
|
|
1. Clone repo
|
|
```bash
|
|
git clone https://github.com/Fanghua-Yu/SUPIR.git
|
|
cd SUPIR
|
|
```
|
|
|
|
2. Install dependent packages
|
|
```bash
|
|
conda create -n SUPIR python=3.8 -y
|
|
conda activate SUPIR
|
|
pip install --upgrade pip
|
|
pip install -r requirements.txt
|
|
```
|
|
|
|
3. Download Checkpoints
|
|
#### Dependent Models
|
|
* [SDXL CLIP Encoder-1](https://huggingface.co/openai/clip-vit-large-patch14)
|
|
* [SDXL CLIP Encoder-2](https://huggingface.co/laion/CLIP-ViT-bigG-14-laion2B-39B-b160k)
|
|
* [SDXL base 1.0_0.9vae](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/blob/main/sd_xl_base_1.0_0.9vae.safetensors)
|
|
* [LLaVA CLIP](https://huggingface.co/openai/clip-vit-large-patch14-336)
|
|
* [LLaVA v1.5 13B](https://huggingface.co/liuhaotian/llava-v1.5-13b)
|
|
|
|
|
|
#### Models we provided:
|
|
* `SUPIR-v0Q`: (Coming Soon) Google Drive, Baidu Netdisk
|
|
|
|
Default training settings with paper. High generalization and high image quality in most cases.
|
|
|
|
* `SUPIR-v0F`: (Coming Soon) Google Drive, Baidu Netdisk
|
|
|
|
Training with light degradation settings. Stage1 encoder of `SUPIR-v0F` remains more details when facing light degradations.
|
|
|
|
4. Edit Custom Path for Checkpoints
|
|
```
|
|
* [CKPT_PTH.py] --> LLAVA_CLIP_PATH, LLAVA_MODEL_PATH, SDXL_CLIP1_PATH, SDXL_CLIP2_CACHE_DIR
|
|
* [options/SUPIR_v0.yaml] --> SDXL_CKPT, SUPIR_CKPT_Q, SUPIR_CKPT_F
|
|
```
|
|
---
|
|
|
|
## ⚡ Quick Inference
|
|
|
|
|
|
### Usage of SUPIR
|
|
|
|
```console
|
|
Usage:
|
|
-- python test.py [options]
|
|
-- python gradio_demo.py [interactive options]
|
|
|
|
--img_dir Input folder.
|
|
--save_dir Output folder.
|
|
--upscale Upsampling ratio of given inputs. Default: 1
|
|
--SUPIR_sign Model selection. Default: 'Q'; Options: ['F', 'Q']
|
|
--seed Random seed. Default: 1234
|
|
--min_size Minimum resolution of output images. Default: 1024
|
|
--edm_steps Numb of steps for EDM Sampling Scheduler. Default: 50
|
|
--s_stage1 Control Strength of Stage1. Default: -1 (negative means invalid)
|
|
--s_churn Original hy-param of EDM. Default: 5
|
|
--s_noise Original hy-param of EDM. Default: 1.003
|
|
--s_cfg Classifier-free guidance scale for prompts. Default: 7.5
|
|
--s_stage2 Control Strength of Stage2. Default: 1.0
|
|
--num_samples Number of samples for each input. Default: 1
|
|
--a_prompt Additive positive prompt for all inputs.
|
|
Default: 'Cinematic, High Contrast, highly detailed, taken using a Canon EOS R camera,
|
|
hyper detailed photo - realistic maximum detail, 32k, Color Grading, ultra HD, extreme
|
|
meticulous detailing, skin pore detailing, hyper sharpness, perfect without deformations.'
|
|
--n_prompt Fixed negative prompt for all inputs.
|
|
Default: 'painting, oil painting, illustration, drawing, art, sketch, oil painting,
|
|
cartoon, CG Style, 3D render, unreal engine, blurring, dirty, messy, worst quality,
|
|
low quality, frames, watermark, signature, jpeg artifacts, deformed, lowres, over-smooth'
|
|
--color_fix_type Color Fixing Type. Default: 'Wavelet'; Options: ['None', 'AdaIn', 'Wavelet']
|
|
--linear_CFG Linearly (with sigma) increase CFG from 'spt_linear_CFG' to s_cfg. Default: False
|
|
--linear_s_stage2 Linearly (with sigma) increase s_stage2 from 'spt_linear_s_stage2' to s_stage2. Default: False
|
|
--spt_linear_CFG Start point of linearly increasing CFG. Default: 1.0
|
|
--spt_linear_s_stage2 Start point of linearly increasing s_stage2. Default: 0.0
|
|
--ae_dtype Inference data type of AutoEncoder. Default: 'bf16'; Options: ['fp32', 'bf16']
|
|
--diff_dtype Inference data type of Diffusion. Default: 'fp16'; Options: ['fp32', 'fp16', 'bf16']
|
|
```
|
|
|
|
### Python Script
|
|
```Shell
|
|
# Seek for best quality for most cases
|
|
CUDA_VISIBLE_DEVICES=0,1 python test.py --img_dir '/opt/data/private/LV_Dataset/DiffGLV-Test-All/RealPhoto60/LQ' --save_dir ./results-Q --SUPIR_sign Q --upscale 2
|
|
# for light degradation and high fidelity
|
|
CUDA_VISIBLE_DEVICES=0,1 python test.py --img_dir '/opt/data/private/LV_Dataset/DiffGLV-Test-All/RealPhoto60/LQ' --save_dir ./results-F --SUPIR_sign F --upscale 2 --s_cfg 4.0 --linear_CFG
|
|
```
|
|
|
|
### Gradio Demo
|
|
```Shell
|
|
CUDA_VISIBLE_DEVICES=0,1 python gradio_demo.py --ip 0.0.0.0 --port 6688
|
|
```
|
|
|
|
### Online Demo (Coming Soon)
|
|
|
|
|
|
|
|
---
|
|
|
|
## BibTeX
|
|
@misc{yu2024scaling,
|
|
title={Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild},
|
|
author={Fanghua Yu and Jinjin Gu and Zheyuan Li and Jinfan Hu and Xiangtao Kong and Xintao Wang and Jingwen He and Yu Qiao and Chao Dong},
|
|
year={2024},
|
|
eprint={2401.13627},
|
|
archivePrefix={arXiv},
|
|
primaryClass={cs.CV}
|
|
}
|
|
|
|
## 📧 Contact
|
|
If you have any question, please email `fanghuayu96@gmail.com`.
|