Ezsynth

Reworked version of Trentonom0r3/Ezsynth, with masking support and some visual bug fixes. Aims to be easy to use and maintain.

Perform things like style transfer, color transfer, inpainting, superimposition, video stylization and more! This implementation makes use of advanced physics based edge detection and RAFT optical flow, which leads to more accurate results during synthesis.

⚠️ This is not intended to be used as an installable module.

Currently tested on:

Windows 10 - Python 3.11 - RTX3060
Ubuntu 24 - Python 3.12 - RTX4070(Laptop)

Get started

Windows

rem Clone this repo
git clone https://github.com/FuouM/Ezsynth.git 
cd Ezsynth

rem (Optional) create and activate venv
python -m venv venv
venv\Scripts\activate.bat

rem Install requirements
pip install -r requirements.txt

rem A precompiled ebsynth.dll is included. 
rem If don't want to rebuild, you are ready to go and can skip the following steps.  

rem Clone ebsynth
git clone https://github.com/Trentonom0r3/ebsynth.git

rem build ebsynth as lib
copy .\build_ebs-win64-cpu+cuda.bat .\ebsynth
cd ebsynth && .\build_ebs-win64-cpu+cuda.bat

rem copy lib
cp .\bin\ebsynth.so ..\ezsynth\utils\ebsynth.so

rem cleanup
cd .. && rmdir /s /q .\ebsynth

Linux

# clone this repo
git clone https://github.com/FuouM/Ezsynth.git 
cd Ezsynth

# (optional) create and activate venv
python -m venv venv
source ./venv/bin/activate

# install requirements
pip install -r requirements.txt

# clone ebsynth
git clone https://github.com/Trentonom0r3/ebsynth.git

# build ebsynth as lib
cp ./build_ebs-linux-cpu+cuda.sh ./ebsynth
cd ebsynth && ./build_ebs-linux-cpu+cuda.sh

# copy lib
cp ./bin/ebsynth.so ../ezsynth/utils/ebsynth.so

# cleanup
cd .. && rm -rf ./ebsynth

All

You may also install Cupy and Cupyx to use GPU for some other operations.

Examples

To get started, see test_redux.py for an example of generating a full video.
To generate image style transfer, see test_imgsynth.py for all examples from the original Ebsynth.

Example outputs

Face style	Stylit	Retarget

Ebsynth.Demo.Cat.mp4

Edge.Methods.mp4

Comparison of Edge methods

Notable things

Updates:

Ef-RAFT is added

To use, download models from the original repo and place them in /ezsynth/utils/flow_utils/ef_raft_models
```
.gitkeep
25000_ours-sintel.pth
ours-things.pth
ours_sintel.pth
```
FlowDiffuser is added.

To use, download the model from the original repo and place it in /ezsynth/utils/flow_utils/flow_diffusion_models/FlowDiffuser-things.pth.

You will also need to install PyTorch Image Models to run it: pip install timm. On first run, it will download 2 models ~470MB twins_svt_large (378 MB) and twins_svt_small (92 MB).

This increases the VRAM usage significantly when run along with EbSynth Run (~15GB, but may not OOM. Tested on 12GB VRAM).

In that case, It will throw CUDNN_BACKEND_EXECUTION_PLAN_DESCRIPTOR error, but shouldn't be fatal, and instead takes ~3x as long to run.

Rafted-1.mp4

Comparison of Optical Flow models

Optical Flow directly affects Flow position warping and Style image warping, controlled by pos_wgt and wrp_wgt respectively.

Changes:

Flow is calculated on a frame by frame basis, with correct time orientation, instead of pre-computing only a forward-flow.
Padding is applied to Edge detection and Warping to remove border visual distortion.

Observations:

Edge detection models return NaN if input tensor has too many zeros(?).
Pre-masked inputs take twice as long to run Ebsynth

API Overview

ImageSynth

For image-to-image style transfer, via file paths: test_imgsynth.py

ezsynner = ImageSynth(
    style_path="source_style.png",
    src_path="source_fullgi.png",
    tgt_path="target_fullgi.png",
    cfg=RunConfig(img_wgt=0.66),
)

result = ezsynner.run(
    guides=[
        load_guide(
            "source_dirdif.png",
            "target_dirdif.png",
            0.66,
        ),
        load_guide(
            "source_indirb.png",
            "target_indirb.png",
            0.66,
        ),
    ]
)

save_to_folder(output_folder, "stylit_out.png", result[0]) # Styled image
save_to_folder(output_folder, "stylit_err.png", result[1]) # Error image

edge_method

Edge detection method. Choose from PST, Classic, or PAGE.

PST (Phase Stretch Transform): Good overall structure, but not very detailed.
Classic: A good balance between structure and detail.
PAGE (Phase and Gradient Estimation): Great detail, great structure, but slow.

video stylization

Via file paths (see test_redux.py):

style_paths = [
    "style000.png",
    "style006.png"
]

ezrunner = Ezsynth(
    style_paths=style_paths,
    image_folder=image_folder,
    cfg=RunConfig(pre_mask=False, feather=5, return_masked_only=False),
    edge_method="PAGE",
    raft_flow_model_name="sintel",
    mask_folder=mask_folder,
    do_mask=True
)

only_mode = None
stylized_frames, err_frames  = ezrunner.run_sequences(only_mode)

save_seq(stylized_frames, "output")

Via Numpy ndarrays:

class EzsynthBase:
    def __init__(
        self,
        style_frs: list[np.ndarray],
        style_idxes: list[int],
        img_frs_seq: list[np.ndarray],
        cfg: RunConfig = RunConfig(),
        edge_method="Classic",
        raft_flow_model_name="sintel",
        do_mask=False,
        msk_frs_seq: list[np.ndarray] | None = None,
    ):
    pass

RunConfig

Ebsynth gen params

uniformity (float): Uniformity weight for the style transfer. Reasonable values are between 500-15000. Defaults to 3500.0.
patchsize (int): Size of the patches [NxN]. Must be an odd number >= 3. Defaults to 7.
pyramidlevels (int): Number of pyramid levels. Larger values useful for things like color transfer. Defaults to 6.
searchvoteiters (int): Number of search/vote iterations. Defaults to 12.
patchmatchiters (int): Number of Patch-Match iterations. The larger, the longer it takes. Defaults to 6.
extrapass3x3 (bool): Perform additional polishing pass with 3x3 patches at the finest level. Defaults to True.

Ebsynth guide weights params

edg_wgt (float): Edge detect weights. Defaults to 1.0.
img_wgt (float): Original image weights. Defaults to 6.0.
pos_wgt (float): Flow position warping weights. Defaults to 2.0.
wrp_wgt (float): Warped style image weight. Defaults to 0.5.

Blending params

use_gpu (bool): Use GPU for Histogram Blending (Only affect Blend mode). Faster than CPU. Defaults to False.
use_lsqr (bool): Use LSQR (Least-squares solver) instead of LSMR (Iterative solver for least-squares) for Poisson blending step. LSQR often yield better results. May change to LSMR for speed (depends). Defaults to True.
use_poisson_cupy (bool): Use Cupy GPU acceleration for Poisson blending step. Uses LSMR (overrides use_lsqr). May not yield better speed. Defaults to False.
poisson_maxiter (int | None): Max iteration to calculate Poisson Least-squares (only affect LSMR mode). Expect positive integers. Defaults to None.
only_mode (str): Skip blending, only run one pass per sequence. Valid values:
- MODE_FWD = "forward" (Will only run forward mode if sequence.mode is blend)
- MODE_REV = "reverse" (Will only run reverse mode if sequence.mode is blend)
- Defaults to MODE_NON = "none".

Masking params

do_mask (bool): Whether to apply mask. Defaults to False.
pre_mask (bool): Whether to mask the inputs and styles before RUN or after. Pre-mask takes ~2x time to run per frame. Could be due to Ebsynth.dll implementation. Defaults to False.
feather (int): Feather Gaussian radius to apply on the mask results. Only affect if return_masked_only == False. Expects integers. Defaults to 0.

Credits

jamriska - https://github.com/jamriska/ebsynth

@misc{Jamriska2018,
  author = {Jamriska, Ondrej},
  title = {Ebsynth: Fast Example-based Image Synthesis and Style Transfer},
  year = {2018},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/jamriska/ebsynth}},
}

Trentonom0r3 - https://github.com/Trentonom0r3/Ezsynth

https://github.com/princeton-vl/RAFT

RAFT: Recurrent All Pairs Field Transforms for Optical Flow
ECCV 2020
Zachary Teed and Jia Deng

https://github.com/n3slami/Ef-RAFT

@inproceedings{eslami2024rethinking,
  title={Rethinking RAFT for efficient optical flow},
  author={Eslami, Navid and Arefi, Farnoosh and Mansourian, Amir M and Kasaei, Shohreh},
  booktitle={2024 13th Iranian/3rd International Machine Vision and Image Processing Conference (MVIP)},
  pages={1--7},
  year={2024},
  organization={IEEE}
}

https://github.com/LA30/FlowDiffuser

@inproceedings{luo2024flowdiffuser,
  title={FlowDiffuser: Advancing Optical Flow Estimation with Diffusion Models},
  author={Luo, Ao and Li, Xin and Yang, Fan and Liu, Jiangyu and Fan, Haoqiang and Liu, Shuaicheng},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={19167--19176},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 233 Commits
examples		examples
ezsynth		ezsynth
output_synth		output_synth
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
build_ebs-linux-cpu+cuda.sh		build_ebs-linux-cpu+cuda.sh
build_ebs-linux-cpu_only.sh		build_ebs-linux-cpu_only.sh
build_ebs-win64-cpu+cuda.bat		build_ebs-win64-cpu+cuda.bat
ebsynth_sha256.txt		ebsynth_sha256.txt
requirements.txt		requirements.txt
test_imgsynth.py		test_imgsynth.py
test_progress.txt		test_progress.txt
test_redux.py		test_redux.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ezsynth - Ebsynth Python Library

Get started

Windows

Linux

All

Examples

Example outputs

Notable things

API Overview

ImageSynth

Ezsynth

RunConfig

Ebsynth gen params

Ebsynth guide weights params

Blending params

Masking params

Credits

About

Releases

Packages

Languages

License

FuouM/Ezsynth

Folders and files

Latest commit

History

Repository files navigation

Ezsynth - Ebsynth Python Library

Get started

Windows

Linux

All

Examples

Example outputs

Notable things

API Overview

ImageSynth

Ezsynth

RunConfig

Ebsynth gen params

Ebsynth guide weights params

Blending params

Masking params

Credits

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages