Markus-Pobitzer
/

wlp-lora

+---
+license: cc-by-nc-4.0
+base_model:
+- Wan-AI/Wan2.1-I2V-14B-480P-Diffusers
+pipeline_tag: image-to-video
+tags:
+- Painting
+---
+# Loomis Painter: Reconstructing the painting process
+<p align="center">
+  <a href='https://github.com/Markus-Pobitzer/wlp'>
+    <img src='https://img.shields.io/badge/github-repo-blue?logo=github'></a>
+  <a href='https://arxiv.org/abs/2511.17344'>
+    <img src='https://img.shields.io/badge/Arxiv-Pdf-A42C25?style=flat&logo=arXiv&logoColor=white'></a>
+  <a href='https://markus-pobitzer.github.io/lplp'>
+    <img src='https://img.shields.io/badge/Project-Page-green?style=flat&logo=Google%20chrome&logoColor=white'></a>
+</p>
+<table>
+  <tr>
+    <td align="center">
+      <img src="assets/base.gif" width="380" alt="Generated Video" />
+      <br />
+      <sub>Generated Video</sub>
+    </td>
+    <td align="center">
+      <img src="assets/reference_image.png" width="380" alt="Input" title="Haystacks by Claude Monet. Source: Wikiart." />
+      <br />
+      <sub>Input</sub>
+    </td>
+  </tr>
+</table>
+## Base Model Inference
+Before running the code make sure to have installed torch, diffusers, transformers, huggingface_hub, and pillow. You can also install the dependencies from the offical Loomis Portrait repo [link](https://github.com/Markus-Pobitzer/wlp).
+```python
+import torch
+from diffusers import AutoencoderKLWan, WanImageToVideoPipeline
+from diffusers.utils import export_to_video, load_image
+from transformers import CLIPVisionModel
+from huggingface_hub import hf_hub_download
+from typing import List, Tuple, Union
+from PIL import Image, ImageOps
+def pil_resize(
+    image: Image.Image,
+    target_size: Tuple[int, int],
+    pad_input: bool = False,
+    padding_color: Union[str, int, Tuple[int, ...]] = "white",
+) -> Image.Image:
+    """Resizing it to the target size.
+    Args:
+        image: Input image to be processed.
+        target_size: Target size (width, height).
+        pad_input: If set resizes the image while keeping the aspect ratio and pads the unfilled part.
+        padding_color: The color for the padded pixels.
+    Returns:
+        The resized image
+    """
+    if pad_input:
+        # Resize image, keep aspect ratio
+        image = ImageOps.contain(image, size=target_size)
+        # Pad while keeping image in center
+        image = ImageOps.pad(image, size=target_size, color=padding_color)
+    else:
+        image = image.resize(target_size)
+    return image
+def undo_pil_resize(
+    image: Image.Image,
+    target_size: Tuple[int, int],
+) -> Image.Image:
+    """Undo the resizing and padding of the input image to the a new image with size target_size.
+    Args:
+        image: Input image to be processed.
+        target_size: Target size (width, height).
+    Returns:
+        The resized image
+    """
+    tmp_img = Image.new(mode="RGB", size=target_size)
+    # Get the resized image size
+    tmp_img = ImageOps.contain(tmp_img, size=image.size)
+    # Undo padding by center cropping
+    width, height = image.size
+    tmp_width, tmp_height = tmp_img.size
+    left = int(round((width - tmp_width) / 2.0))
+    top = int(round((height - tmp_height) / 2.0))
+    right = left + tmp_width
+    bottom = top + tmp_height
+    cropped = image.crop((left, top, right, bottom))
+    # Undo resizing
+    ret = cropped.resize(target_size)
+    return ret
+# Set to True if you have a GPU with less than 80GB VRAM --> Very slow inference!
+enable_sequential_cpu_offload = True
+# Download the LoRA file
+lora_path = hf_hub_download(repo_id="Markus-Pobitzer/wlp-lora", filename="base.safetensors")
+print(f"LoRA path: {lora_path}")
+# Loads the pipeline
+model_id = "Wan-AI/Wan2.1-I2V-14B-480P-Diffusers"
+vae = AutoencoderKLWan.from_pretrained(model_id, subfolder="vae", torch_dtype=torch.float32)
+image_encoder = CLIPVisionModel.from_pretrained(
+    model_id, subfolder="image_encoder", torch_dtype=torch.float32
+)
+# Takes more than 100 GB of disk space
+pipe = WanImageToVideoPipeline.from_pretrained(
+    model_id, vae=vae, image_encoder=image_encoder, torch_dtype=torch.bfloat16
+)
+# Load LoRA
+pipe.load_lora_weights(lora_path)
+pipe.fuse_lora()
+# Either offload or directly to GPU
+if enable_sequential_cpu_offload:
+    pipe.enable_sequential_cpu_offload()
+else:
+    pipe.to("cuda")
+### INFERENCE ###
+image = load_image(
+    "https://uploads3.wikiart.org/images/claude-monet/haystacks-at-giverny.jpg"
+)
+og_size = image.size
+height = 480
+width = 832
+# Resize and pad
+ref_image = pil_resize(image, target_size=(width, height), pad_input=True)
+prompt = "Painting process step by step."
+output = pipe(
+    image=ref_image,
+    prompt=prompt,
+    height=height,
+    width=width,
+    num_frames=81,
+    output_type="pil",
+    guidance_scale=1.0,
+).frames[0]
+# To original image size
+output = [undo_pil_resize(img, og_size) for img in output][::-1]
+# Save video
+export_to_video(output, "output.mp4", fps=3)
+```
+### Art Media Transfer
+To transfer from one art media to the other use following LoRA:
+```python
+lora_path = hf_hub_download(repo_id="Markus-Pobitzer/wlp-lora", filename="art_media_transfer.safetensors")
+```
+Make sure that you also change the prompt accordingly. The supported art medias are:
+- acrylic
+- colored pencils
+- loomis
+- pencil
+- oil
+The prompt has following format:
+```python
+art_media = "..."
+painting_desc = "..."
+prompt = f"<{art_media}> Painting process step by step. {painting_desc}"
+```
+For acrylic, colored pencils and oil the prompt can contain color descriptions, i.e.
+```
+prompt = f"<acrylic> Painting process step by step. The image depicts a serene landscape with a small brown and green island in the center of a body of water, surrounded by green trees and a few boats. The sky is blue with scattered clouds, and there are birds flying in the background."
+```
+For the loomis and pencil art media we left the color information out during fine tuning, i.e.
+```
+prompt = f"<pencil> Painting process step by step. The image depicts a serene landscape with a small island in the center of a body of water, surrounded by trees and a few boats. There are scattered clouds, and birds flying in the background."
+```
+Note that the loomis method only works on portrait photos/paintings and otherwise seems to fall back to an other art media.
+## Citation
+If you use this work, please cite:
+```bibtex
+@misc{pobitzer2025loomispainter,
+      title={Loomis Painter: Reconstructing the Painting Process},
+      author={Markus Pobitzer and Chang Liu and Chenyi Zhuang and Teng Long and Bin Ren and Nicu Sebe},
+      year={2025},
+      eprint={2511.17344},
+      archivePrefix={arXiv},
+      primaryClass={cs.CV},
+      url={https://arxiv.org/abs/2511.17344},
+}
+```