burakkizil commited on
Commit
5859cd6
verified
1 Parent(s): fb5e839

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +106 -3
README.md CHANGED
@@ -1,3 +1,106 @@
1
- ---
2
- license: cc
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <p align="center">
2
+
3
+ <h1 align="center">LAMP: Language-Assisted Motion Planning</h1>
4
+ <p align="center">
5
+ <strong>M. Burak Kizil</strong>
6
+
7
+ <strong>Enes Sanli</strong>
8
+
9
+ <strong>Niloy J. Mitra</strong>
10
+
11
+ <strong>Erkut Erdem</strong>
12
+
13
+ <strong>Aykut Erdem</strong>
14
+
15
+ <strong>Duygu Ceylan</strong>
16
+ <br>
17
+ <br>
18
+ <a href="https://arxiv.org/abs/2512.03619">arXiv</a>&nbsp;&nbsp;&nbsp;
19
+ <a href="https://cyberiada.github.io/LAMP/">Webpage</a>&nbsp;&nbsp;&nbsp;
20
+ <a href="https://github.com/mbkizil/LAMP/">GitHub</a>
21
+ <br>
22
+ </p>
23
+
24
+
25
+ ## Introduction
26
+ <strong>LAMP</strong> defines a motion domain-specific language (DSL), inspired by cinematography conventions. By harnessing program synthesis capabilities of LLMs, LAMP generates structured motion programs from natural language, which are deterministically mapped to 3D trajectories.
27
+
28
+ <img src='./assets/teaser.jpg'>
29
+
30
+
31
+ ## 馃帀 News
32
+ - [ ] Client inference is coming soon.
33
+ - [x] Dec 7, 2025: Gradio demo is ready to use.
34
+ - [x] Dec 7, 2025: We propose [LAMP](https://cyberiada.github.io/LAMP/)
35
+
36
+
37
+
38
+ ## 鈿欙笍 Installation
39
+ The codebase was tested with Python 3.11.13, CUDA version 12.8, and PyTorch >= 2.8.0
40
+
41
+ ### Setup for Model Inference
42
+ You can setup for LAMP model inference by running:
43
+ ```bash
44
+ git clone https://github.com/mbkizil/LAMP.git && cd LAMP
45
+ pip install torch==2.8.0 torchvision==0.23.0 --index-url https://download.pytorch.org/whl/cu128 # If PyTorch is not installed.
46
+ pip install -r requirements.txt
47
+ pip install wan@git+https://github.com/Wan-Video/Wan2.1
48
+ ```
49
+
50
+ ## Download Models
51
+
52
+ Download the [VACE](https://huggingface.co/Wan-AI/Wan2.1-VACE-1.3B) and finetuned [Qwen2.5-VL-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-7B-Instruct) model weight using [download.sh](download.sh)
53
+
54
+
55
+ ```bash
56
+ chmod +x download.sh
57
+ ./download.sh
58
+ ```
59
+
60
+ ## 馃殌 Usage
61
+ In LAMP, users act as a director, providing natural language descriptions for both object and camera behaviors. The system translates these prompts into precise 3D Motion Programs and conditions the video generation process to produce cinematic shots.
62
+
63
+ ### Interactive Demo (Gradio)
64
+ To explore the full pipeline鈥攆rom text-to-motion planning to final video synthesis鈥攚e provide an interactive Gradio interface. This single entry point handles the loading of the Motion Planner (Qwen2.5-VL) and the Video Generator (VACE) seamlessly.
65
+ ```bash
66
+
67
+ python -m src.serve.app --model-path ./qwen_checkpoints/LAMP-Qwen-2.5-VL
68
+
69
+ ```
70
+ This script will:
71
+
72
+ > Load the LLM Motion Planner (Qwen2.5-based) into memory.
73
+
74
+ > Initialize the embedded VACE pipeline for trajectory-conditioned generation.
75
+
76
+ > Launch a local web server (default at http://127.0.0.1:8890).
77
+
78
+
79
+
80
+
81
+ > 馃挕**Notes from VACE**:
82
+ > (1) Please refer to [vace/vace_wan_inference.py](./src/vace_lib/vace/vace_wan_inference.py) for the inference args.
83
+ > (2) For English language Wan2.1 users, you need prompt extension to unlock the full model performance.
84
+ Please follow the [instruction of Wan2.1](https://github.com/Wan-Video/Wan2.1?tab=readme-ov-file#2-using-prompt-extension) and set `--use_prompt_extend` while running inference.
85
+
86
+
87
+
88
+ ## Acknowledgement
89
+
90
+
91
+ We are grateful for the following awesome projects that served as the foundation for LAMP, including [VACE](https://github.com/ali-vilab/VACE) for the powerful all-in-one video generation backbone and [Qwen](https://github.com/QwenLM/Qwen3-VL?ref=xxzz.info) for the robust language reasoning capabilities. We also extend our thanks to [Qwen-VL-Series-Finetune](https://github.com/2U1/Qwen-VL-Series-Finetune), which provided an efficient framework for training our motion planner.
92
+
93
+ Additionally, we acknowledge the pioneering works in camera control and trajectory generation, specifically [GenDoP](https://github.com/3DTopia/GenDoP) and [Exceptional Trajectories](https://github.com/robincourant/DIRECTOR). Their contributions to motion datasets and evaluation methodologies have brought immense inspiration to this project and established essential baselines for the field of controllable video generation.
94
+
95
+ ## BibTeX
96
+
97
+ ```bibtex
98
+ @misc{kizil2025lamplanguageassistedmotionplanning,
99
+ title={LAMP: Language-Assisted Motion Planning for Controllable Video Generation},
100
+ author={Muhammed Burak Kizil and Enes Sanli and Niloy J. Mitra and Erkut Erdem and Aykut Erdem and Duygu Ceylan},
101
+ year={2025},
102
+ eprint={2512.03619},
103
+ archivePrefix={arXiv},
104
+ primaryClass={cs.CV},
105
+ url={https://arxiv.org/abs/2512.03619},
106
+ }