HorizonRobotics
/

RoboTransfer

RoboTransferPipeline

Model card Files Files and versions

RoboTransfer / README.md

nemo04's picture

Add pipeline tag to improve discoverability (#1)

dc3ee42 verified 4 months ago

|

history blame contribute delete

3.22 kB

	---
	library_name: diffusers
	license: apache-2.0
	pipeline_tag: image-to-video
	---

	<h1 align="center">
	RoboTransfer: Geometry-Consistent Video Diffusion for Robotic Visual Policy Transfer
	</h1>


	<div align="center" class="authors">
	Liu Liu,
	Xiaofeng Wang,
	Guosheng Zhao,
	Keyu Li,
	Wenkang Qin,
	Jiaxiong Qiu,
	Zheng Zhu,
	Guan Huang,
	Zhizhong Su
	</div>

	<div align="center" style="line-height: 3;">
	<a href="https://github.com/HorizonRobotics/RoboTransfer" target="_blank" style="margin: 2px;">
	<img alt="Code" src="https://img.shields.io/badge/Code-Github-blue" style="display: inline-block; vertical-align: middle;"/>
	</a>
	<a href="https://horizonrobotics.github.io/robot_lab/robotransfer" target="_blank" style="margin: 2px;">
	<img alt="Project Page" src="https://img.shields.io/badge/🌐-Project_Page-blue" style="display: inline-block; vertical-align: middle;"/>
	</a>
	<a href="https://arxiv.org/abs/2505.23171" target="_blank" style="margin: 2px;">
	<img alt="arXiv" src="https://img.shields.io/badge/📄-arXiv-b31b1b" style="display: inline-block; vertical-align: middle;"/>
	</a>
	<a href="https://youtu.be/dGXKtqDnm5Q" target="_blank" style="margin: 2px;">
	<img alt="Video" src="https://img.shields.io/badge/🎥-Video-red" style="display: inline-block; vertical-align: middle;"/>
	</a>
	<a href="https://mp.weixin.qq.com/s/c9-1HPBMHIy4oEwyKnsT7Q" target="_blank" style="margin: 2px;">
	<img alt="中文介绍" src="https://img.shields.io/badge/中文介绍-07C160?logo=wechat&logoColor=white" style="display: inline-block; vertical-align: middle;"/>
	</a>
	</div>

	<div align="center">
	<img src="assets/pin.jpg" width="40%" alt="RoboTransfer"/></div>

	---

	## 🔍 Abstract

	![RoboTransfer Pipeline](assets/robotransfer.jpg)

	RoboTransfer is a novel diffusion-based video generation framework tailored for robotic visual policy transfer. Unlike conventional approaches, RoboTransfer introduces geometry-aware synthesis by injecting depth and normal priors, ensuring multi-view consistency across dynamic robotic scenes. The method further supports explicit control over scene components, such as background editing, object identity swapping, and motion specification, offering a fine-grained video generation pipeline that benefits embodied learning.

	---

	## 🧠 Key Features

	- 📐 Geometry-Consistent Diffusion: Injects global 3D cues (depth, normal) and cross-view interactions for multi-view realism.
	- 🧩 Scene Component Control: Enables manipulation of object attributes (pose, identity) and background features.
	- 🔁 Cross-View Conditioning: Learns representations from multiple camera views with spatial correspondence.
	- 🤖 Robotic Policy Transfer: Facilitates domain adaptation by generating synthetic training data in target domains.

	---

	## 📖 BibTeX

	```bibtex
	@article{liu2025robotransfer,
	title={RoboTransfer: Geometry-Consistent Video Diffusion for Robotic Visual Policy Transfer},
	author={Liu, Liu and Wang, Xiaofeng and Zhao, Guosheng and Li, Keyu Li, Wenkang Qin, Jiaxiong Qiu, Zheng Zhu, Guan Huang, Zhizhong Su},
	journal={arXiv preprint arXiv:2505.23171},
	year={2025}
	}
	```