Jinbo Xing

I am currently a third-year Ph.D student in the CSE Department, CUHK, supervised by Prof. Tien-Tsin WONG.

Previously, I received my B.Sc. (1st class honor & ELITE Stream) and M.Sc. degrees in Computer Science at CUHK in 2020 and 2021 respectively.

I'm focusing on controllable text-to-video & image-to-video generation.

我正在寻找2025年秋季入职的全职岗位(AIGC视频生成), 欢迎邮件联系

I will graduate at 2025 Fall. Drop me an Email if your are recruting!

Email / Google Scholar / Github / ResearchGate / LinkedIn / CV

Research Works

	DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors Jinbo Xing, Menghan Xia, Yong Zhang, Haoxin Chen, Wangbo Yu, Hanyuan Liu, Xintao Wang, Tien-Tsin Wong, Ying Shan arXiv preprint, 2023 arXiv / Project / Code / Demo An open-domain image animation method based on video diffusion priors.
	StyleCrafter: Enhancing Stylized Text-to-Video Generation with Style Adapter Gongye Liu, Menghan Xia, Yong Zhang, Haoxin Chen, Jinbo Xing, Xintao Wang, Yujiu Yang, Ying Shan arXiv preprint, 2023 arXiv / Project / Code / Demo Given a stylized image as reference, the model can generate images and videos with consistent styles based on text prompt.
	VideoCrafter1 : Open Diffusion Models for High-Quality Video Generation Haoxin Chen, Menghan Xia, Yingqing He, Yong Zhang, Xiaodong Cun, Shaoshu Yang, Jinbo Xing, Yaofang Liu, Qifeng Chen, Xintao Wang, Chao Weng, Ying Shan arXiv preprint, 2023 arXiv / Project / Code An open-sourced foundational text-to-video and image-to-video diffusion model for high-quality video generation.
	Make-Your-Video: Customized Video Generation Using Textual and Structural Guidance Jinbo Xing, Menghan Xia, Yuxin Liu, Yuechen Zhang, Yong Zhang, Yingqing He, Hanyuan Liu, Haoxin Chen, Xiaodong Cun, Xintao Wang, Ying Shan, Tien-Tsin Wong IEEE Transactions on Visualization and Computer Graphics (TVCG), 2024 arXiv / Project / Code Given text description and video structure (depth), our approach can generate temporally coherent and high-fidelity videos. Its applications include dynamic 3d-scene-to-video creation, real-life scene to video, and video rerendering.
	RIVAL: Real-World Image Variation by Aligning Diffusion Inversion Chain Yuechen Zhang, Jinbo Xing, Eric Lo, Jiaya Jia Neural Information Processing Systems (NeurIPS* spotlight)*, 2023 arXiv / Project / Code Given a real-world image, our proposed method can generate its semantic-aligned variants. The approach is compatible with customized image generation, inpainting, controllable image synthesis, style transfer, etc.
	Animate-A-Story: Storytelling with Retrieval-Augmented Video Generation Yingqing He, Menghan Xia, Haoxin Chen, Xiaodong Cun, Yuan Gong, Jinbo Xing, Yong Zhang, Xintao Wang, Chao Weng, Ying Shan, Qifeng Chen arXiv preprint, 2023 arXiv / Project / Code We utilize the abundance of existing video clips and synthesize a coherent storytelling video by customizing their appearances with given visual concepts.
	Video Colorization with Pre-trained Text-to-Image Diffusion Models Hanyuan Liu, Minshan Xie, Jinbo Xing, Chengze Li, Tien-Tsin Wong arXiv preprint, 2023 arXiv / Project / Code We present an effective video colorization method, an adaptation of a pre-trained text-to-image diffusion model for video colorization.
	Improved Diffusion-based Image Colorization via Piggybacked Models Hanyuan Liu, Jinbo Xing, Minshan Xie, Chengze Li, Tien-Tsin Wong arXiv preprint, 2023 arXiv / Project / Code We present a novel image colorization method, which leaverages the rich diffusion model prior. It takes text-based image colorization to another semantic level.
	CodeTalker: Speech-Driven 3D Facial Animation with Discrete Motion Prior Jinbo Xing, Menghan Xia, Yuechen Zhang, Xiaodong Cun, Jue Wang, Tien-Tsin Wong Computer Vision and Pattern Recognition (CVPR), 2023 arXiv / Paper / Project / Code / Online demo We minimize the over-smoothed facial expression using a learned discrete motion prior, which means more dramatic and expressive facial motions with more accurate lip movements can be achieved.
	Ref-NPR: Reference-Based Non-Photorealistic Radiance Fields Yuechen Zhang, Zexin He, Jinbo Xing, Xufeng Yao, Jiaya Jia Computer Vision and Pattern Recognition (CVPR), 2023 arXiv / Paper / Project / Code / Video / Data We present a controllable scene stylization method utilizing radiance fields to stylize a 3D scene, with a single stylized 2D view taken as reference.
	Scale-arbitrary Invertible Image Downscaling Jinbo Xing, Wenbo Hu, Menghan Xia, Tien-Tsin Wong IEEE Transactions on Image Processing (TIP), 2023 Paper / Project / Code / Interactive inspection We present a scale-arbitrary invertible image downscaling network (AIDN) to natively downscale HR images with arbitrary scale factors. Meanwhile, the HR images could be restored with AIDN whenever necessary.
	Flow-aware Synthesis: A Generic Motion Model for Video Frame Interpolation Jinbo Xing, Wenbo Hu, Yuechen Zhang, Tien-Tsin Wong Computational Visual Media (CVM), 2021 Paper We present a flow-aware multi-frame interpolation method to address the complicated non-linear motions in the real world.
	Unstructured Knowledge Access in Task-oriented Dialog Modeling using Language Inference, Knowledge Retrieval and Knowledge-Integrative Response Generation Mudit Chaudhary, Borislav Dzodzo, Sida Huang, Chun Hei Lo, Mingzhi Lyu, Lun Yiu Nie, Jinbo Xing, Tianhua Zhang, Xiaoying Zhang, Jingyan Zhou, Hong Cheng, Wai Lam, Helen Meng (All authors contribute equally) AAAI Conference on Artificial Intelligence Workshop Program (AAAIW), DSTC9, 2021 arXiv / Paper / Code / Paper Reading / Poster We propose a system capable of accessing unstructured knowledge for task-oriented dialog.

Work Experiences

Tencent AI Lab, Tencent TEG	Research Intern Mentor: Dr. Menghan Xia	Jun. 2022 - Present
Stanley Ho Big Data Decision Analytics Research Centre, CUHK	Research Assistant Supervisor: Prof. Helen Meng	May 2020 - Aug. 2021
Electronic Engineering Dept., CUHK	Summer Research Intern Supervisor: Prof. Ni Zhao	May 2018 - Jul. 2018

Awards

Postgraduate Studentship
Distinguished Academic Performance Scholarship of MSc Programme
ELITE Stream Student Scholarship (2018, 2019)
Computer Science scholarship
Dean's List of Faculty of Engineering (2016, 2019, 2020)
Full Scholarship for Undergraduate Students (Mainland)

Teaching

2023-2024 Spring Computational Imaging and Vision (CSCI 3290)

2023-2024 Fall Advanced GPU Programming (CSCI 5390)

2022-2023 Spring Introduction to Multimedia Systems (CSCI 3280)

2021-2022 Spring Introduction to Multimedia Systems (CSCI 3280)

2021-2022 Fall Problem Solving by Programming (ENGG 1110)

2021-2022 Fall Computer Game Software Production (CMSC 5727)

Academic Services

Journal Review:
The Visual Computer Journal (TVCJ)
IEEE Transactions on Visualization and Computer Graphics (TVCG)
International Journal of Computer Vision (IJCV)
ACM Computing Surveys (CSUR)

Conference Review:
ECCV 2024

Misc

I am a keen amateur photographer.

A lover of League of Legends and Apex Legends (Best rank in Diamond).

Last updated: April 2024
Web page design credit to Jon Barron