Jinbo Xing

I am currently a final-year Ph.D student in the CSE Department, CUHK, supervised by Prof. Tien-Tsin WONG and Prof. Chi-Wing FU.

Previously, I received my B.Sc. (1st class honor & ELITE Stream) and M.Sc. degrees in Computer Science at CUHK in 2020 and 2021 respectively.

I'm focusing on controllable text-to-video & image-to-video generation.

我正在寻找2025年秋季入职的全职岗位, 欢迎邮件联系

I will graduate in 2025 Fall. Drop me an Email if you are recruiting!

Email  /  Google Scholar  /  Github  /  ResearchGate  /  LinkedIn  /  CV

profile photo
Research Works
after for CodeTalker
after for CodeTalker
ToonCrafter: Generative Cartoon Interpolation
Jinbo Xing, Hanyuan Liu, Menghan Xia, Yong Zhang, Xintao Wang, Ying Shan, Tien-Tsin Wong
SIGGRAPH Asia, 2024, Journal Track (selected in Technical Papers Trailer)

Paper  /  arXiv  /  Project  /  Demo  /  Code GitHub Repo stars

Taming large-scale video diffusion models for generative cartoon interpolation.

after for CodeTalker
after for CodeTalker
DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors
Jinbo Xing, Menghan Xia, Yong Zhang, Haoxin Chen, Wangbo Yu, Hanyuan Liu, Gongye Liu, Xintao Wang, Ying Shan, Tien-Tsin Wong
European Conference on Computer Vision (ECCV, Oral), 2024

Paper  /  arXiv  /  Project  /  Demo  /  Code GitHub Repo stars

An open-domain high-quality image-to-video foundation model.

after for CodeTalker
after for CodeTalker
ViewCrafter: Taming Video Diffusion Models for High-fidelity Novel View Synthesis
Wangbo Yu*, Jinbo Xing*, Li Yuan*, Wenbo Hu, Xiaoyu Li, Zhipeng Huang, Xiangjun Gao, Tien-Tsin Wong, Ying Shan, Yonghong Tian
arXiv preprint, 2024

arXiv  /  Project  /  Demo  /  Code GitHub Repo stars

Taming large-scale video diffusion models for high-fidelity novel view synthesis.

after for CodeTalker
after for CodeTalker
StyleCrafter: Enhancing Stylized Text-to-Video Generation with Style Adapter
Gongye Liu, Menghan Xia, Yong Zhang, Haoxin Chen, Jinbo Xing, Xintao Wang, Yujiu Yang, Ying Shan
SIGGRAPH Asia, 2024, Journal Track

Paper  /  arXiv  /  Project  /  Demo  /  Code GitHub Repo stars

Given a stylized image as reference, the model can generate images and videos with consistent styles based on text prompt.

after for CodeTalker
after for CodeTalker
VideoCrafter1 : Open Diffusion Models for High-Quality Video Generation
Haoxin Chen, Menghan Xia, Yingqing He, Yong Zhang, Xiaodong Cun, Shaoshu Yang, Jinbo Xing, Yaofang Liu, Qifeng Chen, Xintao Wang, Chao Weng, Ying Shan
arXiv preprint, 2023

arXiv  /  Project  /  Code GitHub Repo stars

An open-sourced foundational text-to-video and image-to-video diffusion model for high-quality video generation.

after for CodeTalker
after for CodeTalker
Make-Your-Video: Customized Video Generation Using Textual and Structural Guidance
Jinbo Xing, Menghan Xia, Yuxin Liu, Yuechen Zhang, Yong Zhang, Yingqing He, Hanyuan Liu, Haoxin Chen, Xiaodong Cun, Xintao Wang, Ying Shan, Tien-Tsin Wong
IEEE Transactions on Visualization and Computer Graphics (TVCG), 2024

arXiv  /  Project  /  Code GitHub Repo stars

Given text description and video structure (depth), our approach can generate temporally coherent and high-fidelity videos. Its applications include dynamic 3d-scene-to-video creation, real-life scene to video, and video rerendering.

after for CodeTalker
after for CodeTalker
MagicStick: Controllable Video Editing via Control Handle Transformations
Yue Ma, Xiaodong Cun, Sen Liang, Jinbo Xing, Yingqing He, Chenyang Qi, Siran Chen, Qifeng Chen
IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2025

arXiv  /  Project  /  Code GitHub Repo stars

MagicStick is a unified framework to modify video properties (e.g., shape, size, location, motion) leveraging the keyframe transformations on the extracted internal control signals.

after for piggycolor
before for piggycolor
RIVAL: Real-World Image Variation by Aligning Diffusion Inversion Chain
Yuechen Zhang, Jinbo Xing, Eric Lo, Jiaya Jia
Neural Information Processing Systems (NeurIPS, Spotlight), 2023

arXiv  /  Project  /  Code GitHub Repo stars

Given a real-world image, our proposed method can generate its semantic-aligned variants. The approach is compatible with customized image generation, inpainting, controllable image synthesis, style transfer, etc.

after for piggycolor
before for piggycolor
Animate-A-Story: Storytelling with Retrieval-Augmented Video Generation
Yingqing He, Menghan Xia, Haoxin Chen, Xiaodong Cun, Yuan Gong, Jinbo Xing, Yong Zhang, Xintao Wang, Chao Weng, Ying Shan, Qifeng Chen
AI4VA ECCV Workshop, 2024

arXiv  /  Project  /  Code GitHub Repo stars

We utilize the abundance of existing video clips and synthesize a coherent storytelling video by customizing their appearances with given visual concepts.

before for piggycolor
before for piggycolor
Video Colorization with Pre-trained Text-to-Image Diffusion Models
Hanyuan Liu, Minshan Xie, Jinbo Xing, Chengze Li, Tien-Tsin Wong
arXiv preprint, 2023

arXiv  /  Project  /  Code

We present an effective video colorization method, an adaptation of a pre-trained text-to-image diffusion model for video colorization.

after for piggycolor
before for piggycolor
Improved Diffusion-based Image Colorization via Piggybacked Models
Hanyuan Liu, Jinbo Xing, Minshan Xie, Chengze Li, Tien-Tsin Wong
arXiv preprint, 2023

arXiv  /  Project  /  Code

We present a novel image colorization method, which leaverages the rich diffusion model prior. It takes text-based image colorization to another semantic level.

after for CodeTalker
before for codetalker
CodeTalker: Speech-Driven 3D Facial Animation with Discrete Motion Prior
Jinbo Xing, Menghan Xia, Yuechen Zhang, Xiaodong Cun, Jue Wang, Tien-Tsin Wong
Computer Vision and Pattern Recognition (CVPR), 2023

arXiv  /  Paper  /  Project  /  Demo  /  Code GitHub Repo stars

We minimize the over-smoothed facial expression using a learned discrete motion prior, which means more dramatic and expressive facial motions with more accurate lip movements can be achieved.

after for Ref-NPR
before for Ref-NPR
Ref-NPR: Reference-Based Non-Photorealistic Radiance Fields
Yuechen Zhang, Zexin He, Jinbo Xing, Xufeng Yao, Jiaya Jia
Computer Vision and Pattern Recognition (CVPR), 2023

arXiv  /  Paper  /  Project  /  Video  /  Data  /  Code GitHub Repo stars

We present a controllable scene stylization method utilizing radiance fields to stylize a 3D scene, with a single stylized 2D view taken as reference.

Annimation for AIDN
Cover Image for AIDN
Scale-arbitrary Invertible Image Downscaling
Jinbo Xing*, Wenbo Hu*, Menghan Xia, Tien-Tsin Wong
IEEE Transactions on Image Processing (TIP), 2023

Paper  /  Project  /  Code  /  Interactive inspection

We present a scale-arbitrary invertible image downscaling network (AIDN) to natively downscale HR images with arbitrary scale factors. Meanwhile, the HR images could be restored with AIDN whenever necessary.

Cover Image for FVI
Flow-aware Synthesis: A Generic Motion Model for Video Frame Interpolation
Jinbo Xing*, Wenbo Hu*, Yuechen Zhang, Tien-Tsin Wong
Computational Visual Media (CVM), 2021

Paper

We present a flow-aware multi-frame interpolation method to address the complicated non-linear motions in the real world.

Unstructured Knowledge Access in Task-oriented Dialog Modeling using Language Inference, Knowledge Retrieval and Knowledge-Integrative Response Generation
Mudit Chaudhary, Borislav Dzodzo, Sida Huang, Chun Hei Lo, Mingzhi Lyu, Lun Yiu Nie, Jinbo Xing, Tianhua Zhang, Xiaoying Zhang, Jingyan Zhou, Hong Cheng, Wai Lam, Helen Meng (All authors contribute equally)
AAAI Conference on Artificial Intelligence Workshop Program (AAAIW), DSTC9, 2021

arXiv  /  Paper  /  Code  /  Paper Reading  /  Poster

We propose a system capable of accessing unstructured knowledge for task-oriented dialog.

Work Experiences
Adobe Research Research Scientist Intern
Mentor: Dr. Long Mai
Jul. 2024 - Present
Tencent AI Lab Research Intern
Mentor: Dr. Menghan Xia
Jun. 2022 - Jul. 2024
Stanley Ho Big Data Decision Analytics Research Centre, CUHK Research Assistant
Supervisor: Prof. Helen Meng
May 2020 - Aug. 2021
Electronic Engineering Dept., CUHK Summer Research Intern
Supervisor: Prof. Ni Zhao
May 2018 - Jul. 2018
Awards
  • Postgraduate Studentship

  • Distinguished Academic Performance Scholarship of MSc Programme

  • ELITE Stream Student Scholarship (2018, 2019)

  • Computer Science scholarship

  • Dean's List of Faculty of Engineering (2016, 2019, 2020)

  • Full Scholarship for Undergraduate Students (Mainland)

Teaching
2023-2024   Spring    Computational Imaging and Vision (CSCI 3290)
2023-2024   Fall          Advanced GPU Programming (CSCI 5390)
2022-2023   Spring    Introduction to Multimedia Systems (CSCI 3280)
2021-2022   Spring    Introduction to Multimedia Systems (CSCI 3280)
2021-2022   Fall          Problem Solving by Programming (ENGG 1110)
2021-2022   Fall          Computer Game Software Production (CMSC 5727)
Academic Services
  • Journal Review:
    IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)
    IEEE Transactions on Visualization and Computer Graphics (TVCG)
    International Journal of Computer Vision (IJCV)
    ACM Computing Surveys (CSUR)
    IEEE Transactions on Multimedia (TMM)
    The Visual Computer Journal (TVCJ)

  • Conference Review:
    ECCV 2024, MM 2024, SIGGRAPH Asia 2024
    ICLR 2025, CVPR 2025

Misc
    I am a keen amateur photographer.
    A lover of League of Legends and Apex Legends (Best rank in Diamond).

Last updated: Nov. 2024
Web page design credit to Jon Barron