Jinbo Xing

I am currently a third-year Ph.D student in the CSE Department, CUHK, supervised by Prof. Tien-Tsin WONG.

Previously, I received my B.Sc. (1st class honor & ELITE Stream) and M.Sc. degrees in Computer Science at CUHK in 2020 and 2021 respectively.

I'm focusing on controllable text-to-video & image-to-video generation.

我正在寻找2025年秋季入职的全职岗位(AIGC视频生成), 欢迎邮件联系

I will graduate at 2025 Fall. Drop me an Email if your are recruting!

Email  /  Google Scholar  /  Github  /  ResearchGate  /  LinkedIn  /  CV

profile photo
Research Works
after for CodeTalker
after for CodeTalker
DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors
Jinbo Xing, Menghan Xia, Yong Zhang, Haoxin Chen, Wangbo Yu, Hanyuan Liu, Xintao Wang, Tien-Tsin Wong, Ying Shan
arXiv preprint, 2023

arXiv  /  Project  /  Code  /  Demo

An open-domain image animation method based on video diffusion priors.

after for CodeTalker
after for CodeTalker
StyleCrafter: Enhancing Stylized Text-to-Video Generation with Style Adapter
Gongye Liu, Menghan Xia, Yong Zhang, Haoxin Chen, Jinbo Xing, Xintao Wang, Yujiu Yang, Ying Shan
arXiv preprint, 2023

arXiv  /  Project  /  Code  /  Demo

Given a stylized image as reference, the model can generate images and videos with consistent styles based on text prompt.

after for CodeTalker
after for CodeTalker
VideoCrafter1 : Open Diffusion Models for High-Quality Video Generation
Haoxin Chen, Menghan Xia, Yingqing He, Yong Zhang, Xiaodong Cun, Shaoshu Yang, Jinbo Xing, Yaofang Liu, Qifeng Chen, Xintao Wang, Chao Weng, Ying Shan
arXiv preprint, 2023

arXiv  /  Project  /  Code

An open-sourced foundational text-to-video and image-to-video diffusion model for high-quality video generation.

after for CodeTalker
after for CodeTalker
Make-Your-Video: Customized Video Generation Using Textual and Structural Guidance
Jinbo Xing, Menghan Xia, Yuxin Liu, Yuechen Zhang, Yong Zhang, Yingqing He, Hanyuan Liu, Haoxin Chen, Xiaodong Cun, Xintao Wang, Ying Shan, Tien-Tsin Wong
IEEE Transactions on Visualization and Computer Graphics (TVCG), 2024

arXiv  /  Project  /  Code

Given text description and video structure (depth), our approach can generate temporally coherent and high-fidelity videos. Its applications include dynamic 3d-scene-to-video creation, real-life scene to video, and video rerendering.

after for piggycolor
before for piggycolor
RIVAL: Real-World Image Variation by Aligning Diffusion Inversion Chain
Yuechen Zhang, Jinbo Xing, Eric Lo, Jiaya Jia
Neural Information Processing Systems (NeurIPS spotlight), 2023

arXiv  /  Project  /  Code

Given a real-world image, our proposed method can generate its semantic-aligned variants. The approach is compatible with customized image generation, inpainting, controllable image synthesis, style transfer, etc.

after for piggycolor
before for piggycolor
Animate-A-Story: Storytelling with Retrieval-Augmented Video Generation
Yingqing He, Menghan Xia, Haoxin Chen, Xiaodong Cun, Yuan Gong, Jinbo Xing, Yong Zhang, Xintao Wang, Chao Weng, Ying Shan, Qifeng Chen
arXiv preprint, 2023

arXiv  /  Project  /  Code

We utilize the abundance of existing video clips and synthesize a coherent storytelling video by customizing their appearances with given visual concepts.

before for piggycolor
before for piggycolor
Video Colorization with Pre-trained Text-to-Image Diffusion Models
Hanyuan Liu, Minshan Xie, Jinbo Xing, Chengze Li, Tien-Tsin Wong
arXiv preprint, 2023

arXiv  /  Project  /  Code

We present an effective video colorization method, an adaptation of a pre-trained text-to-image diffusion model for video colorization.

after for piggycolor
before for piggycolor
Improved Diffusion-based Image Colorization via Piggybacked Models
Hanyuan Liu, Jinbo Xing, Minshan Xie, Chengze Li, Tien-Tsin Wong
arXiv preprint, 2023

arXiv  /  Project  /  Code

We present a novel image colorization method, which leaverages the rich diffusion model prior. It takes text-based image colorization to another semantic level.

after for CodeTalker
before for codetalker
CodeTalker: Speech-Driven 3D Facial Animation with Discrete Motion Prior
Jinbo Xing, Menghan Xia, Yuechen Zhang, Xiaodong Cun, Jue Wang, Tien-Tsin Wong
Computer Vision and Pattern Recognition (CVPR), 2023

arXiv  /  Paper  /  Project  /  Code  /  Online demo

We minimize the over-smoothed facial expression using a learned discrete motion prior, which means more dramatic and expressive facial motions with more accurate lip movements can be achieved.

after for Ref-NPR
before for Ref-NPR
Ref-NPR: Reference-Based Non-Photorealistic Radiance Fields
Yuechen Zhang, Zexin He, Jinbo Xing, Xufeng Yao, Jiaya Jia
Computer Vision and Pattern Recognition (CVPR), 2023

arXiv  /  Paper  /  Project  /  Code  /  Video  /  Data

We present a controllable scene stylization method utilizing radiance fields to stylize a 3D scene, with a single stylized 2D view taken as reference.

Annimation for AIDN
Cover Image for AIDN
Scale-arbitrary Invertible Image Downscaling
Jinbo Xing*, Wenbo Hu*, Menghan Xia, Tien-Tsin Wong
IEEE Transactions on Image Processing (TIP), 2023

Paper  /  Project  /  Code  /  Interactive inspection

We present a scale-arbitrary invertible image downscaling network (AIDN) to natively downscale HR images with arbitrary scale factors. Meanwhile, the HR images could be restored with AIDN whenever necessary.

Cover Image for FVI
Flow-aware Synthesis: A Generic Motion Model for Video Frame Interpolation
Jinbo Xing*, Wenbo Hu*, Yuechen Zhang, Tien-Tsin Wong
Computational Visual Media (CVM), 2021

Paper

We present a flow-aware multi-frame interpolation method to address the complicated non-linear motions in the real world.

Unstructured Knowledge Access in Task-oriented Dialog Modeling using Language Inference, Knowledge Retrieval and Knowledge-Integrative Response Generation
Mudit Chaudhary, Borislav Dzodzo, Sida Huang, Chun Hei Lo, Mingzhi Lyu, Lun Yiu Nie, Jinbo Xing, Tianhua Zhang, Xiaoying Zhang, Jingyan Zhou, Hong Cheng, Wai Lam, Helen Meng (All authors contribute equally)
AAAI Conference on Artificial Intelligence Workshop Program (AAAIW), DSTC9, 2021

arXiv  /  Paper  /  Code  /  Paper Reading  /  Poster

We propose a system capable of accessing unstructured knowledge for task-oriented dialog.

Work Experiences
Tencent AI Lab, Tencent TEG Research Intern
Mentor: Dr. Menghan Xia
Jun. 2022 - Present
Stanley Ho Big Data Decision Analytics Research Centre, CUHK Research Assistant
Supervisor: Prof. Helen Meng
May 2020 - Aug. 2021
Electronic Engineering Dept., CUHK Summer Research Intern
Supervisor: Prof. Ni Zhao
May 2018 - Jul. 2018
Awards
  • Postgraduate Studentship

  • Distinguished Academic Performance Scholarship of MSc Programme

  • ELITE Stream Student Scholarship (2018, 2019)

  • Computer Science scholarship

  • Dean's List of Faculty of Engineering (2016, 2019, 2020)

  • Full Scholarship for Undergraduate Students (Mainland)

Teaching
2023-2024   Spring    Computational Imaging and Vision (CSCI 3290)
2023-2024   Fall          Advanced GPU Programming (CSCI 5390)
2022-2023   Spring    Introduction to Multimedia Systems (CSCI 3280)
2021-2022   Spring    Introduction to Multimedia Systems (CSCI 3280)
2021-2022   Fall          Problem Solving by Programming (ENGG 1110)
2021-2022   Fall          Computer Game Software Production (CMSC 5727)
Academic Services
  • Journal Review:
    The Visual Computer Journal (TVCJ)
    IEEE Transactions on Visualization and Computer Graphics (TVCG)
    International Journal of Computer Vision (IJCV)
    ACM Computing Surveys (CSUR)

  • Conference Review:
    ECCV 2024

Misc
    I am a keen amateur photographer.
    A lover of League of Legends and Apex Legends (Best rank in Diamond).

Last updated: April 2024
Web page design credit to Jon Barron