Publications of C4G @ HKUST

(* indicates equal contribution, # indicates correspondence)



MaterialMVP: Illumination-Invariant Material Generation via Multi-view PBR Diffusion ,
Zebin He, Mingxin Yang, Shuhui Yang, Yixuan Tang, Tao Wang, Kaihao Zhang, Guanying Chen, Yuhong Liu, Jie Jiang, Chunchao Guo, Wenhan Luo#,
Proc. of International Conference on Computer Vision (ICCV), Hawaii, USA, 2025.
[arXiv] [Project Page] [Code] GitHub stars

MOERL: When Mixture-of-Experts Meet Reinforcement Learning for Adverse Weather Image Restoration,
Tao Wang, Peiwen Xia, Bo Li, Peng-Tao Jiang, Zhe Kong, Kaihao Zhang, Tong Lu, Wenhan Luo#,
Proc. of International Conference on Computer Vision (ICCV), Hawaii, USA, 2025.
[PDF] [Code]

DAM-VSR: Disentanglement of Appearance and Motion for Video Super-Resolution,
Zhe Kong, Le Li, Yong Zhang#, Feng Gao, Shaoshu Yang, Tao Wang, Kaihao Zhang, Zhuoliang Kang, Xiaoming Wei, Guanying Chen, Wenhan Luo#,
ACM SIGGRAPH, 2025.
[PDF] [Project Page] [Code]

StyleMaster: Stylize Your Video with Artistic Generation and Translation,
Zixuan Ye, Huijuan Huang#, Xintao Wang, Pengfei Wan, Di Zhang, Wenhan Luo#,
Proc. of IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2025.
[arXiv] [Github] [Project Page] GitHub stars

Towards Multiple Character Image Animation Through Enhancing Implicit Decoupling ,
Jingyun Xue, Hongfa Wang, Qi Tian, Yue Ma, Andong Wang, Zhiyuan Zhao, Shaobo Min, Wenzhe Zhao, Kaihao Zhang, Heung-Yeung Shum, Wei Liu, Mengyang Liu, Wenhan Luo#,
International Conference on Learning Representations (ICLR), 2025.
[PDF] [Project Page] [API in Tencent Cloud]

Uni-MoE: Scaling Unified Multimodal LLMs with Mixture of Experts,
Yunxin Li, Shenyuan Jiang, Baotian Hu, Longyue Wang, Wanqi Zhong, Wenhan Luo, Lin Ma, Min Zhang,
IEEE Trans. on Pattern Analysis and Machine Intelligence (TPAMI), to appear.
[arXiv] [Code] [Project Page] [Model] GitHub stars
HITsz-TMG%2FUMOE-Scaling-Unified-Multimodal-LLMs | Trendshift

APPTracker+: Displacement Uncertainty for Occlusion Handling in Low-Frame-Rate Multiple Object Tracking,
Tao Zhou, Qi Ye, Wenhan Luo, Haizhou Ran, Zhiguo Shi, Jiming Chen,
International Journal of Computer Vision (IJCV), vol. 133, pp. 2044–2069, 2025.
[PDF]

Dual Teacher Knowledge Distillation with Domain Alignment for Face Anti-spoofing ,
Zhe Kong, Wentian Zhang, Tao Wang, Kaihao Zhang, Yuexiang Li, Xiaoying Tang, Wenhan Luo#,
IEEE Trans. on Circuits and Systems for Video Technology (TCSVT), vol. 34, pp. 13177-13189, 2024.
[PDF]

Blind Face Video Restoration with Temporal Consistent Generative Prior and Degradation-Aware Prompt,
Jingfan Tan, Hyunhee Park, Ying Zhang, Tao Wang, Kaihao Zhang, Xiangyu Kong, Pengwen Dai, Zikun Liu, Wenhan Luo#,
The 32rd ACM International Conference on Multimedia (ACM MM), 2024.
[PDF]

OMG: Occlusion-friendly Personalized Multi-concept Generation In Diffusion Models ,
Zhe Kong, Yong Zhang#, Tianyu Yang, Tao Wang, Kaihao Zhang, Bizhu Wu, Guanying Chen, Wei Liu, Wenhan Luo#,
European Conference on Computer Vision (ECCV), 2024.
[PDF] [Code] [Project Page] [Hugging Face (OMG+LoRAs)] [Hugging Face (OMG+InstantID)] GitHub stars

Prompting Future Driven Diffusion Model for Hand Motion Prediction,
Bowen Tang, Kaihao Zhang#, Wenhan Luo#, Wei Liu, Hongdong Li,
European Conference on Computer Vision (ECCV), 2024.
[PDF]

GridFormer: Residual Dense Transformer with Grid Structure for Image Restoration in Adverse Weather Conditions,
Tao Wang, Kaihao Zhang, Ziqian Shao, Wenhan Luo#, Bjorn Stenger, Tong Lu, Tae-Kyun Kim, Wei Liu, Hongdong Li,
International Journal of Computer Vision (IJCV), vol. 132, pp. 4541-4563, 2024.
[PDF] [Code]

Blind Face Restoration for Under-Display Camera via Dictionary Guided Transformer ,
Jingfan Tan, Xiaoxu Chen, Tao Wang, Kaihao Zhang, Wenhan Luo#, Xiaochun Cao,
IEEE Trans. on Circuits and Systems for Video Technology (TCSVT), vol. 34, pp. 4914-4927, 2024.
[PDF]

Punctuation-level Attack: Single-shot and Single Punctuation Can Fool Text Models ,
Wenqiang Wang, Chongyang Du, Tao Wang, Kaihao Zhang, Wenhan Luo#, Lin Ma, Wei Liu, Xiaochun Cao,
Neural Information Processing Systems (NeurIPS), 2023.
[PDF]

Taming Self-Supervised Learning for Presentation Attack Detection: De-Folding and De-Mixing ,
Zhe Kong, Wentian Zhang, Feng Liu, Wenhan Luo, Haozhe Liu, Linlin Shen, Raghavendra Ramachandra,
IEEE Transactions on Neural Networks and Learning Systems (TNNLS), vol. 35, pp. 10639-10650, 2024.
[PDF] [Code]

Ultra-High-Defnition Low-Light Image Enhancement: A Benchmark and Transformer-Based Method ,
Tao Wang, Kaihao Zhang, Tianrun Shen, Wenhan Luo#, Bjorn Stenger, Tong Lu#,
Proc. of the Association for the Advancement of Artificial Intelligence (AAAI), USA, 2023. ( Oral )
[PDF] [Code] GitHub stars

Multiple Object Tracking: A Literature Review,
Wenhan Luo, Junliang Xing, Anton Milan, Xiaoqin Zhang, Wei Liu, Tae-Kyun. Kim,
Artificial Intelligence, vol. 293, pp. 103448, 2021. ( Highly Cited Paper )
[PDF]

Face Anti-Spoofing: Model Matters, So Does Data,
Xiao Yang*, Wenhan Luo*, Linchao Bao, Yuan Gao, Dihong Gong, Shibao Zheng, Zhifeng Li, Wei Liu,
Proc. of IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), USA, 2019.
[PDF]

End-to-end Active Object Tracking and Its Real-world Deployment via Reinforcement Learning ,
Wenhan Luo*, Peng Sun*, Fangwei Zhong*, Wei Liu, Tong Zhang, Yizhou Wang,
IEEE Trans. on Pattern Analysis and Machine Intelligence (TPAMI), vol. 42, pp. 1317-1332, 2020.
[arXiv] [Project Page] [Code] [Environment] GitHub stars

Trajectories as Topics: Multi-Object Tracking by Topic Discovery,
Wenhan Luo, Bjorn Stenger, Xiaowei Zhao, Tae-Kyun Kim,
IEEE Trans. on Image Processing (TIP), vol. 28, no. 1, pp. 240-252, 2019.
[PDF]

End-to-end Active Object Tracking via Reinforcement Learning,
Wenhan Luo*, Peng Sun*, Fangwei Zhong, Wei Liu, Tong Zhang, Yizhou Wang,
International Conference on Machine Learning (ICML), Sweden, 2018.
[PDF] [Project Page] [Code] [Demo]

Automatic Topic Discovery for Multi-object Tracking,
Wenhan Luo, Bjorn Stenger, Xiaowei Zhao, Tae-Kyun Kim,
Proc. of the Association for the Advancement of Artificial Intelligence (AAAI), Austin, Texas, USA, 2015. ( Oral )
[PDF]

Bi-label Propagation for Generic Multiple Object Tracking,
Wenhan Luo, Tae-Kyun Kim, Bjorn Stenger, Xiaowei Zhao, Roberto Cipolla,
Proc. of IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Columbus, Ohio, USA, 2014.
[PDF]

Generic Object Crowd Tracking by Multi-Task Learning,
Wenhan Luo, Tae-Kyun Kim,
Proc. of British Machine Vision Conference (BMVC), Bristol, UK, 2013.
[PDF]

Robust Visual Tracking via Transfer Learning,
Wenhan Luo, Xi Li, Wei Li, Weiming Hu,
IEEE International Conference on Image Processing (ICIP), 2011.

Efficient Block-division Model for Robust Multiple Object Tracking,
Wenhan Luo, Xiaoqin Zhang, Yang Liu, Xi Li, Weiming Hu, Wei Li,
IEEE International Conference on Acoustics,Speech, and Signal Processing (ICASSP), 2011.


Tech Report

UNIC: Unified In-Context Video Editing,
Zixuan Ye, Xuanhua He, Quande Liu, Qiulin Wang, Xintao Wang, Pengfei Wan, Di Zhang, Kun Gai, Qifeng Chen, Wenhan Luo,
arXiv:2506.04216.
[arXiv] [Project Page]

Let Them Talk: Audio-Driven Multi-Person Conversational Video Generation,
Zhe Kong, Feng Gao, Yong Zhang, Zhuoliang Kang, Xiaoming Wei, Xunliang Cai, Guanying Chen, Wenhan Luo,
arXiv:2505.22647.
[arXiv] [Project Page] [Code] [Hugging Face Model] [Gradio] GitHub stars

Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning Models ,
Yunxin Li, Zhenyu Liu, Zitao Li, Xuanyu Zhang, Zhenran Xu, Xinyu Chen, Haoyuan Shi, Shenyuan Jiang, Xintong Wang, Jifang Wang, Shouzheng Huang, Xinping Zhao, Borui Jiang, Lanqing Hong, Longyue Wang, Zhuotao Tian, Baoxing Huai, Wenhan Luo, Weihua Luo, Zheng Zhang, Baotian Hu, Min Zhang,
arXiv:2505.04921, 2025.
[arXiv] [Github] GitHub stars

Foundation Cures Personalization: Recovering Facial Personalized Models' Prompt Consistency ,
Yiyang Cai, Zhengkai Jiang, Yulong Liu, Chunyang Jiang, Wei Xue, Wenhan Luo, Yike Guo,
arXiv:2411.15277, 2024.
[arXiv]

PromptRR: Diffusion Models as Prompt Generators for Single Image Reflection Removal ,
Tao Wang, Wanglong Lu, Kaihao Zhang, Wenhan Luo, Tae-Kyun Kim, Tong Lu, Hongdong Li, Ming-Hsuan Yang,
arXiv:2402.02374, 2024.
[arXiv] [Code]

A Survey of Deep Face Restoration: Denoise, Super-Resolution, Deblur, Artifact Removal ,
Tao Wang, Kaihao Zhang, Xuanxi Chen, Wenhan Luo, Jiankang Deng, Tong Lu, Xiaochun Cao, Wei Liu, Hongdong Li, Stefanos Zafeiriou,
arXiv:2211.02831, 2022.
[arXiv] [Project Page] GitHub stars