Wei Zhai (翟伟)
I am an Associate Researcher at the Department of Automation, University of Science and Technology of China (USTC).
I earned my Ph.D. from USTC in 2022, where I was advised by Prof. Zheng-Jun Zha and Prof. Yang Cao.
Prior to that, I received my B.S. degree from Southwest Jiaotong University in 2017. I was fortunate to receive the AAAI 2023 Distinguished Paper Award and the ACM MM 2025 MSMA Workshop Best Student Paper Award.
My research interests lie at the intersection of Embodied Perception and Interaction, Multimodal Reasoning and Generation, and Neuromorphic Vision.
News
03/2026
1 papers accepted by T-PAMI.
02/2026
7 papers accepted by CVPR 2026.
1 Highlight
01/2026
3 papers accepted by ICLR 2026.
11/2025
1 paper accepted by AAAI 2025.
10/2025
1 paper accepted by T-NNLS.
09/2025
3 papers accepted by NeurIPS 2025.
1 Spotlight
09/2025
1 paper accepted by ACM MM MSMA Workshop.
Best Student Paper
06/2025
4 papers accepted by ICCV 2025.
06/2025
Won the 1st Place in Efficient Event-based Eye-Tracking Challenge.
06/2025
Won the 1st Place in Body Contact Estimation Challenge (RHOBIN2025 CVPR).
05/2025
1 paper accepted by SCIENCE CHINA Information Sciences.
02/2025
5 papers accepted by CVPR 2025.
1 Highlight
Publications
Pre-Print
End-to-End Spatial-Temporal Transformer for Real-time 4D HOI Reconstruction
Arxiv
An end-to-end Spatial-Temporal Transformer that achieves real-time (31.5 FPS), physically plausible 4D human-object reconstruction from a single video without test-time optimization.
Visual-Geometric Collaborative Guidance for Affordance Learning
Arxiv
Journal version of "Leverage Interactive Affinity for Affordance Learning" (CVPR 2023)
Journal version of "Leverage Interactive Affinity for Affordance Learning" (CVPR 2023)
2026
SkyFind: A Large-Scale Benchmark Unveiling Referring Expression Comprehension for UAV
In IEEE T-PAMI
Establishes the first formal definition of UAV-based REC and provides SkyFind—a large-scale dataset with one million high-quality pairs—to address the unique challenges of aerial target localization.
Event-based Visual Deformation Measurement
In CVPR 2026
(Highlight)
An event-frame fusion framework featuring Affine Invariant Simplicial modeling that achieves robust long-term dense deformation tracking while using only 18.9% of the resources required by high-speed video methods.
Gloria: Consistent Character Video Generation via Content Anchors
In CVPR 2026
Proposes a compact anchor-frame representation to generate high-quality, 10-minute+ character videos with superior identity preservation and multi-view appearance consistency.
TOUCH: Text-guided Controllable Generation of Free-Form Hand-Object Interactions
In ICLR 2026
Extends HOI generation from fixed grasping to diverse free-form interactions (e.g., pushing, poking) by introducing the large-scale WildO2 dataset and the TOUCH diffusion-based framework for fine-grained semantic control.
2025
EF-3DGS: Event-Aided Free-Trajectory 3D Gaussian Splatting
In NeurIPS 2025
(Spotlight)
Learning Object Affordance Ranking with Task Context
In ACM MM 2025 MSMA Workshop
(Best Student Paper)
PEAR: Phrase-Based Hand-Object Interaction Anticipation
In SCIENCE CHINA Information Sciences (SCIS)
BRAT: Bidirectional Relative Positional Attention Transformer for Event-based Eye tracking
In CVPR 2025 Workshop
(1st Place Challenge)
Efficient Test-time Adaptive Object Detection via Sensitivity-Guided Pruning
In CVPR 2025
(Highlight)
Likelihood-Aware Semantic Alignment for Full-Spectrum Out-of-Distribution Detection
In Journal of Intelligent Computing and Networking
2024
Mambapupil: Bidirectional Selective Recurrent Model for Event-based Eye Tracking
In CVPR 2024 Workshop
(1st Place Challenge)
2023
Grounded Affordance from Exocentric View
In International Journal of Computer Vision (IJCV)
Journal version of "Learning Affordance Grounding from Exocentric Images" (CVPR 2022)
Journal version of "Learning Affordance Grounding from Exocentric Images" (CVPR 2022)
On Exploring Multiplicity of Primitives and Attributes for Texture Recognition in the Wild
In IEEE T-PAMI
Journal version of MPAP (ICCV 2019) and DSR-Net (CVPR 2020)
Journal version of MPAP (ICCV 2019) and DSR-Net (CVPR 2020)
Background Activation Suppression for Weakly Supervised Object Localization and Semantic Segmentation
In International Journal of Computer Vision (IJCV)
Journal version of BAS (CVPR 2022)
Journal version of BAS (CVPR 2022)
Robustness Benchmark for Unsupervised Anomaly Detection Models
In Journal of University of Science and Technology of China (JUSTC)
Exploring Tuning Characteristics of Ventral Stream's Neurons for Few-Shot Image Classification
In AAAI 2023
(Oral, Distinguished Paper)
2022
One-Shot Affordance Detection in the Wild
In International Journal of Computer Vision (IJCV)
Journal version of "One-Shot Affordance Detection" (IJCAI 2021)
Journal version of "One-Shot Affordance Detection" (IJCAI 2021)
2021
A Tri-Attention Enhanced Graph Convolutional Network for Skeleton-Based Action Recognition
In IET Computer Vision (IET-CV 2021)
2020
One-Shot Texture Retrieval Using Global Grouping Metric
In IEEE T-MM 2020
Journal version of "One-Shot Texture Retrieval with Global Context Metric" (IJCAI 2019)
Journal version of "One-Shot Texture Retrieval with Global Context Metric" (IJCAI 2019)
2019
PixTextGAN: Structure Aware Text Image Synthesis for License Plate Recognition
In IET Image Processing (IET-IP 2019)
2018
Experience
Jul 2024 - Present
Jul 2022 - Jun 2024
Postdoctoral Researcher
Sep 2017 - Jun 2022
Ph.D. in Cyberspace Security
Dec 2020 - Sep 2021
Sep 2013 - Jun 2017
B.S. in Computer Science
Outstanding Graduate of Southwest Jiaotong University (2017)
Professional Activities
Conference Reviewer
- IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- IEEE International Conference on Computer Vision (ICCV)
- European Conference on Computer Vision (ECCV)
- Neural Information Processing Systems (NeurIPS)
- International Conference on Learning Representations (ICLR)
- International Conference on Machine Learning (ICML)
- AAAI Conference on Artificial Intelligence (AAAI)
- ACM Multimedia (ACM MM)
- International Joint Conferences on AI (IJCAI)
Journal Reviewer
- IEEE Trans. on Pattern Analysis and Machine Intelligence (T-PAMI)
- International Journal of Computer Vision (IJCV)
- IEEE Transactions on Image Processing (T-IP)
- IEEE Transactions on Neural Networks and Learning Systems (T-NNLS)
- IEEE Transactions on Multimedia (T-MM)
- IEEE Trans. on Circuits and Systems for Video Technology (T-CSVT)
- Pattern Recognition (PR)
- ACM Trans. on Multimedia Computing, Comm., and Appl. (ToMM)
Awards and Honors
2025
ACM MM MSMA Workshop Best Student Paper
Best Paper
2025
1st Place, Body Contact Estimation Challenge (RHOBIN2025 CVPR Workshop)
Champion
2025
1st Place, Efficient Event-based Eye-Tracking Challenge (CVPR Workshop)
Champion
2024
1st Place, Event-based Eye Tracking Task (AIS2024 CVPR Workshop)
Champion
2024
2nd Place, 3D Contact Estimation Challenge (RHOBIN2024 CVPR Workshop)
Runner-up
2024
2nd Place, NTIRE 2024 Efficient Super-Resolution Challenge
Runner-up
2023
AAAI Distinguished Paper Award
Distinguished
2021
Outstanding Internship at JD Explore Academy
2019
National Scholarship (University of Science and Technology of China)
2017
Outstanding Graduate of Southwest Jiaotong University
2016
National Scholarship (Southwest Jiaotong University)
Teaching
Autumn 2024
Computer Vision, USTC
Autumn 2025
Computer Vision, USTC
Spring 2026
Deep Learning, USTC
Teaching Assistants
Autumn 2020
Computer Vision, USTC
Autumn 2019
Image Processing, USTC