Junyao Hu (胡钧耀)

最后更新于:2024年5月13日 10:00
Last updated on: 2024-05-13 10:00


My photo and WeChat

👋 Hi! My name is Junyao Hu (胡钧耀). I’m a first year PhD student of Nankai University (南开大学). I’m advised by Professor Jufeng Yang (杨巨峰) in Computer Vision Lab (计算机视觉实验室).

🔍 My research interests include deep learning and computer vision, particularly focusing on:

  • 🤔Image sentiment analysis: image emotion label classification, ranking, and distribution learning.
  • 🏃‍Video understanding: video prediction, action recognition.
  • 🔮Visual generative AI: image/video diffusion model application.
  • 💞Psychology interdisciplinary research: early screening of autism spectrum disorder.

🥰 You can contact with me in following ways: Email / Github / Wechat (ID: LittleDream_hjy, and the QR code is in the picture above). Please feel free to make any suggestions. Any questions and inquiries about my work and study life are welcome.

🎞️ I am trying to operate my self-media Chinese channel for training my expression ability, sharing my scientific research and life experience, and bringing useful knowledge to everyone. Updates may not be frequent, and I will strive for quality. You can see me on Bilibili (@-胡椒椒椒) and WeChat Official Account (胡小乐杂货铺)。

📃 More details are shown on my CV page.


2024-02-27 😋 Accepted A paper was accepted to CVPR 2024.

2023-09-01 ✒️ Study I start my Ph.D. studying at Nankai University under the supervision of Prof. Jufeng Yang.

2023-07-15 ✒️ Study I end my undergraduate life at China University of Mining and Technology, thanks to all the teachers and friends around me, especially my parents!

2023-06-10 💼 Activity I attend the VALSE 2023 conference at Wuxi, China.

Selected Publications

If you want to view my all publications, click here.

Note: # = Equal Contribution , * = Corresponding Author.

CVPR24 ExtDM: Distribution extrapolation diffusion model for video prediction
Zhicheng Zhang#, Junyao Hu#, Wentao Cheng*, Danda Paudel, Jufeng Yang

TL;DR: We present ExtDM, a new diffusion model that extrapolates video content from current frames by accurately modeling distribution shifts towards future frames.

📃 Paper 📃 中译版 📦 Code ⚒️ Project 📊 Poster 📅 Slide 🎞️ Video (Bilibili)

Abstract: Video prediction is a challenging task due to its nature of uncertainty, especially for forecasting a long period. To model the temporal dynamics, advanced methods benefit from the recent success of diffusion models, and repeatedly refine the predicted future frames with 3D spatiotemporal U-Net. However, there exists a gap between the present and future and the repeated usage of U-Net brings a heavy computation burden. To address this, we propose a diffusion-based video prediction method that predicts future frames by extrapolating the present distribution of features, namely ExtDM. Specifically, our method consists of three components: (i) a motion autoencoder conducts a bijection transformation between video frames and motion cues; (ii) a layered distribution adaptor module extrapolates the present features in the guidance of Gaussian distribution; (iii) a 3D U-Net architecture specialized for jointly fusing guidance and features among the temporal dimension by spatiotemporal-window attention. Extensive experiments on five popular benchmarks covering short- and long-term video prediction verify the effectiveness of ExtDM.

title={ExtDM: Distribution Extrapolation Diffusion Model for Video Prediction},
author={Zhang, Zhicheng and Hu, Junyao and Cheng, Wentao and Paudel, Danda and Yang, Jufeng},
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},

😅 I’m still working … I can still learn …. Zzz … 😴


Academic Services


  • Conference: CVPR’24, ACMMM’23
  • Transaction: TMM’23