👋About me.

Curriculum Vitae • Zhihu • Gmail • Email@edu

😄 I am a third-year graduate student of SVIP-Lab, supervised by Prof. Shenghua Gao, at ShanghaiTech University. Before that, I received my Bachelor’s degree in 2020 from Dalian University of Technology. My research interests lie in video understanding and weakly supervised learning, including human activity recognition and video representation learning. I am also focusing on multi-modal learning. More detail please refer to my CV.

🎉 News:

2024-02-17: A paper about multimodal large language model for tool agent. Code and dataset has been released.
2023-02-28: A paper about weakly supervised video representation learning was accepted as CVPR 2023.
2022-03-28: A paper about repetition action counting was accepted as CVPR 2022 Oral.

📄 Selected Publications:

(CVPR2023) Weakly Supervised Video Representation Learning with Unaligned Text for Sequential Videos. [paper][code]
(CVPR2022 Oral) TransRAC: Encoding Multi-scale Temporal Correlation with Transformers for Repetitive Action Counting [paper] [code][dataset][Youtube][Bilibili]

👯 I’m looking to inject some fun and creativity into my work week, and I’m guessing you are too. If you’re keen to join forces and make something awesome happen, shoot me a message and let’s make it a reality!

📧 Reach me: ironieser@gmail.com, dongsx@shanghaitech.edu.cn

Sixun Dong