Xuyang Liu xuyang-liu16

🌈 I am Xuyang Liu (刘旭洋), a third-year Master's student at Sichuan University, and an incoming PhD student at PolyU, supervised by Prof. Lei Zhang (IEEE Fellow). I am also currently working as a research intern at OPPO Research Institute. Previously, I have interned at Ant Group focusing on GUI Agent, and Taobao & Tmall Group working on Efficient VLMs. I've also spent half a year visiting MiLAB at Westlake University, supervised by Prof. Donglin Wang. I am fortunate to work closely with Dr. Siteng Huang from DAMO Academy and Prof. Linfeng Zhang from SJTU.

📌 My research centers on efficient Large Vision-Language Models (LVLMs), including:

🖼️ Image Understanding: high-resolution understanding via context compression and fast decoding, including GlobalCom²_[AAAI'26], V²Drop_[CVPR'26], FiCoCo_[AAAI'26], and MixKV_[ICLR'26].
🎬 Video Understanding: long/audio-video, and streaming reasoning via efficient encoding and compression, including VidCom²_[EMNLP'25], STC_[CVPR'26], V-CAST, and OmniSIFT.
🎨 Content Generation: lightweight and efficient AIGC via feature caching, pruning and fast decoding, including ToCa_[ICLR'25], Flash-Unified_{[CVPR'26 Findings]}, and STDec.
⚙️ Efficiency Toolbox: efficient transfer/fine-tuning and benchmarking for downstream task adaptation, including M2IST_[TCSVT'25], V-PETL_[NeurIPS'24] and AutoGnothi_[ICLR'25].

📢 If you find these directions interesting, feel free to reach out via email: liuxuyang@stu.scu.edu.cn.

Provide feedback