Each paper has Paper link, Code/Project link, and one-click copy for Citation & BibTeX.
Published in CVPR 2026, 2025
We propose STC, the first plug-and-play hierarchical token compression framework for streaming VideoLLMs...
@inproceedings{wang2026stc,
title={Accelerating Streaming Video Large Language Models via Hierarchical Token Compression},
author={Wang, Yiyu and Liu, Xuyang and Gui, Xiyan and Lin, Xinying and Yang, Boxue and Liao, Chenfei and Chen, Tailai and Zhang, Linfeng},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
year={2026}
}
Published in CVPR 2026, 2025
We propose V2Drop, a variation-aware method that identifies and progressively drops lazy tokens...
@inproceedings{chen2026v2drop,
title={Variation-aware Vision Token Dropping for Faster Large Vision-Language Models},
author={Chen, Junjie and Liu, Xuyang and Wen, Zichen and Wang, Yiyu and Huang, Siteng and Chen, Honggang},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
year={2026}
}
Published in EMNLP 2025 Main Conference, 2025
We propose VidCom2, a plug-and-play inference acceleration framework for VideoLLMs...
@inproceedings{liu2025vidcom2,
title={Video Compression Commander: Plug-and-Play Inference Acceleration for Video Large Language Models},
author={Liu, Xuyang and Wang, Yiyu and Ma, Junpeng and Zhang, Linfeng},
booktitle={Proceedings of the Conference on Empirical Methods in Natural Language Processing},
year={2025}
}
Published in ACL 2026 Main Track (Under Review), 2025
Preview for lern-to-write.github.io — 8 papers total