Posts by Collection

portfolio

publications

Position: Shifting AI Efficiency From Model-Centric to Data-Centric Compression

Published in ICML 2026 Position Track (Under Review), 2025

A position paper arguing that the focus of efficient AI research is shifting from model-centric compression to data-centric compression, with a systematic review of token compression methods.

Cite: Xuyang Liu, Zichen Wen, Shaobo Wang, Junjie Chen, Zhishan Tao, Yubo Wang, Xiangqi Jin, Chang Zou, Yiyu Wang, Chenfei Liao, Xu Zheng, Honggang Chen, Weijia Li, Xuming Hu, Conghui He, and Linfeng Zhang. (2026). "Shifting AI Efficiency From Model-Centric to Data-Centric Compression." arXiv preprint arXiv:2505.19147.
@article{liu2026shifting,
  title={Shifting AI Efficiency From Model-Centric to Data-Centric Compression},
  author={Liu, Xuyang and Wen, Zichen and Wang, Shaobo and Chen, Junjie and Tao, Zhishan and Wang, Yubo and Jin, Xiangqi and Zou, Chang and Wang, Yiyu and Liao, Chenfei and Zheng, Xu and Chen, Honggang and Li, Weijia and Hu, Xuming and He, Conghui and Zhang, Linfeng},
  journal={arXiv preprint arXiv:2505.19147},
  year={2026}
}

Video Compression Commander: Plug-and-Play Inference Acceleration for Video Large Language Models

Published in EMNLP 2025 Main Conference, 2025

We propose VidCom2, a plug-and-play inference acceleration framework for VideoLLMs that adaptively adjusts compression intensity across frames, effectively preserving essential information while reducing redundancy in video sequences.

Cite: Xuyang Liu, Yiyu Wang, Junpeng Ma, and Linfeng Zhang. (2025). "Video Compression Commander: Plug-and-Play Inference Acceleration for Video Large Language Models." Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP).
@inproceedings{liu2025vidcom2,
  title={Video Compression Commander: Plug-and-Play Inference Acceleration for Video Large Language Models},
  author={Liu, Xuyang and Wang, Yiyu and Ma, Junpeng and Zhang, Linfeng},
  booktitle={Proceedings of the Conference on Empirical Methods in Natural Language Processing},
  year={2025}
}

Variation-aware Vision Token Dropping for Faster Large Vision-Language Models

Published in CVPR 2026, 2025

We propose V2Drop, a variation-aware method that identifies and progressively drops lazy tokens based on their intrinsic behavioral patterns, eliminating positional bias while maintaining compatibility with efficient operators.

Cite: Junjie Chen, Xuyang Liu, Zichen Wen, Yiyu Wang, Siteng Huang, and Honggang Chen. (2026). "Variation-aware Vision Token Dropping for Faster Large Vision-Language Models." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
@inproceedings{chen2026v2drop,
  title={Variation-aware Vision Token Dropping for Faster Large Vision-Language Models},
  author={Chen, Junjie and Liu, Xuyang and Wen, Zichen and Wang, Yiyu and Huang, Siteng and Chen, Honggang},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2026}
}

VTC-Bench: Are We Using the Right Benchmark? An Evaluation Framework for Visual Token Compression Methods

Published in ACL 2026 (Under Review), 2025

We propose VTC-Bench, the first comprehensive evaluation framework for visual token compression methods across image and video understanding tasks, revealing critical insights about current benchmarks.

Cite: Chenfei Liao, Wensong Wang, Zichen Wen, Xu Zheng, Yiyu Wang, Haocong He, Yuanhuiyi Lyu, Lutao Jiang, Xin Zou, Yuqian Fu, Bin Ren, Linfeng Zhang, and Xuming Hu. (2026). "Are We Using the Right Benchmark: An Evaluation Framework for Visual Token Compression Methods." Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL).
@inproceedings{liao2026vtc,
  title={Are We Using the Right Benchmark: An Evaluation Framework for Visual Token Compression Methods},
  author={Liao, Chenfei and Wang, Wensong and Wen, Zichen and Zheng, Xu and Wang, Yiyu and He, Haocong and Lyu, Yuanhuiyi and Jiang, Lutao and Zou, Xin and Fu, Yuqian and Ren, Bin and Zhang, Linfeng and Hu, Xuming},
  booktitle={Proceedings of the Annual Meeting of the Association for Computational Linguistics},
  year={2026}
}

AI for Service: Proactive Assistance with AI Glasses

Published in arXiv Technical Report, 2025

We propose Alpha-Service, a proactive assistance system with AI glasses that enables real-time, context-aware service through multimodal perception and agentic decision making.

Cite: Zichen Wen, Yiyu Wang, Chenfei Liao, Boxue Yang, Junxian Li, Weifeng Liu, Haocong He, Bolong Feng, Xuyang Liu, Yuanhuiyi Lyu, et al. (2025). "AI for Service: Proactive Assistance with AI Glasses." arXiv preprint arXiv:2510.14359.
@article{wen2025aiforservice,
  title={AI for Service: Proactive Assistance with AI Glasses},
  author={Wen, Zichen and Wang, Yiyu and Liao, Chenfei and Yang, Boxue and Li, Junxian and Liu, Weifeng and He, Haocong and Feng, Bolong and Liu, Xuyang and Lyu, Yuanhuiyi and others},
  journal={arXiv preprint arXiv:2510.14359},
  year={2025}
}

Accelerating Streaming Video Large Language Models via Hierarchical Token Compression

Published in CVPR 2026, 2025

We propose STC, the first plug-and-play hierarchical token compression framework for streaming VideoLLMs, optimizing both ViT encoding and LLM pre-filling stages to accelerate real-time video understanding.

Cite: Yiyu Wang, Xuyang Liu, Xiyan Gui, Xinying Lin, Boxue Yang, Chenfei Liao, Tailai Chen, and Linfeng Zhang. (2026). "Accelerating Streaming Video Large Language Models via Hierarchical Token Compression." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
@inproceedings{wang2026stc,
  title={Accelerating Streaming Video Large Language Models via Hierarchical Token Compression},
  author={Wang, Yiyu and Liu, Xuyang and Gui, Xiyan and Lin, Xinying and Yang, Boxue and Liao, Chenfei and Chen, Tailai and Zhang, Linfeng},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2026}
}

Bridging Visual Representation and Reinforcement Learning from Verifiable Rewards in Large Vision-Language Models

Published in ECCV 2026 (Under Review), 2026

We propose KAWHI, a plug-and-play reward reweighting mechanism that explicitly incorporates structured visual information into uniform reward policy optimization methods for LVLMs.

Cite: Yuhang Han, Yuyang Wu, Zhengbo Jiao, Yiyu Wang, Xuyang Liu, Shaobo Wang, Hanlin Xu, Xuming Hu, and Linfeng Zhang. (2026). "Bridging Visual Representation and Reinforcement Learning from Verifiable Rewards in Large Vision-Language Models." arXiv preprint arXiv:2603.27375.
@article{han2026kawhi,
  title={Bridging Visual Representation and Reinforcement Learning from Verifiable Rewards in Large Vision-Language Models},
  author={Han, Yuhang and Wu, Yuyang and Jiao, Zhengbo and Wang, Yiyu and Liu, Xuyang and Wang, Shaobo and Xu, Hanlin and Hu, Xuming and Zhang, Linfeng},
  journal={arXiv preprint arXiv:2603.27375},
  year={2026}
}

V-CAST: Video Curvature-Aware Spatio-Temporal Pruning for Efficient Video Large Language Models

Published in ECCV 2026 (Under Review), 2026

We propose V-CAST, a training-free plug-and-play pruning policy for long-context video inference that casts token compression as a trajectory approximation problem with curvature-guided temporal allocation.

Cite: Xinying Lin, Xuyang Liu, Yiyu Wang, Teng Ma, and Wenqi Ren. (2026). "V-CAST: Video Curvature-Aware Spatio-Temporal Pruning for Efficient Video Large Language Models." arXiv preprint arXiv:2603.27650.
@article{lin2026vcast,
  title={V-CAST: Video Curvature-Aware Spatio-Temporal Pruning for Efficient Video Large Language Models},
  author={Lin, Xinying and Liu, Xuyang and Wang, Yiyu and Ma, Teng and Ren, Wenqi},
  journal={arXiv preprint arXiv:2603.27650},
  year={2026}
}

talks

teaching

Teaching experience 1

Undergraduate course, University 1, Department, 2014

This is a description of a teaching experience. You can use markdown like any other post.

Teaching experience 2

Workshop, University 1, Department, 2015

This is a description of a teaching experience. You can use markdown like any other post.