Home
People
Events
Research
Publications
Contact
News
WindowQuant: Mixed-Precision KV Cache Quantization based on Window-Level Similarity for VLMs Inference Optimization
Wei Tao
,
Xiaoyang Qu
,
Peiqiang Wang
,
Guokuan Li
,
Jiguang Wan
,
Kai Lu
,
Jianzong Wang
September 2026
Cite
Abstract
TBD
Type
2
Publication
In
Transactions on Architecture and Code Optimization
Click the
Cite
button above to demo the feature to enable visitors to import publication metadata into their reference management software.
VLM
Jianzong Wang
Honorary Director
[email protected]
Cite
×