Ecco: Improving Memory Bandwidth and Capacity for LLMs via Entropy-aware Cache Compression
Published in International Symposium on Computer Architecture (ISCA), 2025
Recommended citation: Cheng, F., Guo, C., Wei, C., Zhang, J., Zhou, C., Hanson, E., ... & Chen, Y. (2025). Ecco: Improving Memory Bandwidth and Capacity for LLMs via Entropy-aware Cache Compression. arXiv preprint arXiv:2505.06901.
Download Paper