Announcement_11

Our paper “Cost-Efficient Large Language Model Serving for Multi-turn Conversations with CachedAttention” was accepted by USENIX ATC’2024. Congratulations to Bin Gao!