Publications

Authors marked with * are corresponding authors; those marked with † are co-first authors.

2025

  1. arXiv
    Prefill-Decode Aggregation or Disaggregation? Unifying Both for Goodput-Optimized LLM Serving
    Chao Wang, Pengfei Zuo*, Zhangyu Chen, Yunkai Liang, Zhou Yu, and Ming-Chang Yang
    arXiv preprint arXiv:2508.01989, 2025
  2. Technical Report
    Serving Large Language Models on Huawei CloudMatrix384
    Pengfei Zuo, Huimin Lin, Junbo Deng, Nan Zou, Xingkun Yang, Yingyu Diao, Weifeng Gao, Ke Xu, Zhangyu Chen, and 37 more authors
    arXiv preprint arXiv:2506.12708, 2025
  3. arXiv
    Injecting Adrenaline into LLM Serving: Boosting Resource Utilization and Throughput via Attention Disaggregation
    Yunkai Liang, Zhangyu Chen, Pengfei Zuo*, Zhi Zhou*, Xu Chen, and Zhou Yu
    arXiv preprint arXiv:2503.20552, 2025
  4. arXiv
    Progressive Sparse Attention: Algorithm and System Co-design for Efficient Attention in LLM Serving
    Qihui Zhou, Peiqi Yin, Pengfei Zuo*, and James Cheng
    arXiv preprint arXiv:2503.00392, 2025
  5. arXiv
    Efficient Unified Caching for Accelerating Heterogeneous AI Workloads
    Tianze Wang, Yifei Liu, Chen Chen, Pengfei Zuo, Jiawei Zhang, Qizhen Weng, Yin Chen, Zhenhua Han, Jieru Zhao, and 2 more authors
    arXiv preprint arXiv:2506.12370, 2025
  6. AAAI
    AdaSkip: Adaptive Sublayer Skipping for Accelerating Long-Context LLM Inference
    Zhuomin He, Yizhen Yao, Pengfei Zuo, Bin Gao, Qinya Li, Zhenzhe Zheng, and Fan Wu
    In Proceedings of the 39th Annual AAAI Conference on Artificial Intelligence (AAAI), 2025

2024

  1. USENIX ATC
    Cost‑Efficient Large Language Model Serving for Multi‑turn Conversations with CachedAttention
    Bin Gao, Zhuomin He, Puru Sharma, Qingxuan Kang, Djordje Jevdjic, Junbo Deng, Xingkun Yang, Zhou Yu, and Pengfei Zuo*
    In Proceedings of the 2024 USENIX Annual Technical Conference (USENIX ATC), 2024
  2. SOSP
    Aceso: Achieving Efficient Fault Tolerance in Memory‑Disaggregated Key‑Value Stores
    Zhisheng Hu, Pengfei Zuo*, Yizou Chen, Chao Wang, Junliang Hu, and Ming‑Chang Yang*
    In Proceedings of the 30th ACM Symposium on Operating Systems Principles (SOSP), 2024
  3. SOSP
    CHIME: A Cache‑Efficient and High‑Performance Hybrid Index on Disaggregated Memory
    Xuchuan Luo, Jiacheng Shen, Pengfei Zuo, Xin Wang, Michael R. Lyu, and Yangfan Zhou
    In Proceedings of the 30th ACM Symposium on Operating Systems Principles (SOSP), 2024
  4. TOS
    A Memory‑disaggregated Radix Tree
    Xuchuan Luo, Pengfei Zuo, Jiacheng Shen, Jiazhen Gu, Xin Wang, Michael R. Lyu, and Yangfan Zhou
    ACM Transactions on Storage (TOS), 2024

2023

  1. SOSP
    Ditto: An Elastic and Adaptive Memory‑Disaggregated Caching System
    Jiacheng Shen, Pengfei Zuo*, Xuchuan Luo, Yuxin Su, Jiazhen Gu, Hao Feng, Yangfan Zhou, and Michael R. Lyu
    In Proceedings of the 29th ACM Symposium on Operating Systems Principles (SOSP), 2023
  2. OSDI
    SMART: A High‑Performance Adaptive Radix Tree for Disaggregated Memory
    Xuchuan Luo, Pengfei Zuo*, Jiacheng Shen, Jiazhen Gu, Xin Wang, Michael R. Lyu, and Yangfan Zhou*
    In Proceedings of the 17th USENIX Symposium on Operating Systems Design and Implementation (OSDI), 2023
  3. FAST
    ROLEX: A Scalable RDMA‑oriented Learned Key‑Value Store for Disaggregated Memory Systems
    Pengfei Li, Yu Hua, Pengfei Zuo, Zhangyu Chen, and Jiajie Sheng
    In Proceedings of the 21st USENIX Conference on File and Storage Technologies (FAST), 2023
  4. FAST
    FUSEE: A Fully Memory‑Disaggregated Key‑Value Store
    Jiacheng Shen, Pengfei Zuo*, Xuchuan Luo, Tianyi Yang, Yuxin Su, Yangfan Zhou, and Michael R. Lyu
    In Proceedings of the 21st USENIX Conference on File and Storage Technologies (FAST), 2023
  5. TPDS
    Enabling Efficient Erasure Coding in Disaggregated Memory Systems
    Qiliang Li, Liangliang Xu, Yongkun Li, Min Lyu, Wei Wang, Pengfei Zuo, and Yinlong Xu
    IEEE Transactions on Parallel and Distributed Systems (TPDS), 2023
  6. TKDE
    A Fast Learned Key‑Value Store for Concurrent and Distributed Systems
    Pengfei Li, Yu Hua, Jingnan Jia, and Pengfei Zuo
    IEEE Transactions on Knowledge and Data Engineering (TKDE), 2023
  7. JCST
    Reinvent Cloud Software Stacks for Resource Disaggregation
    Chenxi Wang, Yizhou Shan, Pengfei Zuo, and Huimin Cui
    Journal of Computer Science and Technology (JCST), 2023
  8. HotOS
    Skadi: Building a Distributed Runtime for Data Systems in Disaggregated Data Centers
    Cunchen Hu, Chenxi Wang, Sa Wang, Ninghui Sun, Yungang Bao, Jieru Zhao, Sanidhya Kashyap, Pengfei Zuo, Xusheng Chen, and 4 more authors
    In Workshop on Hot Topics in Operating Systems (HotOS), 2023
  9. TOS
    A High‑Performance RDMA‑oriented Learned Key‑Value Store for Disaggregated Memory Systems
    Pengfei Li, Yu Hua, Pengfei Zuo, Zhangyu Chen, and Jiajie Sheng
    ACM Transactions on Storage (TOS), 2023
  10. TOS
    Localized Validation Accelerates Distributed Transactions on Disaggregated Persistent Memory
    Ming Zhang, Yu Hua, Pengfei Zuo, and Lurong Liu
    ACM Transactions on Storage (TOS), 2023

2022

  1. TACO
    Lock‑Free High‑Performance Hashing for Persistent Memory via PM‑Aware Holistic Optimization
    Zhangyu Chen, Yu Hua, Luochangqi Ding, Bo Ding, Pengfei Zuo, and Xue Liu
    ACM Transactions on Architecture and Code Optimization (TACO), 2022
  2. USENIX ATC
    uKharon: A Membership Service for Microsecond Applications
    Rachid Guerraoui, Antoine Murat, Javier Picorel, Athanasios Xygkis, Huabing Yan, and Pengfei Zuo
    In Proceedings of the USENIX Annual Technical Conference (USENIX ATC), 2022
  3. TOS
    RACE: One‑Sided RDMA‑Conscious Extendible Hashing
    Pengfei Zuo, Qihui Zhou, Jiazhao Sun, Liu Yang, Shuangwu Zhang, Yu Hua, James Cheng, Rongfeng He, and Huabing Yan
    ACM Transactions on Storage (TOS), 2022
  4. FAST
    FORD: Fast One‑sided RDMA‑based Distributed Transactions for Disaggregated Persistent Memory
    Ming Zhang, Yu Hua, Pengfei Zuo, and Lurong Liu
    In Proceedings of the 20th USENIX Conference on File and Storage Technologies (FAST), 2022
  5. VLDB
    FINEdex: A Fine‑grained Learned Index Scheme for Scalable and Concurrent Memory Systems
    Pengfei Li, Yu Hua, Jingnan Jia, and Pengfei Zuo
    In Proceedings of the 48th International Conference on Very Large Data Bases (VLDB), 2022

2021

  1. DAC
    SEALing Neural Network Models in Encrypted Deep Learning Accelerators
    Pengfei Zuo, Yu Hua, Ling Liang, Xingfeng Xie, Xing Hu, and Yuan Xie
    In Proceedings of the 58th Design Automation Conference (DAC), 2021
  2. USENIX ATC
    One-sided RDMA-Conscious Extendible Hashing for Disaggregated Memory
    Pengfei Zuo, Jiazhao Sun, Liu Yang, Shuangwu Zhang, and Yu Hua
    In Proceedings of the USENIX Annual Technical Conference (USENIX ATC), 2021
  3. TCAD
    Practical Deep Neural Network Attacks through Memory Trojaning
    Xing Hu, Yang Zhao, Lei Deng, Ling Liang, Pengfei Zuo, Jing Ye, Yingyan Lin, and Yuan Xie
    IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), 2021

2020

  1. USENIX ATC
    Lock-free Concurrent Level Hashing for Persistent Memory
    Zhangyu Chen, Yu Hua, Bo Ding, and Pengfei Zuo
    In Proceedings of the USENIX Annual Technical Conference (USENIX ATC), 2020
  2. ASPLOS
    DeepSniffer: a DNN Model Extraction Framework Based on Learning Architectural Hints
    Xing Hu, Ling Liang, Shuangchen Li, Lei Deng, Pengfei Zuo, Yu Ji, Xinfeng Xie, Yufei Ding, Chang Liu, and 2 more authors
    In Proceedings of the 25th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2020
  3. DAC
    Reducing Bit Writes in Non-volatile Main Memory by Similarity-aware Compression
    Zhangyu Chen, Yu Hua, Pengfei Zuo, Yuanyuan Sun, and Yuncheng Guo
    In Proceedings of the 57th Design Automation Conference (DAC), 2020
  4. TCAD
    A Latency-optimized and Energy-efficient Write Scheme in NVM-based Main Memory
    Yuncheng Guo, Yu Hua, and Pengfei Zuo
    IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), 2020
  5. ICPP
    An Efficient Wear-level Architecture using Self-adaptive Wear Leveling
    Jianming Huang, Yu Hua, Pengfei Zuo, Wen Zhou, and Fangting Huang
    In Proceedings of the 49th International Conference on Parallel Processing (ICPP), 2020

2019

  1. MICRO
    SuperMem: Enabling Application-transparent Secure Persistent Memory with Low Overheads
    Pengfei Zuo, Yu Hua, and Yuan Xie
    In Proceedings of the 52nd IEEE/ACM International Symposium on Microarchitecture (MICRO), 2019
  2. TOS
    Level Hashing: A High-performance and Flexible-resizing Persistent Hashing Index Structure
    Pengfei Zuo, Yu Hua, and Jie Wu
    ACM Transactions on Storage (TOS), 2019
  3. IEEE Micro
    Write Deduplication and Hash Mode Encryption for Secure Non-volatile Main Memory
    Pengfei Zuo, Yu Hua, Ming Zhao, Wen Zhou, and Yuncheng Guo
    IEEE Micro, 2019
  4. TPDS
    Bandwidth and Energy Efficient Image Sharing for Situation Awareness in Disasters
    Pengfei Zuo, Yu Hua, Yuanyuan Sun, Xue Liu, Jie Wu, Yuncheng Guo, Wen Xia, Shunde Cao, and Dan Feng
    IEEE Transactions on Parallel and Distributed Systems (TPDS), 2019
  5. TPDS
    Improving Restore Performance in Deduplication Systems via a Cost-Efficient Rewriting Scheme
    Jie Wu, Yu Hua, Pengfei Zuo, and Yuanyuan Sun
    IEEE Transactions on Parallel and Distributed Systems (TPDS), 2019

2018

  1. DATE
    DFPC: A Dynamic Frequent Pattern Compression Scheme in NVM-based Main Memory
    Yuncheng Guo, Yu Hua, and Pengfei Zuo
    In Proceedings of the 21st Design Automation and Test in Europe (DATE), 2018
  2. HotStorage
    SecPM: a Secure and Persistent Memory System for Non-volatile Memory
    Pengfei Zuo and Yu Hua
    In Proceedings of the 10th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage), 2018
  3. IPDPS
    Mitigating Traffic-based Side Channel Attacks in Bandwidth-efficient Cloud Storage
    Pengfei Zuo, Yu Hua, Cong Wang, Wen Xia, Shunde Cao, Yukun Zhou, and Yuanyuan Sun
    In Proceedings of the 32nd IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2018
  4. MICRO
    Improving the Performance and Endurance of Encrypted Non-volatile Main Memory through Deduplicating Writes
    Pengfei Zuo, Yu Hua, Ming Zhao, Wen Zhou, and Yuncheng Guo
    In Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), 2018
  5. OSDI
    Write-Optimized and High-Performance Hashing Index Scheme for Persistent Memory
    Pengfei Zuo, Yu Hua, and Jie Wu
    In Proceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI), 2018
  6. TPDS
    A Write-friendly and Cache-optimized Hashing Scheme for Non-volatile Memory Systems
    Pengfei Zuo and Yu Hua
    IEEE Transactions on Parallel and Distributed Systems (TPDS), 2018

2017

  1. SoCC
    DLSH: A Distribution-aware LSH Scheme for Approximate Nearest Neighbor Query in Cloud Computing
    Yuanyuan Sun, Yu Hua, Shunde Cao, and Pengfei Zuo
    In Proceedings of ACM Symposium on Cloud Computing (SoCC), 2017
  2. USENIX ATC
    SmartCuckoo: A Fast and Cost-Efficient Hashing Index Scheme for Cloud Storage Systems
    Yuanyuan Sun, Yu Hua, Song Jiang, Qiuyu Li, Shunde Cao, and Pengfei Zuo
    In Proceedings of USENIX Annual Technical Conference (USENIX ATC), 2017
  3. MSST
    A Write-friendly Hashing Scheme for Non-volatile Memory Systems
    Pengfei Zuo and Yu Hua
    In Proceedings of the 33rd International Conference on Massive Storage Systems and Technology (MSST), 2017
  4. MSST
    A Cost-efficient Rewriting Scheme to Improve Restore Performance in Deduplication Systems
    Jie Wu, Yu Hua, Pengfei Zuo, and Yuanyuan Sun
    In Proceedings of the 33rd International Conference on Massive Storage Systems and Technology (MSST), 2017
  5. ICDCS
    BEES: Bandwidth- and Energy- Efficient Image Sharing for Real-time Situation Awareness
    Pengfei Zuo, Yu Hua, Xue Liu, Dan Feng, Wen Xia, Shunde Cao, Jie Wu, Yuanyuan Sun, and Yuncheng Guo
    In Proceedings of the 37th International Conference on Distributed Computing Systems (ICDCS), 2017
  6. TPDS
    A Collision-Mitigation Cuckoo Hashing Scheme for Large-scale Storage Systems
    Yuanyuan Sun, Yu Hua, Dan Feng, Ling Yang, Pengfei Zuo, and Shunde Cao
    IEEE Transactions on Parallel and Distributed Systems (TPDS), 2017

2016

  1. ICPADS
    Increasing Lifetime and Security of Phase-Change Memory with Endurance Variation
    Wen Zhou, Dan Feng, Yu Hua, Jingning Liu, Fangting Huang, and Pengfei Zuo
    In Proceedings of the IEEE International Conference on Parallel and Distributed Systems (ICPADS), 2016

2015

  1. MSST
    MinCounter: An Efficient Cuckoo Hashing Scheme for Cloud Storage Systems
    Yuanyuan Sun, Yu Hua, Dan Feng, Ling Yang, Pengfei Zuo, and Shunde Cao
    In Proceedings of the 31st International Conference on Massive Storage Systems and Technology (MSST), 2015