About Chiyue Wei

Chiyue Wei is a Ph.D. student in Electrical and Computer Engineering at Duke University, working under the supervision of Professor Yiran Chen. His research interests lie at the intersection of computer architecture and deep learning. Prior to Duke, he earned his Bachelorโ€™s degree in Electronic Engineering from Tsinghua University in 2023, where he conducted research with Professor Yuan Xie and Professor Yu Wang.

๐Ÿ”ฅ News

  • [2025/08] Check out our work DPad, a training-free acceleration method for Diffusion LLMs, now available on on arXiv!
  • [2025/08] Wrapped up my internship at NVIDIA, where I worked on the FlashInfer project. I developed high-performance and customizable attention kernels with CuTe DSL, optimized for Blackwell GPUs.
  • [2025/06] Honored to be named a DAC 2025 Young Fellow.
  • [2025/06] Excited that our works Phi, Transitive Array, and Ecco were presented at ISCA 2025, check out the slides for Phi.
  • [2025/05] ๐ŸŽ‰ Iโ€™m excited to start my summer internship at NVIDIA, focusing on LLM inference framework optimization within the Deep Learning Frameworks team.

  • [2025/03] ๐ŸŽ‰๐ŸŽ‰๐ŸŽ‰ Three papers accepted by ISCA 2025! Topics include acceleration for Spiking Neural Networks, General Matrix Multiplications, and Large Language Models.
  • [2025/03] ๐Ÿ”ฅ I presents Prosperity at HPCA 2025 in Las Vegas! Check out presentation slides and video.
  • [2024/11] ๐ŸŽ‰ Our paper โ€œProsperity: Accelerating Spiking Neural Networks via Product Sparsityโ€ is accepted by HPCA 2025.

๐Ÿ“ Selected Publications