Jiaqi Leng (冷家祺)

Jiaqi Leng portrait

I am a final-year undergraduate student in Computer Science and Technology at Fudan University.

My research focuses on efficient and generalizable language modeling, with an emphasis on machine learning systems and model architectures. I have been fortunate to work with Prof. Yucheng Lu. I also researched sparse attention mechanisms for large language models as a research intern at Ant Group. In Fall 2024, I studied at The University of Texas at Austin as an exchange student.

My research interests include:

  • Efficient deep learning and model architectures
  • Long-context modeling and length extrapolation
  • Sparse attention mechanisms

I welcome opportunities for academic collaboration and discussion.


News

Apr 2026 I attended ICLR 2026 in Brazil and presented our work on length-generalizable sparse attention. Poster PDF

Selected Publications

  1. ICLR 2026
    Understanding and Improving Length Generalization in Hierarchical Sparse Attention Models
    Understanding and Improving Length Generalization in Hierarchical Sparse Attention Models
    In Proceedings of the 14th International Conference on Learning Representations, 2026
  2. Preprint
    Distilling Token-Trained Models into Byte-Level Models
    Distilling Token-Trained Models into Byte-Level Models
    Zishuo Bao, Jiaqi Leng, Junxiong Wang, Bowen Peng, and Yucheng Lu
    arXiv preprint, 2026
  3. NeurIPS 2025
    Hardware-aligned Hierarchical Sparse Attention for Efficient Long-term Memory Access
    Hardware-aligned Hierarchical Sparse Attention for Efficient Long-term Memory Access
    Xiang Hu, Jiaqi Leng, Jun Zhao, Kewei Tu, and Wei Wu
    In Proceedings of the 39th Conference on Neural Information Processing Systems, 2025