Jiaqi Leng (冷家祺)

Jiaqi Leng portrait

I am currently completing my undergraduate studies in Computer Science and Technology at Fudan University.

Previously, I was very fortunate to work with Prof. Yucheng Lu. I also worked as a research intern at Ant Group, where I studied sparse attention mechanisms for large language models.

My research interests include:

  • Efficient deep learning and model architectures
  • Long-context modeling and length extrapolation
  • Sparse attention mechanisms

In Fall 2024, I was an exchange student at The University of Texas at Austin.

I welcome opportunities for academic collaboration and discussion.


News

Apr 2026 I attended ICLR 2026 in Brazil and presented our work on length-generalizable sparse attention. Poster PDF

Selected Publications

  1. ICLR 2026
    Understanding and Improving Length Generalization in Hierarchical Sparse Attention Models
    Understanding and Improving Length Generalization in Hierarchical Sparse Attention Models
    In Proceedings of the 14th International Conference on Learning Representations, 2026
  2. Preprint
    Distilling Token-Trained Models into Byte-Level Models
    Distilling Token-Trained Models into Byte-Level Models
    Zishuo Bao, Jiaqi Leng, Junxiong Wang, Bowen Peng, and Yucheng Lu
    arXiv preprint, 2026
  3. NeurIPS 2025
    Hardware-aligned Hierarchical Sparse Attention for Efficient Long-term Memory Access
    Hardware-aligned Hierarchical Sparse Attention for Efficient Long-term Memory Access
    Xiang Hu, Jiaqi Leng, Jun Zhao, Kewei Tu, and Wei Wu
    In Proceedings of the 39th Conference on Neural Information Processing Systems, 2025