Jiaqi Leng (冷家祺)

prof_pic.jpg

I am a final-year undergraduate student in Computer Science and Technology at Fudan University.

I am currently working with Prof. Yucheng Lu at NYU Shanghai on efficient language modeling, with a focus on byte-level architectures and efficient language modeling. Previously, I worked as a research intern at Ant Group on sparse attention mechanisms for large language models.

My research interests mainly lie in:

  • Efficient deep learning and model architectures
  • Long-context modeling and length extrapolation
  • Sparse attention mechanisms

During Fall 2024, I was an exchange student at The University of Texas at Austin.

Please feel free to reach out! 👋

News

Mar 2026 I will be attending ICLR and presenting our work on length-generalizable sparse attention. See you in Brazil!
Mar 2026 I will join NYU in Fall 2026.

Selected Publications

  1. ICLR 2026
    Understanding and Improving Length Generalization in Hierarchical Sparse Attention Models
    In Proceedings of the 14th International Conference on Learning Representations, 2026
  2. Preprint
    Distilling Token-Trained Models into Byte-Level Models
    Zishuo Bao, Jiaqi Leng, Junxiong Wang, Bowen Peng, and Yucheng Lu
    arXiv preprint, 2026
  3. NeurIPS 2025
    Hardware-aligned Hierarchical Sparse Attention for Efficient Long-term Memory Access
    Xiang Hu, Jiaqi Leng, Jun Zhao, Kewei Tu, and Wei Wu
    In Proceedings of the 39th Conference on Neural Information Processing Systems, 2025