Jiaqi Leng (冷家祺)

prof_pic.jpg

I am a final-year undergraduate student in Computer Science and Technology at Fudan University.

I am currently working with Prof. Yucheng Lu at NYU Shanghai on efficient language modeling, with a focus on byte-level architectures and efficient language modeling. Previously, I worked as a research intern at Ant Group on sparse attention mechanisms for large language models.

My research interests mainly lie in:

  • Efficient deep learning and model architectures
  • Long-context modeling and length extrapolation
  • Sparse attention mechanisms

During Fall 2024, I was an exchange student at The University of Texas at Austin.

Please feel free to reach out! 👋

selected publications

  1. ICLR
    Understanding and Improving Length Generalization in Hierarchical Sparse Attention Models
    In Proceedings of the 14th International Conference on Learning Representations, 2026
  2. Preprint
    Distilling Token-Trained Models into Byte-Level Models
    Zishuo Bao, Jiaqi Leng, Junxiong Wang, Bowen Peng, and Yucheng Lu
    arXiv preprint arXiv:2602.01007, 2026
  3. NeurIPS
    Hardware-aligned Hierarchical Sparse Attention for Efficient Long-term Memory Access
    Xiang Hu, Jiaqi Leng, Jun Zhao, Kewei Tu, and Wei Wu
    In Proceedings of the 39th Conference on Neural Information Processing Systems, 2025