publications

2025

  1. Preprint
    Understanding and Improving Length Generalization in Hierarchical Sparse Attention Models
    arXiv preprint, 2025
  2. NeurIPS
    Hardware-aligned Hierarchical Sparse Attention for Efficient Long-term Memory Access
    Xiang Hu, Jiaqi Leng, Jun Zhao, Kewei Tu, and Wei Wu
    2025