注意力机制

目录

注意力机制#

Sparse VideoGen: Accelerating Video Diffusion Transformers with Spatial-Temporal Sparsity#

paper | code

Sparse VideoGen2: Accelerate Video Generation with Sparse Attention via Semantic-Aware Permutation#

paper

Training-free and Adaptive Sparse Attention for Efficient Long Video Generation#

paper

DSV: Exploiting Dynamic Sparsity to Accelerate Large-Scale Video DiT Training#

paper

MMInference: Accelerating Pre-filling for Long-Context VLMs via Modality-Aware Permutation Sparse Attention#

paper

FPSAttention: Training-Aware FP8 and Sparsity Co-Design for Fast Video Diffusion#

paper

VORTA: Efficient Video Diffusion via Routing Sparse Attention#

paper

Training-Free Efficient Video Generation via Dynamic Token Carving#

paper

RainFusion: Adaptive Video Generation Acceleration via Multi-Dimensional Visual Redundancy#

paper

Radial Attention: O(nlogn) Sparse Attention with Energy Decay for Long Video Generation#

paper

VMoBA: Mixture-of-Block Attention for Video Diffusion Models#

paper

SpargeAttention: Accurate and Training-free Sparse Attention Accelerating Any Model Inference#

paper | code

Fast Video Generation with Sliding Tile Attention#

paper | code

PAROAttention: Pattern-Aware ReOrdering for Efficient Sparse and Quantized Attention in Visual Generation Models#

paper

Generalized Neighborhood Attention: Multi-dimensional Sparse Attention at the Speed of Light#

paper

Astraea: A GPU-Oriented Token-wise Acceleration Framework for Video Diffusion Transformers#

paper

∇NABLA: Neighborhood Adaptive Block-Level Attention#

paper code

Compact Attention: Exploiting Structured Spatio-Temporal Sparsity for Fast Video Generation#

paper

A Survey of Efficient Attention Methods: Hardware-efficient, Sparse, Compact, and Linear Attention#

paper

Bidirectional Sparse Attention for Faster Video Diffusion Training#

paper

Mixture of Contexts for Long Video Generation#

paper

LoViC: Efficient Long Video Generation with Context Compression#

paper

MagiAttention: A Distributed Attention Towards Linear Scalability for Ultra-Long Context, Heterogeneous Mask Training#

paper code

DraftAttention: Fast Video Diffusion via Low-Resolution Attention Guidance#

paper code

XAttention: Block Sparse Attention with Antidiagonal Scoring#

paper code

VSA: Faster Video Diffusion with Trainable Sparse Attention#

paper code

QuantSparse: Comprehensively Compressing Video Diffusion Transformer with Model Quantization and Attention Sparsification#

paper

SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse-Linear Attention#

paper