Full Attention Strikes Back: Transferring Full Attention into Sparse within Hundred Training Steps Paper • 2605.16928 • Published 21 days ago • 93
GraphTeam: Facilitating Large Language Model-based Graph Analysis via Multi-Agent Collaboration Paper • 2410.18032 • Published Oct 23, 2024 • 1
ACDiT: Interpolating Autoregressive Conditional Modeling and Diffusion Transformer Paper • 2412.07720 • Published Dec 10, 2024 • 31