Recent theoretical results show transformers cannot express sequential reasoning problems over long inputs, intuitively because their computational *depth* is bounded. Exact expressive power of transformers with padding.corrabs/2505.18948 (2025) manage site settings The research investigates the expressive power of transformers with padding tokens, comparing it to chain of thought methods
38 Taaffe Oconnell Stock Photos, High-Res Pictures, and Images - Getty Images
Our results in this work give a precise theoretical understanding of how padding and looping—two ways to dynamically expand the computational resources of a transformer at inference time—increase the expressive power of transformers.
Ask others google google scholar semantic scholar internet archive scholar citeseerx pubpeer share record bluesky reddit bibsonomy linkedin persistent url