T5-ARC: Test-Time Training for Transductive Transformer Models in ARC-AGI Challenge

AML Course Project, 2025

The Abstraction and Reasoning Corpus (ARC-AGI) benchmark has emerged as a key challenge in evaluating machine intelligence by emphasizing skill generalization on novel tasks. Recent advancements primarily utilize Test-Time Training (TTT) to enhance reasoning ability of Large Language Models (LLMs). In this project, we focus on TTT for transductive models (end-to-end transformer models that directly generate the output grid) and develop our pipeline following the SOTA methods, which consists of three steps: Base Model Training, TTT and Active Inference. We have conducted detailed experiments to investigate on the effects of different base models, TTT strategies, data and model sizes and the use of negative training samples. Notably, a small transformer model trained from scratch equipped with TTT achieves performance comparable to SOTA LLMs, revealing the potentials of specialized models for certain reasoning tasks.