Publications

Towards understanding how transformer perform multi-step reasoning with matching operation

Zhiwei Wang, Yunji Wang, Zhongwang Zhang, Zhangchen Zhou, Hui Jin, Tianyang Hu, Jiacheng Sun, Zhenguo Li, Yaoyu Zhang, Zhi-Qin John Xu, 2025.

Submitted to The Forty-second International Conference on Machine Learning (ICML 2025).

We propose a buffer mechanism and found evidence that supports such mechanism being employed by language models during the reasoning process. We propose a method to enhance the model’s reasoning capability, significantly improving data utilization efficiency in logical reasoning datasets.

Download [pdf].