MultiI-View Ranking: Tasking Transformers to Generate and Validate Solutions to Math Word Problems

Mzimba, Rifumo2024-10-262024-10-262023-11Mzimba, Rifumo. (2023). MultiI-View Ranking: Tasking Transformers to Generate and Validate Solutions to Math Word Problems. [Master's dissertation, University of the Witwatersrand, Johannesburg]. https://hdl.handle.net/10539/41971https://hdl.handle.net/10539/41971A dissertation submitted in fulfillment of the requirements for the degree of Master of Science,to the Faculty of Science, School of Computer Science & Applied Mathematics, University of the Witwatersrand, Johannesburg, 2023.The recent developments and success of the Transformer model have resulted in the creation of massive language models that have led to significant improvements in the comprehension of natural language. When fine-tuned for downstream natural language processing tasks with limited data, they achieve state-of-the-art performance. However, these robust models lack the ability to reason mathematically. It has been demonstrated that, when fine-tuned on the small-scale Math Word Problems (MWPs) benchmark datasets, these models are not able to generalize. Therefore, to overcome this limitation, this study proposes to augment the generative objective used in the MWP task with complementary objectives that can assist the model in reasoning more deeply about the MWP task. Specifically, we propose a multi-view generation objective that allows the model to understand the generative task as an abstract syntax tree traversal beyond the sequential generation task. In addition, we propose a complementary verification objective to enable the model to develop heuristics that can distinguish between correct and incorrect solutions. These two goals comprise our multi-view ranking (MVR) framework, in which the model is tasked to generate the prefix, infix, and postfix traversals for a given MWP, and then use the verification task to rank the generated expressions. Our experiments show that the verification objective is more effective at choosing the best expression than the widely used beam search. We further show that when our two objectives are used in conjunction, they can effectively guide our model to learn robust heuristics for the MWP task. In particular, we achieve an absolute percentage improvement of 9.7% and 5.3% over our baseline and the state-of-the-art models on the SVAMP datasets. Our source code can be found on https://github.com/ProxJ/msc-final.en©2023 University of the Witwatersrand, Johannesburg. All rights reserved. The copyright in this work vests in the University of the Witwatersrand, Johannesburg. No part of this work may be reproduced or transmitted in any form or by any means, without the prior written permission of University of the Witwatersrand, Johannesburg.MultiviewMath word problemsTransformersMulti-taskingVerifiersUCTDSDG-9: Industry, innovation and infrastructureMultiI-View Ranking: Tasking Transformers to Generate and Validate Solutions to Math Word ProblemsDissertationUniversity of the Witwatersrand, Johannesburg