What question did this study set out to answer?

This research aims to assess the ability of various large language models to interpret and generate symbolic mathematical notation.

March 6, 2026Open Access

Ability of selected Large Language Models to process symbolic Math notation

Key Points

This research aims to assess the ability of various large language models to interpret and generate symbolic mathematical notation.
Evaluated 43 large language models from different vendors and architectures.
Used a Python-based testbed for standardized assessment.
Presented each model with prompts containing formal types and logical rules.
Measured correctness and adherence to symbolic output requirements.
Significant variability in performance across the models.
Frontier models exhibited superior correctness and symbolic output capabilities.
Smaller or open-source models showed inconsistent results.

Abstract

This paper presents an empirical investigation into the ability of selected Large Language Models (LLMs) to understand and apply symbolic notation from mathematical science and logical theories, such as Natural Deduction. The study evaluates 43 LLMs from diverse vendors and architectures, testing their capacity to ingest and output results in symbolic notation rather than prose. Each model is presented with a prompt containing a formal type and inference system, contextual rules, and questions requiring symbolic reasoning. The experiment uses a Python-based testbed to standardize evaluation, measuring both the correctness of responses and the adherence to symbolic output constraints. Results reveal significant variability in performance across models, with frontier models demonstrating superior correctness and symbolic output capabilities. Findings show that frontier models consistently outperform others in both correctness and symbolic output, while smaller or open-source models exhibit mixed results. This underscores the need for careful model selection in applications requiring strict formalism and minimal prose output.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Andreas Schmidt

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Ability of selected Large Language Models to process symbolic Math notation

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study