SymMatika: Structure-Aware Symbolic Discovery

arXiv

Michael Scherk, Boyuan Chen

SymMatika is a high-performing symbolic regression framework. Given data 𝒟 E (explicit) or 𝒟 I (implicit), a parameterized generator θ produces m initial populations.

Summary

Symbolic regression (SR) seeks to recover closed-form mathematical expressions that describe observed data. While existing methods have advanced the discovery of either explicit mappings (i.e., y=f(x)) or discovering implicit relations (i.e., F(x,y)=0), few modern and accessible frameworks support both. Moreover, most approaches treat each expression candidate in isolation, without reusing recurring structural patterns that could accelerate search. We introduce SymMatika, a hybrid SR algorithm that combines multi-island genetic programming (GP) with a reusable motif library inspired by biological sequence analysis. SymMatika identifies high-impact substructures in top-performing candidates and reintroduces them to guide future generations. Additionally, it incorporates a feedback-driven evolutionary engine and supports both explicit and implicit relation discovery using implicit-derivative metrics. Across benchmarks, SymMatika achieves state-of-the-art recovery rates, achieving 5.1% higher performance than the previous best results on Nguyen, the first recovery of Nguyen-12, and competitive performance on the Feynman equations. It also recovers implicit physical laws from Eureqa datasets up to 100× faster. Our results demonstrate the power of structure-aware evolutionary search for scientific discovery. To support broader research in interpretable modeling and symbolic discovery, we have open-sourced the full SymMatika framework.

Citation

Scherk, Michael, and Boyuan Chen. “SymMatika: Structure-Aware Symbolic Discovery.” arXiv preprint arXiv:2507.03110 (2025).

BibTex

@article{scherk2025symmatika, title={SymMatika: Structure-Aware Symbolic Discovery}, author={Scherk, Michael and Chen, Boyuan}, journal={arXiv preprint arXiv:2507.03110}, year={2025} }

Collaborators:

Referenced Research: