Active Learning for Computationally Efficient Distribution of Binary Evolution Simulations


Rocha, Kyle Akira; Andrews, Jeff J.; Berry, Christopher P. L.; Doctor, Zoheyr; Katsaggelos, Aggelos K.; Serra Pérez, Juan Gabriel; Marchant, Pablo; Kalogera, Vicky; Coughlin, Scott; Bavera, Simone S.; Dotter, Aaron; Fragos, Tassos; Kovlakas, Konstantinos; Misra, Devina; Xing, Zepei; Zapartas, Emmanouil (2022)


Binary stars undergo a variety of interactions and evolutionary phases, critical for predicting and explaining observations. Binary population synthesis with full simulation of stellar structure and evolution is computationally expensive, requiring a large number of mass-transfer sequences. The recently developed binary population synthesis code POSYDON incorporates grids of MESA binary star simulations that are interpolated to model large-scale populations of massive binaries. The traditional method of computing a high-density rectilinear grid of simulations is not scalable for higher-dimension grids, accounting for a range of metallicities, rotation, and eccentricity. We present a new active learning algorithm, psy-cris, which uses machine learning in the data-gathering process to adaptively and iteratively target simulations to run, resulting in a custom, high-performance training set. We test psy-cris on a toy problem and find the resulting training sets require fewer simulations for accurate classification and regression than either regular or randomly sampled grids. We further apply psy-cris to the target problem of building a dynamic grid of MESA simulations, and we demonstrate that, even without fine tuning, a simulation set of only ~1/4 the size of a rectilinear grid is sufficient to achieve the same classification accuracy. We anticipate further gains when algorithmic parameters are optimized for the targeted application. We find that optimizing for classification only may lead to performance losses in regression, and vice versa. Lowering the computational cost of producing grids will enable new population synthesis codes such as POSYDON to cover more input parameters while preserving interpolation accuracies.