youtube image
From YouTube: LSST DESC 2nd Data Challenge Image Simulation Campaign with Parallel Python Workflows, A Villareal

Description

The Vera Rubin Observatory LSST is going to provide the astrophysics community with an unprecedented amount of survey data with which to contain the evolution of the universe through time. In order to leverage this dataset, we will ultimately require extensive simulations in order to validate scientific pipelines ahead of the survey ever seeing light. The LSST Dark Energy Science Collaboration (DESC) Second Data Challenge (DC2) represents the largest simulated sky survey of its complexity. Generating such a simulation required managing a complicated and rapidly changing workflow across multiple compute resources. We demonstrate how we utilize containerization and the Parsl parallel scripting library in order to create a portable and scalable workflow to meet the challenges of this computational task. With this workflow we were able to generate a simulated survey volume covering 300 square degrees and five years of image depth, utilizing 100M hours of compute and up to 2000 Cori KNL nodes at a time. We discuss possible improvements that could be made to the workflow for future survey simulation, both from the standpoint of utilizing the increasingly common workflow nodes at high performance computing (HPC) centers and that of how the underlying image simulation code may be altered to benefit more from computing at these scales.