Multi-facility Workflow Case Study
For one experiment at the Linac Coherent Light Source (LCLS) at SLAC National Lab, scientists needed to process their raw data to analyze catalytic reactions. This experiment required over 100 terabytes of data to be processed in semi-real-time so that instruments could be adjusted between 12-hour shifts. To do this, they needed a reliable, high-speed network (ESnet) and a special allocation at NERSC for 150 TB to process their data.
The detector for this crystallography experiment took 120 images per second with each image the size of 10 MB. This yielded 1.2 GB/s with potentially 4.3 TB/hr at full capacity. As this data was acquired, it was sent via ESnet from SLAC to NERSC for processing. The computational engine behind the analysis was NERSC's Mendel cluster. This cluster uses modern software adaptations of standard High Performance Computing (HPC) batch queuing which enables High Throughput Computing (HTC). The experiment data was processed in semi-real-time, which allowed the scientists to adjust the experimental equipment as needed to optimize the scientists’ time at the facility.
As part of a multi-facility collaboration between the LCLS, ESnet, and NERSC, ESnet was responsible for providing reliable, high-speed connectivity between NERSC and LCLS. ESnet monitored and tracked the experiment-specific data while it was being transferred (see figure 1), with the network running at an extremely high capacity: 96% of the total 10G circuit capacity. The amount of data the scientists sent across ESnet totaled 113.6 TB over five days.
Because the scientists had the ability to perform semi-real-time analysis, they were able to make more effective use of valuable LCLS beam time, thereby enhancing the value of the facilities’ resources. During the experiment, there were some points of congestion (see figure 2), which gave the facilities valuable operational information that will allow for better optimization of the infrastructure to support future experiments. As a whole though, the multi-facility data relay was a success.