Optimization of Monte Carlo Neutron Transport Simulations with Emerging Architectures

Abstract : Monte Carlo (MC) neutron transport simulations are widely used in the nuclear community to perform reference calculations with minimal approximations. The conventional MC method has a slow convergence according to the law of large numbers, which makes simulations computationally expensive. Cross section computation has been identified as the major performance bottleneck for MC neutron code. Typically, cross section data are precalculated and stored into memory before simulations for each nuclide, thus during the simulation, only table lookups are required to retrieve data from memory and the compute cost is trivial. We implemented and optimized a large collection of lookup algorithms in order to accelerate this data retrieving process. Results show that significant speedup can be achieved over the conventional binary search on both CPU and MIC in unit tests other than real case simulations. Using vectorization instructions has been proved effective on many-core architecture due to its 512-bit vector units; on CPU this improvement is limited by a smaller register size. Further optimization like memory reduction turns out to be very important since it largely improves computing performance. As can be imagined, all proposals of energy lookup are totally memory-bound where computing units does little things but only waiting for data. In another word, computing capability of modern architectures are largely wasted. Another major issue of energy lookup is that the memory requirement is huge: cross section data in one temperature for up to 400 nuclides involved in a real case simulation requires nearly 1 GB memory space, which makes simulations with several thousand temperatures infeasible to carry out with current computer systems.In order to solve the problem relevant to energy lookup, we begin to investigate another on-the-fly cross section proposal called reconstruction. The basic idea behind the reconstruction, is to do the Doppler broadening (performing a convolution integral) computation of cross sections on-the-fly, each time a cross section is needed, with a formulation close to standard neutron cross section libraries, and based on the same amount of data. The reconstruction converts the problem from memory-bound to compute-bound: only several variables for each resonance are required instead of the conventional pointwise table covering the entire resolved resonance region. Though memory space is largely reduced, this method is really time-consuming. After a series of optimizations, results show that the reconstruction kernel benefits well from vectorization and can achieve 1806 GFLOPS (single precision) on a Knights Landing 7250, which represents 67% of its effective peak performance. Even if optimization efforts on reconstruction significantly improve the FLOP usage, this on-the-fly calculation is still slower than the conventional lookup method. Under this situation, we begin to port the code on GPGPU to exploit potential higher performance as well as higher FLOP usage. On the other hand, another evaluation has been planned to compare lookup and reconstruction in terms of power consumption: with the help of hardware and software energy measurement support, we expect to find a compromising solution between performance and energy consumption in order to face the "power wall" challenge along with hardware evolution.
Complete list of metadatas

Cited literature [121 references]  Display  Hide  Download

https://pastel.archives-ouvertes.fr/tel-01687913
Contributor : Abes Star <>
Submitted on : Thursday, January 18, 2018 - 9:36:07 PM
Last modification on : Friday, June 21, 2019 - 2:17:06 PM
Long-term archiving on : Thursday, May 24, 2018 - 2:51:50 AM

File

65307_WANG_2017_archivage.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-01687913, version 1

Citation

Yunsong Wang. Optimization of Monte Carlo Neutron Transport Simulations with Emerging Architectures. Distributed, Parallel, and Cluster Computing [cs.DC]. Université Paris-Saclay, 2017. English. ⟨NNT : 2017SACLX090⟩. ⟨tel-01687913⟩

Share

Metrics

Record views

865

Files downloads

627