Berkeley Lab's HPC Services consultant Yong Qin won the FX10 Championship hosted by Kyushu University at SC12 last month.
The FX10 championship is a competition for performance efficiency on your own code on 12 compute nodes of Fujitsu PRIMEHPC FX10, a commercial version of the K-computer (#3 of the TOP500 November 2012 list) equipped with the SPARC64(TM) IXfx processor and Tofu interconnect. Contestants submitted their codes to the Kyushu University staff and it was subsequently compiled and profiled to measure efficiency. The person with the highest efficiency wins.
According to Professor Keiichiro FUKAZAWA of Kyushu University, any code with an efficiency better than 10% is good. The application that Yong brought in was a code highly optimized for undulator radiation spectrum calculation that we collaborate with the Advanced Light Source (US) and Hiroshima Synchrotron Radiation Center (Japan). The code achieved an astonishing 53% efficiency. The 2nd place winner was only able to reach a 20% efficiency.
Yong attributes his ability to win based on his efforts to greatly reduce the memory footprint and to optimize the code with advanced parallelization techniques. It also helped that he developed this code to run on the newly available 37TF 108-node Lawrencium LR3 cluster which is equipped with 16 Intel Sandybridge processor cores per node - the same number of cores as on the FX10 nodes.
At first glance, the contest organizers thought that Yong had written a benchmark type of code to use up the processors, but once Yong explained his methods, they declared him the overall winner. Next year, Yong hopes to do even better after he has had a chance to further optimize his code.