The disproportionate increase in the computing capacity of a multi-core node as compared to the injection and network bandwidth on high performance computing (HPC) interconnects can make the network a performance bottleneck. Purchasing additional bandwidth to balance the compute-to-bandwidth ratio can be very expensive. Analyzing communication on current and future HPC interconnects and comparing different network topologies with respect to different metrics such as congestion, performance and dollar costs can help us in understanding networks better and making the right procurement decisions. In this talk, I will present two network simulators that we have developed for analyzing traffic and predicting communication performance on HPC interconnects. Damselfly is a functional model based simulator and TraceR is built on top of ROSS, a parallel discrete event simulation (PDES) engine. I will also present studies we have conducted using these simulators.
Dr. Abhinav Bhatele is a computer scientist in the Center for Applied Scientific Computing at Lawrence Livermore National Laboratory. His interests lie in parallel algorithms, load balancing, HPC networks, communication optimization, visualization and data analytics on high-end parallel systems. Dr. Bhatele received a Ph. D. in Computer Science from the University of Illinois at Urbana-Champaign in 2010. He has received several awards for his dissertation work including a Distinguished Paper Award at Euro-Par in 2009 and
the David J. Kuck Outstanding PhD Thesis Award in 2011. In 2013, a paper that he co-authored with LLNL and external collaborators was selected for a best paper award at IPDPS. Dr. Bhatele was selected as a recipient of the IEEE TCSC Young Achievers in Scalable Computing award in 2014.