SC14 NVIDIA booth talk, November 19, 2014, New Orleans
GP GPU
Large-Scale Granular and Fluid (DEM/SPH) Simulations using Particles Takayuki Aoki Global Scientific Information and Computing Center Tokyo Institute of Technology Copyright © Global Scientific Information and Computing Center, Tokyo Institute of Technology
1
TSUBAME 2.5 Rack (30 nodes) Compute Node
System (58 racks) 1442 nodes: 2952 CPU sockets, 4264 GPUs
GP GPU
Performance: 224.7 TFLOPS (CPU) ※ Turbo boost 5.562 PFLOPS (GPU)
Performance: 122 TFLOPS Memory: 2.28 TB
(3 Tesla K20X GPUs) Performance: 4.08 TFLOPS Memory: 58.0GB(CPU) +18GB(GPU)
Copyright © Global Scientific Information and Computing Center, Tokyo Institute of Technology
Total: 17.1 PFLOPS
TSUBAME Supercomputer
GP GPU
文部科学 大臣表彰 (2012) Tesla K20X Tesla M2050
Gordon Bell Prize (2011)
Tesla S1070 X170(680GPU) Graph 500 No. 3 (2011)
CUDA COE wire Copyright © Global Scientific Information and Computing Center, Tokyo Institute of Technology
3
GP GPU
Copyright © Global Scientific Information and Computing Center, Tokyo Institute of Technology
4
Weak Scalability: 2.0000 PFLOPS on 4,000 TSUBAME2.0, 330 billion cells 44.5 % the peak performance GP GPU
Copyright © Global Scientific Information and Computing Center, Tokyo Institute of Technology
GP GPU
Granular Material Simulations using Discrete Element Method
Copyright © Global Scientific Information and Computing Center, Tokyo Institute of Technology
7
Golf Bunker Shots
Copyright © Global Scientific Information and Computing Center, Tokyo Institute of Technology
GP GPU
8
Simulation for
Granular Materials GP GPU
DEM (Discrete Element Method)
Normal direction
Contact interaction Viscosity
Spring
Tangential direction Spring Friction
Fij kxij xij Viscosity Copyright © Global Scientific Information and Computing Center, Tokyo Institute of Technology
In 2005 DEM (Discrete Element Method) ■ kn
76,000 Particles:
GP GPU
48 hours
= 5×108 dyn/cm
■ Time Integration: 2-stage Ruge-Kutta ■
= 8×104 dyn・sec/cm ■ t = 4×10-7 sec Future work:
CPU 0
CPU 1
CPU 2
Copyright © Global Scientific Information and Computing Center, Tokyo Institute of Technology
10
Dynamic Load Balance GP GPU
2 dimensional slice-grid method Many particles
1. Move boundary no particle
2. Move boundary Copyright © Global Scientific Information and Computing Center, Tokyo Institute of Technology
Dynamic Load Balance GP GPU
2 dimensional slice-grid method Many particles
1. Move boundary no particle
2. Move boundary Copyright © Global Scientific Information and Computing Center, Tokyo Institute of Technology
Dynamic Domain Decomposition GP GPU
Computational domain is dynamically decomposed into 64 sub-domains. Slice grid
KD-tree
Copyright © Global Scientific Information and Computing Center, Tokyo Institute of Technology
Octree
Collision Detection using Level Set Function GP GPU
• Particle Collision detection of particles with complex shapes described by CAD data is efficiently carried out by using Level Set Function.
Particle
Polygon of CAD data
Positive area Φ > 0
Copyright © Global Scientific Information and Computing Center, Tokyo Institute of Technology
Negative area Φ < 0
Level Set Function describing CAD surface
GP GPU
• Generation from 3D CAD data on the uniform mesh • Fast generation algorithm and inside/outside judgment Surface patches of CAD data
Level Set Function negative distance area far from the surface
positive distance area far from the surface
Copyright © Global Scientific Information and Computing Center, Tokyo Institute of Technology
Neighbor Particle List GP GPU
Linked-list method Local domain 0
6
3
0
6
NULL 87 percent of memory usage is reduced compared to regular neighbor list. Copyright © Global Scientific Information and Computing Center, Tokyo Institute of Technology
3
GP GPU
Copyright © Global Scientific Information and Computing Center, Tokyo Institute of Technology
螺旋すべり台
AOKI Lab.
16.7 millions particles with 64 GPUs
バンカーショット計算
AOKI Lab.
GP GPU
Copyright © Global Scientific Information and Computing Center, Tokyo Institute of Technology
DEM using non-spherical particles Considering more realistic shapes of rocks, non-spherical particles are used in DEM.
Many spherical particles with rigid body connections
Copyright © Global Scientific Information and Computing Center, Tokyo Institute of Technology
GP GPU
Using spherical particles, GP GPU
Copyright © Global Scientific Information and Computing Center, Tokyo Institute of Technology
Using non‐spherical tetrapod particles, GP GPU
Copyright © Global Scientific Information and Computing Center, Tokyo Institute of Technology
Multiple GPU Scalability • Conditions Particles : 2 × 106, 1.6 × 107, 1.29 × 108 Domain Decomposition: Dynamic load Balance using Slice Grid Method Time-Integration : 2-stage Runge-Kutta
Copyright © Global Scientific Information and Computing Center, Tokyo Institute of Technology
GP GPU
SPH for Fluid Dynamics
GP GPU
Particle interaction within a kernel radis
h
First derivatives
h : Kernel radius : Kernel function
Copyright © Global Scientific Information and Computing Center, Tokyo Institute of Technology
Improved SPH
GP GPU
A list of Particle Difference Operators Generalization of Finite Difference Operators (Imoto, Tagami 2014)
Interpolation Gradient
2nd polynomial function (Spiky shaped):
Divergence Laplacian
r Copyright © Global Scientific Information and Computing Center, Tokyo Institute of Technology
Improved SPH
GP GPU
• Explicit Time-integration using Predictor-corrector Method Predicator
(1)
Collector
(2)
Temporary pressures are calculated from Birch-Murnaghan’s equations: (3)
Positions are computed as follows: (4)
Pressures are computed as follows: (5) Copyright © Global Scientific Information and Computing Center, Tokyo Institute of Technology
A Dam Break Simulation • Initial setting and Parameters 12 m
4.8 m
2.2 m
Water 10 m
6m
Object
0.8 m
Copyright © Global Scientific Information and Computing Center, Tokyo Institute of Technology
GP GPU
Description of the Object Shape GP GPU
• A object is represented by particles arrangement generated from CAD data
Copyright © Global Scientific Information and Computing Center, Tokyo Institute of Technology
A Dam Break Simulation
Copyright © Global Scientific Information and Computing Center, Tokyo Institute of Technology
GP GPU
72 M particles with 80 GPUs
Fluid-Structure Interaction GP GPU
Copyright © Global Scientific Information and Computing Center, Tokyo Institute of Technology
SUMMARY GP GPU
Particle Method (DEM/SPH) based on short-range interaction are also suitable for GPU computing as well as stencil computation. Successful many granular simulations GPU-based supercomputer TSUBAME 2.0/2.5 have been shown. Fluid simulations using SPH is suitable to describe free-surface flows. Particle methods can be applied to Fluid-Structure Interaction easily. Copyright © Global Scientific Information and Computing Center, Tokyo Institute of Technology
33