Possibilities of gpu use in the process of construction calculations
Pages 256-262
Computer aided design (CAD) and computer aided engineering (CAE) systems are significant tools in modern construction industry. More computations have to be run and handled to achieve the desired accuracy for more detailed models. Therefore, solver of sparse systems of linear algebraic equations is an important and time-consuming part of such software. Raising productivity of conventional clusters has become more complicated. Graphics processor units (GPU) may reach many folds higher productivity than standard CPU, especially in massive data operations. The paper suggests simple and productive technique of speeding up existing solver by implementation of GPU computing.The solver performs Cholesky factorization and is effectively omp-parallelized. Profiling indicated that matrix multiplications executed by standard BLAS library took up to eighty per cent of solver time running. Hence it was possible to distribute tasks between CPU and GPU dynamically by slight code modifications using standard BLAS interface.Proper matrices sizes were identified as data transfer between CPU and GPU. Data transfer takes too long, and multiplication of smaller matrices on GPU would slow down the solver. Allocation of pinned memory improved cooperation between processing units, while enabling the asynchronous transfer increased the load of the GPU. Cuda streams were associated with every omp thread to avoid queues of GPU calls. All the settings may be considerably different depending on hardware and software available, so tests were run on multiple computer configurations.Up to date the factorization time running is reduced by forty to sixty per cent. In order to further enhance the application, it is planned to implement multi-GPU and optimize matrix multiplication algorithm.
DOI: 10.22227/1997-0935.2013.11.256-262
- Cullinan C., Wyant C., Frattesi T. Computing Performance Benchmarks among CPU, GPU, and FPGA. Available at: http://www.wpi.edu/. Date of access: 26.03.2013.
- General-Purpose Computation on Graphics Hardware. Available at: http://www.gpgpu.org/. Date of access: 26.03.2013.
- Yakushev V.L., Zhuk Yu.N., Simbirkin V.N., Filimonov A.V. Realizatsiya metodov rascheta dlya bol'sherazmernykh zadach stroitel'noy mekhaniki v programmnom komplekse STARK ES. [Implementation of Calculation Methods for Major Tasks in Structural Mechanics Using STARK ES Software]. Vestnik kibernetiki [The Bulletin of Cybernetics]. 2011, no. 10, pp. 109—116.
- Yakushev V.L., Simbirkin V.N., Filimonov A.V. Reshenie bol'sherazmernykh zadach stroitel'noy mekhaniki metodom konechnykh elementov v programmnom komplekse STARK ES [Solution of Major Tasks in Structural Mechanics Using FE Method in STARK ES Software]. Teoriya i praktika rascheta zdaniy, sooruzheniy i elementov konstruktsiy. Analiticheskie i chislennye metody: Sbornik trudov mezhdunarodnoy nauchno-prakticheskoy konferentsii [Theory and Practice of Computations for Buildings, Structures and Structural Elements. Analytic and Numerical Methods: Proceedings of International Science-and-Practice Conference]. Moscow, MGSU Publ., 2010, pp. 516—526.
- Hogg J.D., Reid J.K., Scott J.A. Design of a Multicore Sparse Cholesky Factorization Using DAGs. STFC Technical Report RAL-TR-2009-027. Science and Technology Facilities Council, 2009.
- Sanders J., Kandrot E. CUDA by Example: an Introduction to General Purpose GPU Programming. Available at: http://developer.nvidia.com. Date of access: 26.03.2013.
- CUBLAS Library User Guide. NVIDIA Corporation. Available at: http://developer.nvidia.com. Date of access: 26.03.2013.
- Tan G., Li L., Triechle S., Phillips E., Bao Y., Sun N. Fast implementation of DGEMM on Fermi GPU. Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis. ACM New York, NY, USA , pp. 35:1—35:11.
- CUDA C Programming Guide. Available at: http://docs.nvidia.com. Date of access: 26.03.2013.
- CUDA C Best Practices Guide. Available at: http://docs.nvidia.com. Date of access: 26.03.2013.