GPU optimization techniques to accelerate optiGAN-a particle simulation GAN.

Mach Learn Sci Technol

Department of Biomedical Engineering, University of California, Davis, Davis, CA, United States of America.

Published: June 2024

The demand for specialized hardware to train AI models has increased in tandem with the increase in the model complexity over the recent years. Graphics processing unit (GPU) is one such hardware that is capable of parallelizing operations performed on a large chunk of data. Companies like Nvidia, AMD, and Google have been constantly scaling-up the hardware performance as fast as they can. Nevertheless, there is still a gap between the required processing power and processing capacity of the hardware. To increase the hardware utilization, the software has to be optimized too. In this paper, we present some general GPU optimization techniques we used to efficiently train the optiGAN model, a Generative Adversarial Network that is capable of generating multidimensional probability distributions of optical photons at the photodetector face in radiation detectors, on an 8GB Nvidia Quadro RTX 4000 GPU. We analyze and compare the performances of all the optimizations based on the execution time and the memory consumed using the Nvidia Nsight Systems profiler tool. The optimizations gave approximately a 4.5x increase in the runtime performance when compared to a naive training on the GPU, without compromising the model performance. Finally we discuss optiGANs future work and how we are planning to scale the model on GPUs.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11170465PMC
http://dx.doi.org/10.1088/2632-2153/ad51c9DOI Listing

Publication Analysis

Top Keywords

gpu optimization
8
optimization techniques
8
gpu
5
hardware
5
techniques accelerate
4
accelerate optigan-a
4
optigan-a particle
4
particle simulation
4
simulation gan
4
gan demand
4

Similar Publications

An automatic code generated C++/HIP/CUDA implementation of the (auxiliary) Fock, or Kohn-Sham, matrix construction for execution in GPU-accelerated hardware environments is presented. The module is developed as part of the quantum chemistry software package VeloxChem, employing localized Gaussian atomic orbitals. The performance and scaling characteristics are analyzed in view of the specific requirements for self-consistent field optimization and response theory calculations.

View Article and Find Full Text PDF

Review on GPU accelerated methods for genome-wide SNP-SNP interactions.

Mol Genet Genomics

December 2024

Department of Plant Sciences, North Dakota State University, Fargo, 58108, USA.

Detecting genome-wide SNP-SNP interactions (epistasis) efficiently is essential to harnessing the vast data now available from modern biobanks. With millions of SNPs and genetic information from hundreds of thousands of individuals, researchers are positioned to uncover new insights into complex disease pathways. However, this data scale brings significant computational and statistical challenges.

View Article and Find Full Text PDF

An AI dose-influence matrix engine for robust pencil beam scanning protons therapy.

Med Phys

December 2024

National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, People's Republic of China.

Background: Rapid planning is of tremendous value in proton pencil beam scanning (PBS) therapy in overcoming range uncertainty. However, the dose calculation of the dose influence matrix (D) in robust PBS plan optimization is time-consuming and requires substantial acceleration to enhance efficiency.

Purpose: To accelerate the D calculations in PBS therapy, we developed an AI-D engine integrated into our in-house treatment planning system (TPS).

View Article and Find Full Text PDF

Feasibility of reconstructing in-vivo patient 3D dose distributions from 2D EPID image data using convolutional neural networks.

Phys Med Biol

December 2024

School of Nuclear Science and Technology, University of Science and Technology of China, No.96, JinZhai Road Baohe District, Hefei, Anhui, China, Hefei, 230026, CHINA.

The primary purpose of this work is to demonstrate the feasibility of a deep convolutional neural network (dCNN) based algorithm that uses two-dimensional (2D) EPID images and CT images as input to reconstruct 3D dose distributions inside the patient. Approach: To generalize dCNN training and testing data, geometric and materials models of a VitalBeam accelerator treatment head and a corresponding EPID imager were constructed in detail in the GPU-accelerated Monte Carlo dose computing software, ARCHER. The EPID imager pixel spatial resolution ranging from 1.

View Article and Find Full Text PDF

Introduction: Capillaroscopy is a simple method of nailfold capillary imaging, used to diagnose diseases from the systemic sclerosis spectrum. However, the assessment of the capillary image is time-consuming and subjective. This makes it difficult to use for a detailed comparison of studies assessed by various physicians.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!