memory.cu
This section contains functions defined in the memory.cu file related to GPU memory allocation, host to device global data transfer, and serialized workload calculations.
Functions
Functions
-
__host__ void transferParams()
Synchronizes global host simulation parameters with device memory symbols.
Copies physics constants, grid dimensions, lookup tables and model-specific parameters from the host to the device using
cudaMemcpyToSymbol.- Returns:
void
-
__host__ int setMaxBlocks()
Configures the optimal number of blocks to be used by CUDA kernels and generates GPU diagnostics.
It multiplies the maximum number of blocks per multiprocessor by the total number of streaming multiprocessors (SMs) to determine the total maximum number of blocks that can be launched concurrently on the GPU.
- Returns:
The calculated maximum number of blocks for kernel launches.
-
__host__ void cummulativePhotonsPerZone(unsigned long long *generated_photons_arr, unsigned long long *d_index_to_ijk)
Computes a cumulative sum of superphoton counts for zone-based photon sampling.
Generates a cumulative sum array where the value at zone i equals the total number of superphotons to be generated up to and including zone i-1.
- Parameters:
generated_photons_arr – Array containing the number of photons to generate in each zone.
d_index_to_ijk – Array storing the cumulative sum of photons per zone.
- Returns:
void
-
__host__ unsigned long long photonsPerBatch(unsigned long long tot_nph, int *batch_divisions)
Calculates the batch size for GPU photon processing.
Evaluates the available GPU RAM memory to determine the optimal number of partitions. This prevents memory overflows by scaling the batch size against the footprint of the
of_photonstructure.- Parameters:
tot_nph – Total number of photons to be processed in the simulation.
batch_divisions – Calculated number of batches.
- Returns:
The number of photons assigned to each individual GPU batch.
-
__host__ void allocatePhotonData(struct of_photonSOA *ph, unsigned long long size)
Allocates device memory for photon data using a Structure of Arrays (SoA) layout.
Initializes a
of_photonSOAstructure by allocating separate memory buffers on the GPU for every photon property for both original and scattered photons at each batch.Using a Structure of Arrays (SoA) instead of an Array of Structures (AoS) is a critical optimization that enables coalesced memory access patterns, significantly increasing throughput for CUDA kernels.
- Parameters:
ph – Pointer to the
of_photonSOAstructure to be initialized.size – The number of photon slots to allocate in each array.
-
__host__ void freePhotonData(struct of_photonSOA *ph)
Deallocates device memory for a
of_photonSOAstructure.This function releases all individual GPU memory buffers that were previously allocated by
allocatePhotonData.- Parameters:
ph – Pointer to the
of_photonSOAstructure whose device members are to be freed.
-
__host__ void createdPTextureObj(cudaTextureObject_t *texObj, double *dP, cudaArray_t *cuArray)
Creates a 3D texture object from a 4D data grid for the plasma primitive properties.
Converts input
doubledata tofloat, uploads it to a 3D CUDA array, and initializes a texture object with point filtering and clamp addressing.Note
Maps 4D data into a 3D extent by combining the
nxandnydimensions.- Parameters:
texObj – Pointer to the resulting texture object that stores the plasma primitive properties.
dP – Input plasma properties.
cuArray – Pointer to the allocated 3D CUDA array resource.
-
__host__ void create1DTextureObj(cudaTextureObject_t *texObj, double *ptr, cudaArray_t *cudaArray)
Creates a 1D texture object from a 1D data array.
Converts input
doubledata tofloat, uploads it to a 1D CUDA array, and initializes a texture object with linear filtering and clamp addressing.This is used for the Modified Bessel function lookup table ( \( K_2 \)).
- Parameters:
texObj – Pointer to the resulting texture object.
ptr – Input 1D data array.
cudaArray – Pointer to the allocated 1D CUDA array resource.
- Returns:
void