config.h

Documentation for the main macros and preprocessor definitions used in the configuration of the GPUmonty code.

Macros

Defines

KFAC (9*M_PI*ME*CL/EE)

N_BLOCKS 1792: The number of thread blocks dispatched per kernel call. This defines the grid size and should ideally be a multiple of the number of Streaming Multiprocessors (SMs) on your GPU to maximize occupancy. This is set through automatic GPU tuning in the Makefile unless BLOCK_TUNING is set to 0 in the Makefile.

N_THREADS 256: The number of threads per block. This defines the block size and should be chosen as a multiple of the warp size (32 for NVIDIA GPUs) to ensure efficient execution up to 2048 depending on the GPU architecture.

N_THBINS 6: Number of theta bins for angular distribution for the spectral output binning.

MINW 1.e-12: Minimum dimensionless photon frequency (or energy) used to construct the Compton cross-section lookup table

MAXW 1.e15: Maximum dimensionless photon frequency (or energy) used to construct the Compton cross-section lookup table

MINT 1.e-4: Minimum dimensionless temperature ( \( \Theta_{\rm e} \)) used to construct the Compton cross-section lookup table

MAXT 1.e4: Maximum dimensionless temperature ( \( \Theta_{\rm e} \)) used to construct the Compton cross-section lookup table

NW 220: Number of energy bins for the Compton cross-section lookup table

NT 80: Number of dimensionless temperature ( \( \Theta_{\rm e} \)) bins for the Compton cross-section lookup table

HOTCROSS "./table/hotcross.dat": Standard path to the Compton cross-section table file

MAXGAMMA 12.: The integration cutoff multiplier for the electron Lorentz factor. MAXGAMMA tells the code how far into the high-energy “tail” of the electron distribution it needs to integrate to get an accurate result. The integration goes up to \(\gamma_\rm{e} = 1 + \text{MAXGAMMA} \Theta_{\rm e}\).

DMUE 0.05: The step size for the angular integration variable ( \( \mu_\rm{e} \)), that represents the cosine of the angle between the photon’s direction and the electron’s velocity in the Compton cross-section calculation.

DGAMMAE 0.05: The step size resolution for the electron Lorentz factor ( \( \gamma_\rm{e} \)) integration in the Compton cross-section calculation.

MMW 0.5: Mean molecular weight, in units of proton mass ( \( m_p \)).

NDIM 4: Number of dimensions in spacetime, time + 3 spatial dimensions.

KRHO 0: Mmnemonics for the density primitive variable ( \( \rho \)) array index.

UU 1: Mmnemonics for the internal energy primitive variable ( \( u \)) array index.

U1 2: Mmnemonics for the 1st-spatial velocity primitive variable ( \( u^1 \)) array index.

U2 3: Mmnemonics for the 2nd-spatial velocity primitive variable ( \( u^2 \)) array index.

U3 4: Mmnemonics for the 3rd-spatial velocity primitive variable ( \( u^3 \)) array index.

B1 5: Mmnemonics for the 1st-spatial magnetic field primitive variable ( \( B^1 \)) array index.

B2 6: Mmnemonics for the 2nd-spatial magnetic field primitive variable ( \( B^2 \)) array index.

B3 7: Mmnemonics for the 3rd-spatial magnetic field primitive variable ( \( B^3 \)) array index.

KEL 8: Mmnemonics for the electron temperature primitive variable ( \( T_e \)) array index.

KTOT 9: Mmnemonics for the Ktot

SMALL 1.e-40: Small number to avoid division by zero or logarithm of zero.

NPRIM_INDEX(i, j) (j * NPRIM + i): Calculates the 1D index for a 2D array stored in a 1D format.

SLOOP_DEVICE(i, j, k) for (int i = 0; i < N1; i++) \ for (int j = 0; j < N2; j++) \ for (int k = 0; k < N3; k++): Macro to loop over all spatial grid points in device or host code, depending on the compilation context.

DLOOP for(k=0;k<NDIM;k++)for(l=0;l<NDIM;l++): Macro to loop over all dimensions in spacetime twice (used for tensors).

MAX(a, b) (((a)>(b))?(a):(b)): Macro to compute the maximum of two values.

d_lmint (log10(MINT)): Logarithm of the minimum dimensionless temperature ( \( \Theta_{\rm e} \)) for the Compton cross-section table.

d_lminw (log10(MINW)): Logarithm of the minimum dimensionless photon frequency for the Compton cross-section table.

d_lT_min (log(TMIN)): TMIN represents the minimum dimensionless electron temperature ( \( \Theta_{\rm e} \)) used in both \( K_2(1/\Theta_{\rm e}) \) calculations and synchrotron emissivity table generation.

d_dlw (log10(MAXW / MINW) / NW): Logarithmic step size for the Compton cross-section table in the photon frequency dimension.

d_dlT1 (1/(log(TMAX / TMIN) / (N_ESAMP))): Precomputed inverse of the logarithmic temperature step size for emissivity and \( K_2(1/\Theta_{\rm e}) \) calculations.

d_dlT2 (log10(MAXT/MINT) / NT): Logarithmic step size for the Compton cross-section table in the dimensionless temperature ( \( \Theta_{\rm e} \)) dimension.

KMAX (1.e7): Maximum boundery for the dimensionless frequency grid used to precompute the synchrotron emissivity table.

KMIN (0.002): Minimum boundery for the dimensionless frequency grid used to precompute the synchrotron emissivity table.

SMALL_VECTOR (1.e-30): Precomputed definitin of small vector value to avoid numerical issues in tetrad calculations.

EPSABS (0.): The maximum allowable absolute error. This provides a “floor” for the error. If the result of the integral is very close to zero, this prevents the integrator from working forever to find an infinitely small relative error.

EPSREL (1.e-6): The maximum allowable relative error (as a fraction). For example, a value of \(10^{-6}\) means you want the result to be accurate to within \(0.0001\%\).

TMIN (THETAE_MIN): Minimum dimensionless electron temperature ( \( \Theta_{\rm e} \)) for the bessel function ( \( K_2 \)) table calculation.

TMAX (1.e2): Maximum dimensionless electron temperature ( \( \Theta_{\rm e} \)) for the bessel function ( \( K_2 \)) table calculation.

SCATTERINGS_PER_PHOTON (1): Number of scatterings each superphoton is allowed to undergo.

Note

This serves as a estimation for memory allocation and should be adjusted based on the specific simulation requirements. If the medium is optically thin, a value of 1 is sufficient. However, for optically thick scenarios, consider increasing this value. A high number of scatterings per photon may lead to increased memory usage and increase in the number of serialized batches.

NINT (40000): Number of data points in the Nint table used for solid angle averaged synchrotron emissivity calculations.

BTHSQMIN (1.e-8): Minimum value of the product \( B \theta_e^2 \) used in the Nint table.

BTHSQMAX (1.e9): Maximum value of the product \( B \theta_e^2 \) used in the Nint table.

MAX_LAYER_SCA (3): Maximum number of scattering layers allowed.

EPS (0.04): Epsilon parameter used in the photon geodesic integration to scale the stepsize. Decreasing this value increases accuracy but also computational cost.

FAST_CPY(in, out) {out[0] = in[0]; out[1] = in[1]; out[2] = in[2]; out[3] = in[3];}: Macro for fast copying of 4-element arrays.

ETOL (1.e-3): Tolerance for the geodesic integration error.

MAX_ITER (2): Maximum number of iterations for the geodesic integration push photon function.

MAXNSTEP (1280000): Maximum number of integration steps for photon geodesic integration.