config.h
Documentation for the main macros and preprocessor definitions used in the configuration of the GPUmonty code.
Macros
Defines
-
N_BLOCKS 1792
The number of thread blocks dispatched per kernel call. This defines the grid size and should ideally be a multiple of the number of Streaming Multiprocessors (SMs) on your GPU to maximize occupancy. This is set through automatic GPU tuning in the Makefile unless
BLOCK_TUNINGis set to 0 in the Makefile.
-
N_THREADS 256
The number of threads per block. This defines the block size and should be chosen as a multiple of the warp size (32 for NVIDIA GPUs) to ensure efficient execution up to 2048 depending on the GPU architecture.
-
N_THBINS 6
Number of theta bins for angular distribution for the spectral output binning.
-
MINW 1.e-12
Minimum dimensionless photon frequency (or energy) used to construct the Compton cross-section lookup table
-
MAXW 1.e15
Maximum dimensionless photon frequency (or energy) used to construct the Compton cross-section lookup table
-
MINT 1.e-4
Minimum dimensionless temperature ( \( \Theta_{\rm e} \)) used to construct the Compton cross-section lookup table
-
MAXT 1.e4
Maximum dimensionless temperature ( \( \Theta_{\rm e} \)) used to construct the Compton cross-section lookup table
-
NW 220
Number of energy bins for the Compton cross-section lookup table
-
NT 80
Number of dimensionless temperature ( \( \Theta_{\rm e} \)) bins for the Compton cross-section lookup table
-
HOTCROSS "./table/hotcross.dat"
Standard path to the Compton cross-section table file
-
MAXGAMMA 12.
The integration cutoff multiplier for the electron Lorentz factor. MAXGAMMA tells the code how far into the high-energy “tail” of the electron distribution it needs to integrate to get an accurate result. The integration goes up to \(\gamma_\rm{e} = 1 + \text{MAXGAMMA} \Theta_{\rm e}\).
-
DMUE 0.05
The step size for the angular integration variable ( \( \mu_\rm{e} \)), that represents the cosine of the angle between the photon’s direction and the electron’s velocity in the Compton cross-section calculation.
-
DGAMMAE 0.05
The step size resolution for the electron Lorentz factor ( \( \gamma_\rm{e} \)) integration in the Compton cross-section calculation.
-
MMW 0.5
Mean molecular weight, in units of proton mass ( \( m_p \)).
-
NDIM 4
Number of dimensions in spacetime, time + 3 spatial dimensions.
-
KRHO 0
Mmnemonics for the density primitive variable ( \( \rho \)) array index.
-
UU 1
Mmnemonics for the internal energy primitive variable ( \( u \)) array index.
-
U1 2
Mmnemonics for the 1st-spatial velocity primitive variable ( \( u^1 \)) array index.
-
U2 3
Mmnemonics for the 2nd-spatial velocity primitive variable ( \( u^2 \)) array index.
-
U3 4
Mmnemonics for the 3rd-spatial velocity primitive variable ( \( u^3 \)) array index.
-
B1 5
Mmnemonics for the 1st-spatial magnetic field primitive variable ( \( B^1 \)) array index.
-
B2 6
Mmnemonics for the 2nd-spatial magnetic field primitive variable ( \( B^2 \)) array index.
-
B3 7
Mmnemonics for the 3rd-spatial magnetic field primitive variable ( \( B^3 \)) array index.
-
KEL 8
Mmnemonics for the electron temperature primitive variable ( \( T_e \)) array index.
-
KTOT 9
Mmnemonics for the Ktot
-
SMALL 1.e-40
Small number to avoid division by zero or logarithm of zero.
-
NPRIM_INDEX(i, j) (j * NPRIM + i)
Calculates the 1D index for a 2D array stored in a 1D format.
-
SLOOP_DEVICE(i, j, k) for (int i = 0; i < N1; i++) \ for (int j = 0; j < N2; j++) \ for (int k = 0; k < N3; k++)
Macro to loop over all spatial grid points in device or host code, depending on the compilation context.
-
DLOOP for(k=0;k<NDIM;k++)for(l=0;l<NDIM;l++)
Macro to loop over all dimensions in spacetime twice (used for tensors).
-
MAX(a, b) (((a)>(b))?(a):(b))
Macro to compute the maximum of two values.
-
d_lmint (log10(MINT))
Logarithm of the minimum dimensionless temperature ( \( \Theta_{\rm e} \)) for the Compton cross-section table.
-
d_lminw (log10(MINW))
Logarithm of the minimum dimensionless photon frequency for the Compton cross-section table.
-
d_lT_min (log(TMIN))
TMIN represents the minimum dimensionless electron temperature ( \( \Theta_{\rm e} \)) used in both \( K_2(1/\Theta_{\rm e}) \) calculations and synchrotron emissivity table generation.
-
d_dlw (log10(MAXW / MINW) / NW)
Logarithmic step size for the Compton cross-section table in the photon frequency dimension.
-
d_dlT1 (1/(log(TMAX / TMIN) / (N_ESAMP)))
Precomputed inverse of the logarithmic temperature step size for emissivity and \( K_2(1/\Theta_{\rm e}) \) calculations.
-
d_dlT2 (log10(MAXT/MINT) / NT)
Logarithmic step size for the Compton cross-section table in the dimensionless temperature ( \( \Theta_{\rm e} \)) dimension.
-
KMAX (1.e7)
Maximum boundery for the dimensionless frequency grid used to precompute the synchrotron emissivity table.
-
KMIN (0.002)
Minimum boundery for the dimensionless frequency grid used to precompute the synchrotron emissivity table.
-
SMALL_VECTOR (1.e-30)
Precomputed definitin of small vector value to avoid numerical issues in tetrad calculations.
-
EPSABS (0.)
The maximum allowable absolute error. This provides a “floor” for the error. If the result of the integral is very close to zero, this prevents the integrator from working forever to find an infinitely small relative error.
-
EPSREL (1.e-6)
The maximum allowable relative error (as a fraction). For example, a value of \(10^{-6}\) means you want the result to be accurate to within \(0.0001\%\).
-
TMIN (THETAE_MIN)
Minimum dimensionless electron temperature ( \( \Theta_{\rm e} \)) for the bessel function ( \( K_2 \)) table calculation.
-
TMAX (1.e2)
Maximum dimensionless electron temperature ( \( \Theta_{\rm e} \)) for the bessel function ( \( K_2 \)) table calculation.
-
SCATTERINGS_PER_PHOTON (1)
Number of scatterings each superphoton is allowed to undergo.
Note
This serves as a estimation for memory allocation and should be adjusted based on the specific simulation requirements. If the medium is optically thin, a value of 1 is sufficient. However, for optically thick scenarios, consider increasing this value. A high number of scatterings per photon may lead to increased memory usage and increase in the number of serialized batches.
-
NINT (40000)
Number of data points in the Nint table used for solid angle averaged synchrotron emissivity calculations.
-
BTHSQMIN (1.e-8)
Minimum value of the product \( B \theta_e^2 \) used in the Nint table.
-
BTHSQMAX (1.e9)
Maximum value of the product \( B \theta_e^2 \) used in the Nint table.
-
MAX_LAYER_SCA (3)
Maximum number of scattering layers allowed.
-
EPS (0.04)
Epsilon parameter used in the photon geodesic integration to scale the stepsize. Decreasing this value increases accuracy but also computational cost.
-
FAST_CPY(in, out) {out[0] = in[0]; out[1] = in[1]; out[2] = in[2]; out[3] = in[3];}
Macro for fast copying of 4-element arrays.
-
ETOL (1.e-3)
Tolerance for the geodesic integration error.
-
MAX_ITER (2)
Maximum number of iterations for the geodesic integration push photon function.
-
MAXNSTEP (1280000)
Maximum number of integration steps for photon geodesic integration.