config.h

Documentation for the main macros and preprocessor definitions used in the configuration of the GPUmonty code.

Macros

Defines

KFAC (9*M_PI*ME*CL/EE)
N_BLOCKS 1792

The number of thread blocks dispatched per kernel call. This defines the grid size and should ideally be a multiple of the number of Streaming Multiprocessors (SMs) on your GPU to maximize occupancy. This is set through automatic GPU tuning in the Makefile unless BLOCK_TUNING is set to 0 in the Makefile.

N_THREADS 256

The number of threads per block. This defines the block size and should be chosen as a multiple of the warp size (32 for NVIDIA GPUs) to ensure efficient execution up to 2048 depending on the GPU architecture.

N_THBINS 6

Number of theta bins for angular distribution for the spectral output binning.

MINW 1.e-12

Minimum dimensionless photon frequency (or energy) used to construct the Compton cross-section lookup table

MAXW 1.e15

Maximum dimensionless photon frequency (or energy) used to construct the Compton cross-section lookup table

MINT 1.e-4

Minimum dimensionless temperature ( \( \Theta_{\rm e} \)) used to construct the Compton cross-section lookup table

MAXT 1.e4

Maximum dimensionless temperature ( \( \Theta_{\rm e} \)) used to construct the Compton cross-section lookup table

NW 220

Number of energy bins for the Compton cross-section lookup table

NT 80

Number of dimensionless temperature ( \( \Theta_{\rm e} \)) bins for the Compton cross-section lookup table

HOTCROSS "./table/hotcross.dat"

Standard path to the Compton cross-section table file

MAXGAMMA 12.

The integration cutoff multiplier for the electron Lorentz factor. MAXGAMMA tells the code how far into the high-energy “tail” of the electron distribution it needs to integrate to get an accurate result. The integration goes up to \(\gamma_\rm{e} = 1 + \text{MAXGAMMA} \Theta_{\rm e}\).

DMUE 0.05

The step size for the angular integration variable ( \( \mu_\rm{e} \)), that represents the cosine of the angle between the photon’s direction and the electron’s velocity in the Compton cross-section calculation.

DGAMMAE 0.05

The step size resolution for the electron Lorentz factor ( \( \gamma_\rm{e} \)) integration in the Compton cross-section calculation.

MMW 0.5

Mean molecular weight, in units of proton mass ( \( m_p \)).

NDIM 4

Number of dimensions in spacetime, time + 3 spatial dimensions.

KRHO 0

Mmnemonics for the density primitive variable ( \( \rho \)) array index.

UU 1

Mmnemonics for the internal energy primitive variable ( \( u \)) array index.

U1 2

Mmnemonics for the 1st-spatial velocity primitive variable ( \( u^1 \)) array index.

U2 3

Mmnemonics for the 2nd-spatial velocity primitive variable ( \( u^2 \)) array index.

U3 4

Mmnemonics for the 3rd-spatial velocity primitive variable ( \( u^3 \)) array index.

B1 5

Mmnemonics for the 1st-spatial magnetic field primitive variable ( \( B^1 \)) array index.

B2 6

Mmnemonics for the 2nd-spatial magnetic field primitive variable ( \( B^2 \)) array index.

B3 7

Mmnemonics for the 3rd-spatial magnetic field primitive variable ( \( B^3 \)) array index.

KEL 8

Mmnemonics for the electron temperature primitive variable ( \( T_e \)) array index.

KTOT 9

Mmnemonics for the Ktot

SMALL 1.e-40

Small number to avoid division by zero or logarithm of zero.

NPRIM_INDEX(i, j) (j * NPRIM + i)

Calculates the 1D index for a 2D array stored in a 1D format.

SLOOP_DEVICE(i, j, k)     for (int i = 0; i < N1; i++) \ for (int j = 0; j < N2; j++) \ for (int k = 0; k < N3; k++)

Macro to loop over all spatial grid points in device or host code, depending on the compilation context.

DLOOP for(k=0;k<NDIM;k++)for(l=0;l<NDIM;l++)

Macro to loop over all dimensions in spacetime twice (used for tensors).

MAX(a, b) (((a)>(b))?(a):(b))

Macro to compute the maximum of two values.

d_lmint (log10(MINT))

Logarithm of the minimum dimensionless temperature ( \( \Theta_{\rm e} \)) for the Compton cross-section table.

d_lminw (log10(MINW))

Logarithm of the minimum dimensionless photon frequency for the Compton cross-section table.

d_lT_min (log(TMIN))

TMIN represents the minimum dimensionless electron temperature ( \( \Theta_{\rm e} \)) used in both \( K_2(1/\Theta_{\rm e}) \) calculations and synchrotron emissivity table generation.

d_dlw (log10(MAXW / MINW) / NW)

Logarithmic step size for the Compton cross-section table in the photon frequency dimension.

d_dlT1 (1/(log(TMAX / TMIN) / (N_ESAMP)))

Precomputed inverse of the logarithmic temperature step size for emissivity and \( K_2(1/\Theta_{\rm e}) \) calculations.

d_dlT2 (log10(MAXT/MINT) / NT)

Logarithmic step size for the Compton cross-section table in the dimensionless temperature ( \( \Theta_{\rm e} \)) dimension.

KMAX (1.e7)

Maximum boundery for the dimensionless frequency grid used to precompute the synchrotron emissivity table.

KMIN (0.002)

Minimum boundery for the dimensionless frequency grid used to precompute the synchrotron emissivity table.

SMALL_VECTOR (1.e-30)

Precomputed definitin of small vector value to avoid numerical issues in tetrad calculations.

EPSABS (0.)

The maximum allowable absolute error. This provides a “floor” for the error. If the result of the integral is very close to zero, this prevents the integrator from working forever to find an infinitely small relative error.

EPSREL (1.e-6)

The maximum allowable relative error (as a fraction). For example, a value of \(10^{-6}\) means you want the result to be accurate to within \(0.0001\%\).

TMIN (THETAE_MIN)

Minimum dimensionless electron temperature ( \( \Theta_{\rm e} \)) for the bessel function ( \( K_2 \)) table calculation.

TMAX (1.e2)

Maximum dimensionless electron temperature ( \( \Theta_{\rm e} \)) for the bessel function ( \( K_2 \)) table calculation.

SCATTERINGS_PER_PHOTON (1)

Number of scatterings each superphoton is allowed to undergo.

Note

This serves as a estimation for memory allocation and should be adjusted based on the specific simulation requirements. If the medium is optically thin, a value of 1 is sufficient. However, for optically thick scenarios, consider increasing this value. A high number of scatterings per photon may lead to increased memory usage and increase in the number of serialized batches.

NINT (40000)

Number of data points in the Nint table used for solid angle averaged synchrotron emissivity calculations.

BTHSQMIN (1.e-8)

Minimum value of the product \( B \theta_e^2 \) used in the Nint table.

BTHSQMAX (1.e9)

Maximum value of the product \( B \theta_e^2 \) used in the Nint table.

MAX_LAYER_SCA (3)

Maximum number of scattering layers allowed.

EPS (0.04)

Epsilon parameter used in the photon geodesic integration to scale the stepsize. Decreasing this value increases accuracy but also computational cost.

FAST_CPY(in, out) {out[0] = in[0]; out[1] = in[1]; out[2] = in[2]; out[3] = in[3];}

Macro for fast copying of 4-element arrays.

ETOL (1.e-3)

Tolerance for the geodesic integration error.

MAX_ITER (2)

Maximum number of iterations for the geodesic integration push photon function.

MAXNSTEP (1280000)

Maximum number of integration steps for photon geodesic integration.