====================================================== Quick Start Guide for GPUmonty ====================================================== GPUmonty is a high-performance, `CUDA `_-accelerated Monte Carlo radiative transfer (MCRT) code designed for the spectral modeling of accreting black holes based on `igrmonty `_. Prerequisites ============= Before compiling, ensure your system has the following libraries installed and accessible: * **CUDA Toolkit:** Required for the ``nvcc`` compiler and GPU kernels (`Install Guide `_). * **GNU Scientific Library (GSL):** Required for various mathematical and statistical routines (`GSL Home `_). * **Hierarchical Data Format v5 (HDF5):** Required for reading GRMHD simulation snapshots (`HDF5 Home `_). Environment Configuration ------------------------- Locate the installation paths for these libraries on your system and update the corresponding variables in the ``Makefile``: .. code-block:: makefile CUDA_PATH = /usr/local/cuda GSL_PATH = /usr/local HDF5_INCLUDE = /usr/include/hdf5/serial HDF5_LIB = /usr/lib/x86_64-linux-gnu/hdf5/serial .. note:: The makefile is set to automatically find the **compute capability** of your GPU. Compute capability refers to the CUDA architecture version of your GPU (e.g., sm_86 for Ampere), which determines which GPU instructions and optimizations are used during compilation. In case you want to do it yourself, set ```AUTO_CC ?= 0``` and look for the compute capability on `Nvidia's website `_. After you have changed these settings, compile by typing: .. code-block:: bash make -j 15 In case you want to compile for debugging, use: .. code-block:: bash make BUILD_TYPE=debug CUDA Number of Blocks Configuration ----------------------------------- The build system includes an auto-tuning feature that detects the hardware specifications of the GPU on your current machine (specifically Device 0). During compilation, the ``Makefile`` triggers a probe (defined in ``GetGPUBlocks.mk``) that calculates the optimal number of blocks based on the GPU's multiprocessor count and blocks-per-multiprocessor limit. This process automatically updates the ``N_BLOCKS`` definition located in: ``src/config.h`` By default, this feature is **enabled**. If you wish to manually set ``N_BLOCKS`` to a fixed value in the config file, you can disable the auto-tuner by setting the ``GPU_TUNING`` flag to 0: .. code-block:: bash make BLOCK_TUNING=0 .. warning:: If you are running on a High Performance Computing (HPC) cluster, **do not compile on the login/head node**, as these nodes often lack GPUs or possess different hardware than the compute nodes. To ensure the auto-tuner detects the correct GPU architecture for your run, we recommend adding the compilation step directly inside your job submission script (e.g., Slurm or PBS script). Multi-Core Acceleration (OpenMP) -------------------------------- GPUmonty benefits from **OpenMP** for CPU-bound tasks such as data pre-processing and grid initialization. To enable multi-threaded CPU execution: .. code-block:: bash export OMP_NUM_THREADS=XX Replace ``XX`` with the desired number of threads (recommended: number of physical CPU cores). Configuration Parameters ------------------------ Simulation parameters are passed to the executable via a ``.par`` file. You can find a baseline configuration in ``/gpumonty/template.par``. To run a simulation with your custom parameters: .. code-block:: bash ./gpumonty -par path/to/your_file.par .. list-table:: Runtime Parameters :widths: 20 80 :header-rows: 1 * - Parameter - Description * - ``Ns`` - **Superphoton Count**: The approximated total number of photon packets to be generated. * - ``dump`` - **Data Path**: Relative or absolute path to the input GRMHD data file. * - ``spectrum`` - **Output Name**: Filename for the output spectral data (e.g., ``sane.spec``). * - ``MBH`` - **Black Hole Mass**: Mass of the central black hole in Solar Masses (:math:`M_\odot`). * - ``M_unit`` - **Mass Unit Scale**: Normalization factor (in grams) to scale dimensionless GRMHD density to physical CGS units. * - ``tp_over_te`` - **Proton-to-Electron Temperature Ratio**: Constant ratio (:math:`T_p/T_e`) used if a dynamic heating model is not active. * - ``Thetae_max`` - **Temperature Ceiling**: Numerical cap for the dimensionless electron temperature (:math:`\Theta_e = k_B T_e / m_e c^2`). * - ``scattering`` - **Boolean for Scattering**: Enable or disable scattering processes in the simulation. Analyzing the Output ==================== To facilitate data post-processing and visualization, an example Jupyter Notebook is provided in the repository. * **Notebook Location:** ``python/example.ipynb`` * **Workflow:** This tutorial guides you through opening output files, extracting spectral arrays, and generating plots. Spectral Data Structure ----------------------- When analyzing the raw results in Python, please note the relationship between luminosity and the observer's viewing angle: * **Indexing:** The luminosity array (``nuLnu``) is multi-dimensional; each index in the array corresponds directly to one of the ``theta_bins`` defined in your simulation. GRMHD Data File for Testing =========================== To reproduce tests using the same GRMHD input employed in the **GPUmonty paper**, download the dataset from `Prather et al. (2023) `_ via the Harvard Dataverse: * **Dataset:** `Harvard Dataverse (DOI: 10.7910/DVN/XZECPF) `_ After downloading, place the GRMHD data file in the ``data/`` directory and run: .. code-block:: bash ./gpumonty -par template.par The resulting spectrum should match the expected output shown below: .. figure:: ../python/expected_spectrum.png :width: 80% :align: center :alt: Expected spectrum output for GPUmonty **Expected Result:** The resulting spectrum showing the :math:`\nu L_\nu` distribution across frequencies. LICENSE ======= ``GPUmonty`` is free software: you can redistribute it and/or modify it under the terms of the **GNU General Public License** as published by the Free Software Foundation, either **version 2** of the License, or (at your option) any later version. See the `GNU General Public License `_ for more details.