Developer Documentation ======================= If you are new to developing ``21cmFAST``, please read the :ref:`contributing:Contributing` section *first*, which outlines the general concepts for contributing to development, and provides a step-by-step walkthrough for getting setup. This page lists some more detailed notes which may be helpful through the development process. Compiling for debugging ----------------------- When developing, it is usually a good idea to compile the underlying C code in ``DEBUG`` mode. This may allow extra print statements in the C, but also will allow running the C under ``valgrind`` or ``gdb``. To do this:: $ DEBUG=True pip install -e . See :ref:`installation:Installation` for more installation options. Developing the C Code --------------------- In this section we outline how one might go about modifying/extending the C code and ensuring that the extension is compatible with the wrapper provided here. It is critical that you run all tests after modifying _anything_ (and see the section below about running with valgrind). When changing C code, before testing, ensure that the new C code is compiled into your environment by running:: $ rm -rf build $ pip install . Note that using a developer install (`-e`) is not recommended as it stores compiled objects in the working directory which don't get updated as you change code, and can cause problems later. There are two main purposes you may want to write some C code: 1. An external plugin/extension which uses the output data from 21cmFAST. 2. Modifying the internal C code of 21cmFAST. 21cmFAST currently provides no support for external plugins/extensions. It is entirely possible to write your own C code to do whatever you want with the output data, but we don't provide any wrapping structure for you to do this, you will need to write your own. Internally, 21cmFAST uses the ``cffi`` library to aid the wrapping of the C code into Python. You don't need to do the same, though we recommend it. If your desired "extension" is something that needs to operate in-between steps of 21cmFAST, you can use the generators or single-field fucntions (see :ref:`tutorials/coeval_cubes`) to stop your calculation at the desired point, making any modifications to the structures before continuing the run. Modifying the C-code of 21cmFAST should be relatively simple. If your changes are entirely internal to a given function, then nothing extra needs to be done. A little more work has to be done if the modifications add/remove input parameters or the output structure. If any of the input structures are modified (i.e. an extra parameter added to it), then the corresponding class in ``py21cmfast.wrapper`` must be modified, usually simply to add the new parameter to the existing attributes using any desired functionality provided in the ``attrs`` package. For instance, if a new variable ``some_param`` was added to the ``matter_params`` struct in the ``ComputeInitialConditions`` C function, then the ``MatterParams`` class in the wrapper would be modified, adding``some_param=attrs.field(default=)`` to the class. If the default value of the parameter is dependent on another parameter, it should be placed after it's depenencies in the structure, and be given a default function vit the attrs default decorator e.g:: @DIM.default def _dim_default(self): return 3 * self.HII_DIM Here we return either the explicitly set ``DIM``, or 3 by the ``HII_DIM``. In addition, if the new parameter is not settable -- if it is completely determined by other parameters -- mark it as private with a leading underscore or make it a ``@property`` of the class. If you modify an output struct, which usually house a number of array quantities (often float pointers, but not necessarily), then you'll again need to modify the corresponding class in the wrapper. In particular, you'll need to add an entry for that particular array as an ``_arrayfield()`` to the attributes the class and set its shape in the ``new()`` method. Any output which depends on the new array must also have it added to its ``get_required_input_arrays()`` method. If the C backend uses new input or output fields, they must be added to the structs in the header files ``_inputparams_wrapper.h`` or ``_outputstructs_wrapper.h``. C Function Standards ~~~~~~~~~~~~~~~~~~~~ The C-level functions are split into two groups -- low-level "private" functions, and higher-level "public" or "API" functions. All API-level functions are callable from python (but may also be called from other C functions). All API-level functions are currently prototyped in ``_functionprototypes_wrapper.h``. To enable consistency of error-checking in Python (and a reasonable standard for any kind of code), we enforce that any API-level function which computes an output structure must return an integer status. In addition, any "return" objects must be modified in-place (i.e. passed as pointers). This enables Python to control the memory access of these variables, and also to receive proper error statuses (see below for how we do exception handling). We also adhere to the convention that "output" variables should be passed to the function as its last argument(s). In the case that _only_ the last argument is meant to be "output", there exists a simple wrapper ``_call_c_simple`` in ``wrapper/_utils.py`` that will neatly handle the calling of the function in an intuitive pythonic way. Coordinate Systems & Grid Resolution ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ``21cmFAST`` Uses two grid resolutions specified in ``SimulationOptions`` during the simulation: A high-resolution ``DIM`` used to specify the ``InitialConditions``, which is downsampled to ``HII_DIM`` in the ``PerturbedField`` and is used for all other fields. The two grids are aligned as follows (using a ratio of ``DIM`` / ``HII_DIM`` = 4 as an example): .. image:: ./images/grid_schematic_4.png :width: 400px :align: center :alt: Schematic of the grid alignment When the ratio of the two grid dimensions is an integer, the centres of the low-resolution grid cells will always align with the centre of a high resolution cell. This allows us to easily downsample the grids using Fourier-space filters, since we can always place a low-pass filter at the centre of a high-resolution cell. Continuous positions such as ``HaloCatalog.halo_coords`` are specified on the range [0, ``BOX_LEN``]. And wrapping occurs such that the low-resolution cell at index 0 will contain the coordinates [``BOX_LEN(1 - 1/(2*HII_DIM))``, ``BOX_LEN``] and [0, ``BOX_LEN/(2*HII_DIM)``]. Functions and macros regarding indexing and positioning in both continuous space and discrete grids can be found in ``indexing.c`` and ``indexing.h``. Performance and Memory Profiling ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The simplest way to profile the code is using the package `py-spy ` and `speedscope `, simply install py-spy and run ``py-spy record --format speedscope -o [PATH_TO_PROFILE].json -- python [YOUR_PYTHON_SCRIPT].py``. Then drag the output JSON file into the speedscope web app to visualize the profiling data. Profiling the backend functions requires you to pass the ``--native`` flag to ``py-spy``, and more information will be available if you set ``PROFILE=TRUE`` when compiling ``21cmFAST``, which enables debug symbols. (Note 12/08/25: We have found that some versions of py-spy may have issues using the the ``--native`` flag, displaying stack errors. if you encounter this, try installing ``21cmFAST`` and ``py-spy`` on a Python 3.11 environment.) Memory usage can also be profiled simply using `memray `. After installing memray: ``memray run -o [PATH_TO_PROFILE] python [YOUR_PYTHON_SCRIPT].py`` ``memray flamegraph [PATH_TO_PROFILE]`` This will generate a HTML file that can be opened in a web browser to visualize memory usage. Running with gperftools ~~~~~~~~~~~~~~~~~~~~~~~ profiling can also be achieved using gperftools by compiling 21cmfast with the ``PROFILE`` flag then calling whichever script you use to run ``21cmFAST`` with the ``CPUPROFILE`` environment variable env CPUPROFILE=[PATH-TO-PROFILE] python some_script.py Profiles may then be viewed using ``pprof``, and the shared library file (located in the build directory after compiling), see https://gperftools.github.io/gperftools/cpuprofile.html for more information on pprof. Profile graphs may have unnamed functions at the top-level, representing the wrapper. However the lower-level functions will be mapped correctly. Running with Valgrind ~~~~~~~~~~~~~~~~~~~~~ Much more detailed performance statistics can be obtained using valgrind, which explicitly checks for memory leaks. To do this, install ``valgrind`` (we have tested v3.14+), which is probably available via your package manager. We provide a suppression file for ``valgrind`` in the ``devel/`` directory of the main repository. It is ideal if you install a development-version of python especially for running these tests. To do this, download the version of python you want and then configure/install with:: $ ./configure --prefix=/ --without-pymalloc --with-pydebug --with-valgrind $ make; make install Construct a ``virtualenv`` on top of this installation, and create your environment, and install all requirements. If you do not wish to run with a modified version of python, you may continue with your usual version, but may get some extra cruft in the output. If running with Python version > 3.6, consider running with environment variable ``PYTHONMALLOC=malloc`` (see https://stackoverflow.com/questions/20112989/how-to-use-valgrind-with-python ). The general pattern for using valgrind with python is:: $ valgrind --tool=memcheck --track-origins=yes --leak-check=full --suppressions=devel/valgrind-suppress-all-but-c.supp One useful command is to run valgrind over the test suite (from the top-level repo directory):: $ valgrind --tool=memcheck --track-origins=yes --leak-check=full --suppressions=devel/valgrind-suppress-all-but-c.supp pytest While we will attempt to keep the suppression file updated to the best of our knowledge so that only relevant leaks and errors are reported, you will likely have to do a bit of digging to find the relevant parts. Valgrind will likely run very slowly, and sometimes you will know already which exact tests are those which may have problems, or are relevant to your particular changes. To run these:: $ PYTHONMALLOC=malloc valgrind --tool=memcheck --track-origins=yes --leak-check=full --suppressions=devel/valgrind-suppress-all-but-c.supp pytest -v tests/:: > valgrind.out 2>&1 Note that we also routed the stderr output to a file, which is useful because it can be quite voluminous. There is a python script, ``devel/filter_valgrind.py`` which can be run over the output (`valgrind.out` in the above command) to filter it down to only have stuff from 21cmfast in it. Producing Integration Test Data ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ There are bunch of so-called "integration tests", which rely on previously-produced data. To produce this data, run ``python tests/produce_integration_test_data.py``. Furthermore, this data should only be produced with good reason -- the idea is to keep it static while the code changes, to have something steady to compare to. If a particular PR fixes a bug which affects a certain tests' data, then that data should be re-run, in the context of the PR, so it can be explained. Logging in C ~~~~~~~~~~~~ The C code has a header file ``logging.h``. The C code should *never* contain bare print-statements -- everything should be formally logged, so that the different levels can be printed to screen correctly. The levels are defined in ``logging.h``, and include levels such as ``INFO``, ``WARNING`` and ``DEBUG``. Each level has a corresponding macro that starts with ``LOG_``. Thus to log run-time information to stdout, you would use ``LOG_INFO("message");``. Note that the message does not require a final newline character. While there are several exceptions, a rule-of-thumb is to use ``DEBUG`` for output structure level information (printed once per compute functino), ``SUPER_DEBUG`` for smaller loop information (such as things once per filter radius), and ``ULTRA_DEBUG`` for individual cells (hopefully not *every* cell). Exception handling in C ~~~~~~~~~~~~~~~~~~~~~~~ There are various places that things can go wrong in the C code, and they need to be handled gracefully so that Python knows what to do with it (rather than just quitting!). We use the simple ``cexcept.h`` header file from http://www.nicemice.net/cexcept/ to enable a simple form of exception handling. That file itself should **not be edited**. There is another header -- ``exceptions.h`` -- that defines how we use exceptions throughout ``21cmFAST``. Any time an error arises that can be understood, the developer should add a ``Throw ;`` line. The ``ErrorKind`` can be any of the kinds defined in ``exceptions.h`` (eg. ``GSLError`` or ``ValueError``). These are just integers. Any C function that computes an output struct -- *must* be globally wrapped in a ``Try {} Catch(error_code) {}`` block. See ``GenerateICs.c`` for an example. Most of the code should be in the ``Try`` block. Anything that does a ``Throw`` at any level of the call stack within that ``Try`` will trigger a jump to the ``Catch``. The ``error_code`` is the integer that was thrown. Typically, one will perhaps want to do some cleanup here, and then finally *return* the error code. Python knows about the exit codes it can expect to receive, and will raise Python exceptions accordingly. From the python side, two main kinds of exceptions could be raised, depending on the error code returned from C. The lesser exception is called a ``ParameterError``, and is supposed to indicate an error that happened merely because the parameters that were input to the calculation were just too extreme to handle. In the case of something like an automatic Monte Carlo algorithm that's iterating over random parameters, one would *usually* want to just keep going at this point, because perhaps it just wandered too far in parameter space. The other kind of error is a ``FatalCError``, and this is where things went truly wrong, and probably will do for any combination of parameters. If you add a kind of Exception in the C code (to ``exceptions.h``), then be sure to add a handler for it in the ``_process_exitcode`` function in ``wrapper.py``. Maintaining Array State ~~~~~~~~~~~~~~~~~~~~~~~ Part of the challenge of maintaining a nice wrapper around the fast C-code is keeping track of initialized memory, and ensuring that the C structures that require that memory are pointing to the right place. Most of the arrays that are computed in ``21cmFAST`` are initialized *in Python* (using Numpy), then a pointer to their memory is given to the C wrapper object. To make matters more complicated, since some of the arrays are really big, it is sometimes necessary to write them to disk to relieve memory pressure, and load them back in as required. That means that any time, a given array in a C-based class may have one of several different "states": 1. Completely Uninitialized 2. Allocated an initialized in memory 3. Computed (i.e. filled with the values defining that array after computation in C) 4. Stored on disk 5. Stored *and* in memory. It's important to keep track of these states, because when passing the struct to the ``compute()`` function of another struct (as input), we go and check if the array exists in memory, and initialize it. Of course, we shouldn't initialize it with zeros if in fact it has been computed already and is sitting on disk ready to be loaded. Thus, the ``OutputStruct`` tries to keep track of these states for every array in the structure, using the ``Array`` and ``ArrayState`` classes. Every write/read/compute/purge operation self-consistently modifies the status of the array. It has been made difficult to unintnetionally modify the data in an array without properly changing the state. Arrays are frozen structures, so one must use the methods provided to alter their values or states. Purging/Loading C-arrays to/from Disk ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ As of v3.1.0, there are more options for granular I/O, allowing large arrays to be purged from memory when they are unnecessary for further computation. As a developer, you should be aware of the ``_get_required_input_arrays`` method on all ``OutputStruct`` subclasses. This is available to tell the given class what arrays need to be available at compute time in any of the input structs. For example, if doing ``PERTURB_ON_HIGH_RES``, the ``PerturbedField`` requires the hi-res density fields in ``InitialConditions``. This gives indications as to what boxes can be purged to disk (all the low-res boxes in the ICs, for example). Currently, this is only used to *check* that all boxes are available at compute time, and is not used to actually automatically purge anything. Note however that ``InitialConditions`` does have two custom methods that will purge unnecessary arrays before computing perturb fields or ionization fields. .. note:: If you add a new quantity to a struct, and it is required input for other structs, you need to add it to the relevant ``_get_required_input_arrays`` methods. Further note that as of v3.1.0, partial structs can be written and read from disk (so you can specify ``keys=['hires_density']`` in the ``.read()`` method to just read the hi-res density field into the object. Branching and Releasing ----------------------- The aim is to make 21cmFAST's releases as useful, comprehendible, and automatic as possible. This section lays out explicitly how this works (mostly for the benefit of the admin(s)). Versioning ~~~~~~~~~~ The first thing to mention is that we use strict `semantic versioning `_ (since v2.0). Thus the versions are ``MAJOR.MINOR.PATCH``, with ``MAJOR`` including API-breaking changes, ``MINOR`` including new features, and ``PATCH`` fixing bugs or documentation etc. If you depend on hmf, you can set your dependency as ``21cmFAST >= X.Y < X+1`` and not worry that we'll break your code with an update. To mechanically handle versioning within the package, we use `setuptools-scm `_. This stores the version in the git tag. There are many benefits to this -- one is that the version is unique for every single change in the code, with commits on top of a release changing the version. This means that versions accessed via ``py21cmfast.__version__`` are unique and track the exact code in the package (useful for reproducing results). To get the current version from command line, simply do ``python -m setuptools-scm`` in the top-level directory.