Contribution guide

Thank you for considering to contribute to Kernel Tuner!

Reporting Issues

Not all contributions are code, creating an issue also helps us to improve. When you create an issue about a problem, please ensure the following:

  • Describe what you expected to happen.

  • If possible, include a minimal example to help us reproduce the issue.

  • Describe what actually happened, including the output of any errors printed.

  • List the version of Python, CUDA or OpenCL, and C compiler, if applicable.

Contributing Code

For contributing code to Kernel Tuner please select an issue to work on or create a new issue to propose a change or addition. For significant changes, it is required to first create an issue and discuss the proposed changes. Then fork the repository, create a branch, one per change or addition, and create a pull request.

Kernel Tuner follows the Google Python style guide, with Sphinxdoc docstrings for module public functions.

Before creating a pull request please ensure the following:

  • You are working in an up-to-date development environment

  • You have written unit tests to test your additions and all unit tests pass (run nox). If you do not have the required hardware, you can run nox -- skip-gpu, or skip-cuda, skip-hip, skip-opencl.

  • The examples still work and produce the same (or better) results

  • An entry about the change or addition is created in CHANGELOG.md

If you are in doubt on where to put your additions to the Kernel Tuner, please have look at the design documentation, or discuss it in the issue regarding your additions.

Development environment

The following steps help you set up a development environment.

Local setup

Steps with sudo access (e.g. on a local device):

  1. Clone the git repository to the desired location: git clone https://github.com/KernelTuner/kernel_tuner.git, and cd to it.

  2. Prepare your system for building Python versions.
    • On Ubuntu, run sudo apt update && sudo apt upgrade, and sudo apt install -y make build-essential libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev libncursesw5-dev xz-utils tk-dev libffi-dev liblzma-dev python-openssl git.

  3. Install pyenv:
    • On Linux, run curl https://pyenv.run | bash (remember to add the output to .bash_profile and .bashrc as specified).

    • On macOS, run brew update && brew install pyenv.

    • After installation, restart your shell.

  4. Install the required Python versions:
    • On some systems, additional packages may be needed to build Python versions. For example on Ubuntu: sudo apt install build-essential zlib1g-dev libncurses5-dev libgdbm-dev libnss3-dev libssl-dev libreadline-dev libffi-dev libsqlite3-dev wget libbz2-dev liblzma-dev lzma.

    • Install the Python versions with: pyenv install 3.8 3.9 3.10 3.11. The reason we’re installing all these versions as opposed to just one, is so we can test against all supported Python versions.

  5. Set the Python versions so they can be found: pyenv local 3.8 3.9 3.10 3.11 (replace local with global when not using the virtualenv).

  6. Setup a local virtual environment in the folder: pyenv virtualenv 3.11 kerneltuner (or whatever environment name and Python version you prefer).

  7. Install Poetry.
    • Use curl -sSL https://install.python-poetry.org | python3 - to install Poetry.

    • Make sure to add Poetry to PATH as instructed at the end of the installation.

    • Add the poetry export plugin with poetry self add poetry-plugin-export.

  8. Make sure that non-Python dependencies are installed if applicable, such as CUDA, OpenCL or HIP. This is described in Installation.

  9. Apply changes:
    • Re-open the shell for changes to take effect.

    • Activate the environment with pyenv activate kerneltuner.

    • Make sure which python and which pip point to the expected Python location and version.

    • Update Pip with pip install --upgrade pip.

  10. Install the project, dependencies and extras: poetry install --with test,docs -E cuda -E opencl -E hip, leaving out -E cuda, -E opencl or -E hip if this does not apply on your system. To go all-out, use --all-extras
    • Depending on the environment, it may be necessary or convenient to install extra packages such as cupy-cuda11x / cupy-cuda12x, and cuda-python. These are currently not defined as dependencies for kernel-tuner, but can be part of tests.

    • Do not forget to make sure the paths are set correctly. If you’re using CUDA, the desired CUDA version should be in $PATH, $LD_LIBARY_PATH and $CPATH.

    • Re-open the shell for changes to take effect.

  11. Check if the environment is setup correctly by running pytest and nox. All tests should pass, except if one or more extras has been left out in the previous step, then these tests will skip gracefully.
    • [Note]: sometimes, changing the NVIDIA driver privileges is required to read program counters and energy measurements. Check if cat /proc/driver/nvidia/params | grep RmProfilingAdminOnly is set to 1. If so, follow these steps

Cluster setup

Steps without sudo access (e.g. on a cluster):

  1. Clone the git repository to the desired location: git clone https://github.com/KernelTuner/kernel_tuner.git.

  2. Install Conda with Mamba (for better performance) or Miniconda (for traditional minimal Conda).
    • [Optional] if you are under quotas or are otherwise restricted by disk space, you can instruct Conda to use a different directory for saving environments by adding the following to your .condarc file:
      envs_dirs:
       - /path/to/directory
      
    • [Optional] both Mamba and Miniconda can be automatically activated via ~/.bashrc. Do not forget to add these (usually provided at the end of the installation).

    • Exit the shell and re-enter to make sure Conda is available. cd to the kernel tuner directory.

    • [Optional] if you have limited user folder space, the Pip cache can be pointed elsewhere with the environment variable PIP_CACHE_DIR. The cache location can be checked with pip cache dir.

    • [Optional] update Conda if available before continuing: conda update -n base -c conda-forge conda.

  3. Setup a virtual environment: conda create --name kerneltuner python=3.11 (or whatever Python version and environment name you prefer).

  4. Activate the virtual environment: conda activate kerneltuner.
    • [Optional] to use the correct environment by default, execute conda config --set auto_activate_base false, and add conda activate kerneltuner to your .bash_profile or .bashrc.

  5. Make sure that non-Python dependencies are loaded if applicable, such as CUDA, OpenCL or HIP. On most clusters it is possible to load (or unload) modules (e.g. CUDA, OpenCL / ROCM). For more information, see Installation.
    • Do not forget to make sure the paths are set correctly. If you’re using CUDA, the desired CUDA version should be in $PATH, $LD_LIBARY_PATH and $CPATH.

    • [Optional] the loading of modules and setting of paths is likely convenient to put in your .bash_profile or .bashrc.

  6. Install Poetry.
    • Use curl -sSL https://install.python-poetry.org | python3 - to install Poetry.

    • Add the poetry export plugin with poetry self add poetry-plugin-export.

  7. Install the project, dependencies and extras: poetry install --with test,docs -E cuda -E opencl -E hip, leaving out -E cuda, -E opencl or -E hip if this does not apply on your system. To go all-out, use --all-extras.
    • If you run into “keyring” or other seemingly weird issues, this is a known issue with Poetry on some systems. Do: pip install keyring, python3 -m keyring --disable.

    • Depending on the environment, it may be necessary or convenient to install extra packages such as cupy-cuda11x / cupy-cuda12x, and cuda-python. These are currently not defined as dependencies for kernel-tuner, but can be part of tests.

    • Verify that your development environment has no missing installs or updates with poetry install --sync --dry-run --with test.

  8. Check if the environment is setup correctly by running pytest. All tests should pass, except if you’re not on a GPU node, or one or more extras has been left out in the previous step, then these tests will skip gracefully.

  9. Set Nox to use the correct backend and location:
    • Run conda -- create-settings-file to automatically create a settings file.

    • In this settings file noxsettings.toml, change the venvbackend:
      • If you used Mamba in step 2, to mamba.

      • If you used Miniconda or Anaconda in step 2, to conda.

      • If you used Venv in step 2, to venv.

      • If you used Virtualenv in step 2, this is already the default.

    • Be sure to adjust this when changing backends.

    • The settings file also has envdir, which allows you to change the directory Nox caches environments in, particularly helpful if you have a diskquota on your user directory.

  10. [Optional] Run the tests on Nox as described below.

Running tests

To run the tests you can use nox (to run against all supported Python versions in isolated environments) and pytest (to run against the local Python version, see below) in the top-level directory. For full coverage, make Nox use the additional tests (such as cupy and cuda-python) with nox -- additional-tests.

The Nox isolated environments can take up to 1 gigabyte in size, so users tight on diskspace can run nox with the small-disk option. This removes the other environment caches before each session is ran (note that this will take longer to run). A better option would be to change the location environments are stored in with envdir in the noxsettings.toml file.

Please note that the command-line options can be combined, e.g. nox -- additional-tests skip-hip small-disk. If you do not have fully compatible hardware or environment, you can use the following options:

  • nox -- skip-cuda to skip tests involving CUDA.

  • nox -- skip-hip to skip tests involving HIP.

  • nox -- skip-opencl to skip tests involving OpenCL.

  • nox -- skip-gpu to skip all tests on the GPU (the same as nox -- skip-cuda skip-hip skip-opencl), especially helpful if you don’t have a GPU locally.

Contributions you make to the Kernel Tuner should not break any of the tests even if you cannot run them locally!

Running with pytest will test against your local Python version and PIP packages. In this case, tests that require PyCuda and/or a CUDA capable GPU will be skipped automatically if these are not installed/present. The same holds for tests that require PyOpenCL, Cupy, and CUDA. It is also possible to invoke PyTest from the ‘Testing’ tab in Visual Studio Code to visualize the testing in your IDE.

The examples can be seen as integration tests for the Kernel Tuner. Note that these will also use the installed package.

Building documentation

Documentation is located in the doc/ directory. This is where you can type make html to generate the html pages in the doc/build/html directory. The source files used for building the documentation are located in doc/source. To locally inspect the documentation before committing you can browse through the documentation pages generated locally in doc/build/html.

To make sure you have all the dependencies required to build the documentation, at least those in --with docs. Pandoc is also required, you can install pandoc on Ubuntu using sudo apt install pandoc and on Mac using brew install pandoc. For different setups please see pandoc’s install documentation.

The documentation pages hosted online are built automatically using GitHub actions. The documentation pages corresponding to the master branch are hosted in /latest/. The documentation of the last release is in /stable/. When a new release is published the documentation for that release will be stored in a directory created for that release and /stable/ will be updated to point to the last release. This process is again fully automated using GitHub actions.