Installation¶
DataEval is a library that offers powerful metric classes and dataset analysis functions using NumPy and PyTorch as the primary backends.
Supported Python Versions¶
We currently support Python versions >= 3.10
PyTorch Dependency¶
DataEval requires PyTorch to be installed. By default, pip install dataeval
pulls PyTorch from PyPI, which includes CUDA support on Linux.
To install a specific PyTorch variant, use --extra-index-url to point pip
at the appropriate PyTorch wheel index:
# CPU only
pip install dataeval --extra-index-url https://download.pytorch.org/whl/cpu
# CUDA 11.8
pip install dataeval --extra-index-url https://download.pytorch.org/whl/cu118
# CUDA 12.8
pip install dataeval --extra-index-url https://download.pytorch.org/whl/cu128
See the PyTorch installation guide for all available PyTorch installation options.
Note: When installing from source using uv, you can use extras to specify PyTorch versions
(e.g., --extra cpu, --extra cu118, --extra cu128). See the source installation
instructions below for details.
Installing DataEval¶
Now that you have chosen which DataEval to install, the following methods will show you how to install using your preferred method.
Installing from pip
pip install dataeval
Installing from conda
conda install -c conda-forge dataeval
To install DataEval from source locally on Ubuntu using poetry, begin by ensuring poetry is installed in your Python environment.
pip install poetry
Pull the source down and change to the DataEval project directory.
git clone https://github.com/aria-ml/dataeval.git
cd dataeval
Install DataEval
poetry install
Now that DataEval is installed, you can run commands in the Poetry virtual environment by prefixing shell commands with poetry run, or activate the virtual environment directly in the shell.
poetry env activate
To install DataEval from source locally on Ubuntu, you will need uv for Python environment management.
Pull the source down and change to the DataEval project directory.
git clone https://github.com/aria-ml/dataeval.git
cd dataeval
Install DataEval with development dependencies.
uv sync
Optionally, you can specify the version of Python and PyTorch CPU/CUDA support (cpu, cu118, cu128) using -p and –extra respectively.
For example, the following command installs DataEval in a Python 3.11 environment using only PyTorch with CPU support, and no development dependencies:
uv sync -p 3.11 --extra cpu --no-default-groups
Now that DataEval is installed, you can run commands in the uv virtual environment by prefixing shell commands with uv run, or activate the virtual environment directly in the shell.
source .venv/bin/activate