Configuring hardware: PyTorch devices and cpu processes¶
Problem Statement¶
DataEval provides global configuration settings to control computational resources and hardware acceleration. This guide shows how to configure the default PyTorch device and the maximum number of worker processes.
When to use¶
You need to specify GPU or CPU execution for PyTorch-based operations
You want to control the number of parallel worker processes
You need to optimize performance for your hardware configuration
What you will need¶
A Python environment with dataeval installed
Getting Started¶
import dataeval
Configuring the PyTorch device¶
DataEval provides configuration options for setting the PyTorch device to use within DataEval. See
torch.device for more information.
Set the default device to CPU¶
dataeval.config.set_device("cpu")
print(f"Current device for DataEval: {dataeval.config.get_device()}")
Current device for DataEval: cpu
Set the default device to CUDA GPU¶
dataeval.config.set_device("cuda")
print(f"Current device for DataEval: {dataeval.config.get_device()}")
Current device for DataEval: cuda
Set the default device to a specific CUDA GPU¶
dataeval.config.set_device("cuda:1")
print(f"Current device for DataEval: {dataeval.config.get_device()}")
Current device for DataEval: cuda:1
Reset the device to use PyTorch’s default device¶
dataeval.config.set_device(None)
print(f"Current device for DataEval: {dataeval.config.get_device()}")
Current device for DataEval: cpu
Configuring maximum worker processes¶
DataEval follows the maximum worker configuration conventions used by
scikit-learn and
joblib.
Set the maximum number of worker processes¶
dataeval.config.set_max_processes(4)
print(f"Max processes: {dataeval.config.get_max_processes()}")
Max processes: 4
Set the maximum number of workers to all visible cpu cores¶
dataeval.config.set_max_processes(-1)
print(f"Max processes: {dataeval.config.get_max_processes()}")
Max processes: -1
Unset the maximum number of workers¶
dataeval.config.set_max_processes(None)
print(f"Max processes: {dataeval.config.get_max_processes()}")
Max processes: None
Using temporary context managers¶
Temporarily override the max processes setting using a context manager:
dataeval.config.set_max_processes(8)
print(f"Before context: {dataeval.config.get_max_processes()}")
with dataeval.config.use_max_processes(2):
print(f"Inside context: {dataeval.config.get_max_processes()}")
# Perform operations with max_processes=2
print(f"After context: {dataeval.config.get_max_processes()}")
Before context: 8
Inside context: 2
After context: 8