A temperature-based fan speed controller for Dell PowerEdge servers (tested on an R730, should work with most PowerEdges). Supports both local and remote hosts.
-
Python 3 is installed.
-
IPMI Over LAN is enabled in all used iDRACs (Login > Network/Security > IPMI Settings).
- May not be needed if you're only managing the local machine.
-
All hosts to be monitored must have the appropriate sensor tools installed as needed:
-
For monitoring local CPU: install and configure
lm-sensors -
For monitoring NVIDIA GPU: install
nvidia-smi -
For monitoring AMD GPU: install
rocm-smi -
Example output of
sensorsfor a dual CPU system:coretemp-isa-0000 Adapter: ISA adapter Core 0: +38.0°C (high = +69.0°C, crit = +79.0°C) Core 1: +46.0°C (high = +69.0°C, crit = +79.0°C) Core 2: +40.0°C (high = +69.0°C, crit = +79.0°C) Core 8: +43.0°C (high = +69.0°C, crit = +79.0°C) Core 9: +39.0°C (high = +69.0°C, crit = +79.0°C) Core 10: +39.0°C (high = +69.0°C, crit = +79.0°C) coretemp-isa-0001 Adapter: ISA adapter Core 0: +29.0°C (high = +69.0°C, crit = +79.0°C) Core 1: +35.0°C (high = +69.0°C, crit = +79.0°C) Core 2: +29.0°C (high = +69.0°C, crit = +79.0°C) Core 8: +34.0°C (high = +69.0°C, crit = +79.0°C) Core 9: +33.0°C (high = +69.0°C, crit = +79.0°C) Core 10: +31.0°C (high = +69.0°C, crit = +79.0°C)
-
Clone the repo and run the installation script as root to configure the system or upgrade the already installed controller:
git clone https://github.com/kuan909608/dell-idrac-fan-controller-gpu.git
cd dell-idrac-fan-controller-gpu
sudo ./install.sh [<installation path>]
The default installation path is /opt/fan_control and the service will be installed as fan-control.service. If a configuration file already exists, it will be renamed with a .old extension.
To deploy remote fan management with Docker (fan_control running on a separate host and only interacting with remote ones, see Notes on remote hosts), build the image in the repo and bind mount your own YAML config and SSH keys folder:
git clone https://github.com/kuan909608/dell-idrac-fan-controller-gpu.git
cd dell-idrac-fan-controller-gpu
docker build -t fan_control .
docker run -d --restart=always --name fan_control -v "./fan_control_config.yaml:/app/fan_control_config.yaml:ro" -v "./keys:/app/keys:ro" fan_controlRunning this tool under a proper orchestrator is advised.
You can deploy this tool in two ways: systemd (bare-metal) or Docker. Please choose only one method for each host.
- Recommended if you need direct access to hardware sensors (e.g., lm-sensors) on the host.
- Suitable for environments where you want the service to start automatically with the OS and be managed by systemd.
- The
install.shscript automates dependency installation, venv setup, file copying, and systemd service configuration.
- Recommended for remote-only management, or if you want to isolate the environment and simplify migration.
- If you need to access hardware sensors inside Docker, you must mount additional system directories (e.g.,
/dev,/sys). Example:docker run ... -v /dev:/dev -v /sys:/sys ...
- Make sure to mount your configuration file and SSH keys as shown above.
- For production, use an orchestrator for better reliability.
- Do not enable both systemd service and Docker container on the same host at the same time. Running both may cause conflicts or resource contention.
- The
install.shscript will overwrite existing files and systemd service. Backup your configuration before running it. - When using SSH keys in Docker, ensure proper permissions and security practices.
You can tune the controller's settings via the fan_control_config.yaml file in the installation directory.
The configuration file contains two main sections: general and hosts.
| Key | Description |
|---|---|
debug |
Toggle debug mode (print ipmitool commands instead of executing them, enable additional logging). |
interval |
How often (in seconds) to read the CPUs' and GPUs' temperatures and adjust the fans' speeds. |
temperature_control_mode |
Use max or avg to decide if fan control is based on the maximum or average temperature. |
cpu_temperature_command |
Shell command to get CPU temperatures (semicolon separated). |
gpu_temperature_command_nvidia |
Shell command to get NVIDIA GPU temperatures (semicolon separated). |
gpu_temperature_command_amd |
Shell command to get AMD GPU temperatures (semicolon separated). |
Each host object supports the following keys:
| Key | Description |
|---|---|
name |
Host name identifier. |
fan_control_mode |
Fan control mode, manual or automatic. |
temperatures |
List of temperature thresholds (in °C). Must have at least 2 values. |
speeds |
List of fan speeds (in %) for each threshold. Must have at least 2 values. |
hysteresis |
Hysteresis value in °C to prevent rapid fan speed changes. |
ipmi_credentials |
(Optional) IPMI login info for this host. |
ssh_credentials |
(Optional) SSH login info for this host. Supports host, username, password, and optional key_path for SSH key authentication. |
gpu_type |
(Optional) Supported GPU types, can be a string (e.g., nvidia) or an array (e.g., [nvidia, amd]). |
vms |
(Optional) List of VM objects. See below for VM object structure. |
Each VM object supports the following keys:
| Key | Description |
|---|---|
name |
VM name identifier. |
ssh_credentials |
SSH login info for the VM. Supports host, username, password, and optional key_path for SSH key authentication. |
gpu_type |
Supported GPU types for the VM, can be a string (e.g., nvidia) or an array (e.g., [nvidia, amd]). |
If you only specify 2 pairs of temperatures and speeds (e.g., [40, 80] and [20, 80]), the system will automatically split them into multiple steps based on the hysteresis value.
The splitting logic:
- The range between
temp_minandtemp_maxwill be divided into intervals ofhysteresis * 2. - For each interval, a new threshold and corresponding speed will be generated, resulting in smoother fan speed transitions.
Example:
temperatures: [40, 80]
speeds: [20, 80]
hysteresis: 5This will be automatically expanded to:
thresholds: [40.00, 50.00, 60.00, 70.00, 80.00]
speeds: [20, 35, 50, 65, 80]
general:
debug: False
interval: 60
cpu_temperature_command: "sensors | grep -E 'Core [0-9]+:' | awk '{print $3}' | sed 's/+//;s/°C//' | paste -sd ';' -"
gpu_temperature_command_nvidia: "nvidia-smi --query-gpu=temperature.gpu --format=csv,noheader,nounits | paste -sd ';' -"
gpu_temperature_command_amd: "rocm-smi --showtemp | grep -E 'Temp' | awk '{print \$2}' | sed 's/[^0-9.]//g' | paste -sd ';' -"
hosts:
- name: host1
temperatures: [40, 60, 80]
speeds: [20, 50, 80]
hysteresis: 5
ipmi_credentials:
host: 10.0.0.1
username: admin
password: password
ssh_credentials:
host: 10.0.0.2
username: admin
password: password
gpu_type: nvidia
vms:
- name: vm1
ssh_credentials:
host: 10.0.0.3
username: user
password: password
gpu_type: [nvidia]
- name: host2
temperatures: [35, 55, 75]
speeds: [30, 60, 90]
hysteresis: 5
gpu_type: nvidiaIf you only specify 2 pairs of temperatures and speeds (e.g., [40, 80] and [20, 80]), the system will automatically split them into multiple steps based on the hysteresis value.
The splitting logic:
- The range between
temp_minandtemp_maxwill be divided into intervals ofhysteresis * 2. - For each interval, a new threshold and corresponding speed will be generated, resulting in smoother fan speed transitions.
Example:
temperatures: [40, 80]
speeds: [20, 80]
hysteresis: 5This will be automatically expanded to:
thresholds: [40.00, 50.00, 60.00, 70.00, 80.00]
speeds: [20, 35, 50, 65, 80]
temperature_control_mode(ingeneral): Set tomaxto use the highest temperature for fan control, oravgto use the average.fan_control_mode(in each host): Set tomanualfor script-controlled fan speed, orautomaticto let hardware manage it.gpu_type: Can be a string or an array, e.g.,nvidiaor[nvidia, amd].ssh_credentials.key_path: (Optional) Path to SSH private key for authentication.- VM objects under
vmsalso supportgpu_typeas an array.
- Each host can define its own temperature thresholds, fan speeds, and credentials.
- Hosts can include a
vmslist,each VM supports its own SSH credentials and GPU type。 - The script will collect GPU temperatures from both the host and all defined VMs, and use the highest temperature for fan control.
- The script collects CPU and GPU temperatures for each host and its VMs.
- If temperature data is missing or abnormal, the fan will run at the highest configured speed (the last value in
speeds) for safety. - The maximum value among all CPU/GPU temperatures (including VMs) is used as the control temperature.
- Fan speed is set according to the configured thresholds and speeds.
- All temperature readings and control actions are recorded in the internal state for monitoring and debugging.
Every interval seconds, the controller will collect CPU and GPU temperatures from all hosts and their VMs.
The highest temperature among all CPUs/GPUs (including VMs) is used as the control temperature to determine the fan speed.
- If temperature data is missing or abnormal, the fan will run at the highest configured speed (the last value in
speeds) for safety. - Fan speed is set according to the configured thresholds and speeds.
- All temperature readings and control actions are recorded in the internal state for monitoring and debugging.
Fan speed is determined by each temperature threshold and its corresponding speed. The number of thresholds/speeds can be any matching pair count.
| Condition | Fan speed |
|---|---|
| Tmax ≤ Threshold1 | Speed1 |
| Threshold1 < Tmax ≤ Threshold2 | Speed2 |
| ... | ... |
| Tmax > ThresholdN | Highest configured speed (last value in speeds) |
If hysteresis is set for a given host, the controller will wait for the temperature to go below ThresholdN - hysteresis before lowering the fan speed.
For example: with a Threshold2 of 37°C and a hysteresis of 3°C, the fans won't slow down from Threshold3 to Threshold2 speed until the temperature reaches 34°C.
This controller can monitor the temperature and change the fan speed of remote hosts too: the only caveat is that you'll need to extract the temperatures via an external command. This could be via SSH, for example. The controller expects such a command to return a newline-delimited list of numbers parseable as floats.
The included example is a good fit for a remote Proxmox VE host: it will connect to it via SSH and extract the temperature of all CPU cores, one per line. This way you'll be able to manage that machine just as well as the local one without applying any hardly trackable modification to the base OS.
Major thanks go to NoLooseEnds's directions for the core commands, sulaweyo's ruby script for the idea of automating them, and nmaggioni's r710-fan-controller as the main forked project.
Note: The key difference of this script, other than handling remote hosts, is that it's based on the temperature of the CPUs' cores and not on the ambient temperature sensor on the server's motherboard.