Skip to content

Commit a0a146e

Browse files
sradcoAI Assistant
andcommitted
Add multipath collector
Add a new disabled-by-default collector (--collector.multipath) that exposes storage path health and redundancy metrics from two sources: DM-Multipath (via multipathd Unix socket): - node_multipath_daemon_up: multipathd reachability - node_multipath_device_{info,active,size_bytes}: per-device state - node_multipath_device_paths_{total,active,failed}: path counts - node_multipath_device_path_faults_total: cumulative path faults - node_multipath_path_{active,checker_state}: per-path detail NVMe-oF Native Multipath (via /sys/class/nvme-subsystem/): - node_multipath_nvme_subsystem_{info,paths_total,paths_live} - node_multipath_nvme_path_state: per-controller connectivity This fills a monitoring gap for storage connectivity. The existing NVMe collector reports hardware health but is blind to fabric path failures. Signed-off-by: Shirly Radco <sradco@redhat.com> Co-authored-by: AI Assistant <noreply@cursor.com>
1 parent a1cbf81 commit a0a146e

File tree

7 files changed

+1387
-13
lines changed

7 files changed

+1387
-13
lines changed

README.md

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -201,6 +201,7 @@ lnstat | Exposes stats from `/proc/net/stat/`. | Linux
201201
logind | Exposes session counts from [logind](http://www.freedesktop.org/wiki/Software/systemd/logind/). | Linux
202202
meminfo\_numa | Exposes memory statistics from `/sys/devices/system/node/node[0-9]*/meminfo`, `/sys/devices/system/node/node[0-9]*/numastat`. | Linux
203203
mountstats | Exposes filesystem statistics from `/proc/self/mountstats`. Exposes detailed NFS client statistics. | Linux
204+
multipath | Exposes DM-multipath and NVMe-oF path health from `multipathd` and `/sys/class/nvme-subsystem/`. | Linux
204205
network_route | Exposes the routing table as metrics | Linux
205206
pcidevice | Exposes pci devices' information including their link status and parent devices. | Linux
206207
perf | Exposes perf based metrics (Warning: Metrics are dependent on kernel configuration and settings). | Linux
@@ -339,6 +340,36 @@ echo 'role{role="application_server"} 1' > /path/to/directory/role.prom.$$
339340
mv /path/to/directory/role.prom.$$ /path/to/directory/role.prom
340341
```
341342

343+
### Multipath Collector
344+
345+
The `multipath` collector exposes storage path health and redundancy from two
346+
independent data sources:
347+
348+
- **DM-Multipath** — queries `multipathd` over its Unix socket for device-mapper
349+
multipath maps, path groups, and individual path states.
350+
- **NVMe-oF Native Multipath** — reads `/sys/class/nvme-subsystem/` for NVMe
351+
subsystem controller states, providing connectivity-layer visibility that the
352+
standard `nvme` collector does not cover.
353+
354+
Either data source may be absent; the collector reports whatever is available.
355+
356+
#### Flags
357+
358+
Flag | Default | Description
359+
-----|---------|------------
360+
`--collector.multipath.socket-path` | `/run/multipathd.socket` | Path to the `multipathd` Unix socket.
361+
`--collector.multipath.timeout` | `5s` | Timeout for `multipathd` socket communication.
362+
363+
#### Permissions
364+
365+
The DM-Multipath portion requires read access to the `multipathd` Unix socket.
366+
Depending on the system, this may require running `node_exporter` as root or
367+
adding it to the appropriate group (commonly `disk` or the group owning
368+
`/run/multipathd.socket`).
369+
370+
The NVMe subsystem portion only requires read access to sysfs, which is
371+
available to all users by default.
372+
342373
### Filtering enabled collectors
343374

344375
The `node_exporter` will expose all metrics from enabled collectors by default. This is the recommended way to collect metrics to avoid errors when comparing metrics of different families.
Lines changed: 175 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,175 @@
1+
{
2+
"major_version": 0,
3+
"minor_version": 1,
4+
"maps": [
5+
{
6+
"name" : "mpathA",
7+
"uuid" : "36001405d27e278d2c1e4938a3b4a8bb9",
8+
"sysfs" : "dm-0",
9+
"failback" : "immediate",
10+
"queueing" : "off",
11+
"paths" : 4,
12+
"write_prot" : "rw",
13+
"dm_st" : "active",
14+
"features" : "0",
15+
"hwhandler" : "1 alua",
16+
"action" : "",
17+
"path_faults" : 2,
18+
"vend" : "LIO-ORG",
19+
"prod" : "disk1",
20+
"rev" : "4.0",
21+
"switch_grp" : 1,
22+
"map_loads" : 3,
23+
"total_q_time" : 0,
24+
"q_timeouts" : 0,
25+
"path_groups": [
26+
{
27+
"selector" : "service-time 0",
28+
"pri" : 50,
29+
"dm_st" : "active",
30+
"marginal_st" : "normal",
31+
"group" : 1,
32+
"paths": [
33+
{
34+
"dev" : "sda",
35+
"dev_t" : "8:0",
36+
"dm_st" : "active",
37+
"dev_st" : "running",
38+
"chk_st" : "ready",
39+
"checker" : "tur",
40+
"pri" : 50,
41+
"host_wwnn" : "0x200100015b4a6702",
42+
"target_wwnn" : "0x200300015b4a6702",
43+
"host_wwpn" : "0x200200015b4a6702",
44+
"target_wwpn" : "0x200400015b4a6702",
45+
"host_adapter" : "0000:03:00.0",
46+
"lun_hex" : "0x0000000000000000",
47+
"marginal_st" : "normal"
48+
},
49+
{
50+
"dev" : "sdb",
51+
"dev_t" : "8:16",
52+
"dm_st" : "active",
53+
"dev_st" : "running",
54+
"chk_st" : "ready",
55+
"checker" : "tur",
56+
"pri" : 50,
57+
"host_wwnn" : "0x200100015b4a6702",
58+
"target_wwnn" : "0x200300015b4a6702",
59+
"host_wwpn" : "0x200200015b4a6702",
60+
"target_wwpn" : "0x200500015b4a6702",
61+
"host_adapter" : "0000:03:00.0",
62+
"lun_hex" : "0x0000000000000000",
63+
"marginal_st" : "normal"
64+
}
65+
]
66+
},
67+
{
68+
"selector" : "service-time 0",
69+
"pri" : 10,
70+
"dm_st" : "enabled",
71+
"marginal_st" : "normal",
72+
"group" : 2,
73+
"paths": [
74+
{
75+
"dev" : "sdc",
76+
"dev_t" : "8:32",
77+
"dm_st" : "active",
78+
"dev_st" : "running",
79+
"chk_st" : "ghost",
80+
"checker" : "tur",
81+
"pri" : 10,
82+
"host_wwnn" : "0x200100015b4a6703",
83+
"target_wwnn" : "0x200300015b4a6703",
84+
"host_wwpn" : "0x200200015b4a6703",
85+
"target_wwpn" : "0x200400015b4a6703",
86+
"host_adapter" : "0000:05:00.0",
87+
"lun_hex" : "0x0000000000000000",
88+
"marginal_st" : "normal"
89+
},
90+
{
91+
"dev" : "sdd",
92+
"dev_t" : "8:48",
93+
"dm_st" : "failed",
94+
"dev_st" : "running",
95+
"chk_st" : "faulty",
96+
"checker" : "tur",
97+
"pri" : 0,
98+
"host_wwnn" : "0x200100015b4a6703",
99+
"target_wwnn" : "0x200300015b4a6703",
100+
"host_wwpn" : "0x200200015b4a6703",
101+
"target_wwpn" : "0x200500015b4a6703",
102+
"host_adapter" : "0000:05:00.0",
103+
"lun_hex" : "0x0000000000000000",
104+
"marginal_st" : "normal"
105+
}
106+
]
107+
}
108+
]
109+
},
110+
{
111+
"name" : "mpathB",
112+
"uuid" : "36001405e4ec3a15c6724bb2aa3ecf54a",
113+
"sysfs" : "dm-1",
114+
"failback" : "immediate",
115+
"queueing" : "off",
116+
"paths" : 2,
117+
"write_prot" : "rw",
118+
"dm_st" : "active",
119+
"features" : "0",
120+
"hwhandler" : "1 alua",
121+
"action" : "",
122+
"path_faults" : 0,
123+
"vend" : "LIO-ORG",
124+
"prod" : "disk2",
125+
"rev" : "4.0",
126+
"switch_grp" : 0,
127+
"map_loads" : 1,
128+
"total_q_time" : 0,
129+
"q_timeouts" : 0,
130+
"path_groups": [
131+
{
132+
"selector" : "service-time 0",
133+
"pri" : 50,
134+
"dm_st" : "active",
135+
"marginal_st" : "normal",
136+
"group" : 1,
137+
"paths": [
138+
{
139+
"dev" : "sde",
140+
"dev_t" : "8:64",
141+
"dm_st" : "active",
142+
"dev_st" : "running",
143+
"chk_st" : "ready",
144+
"checker" : "tur",
145+
"pri" : 50,
146+
"host_wwnn" : "0x200100015b4a6702",
147+
"target_wwnn" : "0x200300015b4a6702",
148+
"host_wwpn" : "0x200200015b4a6702",
149+
"target_wwpn" : "0x200400015b4a6702",
150+
"host_adapter" : "0000:03:00.0",
151+
"lun_hex" : "0x0001000000000000",
152+
"marginal_st" : "normal"
153+
},
154+
{
155+
"dev" : "sdf",
156+
"dev_t" : "8:80",
157+
"dm_st" : "active",
158+
"dev_st" : "running",
159+
"chk_st" : "ready",
160+
"checker" : "tur",
161+
"pri" : 50,
162+
"host_wwnn" : "0x200100015b4a6702",
163+
"target_wwnn" : "0x200300015b4a6702",
164+
"host_wwpn" : "0x200200015b4a6702",
165+
"target_wwpn" : "0x200500015b4a6702",
166+
"host_adapter" : "0000:03:00.0",
167+
"lun_hex" : "0x0001000000000000",
168+
"marginal_st" : "normal"
169+
}
170+
]
171+
}
172+
]
173+
}
174+
]
175+
}

collector/fixtures/sys.ttar

Lines changed: 111 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -2219,40 +2219,138 @@ Lines: 1
22192219
Samsung SSD 970 PRO 512GB
22202220
Mode: 444
22212221
# ttar - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
2222-
Path: sys/class/nvme/nvme0/serial
2222+
Directory: sys/class/nvme/nvme0/nvme0c0n0
2223+
Mode: 755
2224+
# ttar - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
2225+
Path: sys/class/nvme/nvme0/nvme0c0n0/ana_state
22232226
Lines: 1
2224-
S680HF8N190894I
2227+
optimized
22252228
Mode: 444
22262229
# ttar - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
2227-
Path: sys/class/nvme/nvme0/state
2230+
Path: sys/class/nvme/nvme0/nvme0c0n0/nuse
22282231
Lines: 1
2229-
live
2232+
488281250
22302233
Mode: 444
22312234
# ttar - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
2232-
Directory: sys/class/nvme/nvme0/nvme0c0n0
2235+
Directory: sys/class/nvme/nvme0/nvme0c0n0/queue
22332236
Mode: 755
22342237
# ttar - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
2235-
Path: sys/class/nvme/nvme0/nvme0c0n0/ana_state
2238+
Path: sys/class/nvme/nvme0/nvme0c0n0/queue/logical_block_size
22362239
Lines: 1
2237-
optimized
2238-
Mode: 444
2240+
4096
2241+
Mode: 644
22392242
# ttar - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
22402243
Path: sys/class/nvme/nvme0/nvme0c0n0/size
22412244
Lines: 1
22422245
3906250000
22432246
Mode: 444
22442247
# ttar - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
2245-
Path: sys/class/nvme/nvme0/nvme0c0n0/nuse
2248+
Path: sys/class/nvme/nvme0/serial
22462249
Lines: 1
2247-
488281250
2250+
S680HF8N190894I
22482251
Mode: 444
22492252
# ttar - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
2250-
Directory: sys/class/nvme/nvme0/nvme0c0n0/queue
2253+
Path: sys/class/nvme/nvme0/state
2254+
Lines: 1
2255+
live
2256+
Mode: 444
2257+
# ttar - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
2258+
Directory: sys/class/nvme-subsystem
22512259
Mode: 755
22522260
# ttar - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
2253-
Path: sys/class/nvme/nvme0/nvme0c0n0/queue/logical_block_size
2261+
Directory: sys/class/nvme-subsystem/nvme-subsys0
2262+
Mode: 755
2263+
# ttar - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
2264+
Path: sys/class/nvme-subsystem/nvme-subsys0/iopolicy
22542265
Lines: 1
2255-
4096
2266+
round-robinEOF
2267+
Mode: 644
2268+
# ttar - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
2269+
Path: sys/class/nvme-subsystem/nvme-subsys0/model
2270+
Lines: 1
2271+
Dell PowerStoreEOF
2272+
Mode: 644
2273+
# ttar - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
2274+
Directory: sys/class/nvme-subsystem/nvme-subsys0/nvme0
2275+
Mode: 755
2276+
# ttar - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
2277+
Path: sys/class/nvme-subsystem/nvme-subsys0/nvme0/address
2278+
Lines: 1
2279+
nn-0x200000109b123456:pn-0x100000109b123456EOF
2280+
Mode: 644
2281+
# ttar - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
2282+
Path: sys/class/nvme-subsystem/nvme-subsys0/nvme0/state
2283+
Lines: 1
2284+
liveEOF
2285+
Mode: 644
2286+
# ttar - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
2287+
Path: sys/class/nvme-subsystem/nvme-subsys0/nvme0/transport
2288+
Lines: 1
2289+
fcEOF
2290+
Mode: 644
2291+
# ttar - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
2292+
Directory: sys/class/nvme-subsystem/nvme-subsys0/nvme1
2293+
Mode: 755
2294+
# ttar - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
2295+
Path: sys/class/nvme-subsystem/nvme-subsys0/nvme1/address
2296+
Lines: 1
2297+
nn-0x200000109b123457:pn-0x100000109b123457EOF
2298+
Mode: 644
2299+
# ttar - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
2300+
Path: sys/class/nvme-subsystem/nvme-subsys0/nvme1/state
2301+
Lines: 1
2302+
liveEOF
2303+
Mode: 644
2304+
# ttar - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
2305+
Path: sys/class/nvme-subsystem/nvme-subsys0/nvme1/transport
2306+
Lines: 1
2307+
fcEOF
2308+
Mode: 644
2309+
# ttar - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
2310+
Directory: sys/class/nvme-subsystem/nvme-subsys0/nvme2
2311+
Mode: 755
2312+
# ttar - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
2313+
Path: sys/class/nvme-subsystem/nvme-subsys0/nvme2/address
2314+
Lines: 1
2315+
nn-0x200000109b123458:pn-0x100000109b123458EOF
2316+
Mode: 644
2317+
# ttar - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
2318+
Path: sys/class/nvme-subsystem/nvme-subsys0/nvme2/state
2319+
Lines: 1
2320+
liveEOF
2321+
Mode: 644
2322+
# ttar - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
2323+
Path: sys/class/nvme-subsystem/nvme-subsys0/nvme2/transport
2324+
Lines: 1
2325+
fcEOF
2326+
Mode: 644
2327+
# ttar - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
2328+
Directory: sys/class/nvme-subsystem/nvme-subsys0/nvme3
2329+
Mode: 755
2330+
# ttar - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
2331+
Path: sys/class/nvme-subsystem/nvme-subsys0/nvme3/address
2332+
Lines: 1
2333+
nn-0x200000109b123459:pn-0x100000109b123459EOF
2334+
Mode: 644
2335+
# ttar - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
2336+
Path: sys/class/nvme-subsystem/nvme-subsys0/nvme3/state
2337+
Lines: 1
2338+
deadEOF
2339+
Mode: 644
2340+
# ttar - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
2341+
Path: sys/class/nvme-subsystem/nvme-subsys0/nvme3/transport
2342+
Lines: 1
2343+
fcEOF
2344+
Mode: 644
2345+
# ttar - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
2346+
Path: sys/class/nvme-subsystem/nvme-subsys0/serial
2347+
Lines: 1
2348+
SN12345678EOF
2349+
Mode: 644
2350+
# ttar - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
2351+
Path: sys/class/nvme-subsystem/nvme-subsys0/subsysnqn
2352+
Lines: 1
2353+
nqn.2014-08.org.nvmexpress:uuid:a34c4f3a-0d6f-5cec-dead-beefcafebabeEOF
22562354
Mode: 644
22572355
# ttar - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
22582356
Directory: sys/class/power_supply

0 commit comments

Comments
 (0)