If you are not running storagenode on a FreeBSD system, connected to a UPS (especially, CyberPower one) managed by NUT – this does not apply to you, no reason to continue reading.
The problem
This is a “perfect storm” of circumstances that may result in data loss as a result of power getting yanked before the system is ready, when UPS battery depletes.
- NUT is designed for linux, where by the time shutdown -h commands completes, the filesystem is synced and re-mounted readonly, at which points it’s safe to pull power. FreeBSD does not re-mount filesystem, and the kernel takes care of flushing caches and finalizing the filesystem after the shutdown command completes. So when NUT shuts down the UPS power as soon as shutdown command completes – this results in an unclean shutdown. There is parameter
offdelay
that defines a delay in seconds between the UPS kill command is issued and power is pulled. By default, it’s 20 seconds, and unfortunately, this is not enough for most servers. - CyberPower UPSes, in particular, treat the
offdelay
parameter differently. They convert it to minutes and round down. Hence, the default 20 second value in CyberPower’s world means zero. It must be overwritten to at least 60, to prevent the UPS from yanking power immediately. - Separately, relying on UPS’s critical battery alarm to shut down the server is a non-starter: most UPS, especially when the battery deteriorates, do not provide enough runtime to safely shut down the server. Most available Cyberpower models don’t provide ways to calibrate the battery automatically and/or periodically, and when I contacted support asking how to do it, the rep spent 10 min searching for something and then claimed “this information is proprietary” and hung up. Therefore we want to configure shutdown by remaining battery percentage or runtime thresholds manually.
The solution
Add the following to ups.conf
Configure runtime monitoring
Ignore low battery state reported by the UPS and instead and instead go by remaining runtime and state of charge. In this example, we set 20% low battery charge, and 5 min remaining runtime, whichever is lower.
ignorelb
override.battery.charge.low = 20
override.battery.runtime.low = 300
CyberPower-specific offdelay
override
CyberPower UPSs divide the value by 60, rounds down, and use the resulting number of minutes as a delay. In addition, on some models, the ondelay
parameter must be set to zero to ensure proper power-up behavior. (what a crock of shit are those devices… No more buying them. But they are cheap)
ondelay=0
offdelay=120
These values result in 2 minute power off delay (120/2), and 10 seconds (internal UPS default, evidently, on my specific model) power-on delay, when power is restored.