-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nut-upsd connection lost #67
Comments
+1 |
1 similar comment
+1 |
Hello, version docker tag used: latest = 2.8.1-r0
also have this alert if it helps
after a while:
after more time:
|
What is the output of this command (part of Dockerfile's
An important assumption is that the |
Hello, Here is the required information: used: latest = 2.8.1-r0
|
I want to mention that nut-upsd:2.8.0-r4 works ok and has the following output:
|
The command output I want to see is
which expands the variable $NAME, not expanding a non-existent variable $CP1500EPFCLCD, below is incorrect:
Someone reported a bug with there being two Does the symptom still appear when you comment out the NAME: CP1500EPFCLCD environment variable in docker-compose.yml? |
old 2.8.0-r4: / # upsc $NAME@localhost:3493 2>&1
Init SSL without certificate database
battery.charge: 100
battery.charge.low: 10
battery.charge.warning: 20
battery.mfr.date: CPS
battery.runtime: 3150
battery.runtime.low: 300
battery.type: PbAcid
battery.voltage: 24.0
battery.voltage.nominal: 24
device.mfr: CPS
device.model: CP1500EPFCLCD
device.serial: xxx
device.type: ups
driver.name: usbhid-ups
driver.parameter.pollfreq: 30
driver.parameter.pollinterval: 2
driver.parameter.port: auto
driver.parameter.serial: xxx
driver.parameter.synchronous: auto
driver.version: 2.8.0
driver.version.data: CyberPower HID 0.6
driver.version.internal: 0.47
driver.version.usb: libusb-1.0.26 (API: 0x1000109)
input.transfer.high: 260
input.transfer.low: 170
input.voltage: 232.0
input.voltage.nominal: 230
output.voltage: 232.0
ups.beeper.status: enabled
ups.delay.shutdown: 20
ups.delay.start: 30
ups.load: 13
ups.mfr: CPS
ups.model: CP1500EPFCLCD
ups.productid: 0501
ups.realpower.nominal: 900
ups.serial: xxx
ups.status: OL
ups.test.result: No test initiated
ups.timer.shutdown: -60
ups.timer.start: -60
ups.vendorid: 0764 new latest 2.8.0-r0: / # upsc $NAME@localhost:3493 2>&1
Init SSL without certificate database
battery.charge: 100
battery.charge.low: 10
battery.charge.warning: 20
battery.mfr.date: CPS
battery.runtime: 3120
battery.runtime.low: 300
battery.type: PbAcid
battery.voltage: 24.0
battery.voltage.nominal: 24
device.mfr: CPS
device.model: CP1500EPFCLCD
device.serial: xxx
device.type: ups
driver.debug: 0
driver.flag.allow_killpower: 0
driver.name: usbhid-ups
driver.parameter.pollfreq: 30
driver.parameter.pollinterval: 2
driver.parameter.port: auto
driver.parameter.serial: xxx
driver.parameter.synchronous: auto
driver.state: quiet
driver.version: 2.8.1
driver.version.data: CyberPower HID 0.8
driver.version.internal: 0.52
driver.version.usb: libusb-1.0.26 (API: 0x1000109)
input.transfer.high: 260
input.transfer.low: 170
input.voltage: 230.0
input.voltage.nominal: 230
output.voltage: 230.0
ups.beeper.status: enabled
ups.delay.shutdown: 20
ups.delay.start: 30
ups.load: 13
ups.mfr: CPS
ups.model: CP1500EPFCLCD
ups.productid: 0501
ups.realpower.nominal: 900
ups.serial: xxx
ups.status: OL
ups.test.result: No test initiated
ups.timer.shutdown: -60
ups.timer.start: -60
ups.vendorid: 0764 |
I'll test this now. Does the symptom still appear when you comment out the NAME: CP1500EPFCLCD environment variable in docker-compose.yml? |
comment out the NAME: CP1500EPFCLCD in docker compose after not even a few minutes: logs: docker logs nut-server Network UPS Tools - UPS driver controller 2.8.1
Network UPS Tools - Generic HID driver 0.52 (2.8.1)
Using subdriver: CyberPower HID 0.8
USB communication driver (libusb 1.0) 0.46
WARNING: Needed to fix group access to filesystem socket of this driver, but failed; run the driver with more debugging to see how exactly.
Consumers of the socket, such as upsd data server, can fail to interact with the driver and represent the device: /var/run/nut/usbhid-ups-ups
Ignoring invalid pid number 0
Network UPS Tools upsd 2.8.1
listening on 0.0.0.0 port 3493
Connected to UPS [ups]: usbhid-ups-ups
Found 1 UPS defined in ups.conf
Network UPS Tools upsmon 2.8.1
0.000000 Ignoring invalid pid number 0
0.000028 [D1] Just failed to send signal, no daemon was running
0.000189 Using power down flag file /etc/killpower
0.000365 UPS: ups@localhost (primary) (power value 1)
0.000396 [D1] debug level is '1'
0.000621 [D1] Saving PID 25 into /run/upsmon.pid
0.000789 [D1] Succeeded to become_user(nut): now UID=100 GID=101
Init SSL without certificate database
0.001894 upsnotify: failed to notify about state 2: no notification tech defined, will not spam more about it
0.001904 [D1] Trying to connect to UPS [ups@localhost]
0.002217 [D1] Logged into UPS ups@localhost
565.061320 Poll UPS [ups@localhost] failed - Data stale
565.061394 Communications with UPS ups@localhost lost
sh: wall: not found
570.062669 Poll UPS [ups@localhost] failed - Data stale
575.063161 Poll UPS [ups@localhost] failed - Data stale
580.063387 Poll UPS [ups@localhost] failed - Data stale
585.063580 Poll UPS [ups@localhost] failed - Data stale
590.064142 Poll UPS [ups@localhost] failed - Data stale
595.064605 Poll UPS [ups@localhost] failed - Data stale
600.065096 Poll UPS [ups@localhost] failed - Data stale
605.065808 Poll UPS [ups@localhost] failed - Data stale
610.066411 Poll UPS [ups@localhost] failed - Data stale
615.066849 Poll UPS [ups@localhost] failed - Data stale
620.067402 Poll UPS [ups@localhost] failed - Data stale
625.067720 Poll UPS [ups@localhost] failed - Data stale
630.068213 Poll UPS [ups@localhost] failed - Data stale
635.068767 Poll UPS [ups@localhost] failed - Data stale
640.069205 Poll UPS [ups@localhost] failed - Data stale when error happens: / # upsc $NAME@localhost:3493 2>&1
Init SSL without certificate database
Error: Data stale when i restart the container, right before it errors out: / # upsc $NAME@localhost:3493 2>&1
Init SSL without certificate database
battery.charge: 100
battery.charge.low: 10
battery.charge.warning: 20
battery.mfr.date: CPS
battery.runtime: 13950
battery.runtime.low: 300
battery.type: PbAcid
battery.voltage: 24.0
battery.voltage.nominal: 24
device.mfr: CPS
device.model: CP1500EPFCLCD
device.serial: xxx
device.type: ups
driver.debug: 0
driver.flag.allow_killpower: 0
driver.name: usbhid-ups
driver.parameter.pollfreq: 30
driver.parameter.pollinterval: 2
driver.parameter.port: auto
driver.parameter.serial: xxx
driver.parameter.synchronous: auto
driver.state: quiet
driver.version: 2.8.1
driver.version.data: CyberPower HID 0.8
driver.version.internal: 0.52
driver.version.usb: libusb-1.0.26 (API: 0x1000109)
input.transfer.high: 260
input.transfer.low: 170
input.voltage: 237.0
input.voltage.nominal: 230
output.voltage: 237.0
ups.beeper.status: enabled
ups.delay.shutdown: 20
ups.delay.start: 30
ups.load: 0
ups.mfr: CPS
ups.model: CP1500EPFCLCD
ups.productid: 0501
ups.realpower.nominal: 900
ups.serial: xxx
ups.status: OL
ups.test.result: No test initiated
ups.timer.shutdown: -60
ups.timer.start: -60
ups.vendorid: 0764 Let me know if anything else is needed, I'll help no problem. |
@hellcry37 @poliant @KC-inDomus please invoke
Other users: if this causes unwanted restarts for other UPS units, please make me aware of it here or in a new issue. This latest change is a bit risky so I might have to add protective logic for the default case. |
Does this fix the issue or just restarts when "Data stale" is detected? If the issue is not fixed that means container will restart in a loop at some point (that will not help) what is that MAXAGE: 25 do? |
If the issue is caused by the UPS unit taking longer than 15 seconds, the default value of MAXAGE, to report its data: it will be fixed. Please change that parameter and see if the container stays running without restarts. Googling for this problem is inconclusive. |
will change it now and come back |
Til now it seem to work ok. However we have a warning I dont know if is of any importance: WARNING: Needed to fix group access to filesystem socket of this driver, but failed; run the driver with more debugging to see how exactly. I'll come back tomorrow to tell you if this worked as it should. Thank you for all your hard work and your patience! |
Ahhh too soon it just failed again container restarted: 74afb66911c9 instantlinux/nut-upsd:latest "/bin/sh -c /usr/loc…" 29 minutes ago Up 59 seconds (healthy) 0.0.0.0:3493->3493/tcp, :::3493->3493/tcp nut-server docker logs nut-server
Network UPS Tools - UPS driver controller 2.8.1
Network UPS Tools - Generic HID driver 0.52 (2.8.1)
Using subdriver: CyberPower HID 0.8
USB communication driver (libusb 1.0) 0.46
WARNING: Needed to fix group access to filesystem socket of this driver, but failed; run the driver with more debugging to see how exactly.
Consumers of the socket, such as upsd data server, can fail to interact with the driver and represent the device: /var/run/nut/usbhid-ups-CP1500EPFCLCD
Network UPS Tools upsd 2.8.1
Ignoring invalid pid number 0
listening on 0.0.0.0 port 3493
Connected to UPS [CP1500EPFCLCD]: usbhid-ups-CP1500EPFCLCD
Found 1 UPS defined in ups.conf
Network UPS Tools upsmon 2.8.1
0.000000 Ignoring invalid pid number 0
0.000028 [D1] Just failed to send signal, no daemon was running
0.000403 Using power down flag file /etc/killpower
0.000758 UPS: CP1500EPFCLCD@localhost (primary) (power value 1)
0.000792 [D1] debug level is '1'
0.001066 [D1] Saving PID 26 into /run/upsmon.pid
0.001247 [D1] Succeeded to become_user(nut): now UID=100 GID=101
Init SSL without certificate database
0.002293 upsnotify: failed to notify about state 2: no notification tech defined, will not spam more about it
0.002301 [D1] Trying to connect to UPS [CP1500EPFCLCD@localhost]
0.002624 [D1] Logged into UPS CP1500EPFCLCD@localhost
1695.185795 Poll UPS [CP1500EPFCLCD@localhost] failed - Data stale
1695.185828 Communications with UPS CP1500EPFCLCD@localhost lost
sh: wall: not found
1700.186960 Poll UPS [CP1500EPFCLCD@localhost] failed - Data stale
1705.187683 Poll UPS [CP1500EPFCLCD@localhost] failed - Data stale
1710.188270 Poll UPS [CP1500EPFCLCD@localhost] failed - Data stale
1715.188906 Poll UPS [CP1500EPFCLCD@localhost] failed - Data stale
1718.291046 Signal 15: exiting
1718.292958 upsmon parent: read
Network UPS Tools - UPS driver controller 2.8.1
Network UPS Tools - Generic HID driver 0.52 (2.8.1)
Using subdriver: CyberPower HID 0.8
USB communication driver (libusb 1.0) 0.46
WARNING: Needed to fix group access to filesystem socket of this driver, but failed; run the driver with more debugging to see how exactly.
Consumers of the socket, such as upsd data server, can fail to interact with the driver and represent the device: /var/run/nut/usbhid-ups-CP1500EPFCLCD
Ignoring invalid pid number 0
Network UPS Tools upsd 2.8.1
listening on 0.0.0.0 port 3493
Connected to UPS [CP1500EPFCLCD]: usbhid-ups-CP1500EPFCLCD
Found 1 UPS defined in ups.conf
Network UPS Tools upsmon 2.8.1
0.000000 Ignoring invalid pid number 0
0.000068 [D1] Just failed to send signal, no daemon was running
0.000312 Using power down flag file /etc/killpower
0.000451 UPS: CP1500EPFCLCD@localhost (primary) (power value 1)
0.000494 [D1] debug level is '1'
0.000929 [D1] Saving PID 19 into /run/upsmon.pid
0.001216 [D1] Succeeded to become_user(nut): now UID=100 GID=101
Init SSL without certificate database
0.002909 upsnotify: failed to notify about state 2: no notification tech defined, will not spam more about it
0.002927 [D1] Trying to connect to UPS [CP1500EPFCLCD@localhost]
0.003618 [D1] Logged into UPS CP1500EPFCLCD@localhost
125.016864 Poll UPS [CP1500EPFCLCD@localhost] failed - Data stale
125.016908 Communications with UPS CP1500EPFCLCD@localhost lost
sh: wall: not found
130.017894 Poll UPS [CP1500EPFCLCD@localhost] failed - Data stale
135.018449 Poll UPS [CP1500EPFCLCD@localhost] failed - Data stale
140.019069 Poll UPS [CP1500EPFCLCD@localhost] failed - Data stale
145.019420 Poll UPS [CP1500EPFCLCD@localhost] failed - Data stale
150.019670 Poll UPS [CP1500EPFCLCD@localhost] failed - Data stale
150.633707 Signal 15: exiting
150.635504 upsmon parent: read
Network UPS Tools - UPS driver controller 2.8.1
Network UPS Tools - Generic HID driver 0.52 (2.8.1)
USB communication driver (libusb 1.0) 0.46
Using subdriver: CyberPower HID 0.8
WARNING: Needed to fix group access to filesystem socket of this driver, but failed; run the driver with more debugging to see how exactly.
Consumers of the socket, such as upsd data server, can fail to interact with the driver and represent the device: /var/run/nut/usbhid-ups-CP1500EPFCLCD
Ignoring invalid pid number 0
Network UPS Tools upsd 2.8.1
listening on 0.0.0.0 port 3493
Connected to UPS [CP1500EPFCLCD]: usbhid-ups-CP1500EPFCLCD
Found 1 UPS defined in ups.conf
0.000000 Ignoring invalid pid number 0
Network UPS Tools upsmon 2.8.1
0.000062 [D1] Just failed to send signal, no daemon was running
0.000482 Using power down flag file /etc/killpower
0.000698 UPS: CP1500EPFCLCD@localhost (primary) (power value 1)
0.000770 [D1] debug level is '1'
0.001433 [D1] Saving PID 19 into /run/upsmon.pid
0.001898 [D1] Succeeded to become_user(nut): now UID=100 GID=101
Init SSL without certificate database
0.004543 upsnotify: failed to notify about state 2: no notification tech defined, will not spam more about it
0.004565 [D1] Trying to connect to UPS [CP1500EPFCLCD@localhost]
0.005634 [D1] Logged into UPS CP1500EPFCLCD@localhost So atm it will restart in a loop. Work, fail, restart, work, fail, restart, ... |
In the container under directory /etc/nut, you'll find the config files. Copy those off into a directory on your local host (
Check that the value of MAXAGE in How often is the restart happening, after the first hour or so of running? If changes to some of those other variables have any effect, can you get it to stay stable for longer than several hours at a time? I'll be happy to add support for more variables once this is confirmed. |
I did not see a point in modify so much the docker to make a volume and add files there, I did all this in the container to get data for you. docker exec -it nut-server sh # =======================================================================
# MAXAGE <seconds>
MAXAGE 25 So from here we conclude the MAXAGE var works fine. Container restarts onece per 1h - 6h is not a well defined time tbh. In HA it dropped on: As you can see not a well defined pattern :) |
switched to local config files mode! What should I test next? |
I can't think of what specific changes might address the problem. Searching online, I found a few discussions about the CyberPower stale-data problem posted over the past several years: Those, along with comments found in the distributed config files under /etc/nut, should help point you to the solutions others have found. Once you can get it stable 24hrs, post the resulting values here. Thanks! |
Whatever I do still fails, I'll just use the old one and move on with my life |
OK fair enough. Hopefully the restart logic is now stable enough to use, and that others will post their successful parameter values for this model or others that don't behave the same way as APC. |
Hello @instantlinux, I've been facing the stale-data issue with two CyberPower UPS. The I am mounting a custom By creating and mounting a
Both of the UPS devices have been running for 36 hours continuously without issue. I may try to lower the Here's the relevant details of my compose file in case this is helpful.
I'd be happy to test this out further with you to confirm the |
I have the same manufacturer and the UPS PowerWalker VI 1200 SHL, the only thing that reliably works for me is to soft-reset the port using |
I'm using nut-upsd image on a RPI4 for a tecnoware Era Plus 1100 and evything works fine except in the following cases:
I both case the port changes (e.g. from /dev/bus/usb/00X/00Z to /dev/bus/usb/00Y/00W).
In order to have always the same port I've created a rule so that the usb port is always mapped to /dev/myUPS and I've configured such port on my dockerfile:
Also with such configuration I lose the connection to UPS in both cases I reported above... of course the first case is not real, meaning I used for testing... of course I've checked that when the USB is reconnected and/or the UPS re-establish the connection, symbolik link to usb port is updated correctly.
Thanks for support
The text was updated successfully, but these errors were encountered: