Skip to content

something broke, yesterday - everything unavailable for all 4 of my devices #162

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Anto79-ops opened this issue Apr 25, 2025 · 9 comments
Labels
bug Something isn't working

Comments

@Anto79-ops
Copy link

Anto79-ops commented Apr 25, 2025

Been running well here for a year or so, 1.8.5 is installed on four of my RPis. Runing the MQTT broker addon in HA. Soon as I updated to 2025.4.4, now my devices (all 4 of them), are offline. MQTT broker seems to be working because I have other MQTT integrations, even on the same RPi's, that are working, so im not sure what is happening here. here are the verbose logs for RPi-Reporter

pi@raspberrypi:/opt/RPi-Reporter-MQTT2HA-Daemon $ python3 /opt/RPi-Reporter-MQTT2HA-Daemon/ISP-RPi-mqtt-daemon.py -d -v
[2025-04-25 09:19:23] - (DBG): --------------------------------------------------------------------
[2025-04-25 09:19:23] - ISP-RPi-mqtt-daemon.py v1.8.5
[2025-04-25 09:19:23] - Verbose enabled
[2025-04-25 09:19:23] - (DBG): Debug enabled
[2025-04-25 09:19:23] - (DBG): * init mqtt_client_connected=[False]
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 169, in _new_conn
    conn = connection.create_connection(
  File "/usr/lib/python3/dist-packages/urllib3/util/connection.py", line 96, in create_connection
    raise err
  File "/usr/lib/python3/dist-packages/urllib3/util/connection.py", line 86, in create_connection
    sock.connect(sa)
TimeoutError: [Errno 110] Connection timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 700, in urlopen
    httplib_response = self._make_request(
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 395, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 234, in request
    super(HTTPConnection, self).request(method, url, body=body, headers=headers)
  File "/usr/lib/python3.9/http/client.py", line 1259, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/usr/lib/python3.9/http/client.py", line 1305, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/usr/lib/python3.9/http/client.py", line 1254, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/lib/python3.9/http/client.py", line 1014, in _send_output
    self.send(msg)
  File "/usr/lib/python3.9/http/client.py", line 954, in send
    self.connect()
  File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 200, in connect
    conn = self._new_conn()
  File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 181, in _new_conn
    raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7fa208f940>: Failed to establish a new connection: [Errno 110] Connection timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/dist-packages/requests/adapters.py", line 667, in send
    resp = conn.urlopen(
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 756, in urlopen
    retries = retries.increment(
  File "/usr/lib/python3/dist-packages/urllib3/util/retry.py", line 576, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='kz0q.com', port=80): Max retries exceeded with url: /daemon-releases (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fa208f940>: Failed to establish a new connection: [Errno 110] Connection timed out'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/RPi-Reporter-MQTT2HA-Daemon/ISP-RPi-mqtt-daemon.py", line 346, in <module>
    getDaemonReleases() # and load them!
  File "/opt/RPi-Reporter-MQTT2HA-Daemon/ISP-RPi-mqtt-daemon.py", line 315, in getDaemonReleases
    response = requests.request('GET', 'http://kz0q.com/daemon-releases', verify=False)
  File "/usr/local/lib/python3.9/dist-packages/requests/api.py", line 59, in request
    return session.request(method=method, url=url, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/local/lib/python3.9/dist-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/requests/adapters.py", line 700, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='kz0q.com', port=80): Max retries exceeded with url: /daemon-releases (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fa208f940>: Failed to establish a new connection: [Errno 110] Connection timed out'))

There is nothing in the release notes for HA Core 2025.4.4 that sugggest a change in mqtt, so im at a loss as to why all 4 of my devices cannot connect to the mqtt broker.

Any suggestions?

here is the bug script output:

pi@raspberrypi:/opt/RPi-Reporter-MQTT2HA-Daemon $ cat genBugInfo-250425-094040.lst
# SCRIPT genBugInfo v1.1 run 25/04/25-09:40:40
# ----------------------------------------------------------------------

# /bin/cat /etc/apt/sources.list | /bin/egrep -v '#'

deb http://deb.debian.org/debian bullseye main contrib non-free
deb http://security.debian.org/debian-security bullseye-security main contrib non-free
deb http://deb.debian.org/debian bullseye-updates main contrib non-free

 ----

# /bin/cat /etc/apt/sources.list | /bin/egrep -v '#' | /usr/bin/awk '{ print $3 }' | /bin/grep . | /usr/bin/sort                                                                                                                              -u | head -1

bullseye

 ----

# /bin/uname -r

6.12.22-v8+

 ----

# /bin/hostname -f

raspberrypi

 ----

# /usr/bin/uptime

 09:40:40 up 11 min,  1 user,  load average: 0.16, 0.12, 0.07

 ----

# /sbin/ifconfig

eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.1.162  netmask 255.255.255.0  broadcast 192.168.1.255
        inet6 fe80::2a32:80fe:af29:19ab  prefixlen 64  scopeid 0x20<link>
        inet6 fd95:2175:eb06:afe6:c1d4:8e45:c8b:2f09  prefixlen 64  scopeid 0x0<global>
        ether 00:e0:4c:68:05:39  txqueuelen 1000  (Ethernet)
        RX packets 29244  bytes 4561338 (4.3 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 928  bytes 104811 (102.3 KiB)
        TX errors 0  dropped 3 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 19  bytes 3677 (3.5 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 19  bytes 3677 (3.5 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0


 ----

# /sbin/ifconfig | /bin/egrep 'Link|flags|inet|ether' | /bin/egrep -v -i 'lo:|loopback|inet6|\:\:1|127\.0\.0\.1'

eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.1.162  netmask 255.255.255.0  broadcast 192.168.1.255
        ether 00:e0:4c:68:05:39  txqueuelen 1000  (Ethernet)

 ----

# /sbin/route

Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
default         192.168.1.1     0.0.0.0         UG    202    0        0 eth0
192.168.1.0     0.0.0.0         255.255.255.0   U     202    0        0 eth0

 ----

# /bin/ls -l /var/log/dpkg.log /var/log/dpkg.log.1 2>/dev/null

-rw-r--r-- 1 root root  7779 Apr 24 10:42 /var/log/dpkg.log
-rw-r--r-- 1 root root 18577 Apr  9 18:31 /var/log/dpkg.log.1

 ----

# /bin/grep 'status installed' /var/log/dpkg.log /var/log/dpkg.log.1 2>/dev/null | sort | tail -1

/var/log/dpkg.log:2025-04-24 10:42:03 status installed man-db:arm64 2.9.4-2

 ----

# /bin/df -m

Filesystem     1M-blocks  Used Available Use% Mounted on
/dev/root          29668  4755     23675  17% /
devtmpfs              78     0        78   0% /dev
tmpfs                209     0       209   0% /dev/shm
tmpfs                 84     1        83   2% /run
tmpfs                  5     1         5   1% /run/lock
/dev/mmcblk0p1       255    65       191  26% /boot
tmpfs                 42     0        42   0% /run/user/1000

 ----

# /bin/df -m | /usr/bin/tail -n +2 | /bin/egrep -v 'tmpfs|boot'

/dev/root          29668  4755     23675  17% /

 ----

# ls -l /opt/vc/bin/vcgencmd /usr/bin/vcgencmd

ls: cannot access '/opt/vc/bin/vcgencmd': No such file or directory
-rwxr-xr-x 1 root root 22920 Mar 22  2023 /usr/bin/vcgencmd

 ----
@Anto79-ops Anto79-ops added the bug Something isn't working label Apr 25, 2025
@Dangermouse-UK
Copy link

Dont blame home assistant! It is the check for updates that fails!

I have removed this from one of mine behind a firewall as it cant resolve it and that one survived. This awful feature causes problems for others and just needs removing. Just comment out the following line:

getDaemonReleases() # and load them!

@Anto79-ops Anto79-ops changed the title HA Core update to 2025.4.4 broke something something broke, yesterday - everything unavailable for all 4 of my devices Apr 25, 2025
@Anto79-ops
Copy link
Author

Ok, thank you for confirming that! I changed the issue

And yes, your fix has brought everything back.

For me, it was line 346

@martinjones12345 thank you!

@howiehowie93
Copy link

howiehowie93 commented Apr 25, 2025

Came looking for the status check Terminal command 'cos my only remaining working Pi went off line at 09:57 this morning.

I just need to work out how to access the file to comment it out I guess ??

OK so edited the file line 376 with nano to # is out and it still failed to start !

Edit - I found the reload command and it works again and can be seen in HA

TY @Anto79-ops

@Anto79-ops
Copy link
Author

Anto79-ops commented Apr 25, 2025

here's what I did

  1. cd /opt/RPi-Reporter-MQTT2HA-Daemon
  2. sudo nano -c ISP-RPi-mqtt-daemon.py
  3. scroll down to aroun 340 and look for the getDaemonReleases() # and load them and add # before
  4. save and exit the file editor
    5. sudo systemctl stop isp-rpi-reporter.service
    6. sudo systemctl daemon-reload
    7. sudo systemctl start isp-rpi-reporter.service

this worked for all 4 of my devices

EDIT: no need to reload the service as per this comment

@bsimmo
Copy link

bsimmo commented Apr 25, 2025

Duplicate of #152
Duplicate of #114

@upsuper
Copy link

upsuper commented Apr 26, 2025

You probably want to comment out this as well:

if timeNow > daemon_last_fetch_time + kVersionCheckIntervalInSeconds:
getDaemonReleases() # and load them!

Otherwise it will start failing again in 12hrs.

@upsuper
Copy link

upsuper commented Apr 26, 2025

Also if you are updating the py file, there is no need to run daemon-reload as the service definition is not changed.

@Dangermouse-UK
Copy link

Dangermouse-UK commented Apr 26, 2025

You probably want to comment out this as well:

RPi-Reporter-MQTT2HA-Daemon/ISP-RPi-mqtt-daemon.py

Lines 1899 to 1900 in 259e42c

if timeNow > daemon_last_fetch_time + kVersionCheckIntervalInSeconds:
getDaemonReleases() # and load them!
Otherwise it will start failing again in 12hrs.

This isn't needed as I never commented this out originally but I agree it is much more sensible and safe to do so.
For whatever reason this if clause doesn't get met (I assume because there is no last fetch time?

However I have also commented this out and if you are going to do so make sure you comment out both lines! (I rushed this and only commented out the getDaemon releases which broke it again and proved to me this clause never gets evaluated.

@Anto79-ops
Copy link
Author

Seems to be solved since the manual update of the file.

Should I close or reverse the changes?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants