Node stopped to receive an ingress, but it's not full

byblo by horosho esli byla by statistika pochemu node ne poluchaet trafic. Privedu primer.



node ne poluchet trafic.
Bylo taki 5tk.
4 ja perekinul na drugoi IP srazu stali rabotat, toest srazu poshol traffic.
restart proboval ne pomogaet.
Po Neighbors posmotrel bylo 8 nodov na etom IP
5 moih, 3 chuzhih
vot teper ne ponimaju pochemu imenno na etom IP trafika pochti netu dazhe kogda ja umenshil v 2 raza kolichestvo nodov. Kak vidite v istorii vidno 4to trafik byl i mnogo.
Pologaju 4to poselilsja sosed s ubitymi nodami i vsja podsedka teper ubita.
Kak mozhno rassledovat pozhemu?

Например репутация у соседских узлов выше.
Достаточно легко проверить по IP:порт соседа, если AllHealthy: true, значит да, репутация хорошая. Может клиенты не загружают. Мне трудно оценить, у меня узлы полные.

Кстати, а не было ли у вас сообщения в логах “less than requested”?

2024-07-20T18:49:47+03:00 WARN piecestore:monitor Disk space is less than requested. Allocated space is {“bytes”: 3871543457311}
da nashol
no
“Statuses”: null,
“Help”: “To access Storagenode services, please use DRPC protocol!”,
“AllHealthy”: true

pri etom na diske 1.65 TB svobodno iz 4tb diska.

Узел мог решить, что он полный, если информация в БД была не обновлена.
Проверка только на старте производится. Ещё можно увидеть, что узел уменьшил выделение на MND (там как раз отображается информация из API, а не из конфига).

posle restarta on mne pokazal 4razuzhe tozhe samoe.
No uzel na dashboard pokazuvaet 4to on polovina tolko.
V baze vrode tozhe samoe

API pokazyvaet tozhe 4to i dashboaerd

I found that if in database I put together All rows from All sattelitses event not existing any more then node will be full.
Screenshot 2024-07-22 135450

Yes, it could be a root cause, however, you said that the API is showing the correct value?

What would happen, if you would remove these obsolete rows manually or with storagenode forget-satellite --force <the satellites list here>?

I tried to remove first row from database, where satellite id NULL in database, but node started to show 0 space occupied at all, so i restored file from backup.
forget satellites was done already on this node in the past, i remember that elec written somewhere that, this forgotten satellites wasn’t wiped from DBs.

I think you need to delete only rows with satellite_id filled. Those with NULLs seems to me like a header or a subtotal.
And if my guess is correct, you also need to reduce the amount in the row with NULL satellite_id on the deleted amount.

I still didn’t get an answer about these numbers in that database.

By the way,

seems you calculated in a binary measure units, not SI (base 10)?
i.e.
1.26924e11/1e9 = 126.924 (GB), not 118.21 GiB
4.26265e12 = 4.263 (TB)

And 2.06786e12 ~ 2.07 (TB), and it’s matched to what is displayed on the dashboard as Used

So perhaps just not enough traffic from the customers? Or do you see a complete zero ingress?

My nodes periodically have an ingress, when the trash is cleaned up a little bit

yes i calculated with 1024 bytes. and bottom number is all total of aal rows.
I deleted rows from old satellites but nothing changed.

Yes, it’s kind of expected, if the reason not in reporting the node as full. And it shouldn’t be accordingly your data from the API.
So, just low ingress from customers.

Ingress for yesterday below 100 mb.

Mine are proud 470.00B (yes, bytes) ingress owners :sunglasses:

Interesting fakt that all other node i moved from this IP started to have normal ingress

A doomed IP? :thinking: Interesting. I do not have any stat about such behavior. Especially if the restart didn’t help.

Can you provide me a NodeID to check?

1wzBAqEa947v5ATZJLivRGZmz5Wotk8ZbxgYQBpfQqcESTuy28

1 Like

I provided it to our engineers to check, what’s going on with that node.