Node stopped to receive an ingress, but it's not full

Vadim · July 22, 2024, 7:33am

byblo by horosho esli byla by statistika pochemu node ne poluchaet trafic. Privedu primer.

node ne poluchet trafic.
Bylo taki 5tk.
4 ja perekinul na drugoi IP srazu stali rabotat, toest srazu poshol traffic.
restart proboval ne pomogaet.
Po Neighbors posmotrel bylo 8 nodov na etom IP
5 moih, 3 chuzhih
vot teper ne ponimaju pochemu imenno na etom IP trafika pochti netu dazhe kogda ja umenshil v 2 raza kolichestvo nodov. Kak vidite v istorii vidno 4to trafik byl i mnogo.
Pologaju 4to poselilsja sosed s ubitymi nodami i vsja podsedka teper ubita.
Kak mozhno rassledovat pozhemu?

Alexey · July 22, 2024, 7:40am

Например репутация у соседских узлов выше.
Достаточно легко проверить по IP:порт соседа, если AllHealthy: true, значит да, репутация хорошая. Может клиенты не загружают. Мне трудно оценить, у меня узлы полные.

Кстати, а не было ли у вас сообщения в логах “less than requested”?

Vadim · July 22, 2024, 7:46am

2024-07-20T18:49:47+03:00 WARN piecestore:monitor Disk space is less than requested. Allocated space is {“bytes”: 3871543457311}
da nashol
no
“Statuses”: null,
“Help”: “To access Storagenode services, please use DRPC protocol!”,
“AllHealthy”: true

pri etom na diske 1.65 TB svobodno iz 4tb diska.

Alexey · July 22, 2024, 7:52am

Узел мог решить, что он полный, если информация в БД была не обновлена.
Проверка только на старте производится. Ещё можно увидеть, что узел уменьшил выделение на MND (там как раз отображается информация из API, а не из конфига).

Vadim · July 22, 2024, 8:17am

posle restarta on mne pokazal 4razuzhe tozhe samoe.
No uzel na dashboard pokazuvaet 4to on polovina tolko.
V baze vrode tozhe samoe

Vadim · July 22, 2024, 8:19am

API pokazyvaet tozhe 4to i dashboaerd

Vadim · July 22, 2024, 10:56am

I found that if in database I put together All rows from All sattelitses event not existing any more then node will be full.
Screenshot 2024-07-22 135450

Alexey · July 23, 2024, 3:12am

Yes, it could be a root cause, however, you said that the API is showing the correct value?

What would happen, if you would remove these obsolete rows manually or with storagenode forget-satellite --force <the satellites list here>?

Vadim · July 23, 2024, 3:42am

I tried to remove first row from database, where satellite id NULL in database, but node started to show 0 space occupied at all, so i restored file from backup.
forget satellites was done already on this node in the past, i remember that elec written somewhere that, this forgotten satellites wasn’t wiped from DBs.

Alexey · July 23, 2024, 3:54am

I think you need to delete only rows with satellite_id filled. Those with NULLs seems to me like a header or a subtotal.
And if my guess is correct, you also need to reduce the amount in the row with NULL satellite_id on the deleted amount.

I still didn’t get an answer about these numbers in that database.

Alexey · July 23, 2024, 4:02am

By the way,

seems you calculated in a binary measure units, not SI (base 10)?
i.e.
1.26924e11/1e9 = 126.924 (GB), not 118.21 GiB
4.26265e12 = 4.263 (TB)

And 2.06786e12 ~ 2.07 (TB), and it’s matched to what is displayed on the dashboard as Used

So perhaps just not enough traffic from the customers? Or do you see a complete zero ingress?

My nodes periodically have an ingress, when the trash is cleaned up a little bit

Vadim · July 23, 2024, 4:23am

yes i calculated with 1024 bytes. and bottom number is all total of aal rows.
I deleted rows from old satellites but nothing changed.

Alexey · July 23, 2024, 4:24am

Yes, it’s kind of expected, if the reason not in reporting the node as full. And it shouldn’t be accordingly your data from the API.
So, just low ingress from customers.

Vadim · July 23, 2024, 4:25am

Ingress for yesterday below 100 mb.

Alexey · July 23, 2024, 4:26am

Mine are proud 470.00B (yes, bytes) ingress owners

Vadim · July 23, 2024, 4:27am

Interesting fakt that all other node i moved from this IP started to have normal ingress

Alexey · July 23, 2024, 4:28am

A doomed IP? Interesting. I do not have any stat about such behavior. Especially if the restart didn’t help.

Alexey · July 23, 2024, 7:50am

Can you provide me a NodeID to check?

Vadim · July 23, 2024, 7:52am

1wzBAqEa947v5ATZJLivRGZmz5Wotk8ZbxgYQBpfQqcESTuy28

Alexey · July 23, 2024, 8:42am

I provided it to our engineers to check, what’s going on with that node.