How can such overusage happen?

The multinode dashboard shows the actual disk free space, not free space in allocation.
The funny thing, that on the main page it shows free space correctly.

What version of the storagenode binary are you running? Do you know when this started?

I have seen this on 2 different nodes. One was only a couple hundred MB, so I was not worried.
However the huge uverusage happened with a v1.49.5 node. I don’t know when it started, but I think I have never seen such an issue ever before 1.49.5.

And it is happening again. 1 node has 18 GB of overusage and I haven’t changed a thing.
This has never happend before version 1.49.5 and seems to happen more frequently recently. But why?

Do you see which part causes the over usage: Trash or Used? Isn’t it just because a lot of temporary files in the trash?

Used space is still below the assigned space.
It might be because of temporary trash files. But I don’t think the node should exceed its total assigned space, even if it is temporary only. As SNO I have no control about how much data gets deleted. So I cannot predict how much more additional spare space I would have to reserve. And we still don’t know what will happen, if there is no more space left in such a case.

Hmm. This is very strange. Just tried to reproduce it locally: I filled up all my storage nodes (using storj-up → a full local cluster) and deleted the files: and it seems to be fine:

{
  "used": 168140800,
  "available": 1000000000,
  "trash": 822749184,
  "overused": 0
}

Trash is full with garbage, but usage is low. Tested and everything is still in the trash.

Checking the code, trying to figure out how is it possible on your side…

Really I have no idea why this is happening. I can only report it when it happens.
The baseline is that for some reason the node keeps adding data despite being full. Full means data + trash exceeds the assigned space. And this should not happen.
So maybe the satellite does not pick up the information in time that the node is full. Or the node does not send this information in time (or even wrong information). Or maybe trash deletion / garbage collection has some issues keeping up. I don’t know.

And again one: DISK SPACE LEFT: -131.43MB

Can confirm here too. Seen on a node still on v1.53.1 (hasn’t auto updated yet) and another that is on v1.54.2. The v1.54.2 has dropped back to no overuse but the v1.53.1 still has some overuse. It isn’t enough for me to be worried about, but I am curious how this can happen as I haven’t made any changes to the node size.
Screenshot from 2022-05-16 12-09-48

this is why its recommended to keep a good amount of extra capacity just incase this happens… in the future i’m sure we will be able to go closer to the theoretical max.
nodes are continually deleting and getting new ingress… 500MB is ½ a GB
so you are about 1/3600th off in relation to your capacity.

keeping stuff within such narrow limits is quite a challenge in most things.
and these numbers are also estimates i think…

every time the filewalker is run a node counts all files and their sizes and then calculates from there until next time the filewalker is run, usually on node reboots.
so over time it is bound to deviate

OK, this makes sense for me. The overusage showed up after the nodes had been restarted. Interesting though that Storj allows for the reporting to drift so much.

I don’t remember where I had seen a specific recommended number but what I went with making available on my disks, each of which are dedicated to Storj, was about 93%. So that “unshared” 7% that at most I’m only using for backup of some scripts and .bash_history should be more than enough for any Storj overallocation.

Very interesting point. Today I needed to restart a node and when I check now I see: Overused 11.93GB:thinking:

1 Like

again this isn’t much… lets say a storage node of 2 - 5 TB
12GB deviation is less than 1% and in the 5 TB case less than 0.25%

sure computers should be able to count flawlessly, so it does seem like something is being missed…
like say it might not account for sector sizes on the storage media…
if a 2kb is written a 4k sector is still used, these kinds of things might offset it…

tho it seems to low for being that as i would expect that to be many %
so what else can it be… metadata perhaps… metadata doesn’t take much space, but for each file that is written there will be metadata and storagenodes does write a lot of small files.

or it could be some kind of rounding error… like say one of the variables that is part of the calculation might not use the exact numbers.

let say the filewalker adds up all your storagenodes files… that is a pretty awesome number.
so 2TB would be
2000000000000 adding a 2MB file and its +2000000
ofc the majority of files written initially is much smaller…
more like like say 4000 bytes

so we use 4 bytes just to write the storage of 4000 bytes
than in itself is 0.1% more information required to be stored…

but yeah it does seem like there is most likely something that offsets it…
because atleast in theory it should be able to do the math 100% accurately.

however reality is always imprecise.

also that it can go over might not mean it cannot stop if it runs out of space…
just that it goes over because it can…

Don’t presume too easily. That’s a small node currently holding only around 200 GB.
And nevertheless, it does not matter if 12GB is a lot or not. If the disk runs out of space due to this over usage it does not matter how much it is over the allocated space.

you are right, at that level it certainly is a lot.
doesn’t mean it can’t stop if it runs out of space…

but you are right to be concerned, running totally out of space could be really dangerous for stuff like databases and such.

i suppose it could also be something like uploads that are already started when it reaches it saturation point and starts to refuse uploads.

the uploads already begun would most likely be finished, i would assume…
and storagenodes can get a ton of uploads concurrently.
in the hundreds or even thousands depending on how busy it is…

i recently set my max concurrent to 100 and i still see uploads rejected on rare occasion.

Yes, that is my major concern that a node might crash when it reaches the space limit. Basically it means you cannot fully utilize a HDD and have to keep several GBs free at all times just for these occasions. How much is safe? Also it is not clear if that over usage get’s paid for? If it remains undetected until a filewalker process I guess it does not getting paid until it gets detected?

i know that some SNO has run into the max capacity of the hdd without any major problems.
then end up on the forum complaining about their nodes not growing or not getting uploads.

so i don’t think you should be to worried about it, i certainly haven’t seen any signs of that with the limited knowledge i got on the subject.

also i’m unsure if the filewalker capacity calculations has anything to do with the actual payouts.
i think its mostly there to keep track of how much capacity is used on a live basis.

the payouts themselves i would suspect to be calculated by the satellite databases of piece storage, which from what i have been told is then compared to a local calculation on the storagenode.
so i guess the filewalker capacity number could play a part in verifying that the numbers the satellite come up with is roughly right, but i would suspect the local part of more like a checksum, just to be easily able to see if something it crazily off.

in short i think you will be paid regardless what the filewalker says…

but to truly understand what is happening in these cases would most likely require extensive code reviews and full understanding of all the involved storagenode mechanics.

i know @BrightSilence has in the past dug deep into verifying the earnings match up with the stored data, and in some cases has found minor deviations which then later was fixed.

i think that was the case when one of the satellites was months behind on its calculations of earnings, so people was essentially being paid earnings that was earned months back.

if memory serves it was due to the satellite being overworked and storjlabs migrated or upgraded their satellite infrastructure to fix the issue.

so its certainly not without precedence that payments can be affected by various issues.
but there are few people that can really give us the exacts of that stuff.
and would require a great deal of work i suspect.

@SGC is right, you get paid based on the pieces the satellite knows you have. So you will definitely get paid for over usage. Though by the time this starts to make any noticeable difference at all, you have bigger problems on your hands.

The time you were referring to was definitely not minor.


Oh how I miss the times of surge payout :slight_smile:

But yeah, they were $108 short on my node at the time.
It was corrected the next month though.

Since then things have stabilized a bit more. And while small differences still happen, they are insignificant at this point. It’s just a few cents here and there. Basically negligible. If you want to see these numbers on your own node, the earnings calculator does the work of comparing satellite payout stats to local node stats automatically for you.

1 Like

lol yeah but didn’t really consider it a true deviation, since the satellite was just far far far behind on the accounting, so at that point in time it was a deviation, but really long term it was all accounted for…

certainly did get people to sit up and take note…
found it a bit funny how long it had taken before anyone noticed.

yeah and i had just joined the v3 network so i got like 2$ in surge lol total
i just thought we got surge last payout… but sadly not and then crypto prices tanked afterwards.

so that wasn’t really a blessing lol even tho people did see it as such at the time… should have sold, but it also sucks to have no tokens when the prices eventually go back up :smiley:

hopefully it will end up being a boon, eventually…

1 Like