API disk space used negative

With one of the latest updates it seems like the space used calc used in the storagenode API has changed. Assume the following scenario: Your node has 5TB of data stored on disk. In your config you set the storage.allocated-disk-space to 1TB.

Previously it showed the used space of 5TB correctly in the API, now it shows a negative amount that im not sure how it is calculated.

Anybody else seen this?

I have node with 7.0TB used.

((curl http://127.0.0.1:14002/api/sno).Content | ConvertFrom-Json).satellites.id | %{"$_"; ((curl http://127.0.0.1:14002/api/sno/satellite/$_).Content | ConvertFrom-Json) | %{$_.storageSummary,$_.bandwidthSummary}}

118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW
867684838982.8856
188792576
1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE
1220013470465702.2
534805791744
121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6
92304750146855.62
48094575616
12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S
93660540289892.98
50200412416
12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs
227493662388082.22
80415960320
12rfG3sh9NCWiX3ivPjq2HtdLmbqCrvHVEzJubnzFzosMuawymB
748961356923475.4
116561979392

Changed the 7.0TB allocation to 1.0 TB, restarted the service

((curl http://127.0.0.1:14002/api/sno).Content | ConvertFrom-Json).satellites.id | %{"$_"; ((curl http://127.0.0.1:14002/api/sno/satellite/$_).Content | ConvertFrom-Json) | %{$_.storageSummary,$_.bandwidthSummary}}

118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW
867684838982.8856
188792576
1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE
1220013470465702.2
534831304704
121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6
92304750146855.62
48099212032
12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S
93660540289892.98
50201259776
12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs
227493662388082.22
80418309120
12rfG3sh9NCWiX3ivPjq2HtdLmbqCrvHVEzJubnzFzosMuawymB
748961356923475.4
116564298752

The stat metric

7.0 TB

((curl http://127.0.0.1:14002/api/sno).Content | ConvertFrom-Json).diskSpace

         used     available     trash
         ----     ---------     -----
6998714426496 7000000000000 791169408

1.0 TB

((curl http://127.0.0.1:14002/api/sno).Content | ConvertFrom-Json).diskSpace

        used     available     trash
        ----     ---------     -----
270179595904 1000000000000 791169408

So, the .diskSpace.used is decreasing with decreasing of allocation.
And how it looks on the dashboard:

Before the change (allocation 7.0 TB)

After the change (7.0 TB → 1.0TB)

ah, im not alone, looks like a bug then

i have created a bug issue here: https://github.com/storj/storj/issues/3942

Its overall bad approach, to calculate used space we calculate space used by pieces of each satellite, it can’t decrease to 1 TB if it was 7 TB even since u changed allocated space, that’s why you would receive negative space, but since we had negative space problems before we added check to code that calculates if allocated-used-trash < 0 we recalculate directory’s free space. To avoid this wrong calculations we are going to add partial graceful-exit for extra pieces after decreasing allocated space(it is in our roadmap). If you have any ideas or suggestions how to handle this right now we would kindly check it and try to implement.

On my node my free space was negative (by 30 GB) running on v1.11.1. After the update to v1.12.3, the node shows the correct used space, but now is showing free space equal to the amount of physical space available on the disk. This might just be a reporting problem, since the node is not starting any new uploads to fill this “free” space.

Any chance this recalculation could be the problem?

I’d expect the following to happen once i decrease my allocated space below used space:

  • My used space remains where it was, unchanged
  • My free space is now negative
  • NEVER ever should there be any graceful exit because of this change (!)
  • The node stops receiving new pieces and only uploads existing files (if requested), till it’s used space is below the allocated amount again

Actually, there is a discrepancy between the CLI and Web Dashboards:

Storage Node Dashboard ( Node Version: v1.12.3 )
======================
ID     
Status ONLINE
Uptime 1h18m37s

                   Available          Used        Egress      Ingress
     Bandwidth           N/A     600.66 GB     577.09 GB     23.56 GB (since Sep 1)
          Disk     339.41 GB       3.39 TB

image

user@rock64:~$ df -H | grep /mnt/storj1
/dev/sdc1       4.0T  3.4T  340G  91% /mnt/storj1

(sorry to jump in on this thread, but it seems related to me)

Weird, since calculations on cli and web dashboards are same, I’m going to check why it keeps happening.

2 Likes

Thanks, we will discuss it with team asap and decide what are the next steps to improve this part!

1 Like

Partial graceful exit would be a great feature, but I would prefer it to not trigger automatically. Sometimes it can be useful to briefly reduce the load on your node by lowering the allocation to just not accept any new uploads. But that doesn’t mean you necessarily want to get rid of the data. So I suggest having a separate command to actually free up the space.

5 Likes

This is really confusing for the end user though, since depending on the situation, this number now means something completely different.

I recently had a fairly extreme version of this problem. I run one node on a Drobo device, which uses thin provisioning. It has a 16TB volume, but far less physical disk space. I had only 1.9TB assigned to Storj, which I now wanted to lower to 1.8TB. (Drobos get slow when they fill up beyond 75%, so I wanted to lower it slightly so it would eventually drop below that threshold. I’m not in a hurry for this though, so don’t want to trigger partial exit even if it were an option.)

This results in the CLI dashboard now showing the available space in the thin provisioned volume. Which is meaningless.
image

The web dashboard is even more surprising!
image
Apparently the node is currently using negative 13TB! :slight_smile:

For what it’s worth, it does look like the node doesn’t actually tell the satellites that it has free space, since it’s not getting new data. But it’s really confusing to track how much data is actually available or how far above the assigned space the node is.

As an end user I would by far prefer that the node ALWAYS displays available space compared to the assigned space. Even if that leads to negative numbers, as that at least makes it clear that the current usage is above the assigned space.

1 Like

I am seeing a less extreme version of the exact same problem as explained above. @Nikolai_Siedov did you manage to find time to look into this?

Looks like this commit in 1.13.2 has removed the temporary workaround that caused this issue.

1 Like

looks like that will be part of >= 1.14.0 tho, not included in 1.13.3, only 1.14.0-rc

2 posts were split to a new topic: There is a discrepancy between the CLI and Web Dashboards in used space