Error getting current space used calculation

baker · January 31, 2020, 7:34pm

My node has been up for 24 hours now and has still not recognized the used space since before the update from 0.31.10 to 0.31.12. I am seeing 2.4 GB used now vs nearly 2 TB being actually used. Any suggestions? The node has been working fine otherwise. Getting uploads and downloads. No failed audits. Upon node restart I get:

2020-01-31T19:24:15.177Z WARN piecestore:monitor Disk space is less than requested. Allocating space {"bytes": 1737681216768}
Which I did not have a problem with before the 0.31.12 was released.
Soon followed by:

2020-01-31T19:24:51.000Z        ERROR   piecestore:cache        error getting current space used calculation:   {"error": "lstat config/storage/blobs/ukfu6bhbboxilvt7jrwlqk7y2tapb5d2r2tsmj2sjxvw5qaaaaaa/3v/5yttgjqkjo3k5j37gs4mdbd4cnd5bjpwipctal6oyctv5vceia.sj1: bad message"}
2020-01-31T19:25:10.161Z        ERROR   blobscache      newSatPiecesTotal < 0   {"newSatPiecesTotal": -313856}
2020-01-31T19:25:10.161Z        ERROR   blobscache      newSatPiecesContentSize < 0     {"newSatPiecesContentSize": -313344}
2020-01-31T19:25:10.162Z        INFO    piecestore      deleted {"Satellite ID": "118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW", "Piece ID": "GZ3YO4CFRB7NS7XBDOO34SYBH54R4B63J4I7TC2LIJUY22QXY7BQ"}
2020-01-31T19:25:18.581Z        ERROR   blobscache      newSatPiecesTotal < 0   {"newSatPiecesTotal": -158976}
2020-01-31T19:25:18.581Z        ERROR   blobscache      newSatPiecesContentSize < 0     {"newSatPiecesContentSize": -158464}

I’m going to be away for a few days and would hate to leave the node in a bad state. I still have at least 1.2TB free space, so I am not worried about it filling up in the mean time.

Egon · February 1, 2020, 10:19am

The first error message seems worrying:

2020-01-31T19:24:51.000Z ERROR piecestore:cache error getting current space used calculation: {“error”: “lstat config/storage/blobs/ukfu6bhbboxilvt7jrwlqk7y2tapb5d2r2tsmj2sjxvw5qaaaaaa/3v/5yttgjqkjo3k5j37gs4mdbd4cnd5bjpwipctal6oyctv5vceia.sj1: bad message”}

This seems to indicate that there is something going wrong with the filesystem, disk or mounting.

From shell - "traversal failed: u: Bad message" when deleting an extremely large directory in Linux - Unix & Linux Stack Exchange someone had the “bad message” when there was a filesystem corruption.

What is the exact setup for the node i.e. is it using docker with bind mounting? What is the disk?

Try checking the filesystem for corruption (shell - "traversal failed: u: Bad message" when deleting an extremely large directory in Linux - Unix & Linux Stack Exchange). You may need to stop the storagenode when it’s running.

We’ll try to find other potential explanations for the error.

baker · February 5, 2020, 2:21pm

Hi Egon,

Sorry for taking so long to get back to you. Your hunch about the filesystem was correct. After running the consistency check, some errors were found and fixed. The node is working properly again. What I find odd is that the node was working fine when I updated from v0.31.9 to v0.31.10 earlier on Jan 31 with no issues. This problem only came up when I updated from v0.31.10 to v0.31.12 later that day. No restarts and system was stable during that time. I always stop the node with the stop -t 300 option. I doubt it will be useful, but here is the output from fsck:

fsck output

user@rock64:~$ sudo fsck -cfk /dev/sda1
fsck from util-linux 2.29.2
e2fsck 1.43.4 (31-Jan-2017)
Checking for bad blocks (read-only test): done
storj1: Updating bad block inode.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Entry 'sw4ya375p2yzobnq2o2524supvudenzeynjn5icguodw45ndcq.sj1' in /v3alpha/storage/blobs/abforhuxbzyd35blusvrifvdwmfx4hmocsva4vmpp3rgqaaaaaaa/6v (182845651) has deleted/unused inode 185522922.  Clear<y>? yes
Entry 'vmoddrhex4cuusm462omlgrcdvad264qt6ohh4a2mvcfb42whq.sj1' in /v3alpha/storage/blobs/v4weeab67sbgvnbwd5z7tweqsqqun7qox2agpbxy44mqqaaaaaaa/v3 (183107920) has deleted/unused inode 185522916.  Clear<y>? yes
Entry 'lyeiv7q6ldn2o64gmgu4bemssp7ogjlljolomcy6y5c7ob6m3a.sj1' in /v3alpha/storage/blobs/v4weeab67sbgvnbwd5z7tweqsqqun7qox2agpbxy44mqqaaaaaaa/cc (183108055) has deleted/unused inode 185522915.  Clear<y>? yes
Entry 'nmzohfbmil234soiscuf4x76boycbkoq7oaikoyajappqwmtxq.sj1' in /v3alpha/storage/blobs/v4weeab67sbgvnbwd5z7tweqsqqun7qox2agpbxy44mqqaaaaaaa/w5 (183238890) has deleted/unused inode 185522917.  Clear<y>? yes
Entry 'himu2lbxqftawh5lqxplddrlmand6zrmc5j4vl57jmy5p2q6pq.sj1' in /v3alpha/storage/blobs/v4weeab67sbgvnbwd5z7tweqsqqun7qox2agpbxy44mqqaaaaaaa/st (183369738) has deleted/unused inode 185522914.  Clear<y>? yes
Entry '5yttgjqkjo3k5j37gs4mdbd4cnd5bjpwipctal6oyctv5vceia.sj1' in /v3alpha/storage/blobs/ukfu6bhbboxilvt7jrwlqk7y2tapb5d2r2tsmj2sjxvw5qaaaaaa/3v (183369993) has deleted/unused inode 185522921.  Clear<y>? yes
Entry 'ifoql56domt26lbesby6loadwafcq7muifmmrqwpxr3ew7rsrq.sj1' in /v3alpha/storage/blobs/v4weeab67sbgvnbwd5z7tweqsqqun7qox2agpbxy44mqqaaaaaaa/4s (183369815) has deleted/unused inode 185522924.  Clear<y>? yes
Entry 'q7lf4u452i353qv7eiatznc2qbmumzfazkfg3a7qphf4ycuokq.sj1' in /v3alpha/storage/blobs/qstuylguhrn2ozjv4h2c6xpxykd622gtgurhql2k7k75wqaaaaaa/as (183763999) has deleted/unused inode 185522913.  Clear<y>? yes
Entry 'bgmh7pvpurt4yqrmk33wxuvcxb5jyze3q6h3q5gfgof3uqw7ja.sj1' in /v3alpha/storage/blobs/ukfu6bhbboxilvt7jrwlqk7y2tapb5d2r2tsmj2sjxvw5qaaaaaa/3g (183894022) has deleted/unused inode 185522919.  Clear<y>? yes
Entry 'ug3c4kge3cfvsgsrth4pgvs4hnx7pemjs7oxz2r27vgeervxja.sj1' in /v3alpha/storage/blobs/ukfu6bhbboxilvt7jrwlqk7y2tapb5d2r2tsmj2sjxvw5qaaaaaa/no (184044993) has deleted/unused inode 185522920.  Clear<y>? yes
Entry 'lkfhqh6lgr76zuz6zg7id2mgstdpuw5z2ebrb37gv4c6s6zvsq.sj1' in /v3alpha/storage/blobs/ukfu6bhbboxilvt7jrwlqk7y2tapb5d2r2tsmj2sjxvw5qaaaaaa/e6 (184045497) has deleted/unused inode 185522918.  Clear<y>? yes
Entry 'onctlznq376s3l6bwh5gfzo2r4lgie6nf5j4reew5mwcvtlota.sj1' in /v3alpha/storage/blobs/qstuylguhrn2ozjv4h2c6xpxykd622gtgurhql2k7k75wqaaaaaa/fy (184162034) has deleted/unused inode 185522923.  Clear<y>? yes
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Unattached inode 185522929
Connect to /lost+found<y>? yes
Inode 185522929 ref count is 2, should be 1.  Fix<y>? yes
Unattached inode 185522930
Connect to /lost+found<y>? yes
Inode 185522930 ref count is 2, should be 1.  Fix<y>? yes
Unattached inode 185522931
Connect to /lost+found<y>? yes
Inode 185522931 ref count is 2, should be 1.  Fix<y>? yes
Unattached inode 185522932
Connect to /lost+found<y>? yes
Inode 185522932 ref count is 2, should be 1.  Fix<y>? yes
Unattached inode 185522933
Connect to /lost+found<y>? yes
Inode 185522933 ref count is 2, should be 1.  Fix<y>? yes
Unattached inode 185522934
Connect to /lost+found<y>? yes
Inode 185522934 ref count is 2, should be 1.  Fix<y>? yes
Unattached inode 185522935
Connect to /lost+found<y>? yes
Inode 185522935 ref count is 2, should be 1.  Fix<y>? yes
Unattached inode 185522936
Connect to /lost+found<y>? yes
Inode 185522936 ref count is 2, should be 1.  Fix<y>? yes
Unattached inode 185522937
Connect to /lost+found<y>? yes
Inode 185522937 ref count is 2, should be 1.  Fix<y>? yes
Unattached inode 185522938
Connect to /lost+found<y>? yes
Inode 185522938 ref count is 2, should be 1.  Fix<y>? yes
Unattached inode 185522942
Connect to /lost+found<y>? yes
Inode 185522942 ref count is 2, should be 1.  Fix<y>? yes
Unattached inode 186030414
Connect to /lost+found<y>? yes
Inode 186030414 ref count is 2, should be 1.  Fix<y>? yes
Unattached inode 186128954
Connect to /lost+found<y>? yes
Inode 186128954 ref count is 2, should be 1.  Fix<y>? yes
Unattached inode 186132228
Connect to /lost+found<y>? yes
Inode 186132228 ref count is 2, should be 1.  Fix<y>? yes
Unattached inode 186132229
Connect to /lost+found<y>? yes
Inode 186132229 ref count is 2, should be 1.  Fix<y>? yes
Unattached inode 186132230
Connect to /lost+found<y>? yes
Inode 186132230 ref count is 2, should be 1.  Fix<y>? yes
Unattached inode 186132231
Connect to /lost+found<y>? yes
Inode 186132231 ref count is 2, should be 1.  Fix<y>? yes
Unattached inode 186132232
Connect to /lost+found<y>? yes
Inode 186132232 ref count is 2, should be 1.  Fix<y>? yes
Unattached inode 186132234
Connect to /lost+found<y>? yes
Inode 186132234 ref count is 2, should be 1.  Fix<y>? yes
Pass 5: Checking group summary information
Block bitmap differences:  -(558731904--558732010) +(558761472--558761983) +(558765056--558765109) +(558765984--558766000) +(558767616--558767999) +(558768640--558769151) -(558770688--558771253) -(558772224--558772789) +(558781952--558782005) -(558794304--558794357) -(558809088--558809141) -(558820352--558820863) -742137734 -742138234 -(742139482--742139485)
Fix<y>? yes
Free blocks count wrong for group #17051 (11181, counted=11288).
Fix<y>? yes
Free blocks count wrong for group #17052 (10200, counted=12613).
Fix<y>? yes
Free blocks count wrong for group #17053 (11838, counted=12458).
Fix<y>? yes
Free blocks count wrong for group #22648 (6816, counted=6822).
Fix<y>? yes
Free blocks count wrong (463954884, counted=463958030).
Fix<y>? yes
Inode bitmap differences:  -(185522913--185522928)
Fix<y>? yes
Free inodes count wrong for group #22646 (33, counted=49).
Fix<y>? yes
Free inodes count wrong (242534050, counted=242534066).
Fix<y>? yes

storj1: ***** FILE SYSTEM WAS MODIFIED *****
storj1: 1661262/244195328 files (7.5% non-contiguous), 512796146/976754176 blocks

My setup is a rock64 4GB board running Debian Stretch, with a 4TB external USB drive formatted ext4. It is auto-mounted at boot time with an fstab entry. The only unique thing there is I mount the drive with the noatime option. The share location is attached to the container with the bind mount type.

Since the node seems able to recover the used space calculation once the file system inconsistencies are fixed, does this mean the node was stuck trying to read the size of a bad file and wouldn’t continue reading the rest of the pieces? I haven’t had any failed audits (yet). I feel like this is something the node should be able to recover from on it’s own by rebuilding the used space db automatically. Although I will admit that I have only a loose understanding of what goes on behind the scenes.

Thanks again for your help! If you feel this needs further investigation, I would be happy to help with what I can. Also, thanks @Alexey for splitting this off into it’s own thread. I should have done so in the first place.

(edit: cleaned up fsck output)

Egon · February 6, 2020, 2:46pm

At some point we introduced a bug that caused the storage node not to iterate over all the pieces on disk, which mean’t the issue went unnoticed. It failed to read the pieces hence the blob calculation exited early, giving the false number of how much data was actually available. We’ll have two fixes at least to ensure that we get as good result as possible.

However, I have no good guesses why the disk ended up in that state. Also, when there’s a disk corruption, there it isn’t something we can easily fix. The easiest thing we can add is to log “I’m unable to read the disk, please verify the disk is not corrupt.”. It’s not quite clear what the action should be since, if it goes unnoticed, the node could end up failing audits or the state of the disk could worsen… then again stopping the node altogether wouldn’t also be great. Trying to automatically repair the disk could make the disk worse as well.

baker · February 6, 2020, 3:50pm

I agree that the filesystem corruption is likely something to do with my setup. I will check my power supplies, perhaps there is a problem there. The surface test didn’t find any bad blocks, so at this point I am not suspecting the disk itself. It took me 10 hours to do the fs check, so I definitely don’t want a repeat of that once we hit production. Granted most of that time was the surface test, so if I only checked the filesystem, it would have had less of an impact.

The timing of the issue is something I can’t square away completely. In a case like this, is it the storage_usage db file that has become corrupt? Was it just dumb luck that the corruption hit a non-critical db?

Health of the overall system should definitely the responsibility of the SNO. Trying to have the node software intervene would for sure cause more problems. I think the node should be prevented from running in this state, similar to with the other database pre-flight checks. I don’t think the node should be happy to just start counting space over again when it knows there is a bad storage_usage db sitting there.

In my case it didn’t matter since I had plenty of free space, but if I didn’t notice the problem the node could have filled the drive to 100%. This becomes possible because when the node detects less space then allocated, it allocates 100% of the available free space. Maybe this auto-allocation should only ask for 90% of the free space available as a default.

My apologies if I have confused the way any of these processes/files relate to each other. Thanks again for your time.

Egon · February 7, 2020, 9:05am

In a case like this, is it the storage_usage db file that has become corrupt? Was it just dumb luck that the corruption hit a non-critical db?

In this case the pieces themselves were corrupt. It ended up having a side-effect on other calculations, which ended as erroneous values in storage_usage.db. storage_usage.db itself is mostly a cache to understand how much each satellite is storing on the disk.

It did however hit a critical database - the pieces. However, due to the decentralized nature of downloads and frequency of downloads of different pieces, it never ended up being read… or the storagenode responded with an error that it wasn’t able to read it.

Maybe this auto-allocation should only ask for 90% of the free space available as a default.

We’ll add a configuration flag that allows to say “never accept uploads when there is less than N GB available on the disk”.