On restart - reverting back to 3GB storage

ItsHass · February 14, 2025, 10:10am

Hi all,

I have a new hard drive / new node - But it keeps reverting back to 3GB storage after restart… Do you know how I fix this?

Here are the DB list

-rwxrwxrwx 1 root root  36K Feb 14 10:01 bandwidth.db
-rwxrwxrwx 1 root root  32K Feb 14 10:02 bandwidth.db-shm
-rwxrwxrwx 1 root root    0 Feb 14 10:02 bandwidth.db-wal
-rwxrwxrwx 1 root root  24K Feb 14 10:01 garbage_collection_filewalker_progress.db
-rwxrwxrwx 1 root root  32K Feb 14 10:01 heldamount.db
-rwxrwxrwx 1 root root  16K Feb 14 10:01 info.db
-rwxrwxrwx 1 root root  24K Feb 14 10:01 notifications.db
drwxrwxrwx 4 root root 4.0K Feb 10 14:44 orders
-rwxrwxrwx 1 root root  32K Feb 14 10:01 orders.db
-rwxrwxrwx 1 root root  28K Feb 14 10:01 piece_expiration.db
drwxrwxrwx 5 root root 4.0K Feb 10 19:19 piece_expirations
-rwxrwxrwx 1 root root  24K Feb 14 10:01 piece_spaced_used.db
-rwxrwxrwx 1 root root  24K Feb 14 10:01 pieceinfo.db
-rwxrwxrwx 1 root root  24K Feb 14 10:01 pricing.db
-rwxrwxrwx 1 root root  32K Feb 14 10:02 pricing.db-shm
-rwxrwxrwx 1 root root    0 Feb 14 10:02 pricing.db-wal
-rwxrwxrwx 1 root root  24K Feb 14 10:01 reputation.db
-rwxrwxrwx 1 root root  32K Feb 14 10:02 reputation.db-shm
-rwxrwxrwx 1 root root    0 Feb 14 10:02 reputation.db-wal
-rwxrwxrwx 1 root root  32K Feb 14 09:31 satellites.db
-rwxrwxrwx 1 root root  32K Feb 14 09:31 satellites.db-shm
-rwxrwxrwx 1 root root  41K Feb 14 09:31 satellites.db-wal
-rwxrwxrwx 1 root root  24K Feb 14 10:01 secret.db
-rwxrwxrwx 1 root root  24K Feb 14 10:01 storage_usage.db
-rwxrwxrwx 1 root root  32K Feb 14 10:02 storage_usage.db-shm
-rwxrwxrwx 1 root root    0 Feb 14 10:02 storage_usage.db-wal
-rwxrwxrwx 1 root root  20K Feb 14 10:01 used_serial.db
-rwxrwxrwx 1 root root 400K Feb 14 10:01 used_space_per_prefix.db

blob storage showing:

directory:

I’m sure its something simple… I just can’t figure it out

nerdatwork · February 14, 2025, 11:56am

What is your docker run command and value for –storage.allocated-disk-space in config.yaml file?

LxdrJ · February 14, 2025, 12:07pm

Owner of folder is root which user runs docker?

ItsHass · February 14, 2025, 12:45pm

root runs it , and can see it opening the DB files

ItsHass · February 14, 2025, 12:46pm

-e STORAGE=“3.2TB” \

ItsHass · February 14, 2025, 12:47pm

seems just storage space used is the issue … - all other DBs seem to be fine…

but i can see it being opened / updated last mod on the same time i start the node

nerdatwork · February 14, 2025, 12:50pm

Also this

ItsHass · February 14, 2025, 1:14pm

# total allocated disk space in bytes
storage.allocated-disk-space: 2.00 TB

I believe this is ignored if the -e STORAGE is set ? maybe

Mark · February 14, 2025, 4:25pm

It sounds like you might be experiencing the used space file walker bug. High Trash usage - 27 TB - no more uploads - #29 by Alexey

I was just experimenting on my node and it looks like simply renaming or deleting the used_space_per_prefix.db and restarting the node does the trick.

After the file walker has had time to complete, the used space numbers on the dashboard should update to the correct numbers.

For some reason the used space numbers don’t save to the database on shutdown or restart but they do save to the database hourly. Also you might have to fix the used_space_per_prefix.db issued each time you restart if you want the file walker to run correctly.

ItsHass · February 14, 2025, 4:44pm

Aha …

So i did notice that i dont have a -wal , or -shm file additional in that first post of the db log.

i have removed that file - and restarted

now it looks like this

-rwxrwxrwx 1 root root   36864 Feb 14 16:39 bandwidth.db
-rwxrwxrwx 1 root root   32768 Feb 14 16:39 bandwidth.db-shm
-rwxrwxrwx 1 root root   41232 Feb 14 16:39 bandwidth.db-wal
-rwxrwxrwx 1 root root   24576 Feb 14 11:43 garbage_collection_filewalker_progress.db
-rwxrwxrwx 1 root root   32768 Feb 14 16:39 garbage_collection_filewalker_progress.db-shm
-rwxrwxrwx 1 root root   32992 Feb 14 16:39 garbage_collection_filewalker_progress.db-wal
-rwxrwxrwx 1 root root   32768 Feb 14 11:43 heldamount.db
-rwxrwxrwx 1 root root   32768 Feb 14 16:39 heldamount.db-shm
-rwxrwxrwx 1 root root   32992 Feb 14 16:39 heldamount.db-wal
-rwxrwxrwx 1 root root   16384 Feb 14 11:43 info.db
-rwxrwxrwx 1 root root   32768 Feb 14 16:39 info.db-shm
-rwxrwxrwx 1 root root   32992 Feb 14 16:39 info.db-wal
-rwxrwxrwx 1 root root   24576 Feb 14 11:43 notifications.db
-rwxrwxrwx 1 root root   32768 Feb 14 16:39 notifications.db-shm
-rwxrwxrwx 1 root root   32992 Feb 14 16:39 notifications.db-wal
drwxrwxrwx 4 root root    4096 Feb 10 14:44 orders
-rwxrwxrwx 1 root root   32768 Feb 14 11:43 orders.db
-rwxrwxrwx 1 root root   32768 Feb 14 16:39 orders.db-shm
-rwxrwxrwx 1 root root   32992 Feb 14 16:39 orders.db-wal
-rwxrwxrwx 1 root root   28672 Feb 14 11:43 piece_expiration.db
-rwxrwxrwx 1 root root   32768 Feb 14 16:39 piece_expiration.db-shm
-rwxrwxrwx 1 root root   41232 Feb 14 16:39 piece_expiration.db-wal
drwxrwxrwx 5 root root    4096 Feb 10 19:19 piece_expirations
-rwxrwxrwx 1 root root   24576 Feb 14 16:39 piece_spaced_used.db
-rwxrwxrwx 1 root root   32768 Feb 14 16:39 piece_spaced_used.db-shm
-rwxrwxrwx 1 root root   49472 Feb 14 16:39 piece_spaced_used.db-wal
-rwxrwxrwx 1 root root   24576 Feb 14 11:43 pieceinfo.db
-rwxrwxrwx 1 root root   32768 Feb 14 16:39 pieceinfo.db-shm
-rwxrwxrwx 1 root root   32992 Feb 14 16:39 pieceinfo.db-wal
-rwxrwxrwx 1 root root   24576 Feb 14 11:43 pricing.db
-rwxrwxrwx 1 root root   32768 Feb 14 16:39 pricing.db-shm
-rwxrwxrwx 1 root root   65952 Feb 14 16:39 pricing.db-wal
-rwxrwxrwx 1 root root   24576 Feb 14 15:29 reputation.db
-rwxrwxrwx 1 root root   32768 Feb 14 16:39 reputation.db-shm
-rwxrwxrwx 1 root root   32992 Feb 14 16:39 reputation.db-wal
-rwxrwxrwx 1 root root   32768 Feb 14 16:39 satellites.db
-rwxrwxrwx 1 root root   32768 Feb 14 16:39 satellites.db-shm
-rwxrwxrwx 1 root root   41232 Feb 14 16:39 satellites.db-wal
-rwxrwxrwx 1 root root   24576 Feb 14 11:43 secret.db
-rwxrwxrwx 1 root root   32768 Feb 14 16:39 secret.db-shm
-rwxrwxrwx 1 root root   32992 Feb 14 16:39 secret.db-wal
-rwxrwxrwx 1 root root   24576 Feb 14 11:43 storage_usage.db
-rwxrwxrwx 1 root root   32768 Feb 14 16:39 storage_usage.db-shm
-rwxrwxrwx 1 root root   32992 Feb 14 16:39 storage_usage.db-wal
-rwxrwxrwx 1 root root   20480 Feb 14 11:43 used_serial.db
-rwxrwxrwx 1 root root   32768 Feb 14 16:39 used_serial.db-shm
-rwxrwxrwx 1 root root   41232 Feb 14 16:39 used_serial.db-wal
-rwxrwxrwx  1 root root  397312 Feb 14 16:39 used_space_per_prefix.db
-rwxrwxrwx  1 root root   32768 Feb 14 16:39 used_space_per_prefix.db-shm
-rwxrwxrwx  1 root root 4124152 Feb 14 16:39 used_space_per_prefix.db-wal

Mark · February 14, 2025, 4:46pm

So how do the numbers on your dashboard look now?

ItsHass · February 14, 2025, 4:47pm

We’re ALIVE !

Mark · February 14, 2025, 4:51pm

Excellent. Unfortunately the used space resets might keep happening until they fix the bug. Disabling the file walker should prevent it, unless you want to keep deleting that used_space_per_prefix.db file instead.

ItsHass · February 14, 2025, 5:30pm

So, on checking another node - I think i might have the issue with orders.db - used_serial.db and pieceinfo.db … is that part of the issue also ?

Mark · February 14, 2025, 5:33pm

I would say no. I am not aware of any issues with those other databases.

ItsHass · March 1, 2025, 11:42pm

@Alexey Do you know of any update regarding the ‘bug’ on the file walker with that db mentioned by @Mark

Mark · March 2, 2025, 12:02am

I was looking at the code recently to try to better understand the issue and I came across this:

github.com

storj/storj/blob/1d4e57d04d237b97d490cf0f4c00e770f2355ef9/storagenode/pieces/filewalker.go#L93


      
          	return fw.WalkAndComputeSpaceUsedBySatelliteWithWalkFunc(ctx, satelliteID, nil)
          }
          
          // WalkAndComputeSpaceUsedBySatelliteWithWalkFunc walks over all pieces for a given satellite, adds up and returns the total space used.
          // It also calls the walkFunc for each piece.
          // This is useful for testing purposes. Call this method with a walkFunc that collects information about each piece.
          func (fw *FileWalker) WalkAndComputeSpaceUsedBySatelliteWithWalkFunc(ctx context.Context, satelliteID storj.NodeID, walkFunc func(StoredPieceAccess) error) (satPiecesTotal int64, satPiecesContentSize int64, satPieceCount int64, err error) {
          	satelliteUsedSpacePerPrefix := make(map[string]PrefixUsedSpace)
          	var skipPrefixFunc blobstore.SkipPrefixFn
          	if fw.usedSpaceDB != nil {
          		// hardcoded 7 days, if the used space is not updated in the last 7 days, we will recalculate it.
          		// TODO: make this configurable
          		lastUpdated := time.Now().Add(-time.Hour * 168)
          		usedSpace, err := fw.usedSpaceDB.Get(ctx, satelliteID, &lastUpdated)
          		if err != nil && !errs.Is(err, sql.ErrNoRows) {
          			return 0, 0, 0, errFileWalker.Wrap(err)
          		}
          
          		for _, prefix := range usedSpace {
          			satelliteUsedSpacePerPrefix[prefix.Prefix] = prefix
          		}

It looks like the used space file walker is programmed to not do a file walk if it has already run in the last 7 days. So now I don’t know if it’s a bug or just unexpected behavior. Either way it can cause issues at times. I haven’t tried waiting and checking to see if it runs normally after waiting 7 days.

Alexey · March 2, 2025, 4:21am

You may track it

I believe @Mark is right - this is the reason. So you may also wait 7 days to check is it updated or not.