Disk usage discrepancy?

However, I believe some filewalkers (the trash one for example and perhaps TTL collector are still working with a normal priority).

That’s usually mean that the collected stat was not written to the databases. So you need to figure out why. I believe you have errors, related to the databases (malformed, “database is locked”, “file is not a database”, etc.).
If you also have errors, related to a used-space-filewalker, then the databases will not be updated too. As a result - “reset” stat on the dashboard (to what is it stored in the databases).

1 Like

And may I please ask how can I reset database?

Good morning,

Today a very strange thing happened to me with the maximum capacity of one of my nodes. After having finished the lazyfilewalker (10 days later) and having updated the databases, I found that in the dashboard, the total disk space is higher than configured in the config.yml (15.00TB) and the strangest thing is even above it is greater than the maximum capacity of the hard drive (~16TB)

image
image
image

I understand that this is a bug, and if I restart the node and run the lazyfilewalker again it should be resolved. But why has the total available disk changed if I have it assigned to 15TB?

I was moved to this topic and now? :smiley:
Here are so many different problems. Which fits to my case exactly?

You do not need to. Just enable the scan on startup, if you disabled it (it’s enabled by default) and restart the node. Then make sure that you do not have errors related to a databases (error and database) and filewalkers (error and used-space).

1 Like

You likely checked the multinode dashboard, am I right?
If so, see

Exactly the linked one in my reply.
I would copy the content

You may check errors related to databases and/or filewalkers in your logs

docker logs storagenode 2>&1 | grep -i error | grep -E "database|used-space" | tail

The suggestion depends on what error do you have. And also I believe that filewalker works noticeable slower on xfs, at least seems so: Topics tagged xfs.

1 Like

Hello Alexey, I don’t use the multinode dashboard, I always consult single dashboard. It has been a very strange thing. I’m going to restart the node and let it update the databases again. I will inform you when it has been updated

Then it’s weird. We reverted the change for such kind of behavior for the SND. But the related code is still provides this information of what’s a node currently decide to use as an allocated via API (it chooses the minimum from the “allocated”, “used + free (in the allocated)” and “used + free (on the disk)” to do not use more space, than it should).
However, it takes the “used” from the databases (because there is no way to request a used in the allocated space from the OS unless you provided the whole volume to the node, but the node is not aware of that fact in any way).

I see database locked and filewalker errors. I have also seen that my uptime is at 92-94% although the node is actually active. There some random restarts after some hours/days.

2024-07-10T06:29:35+02:00       ERROR   lazyfilewalker.used-space-filewalker    failed to start subprocess      {"Process": "storagenode", "satelliteID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE", "error": "context canceled"}
2024-07-10T06:29:39+02:00       ERROR   lazyfilewalker.used-space-filewalker    failed to start subprocess      {"Process": "storagenode", "satelliteID": "12rfG3sh9NCWiX3ivPjq2HtdLmbqCrvHVEzJubnzFzosMuawymB", "error": "context canceled"}
2024-07-10T06:29:41+02:00       ERROR   lazyfilewalker.used-space-filewalker    failed to start subprocess      {"Process": "storagenode", "satelliteID": "12tRQrMTWUWwzwGh18i7Fqs67kmdhH9t6aToeiwbo5mfS2rUmo", "error": "context canceled"}
2024-07-10T06:30:36+02:00       ERROR   filewalker      failed to get progress from database    {"Process": "storagenode"}
2024-07-10T06:30:38+02:00       ERROR   filewalker      failed to get progress from database    {"Process": "storagenode"}
2024-07-10T06:46:57+02:00       ERROR   piecestore      upload failed   {"Process": "storagenode", "Piece ID": "VWJMW7YIY3IJY76WIJL7OH35NSXLDZREN2JOIBLR7GI62MUEEXQA", "Satellite ID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE", "Action": "PUT", "Remote Address": "79.127.201.209:56894", "Size": 197376, "error": "pieceexpirationdb: database is locked", "errorVerbose": "pieceexpirationdb: database is locked\n\tstorj.io/storj/storagenode/storagenodedb.(*pieceExpirationDB).SetExpiration:115\n\tstorj.io/storj/storagenode/pieces.(*Store).SetExpiration:587\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Upload.func6:483\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Upload:541\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func1:294\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:33\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:61\n\tstorj.io/common/experiment.(*Handler).HandleRPC:42\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:167\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:109\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:157\n\tstorj.io/drpc/drpcctx.(*Tracker).track:35"}
2024-07-10T08:43:26+02:00       ERROR   piecestore      upload failed   {"Process": "storagenode", "Piece ID": "IEU4ZUSOP7MLO5PY4U2IQSWHCZTJDPJ3N6GNRJMV4PNTTG43SPPA", "Satellite ID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE", "Action": "PUT", "Remote Address": "109.61.92.75:42512", "Size": 197376, "error": "pieceexpirationdb: database is locked", "errorVerbose": "pieceexpirationdb: database is locked\n\tstorj.io/storj/storagenode/storagenodedb.(*pieceExpirationDB).SetExpiration:115\n\tstorj.io/storj/storagenode/pieces.(*Store).SetExpiration:587\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Upload.func6:483\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Upload:541\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func1:294\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:33\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:61\n\tstorj.io/common/experiment.(*Handler).HandleRPC:42\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:167\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:109\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:157\n\tstorj.io/drpc/drpcctx.(*Tracker).track:35"}
2024-07-10T09:32:12+02:00       ERROR   orders  failed to add bandwidth usage   {"Process": "storagenode", "satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "action": "GET_REPAIR", "amount": 375448832, "error": "bandwidthdb: database is locked", "errorVerbose": "bandwidthdb: database is locked\n\tstorj.io/storj/storagenode/storagenodedb.(*bandwidthDB).Add:76\n\tstorj.io/storj/storagenode/orders.(*Service).SendOrders.func2:254\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78"}
2024-07-10T09:32:12+02:00       ERROR   orders  failed to add bandwidth usage   {"Process": "storagenode", "satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "action": "GET", "amount": 146006272, "error": "bandwidthdb: database is locked", "errorVerbose": "bandwidthdb: database is locked\n\tstorj.io/storj/storagenode/storagenodedb.(*bandwidthDB).Add:76\n\tstorj.io/storj/storagenode/orders.(*Service).SendOrders.func2:254\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78"}
2024-07-10T09:32:12+02:00       ERROR   orders  failed to add bandwidth usage   {"Process": "storagenode", "satellite ID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "action": "GET", "amount": 23521792, "error": "bandwidthdb: database is locked", "errorVerbose": "bandwidthdb: database is locked\n\tstorj.io/storj/storagenode/storagenodedb.(*bandwidthDB).Add:76\n\tstorj.io/storj/storagenode/orders.(*Service).SendOrders.func2:254\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78"}```

So, you need to try to optimize your disk subsystem. I do not know much about Unraid, but if it’s possible to add a cache before the slow disk, it should improve things.
Or you can at least move the databases to another, less loaded disk, this should help with the databases issues and likely also should help to reduce an IO load on the disk, and it’s possible that the filewalker would now have a chance to finish its scans.
If the moving databases would not help with a filewalker, then you have only option to disable a lazy mode to allow it work with a normal IO priority.

You need to search for FATAL and/or Unrecoverable errors in your logs.

1 Like

hello

From my Truenas Scale version: Dragonfish-24.04.1.1

there are a mismatch between the used space is this issue a well know issue.

Storj App Version:
v1.68.2


for app installation i used the new function ixVolume see the screenshot too

1 Like

Thank you I will give this a try :slight_smile:
I think the system is just to slow for 2 nodes with the new workload of storj.
It only has 8GB RAMs and i3 6 Gen.

How would I disable a lazy mode exactly?

The reasons are the same, as explained in this post:

1 Like

A post was merged into an existing topic: Avg disk space used dropped with 60-70%

I’ve had to kind of reset my node (I was not able to get to the dashboard) so I ended up deleting my .db files and when the issue was corrected - it recreated the database files.

After being able to reconnect - the used space was reset to 0 and has been slowly increasing (it now sits at 87.54GB).

Will the used space get back to where it was before the issue?

Yes.

Also please consider 2 things.

  1. You can search the forum for similar issues so you can get faster solution to your issues.
  2. Please don’t create another account just because your new account got limited. Its meant to be limited to avoid bots from creating multiple accounts and spamming the forum.

Ref:

I suspect you created accounts support22, support24 and support25 and support26

1 Like