Node (in docker) memory increase 4GB+ leading to container restart (v1.111.4)

Hi all,
I have been running a node for quite a while on my Synology, running in docker, but lately i keep getting a a restart notification, so i looked into it and see that the memory keeps on increasing until the container just stops and then restarts for it to begin all over.
Memory as of this moment writing this is a whopping 4.11GB
My Node ID: 18638RiJMGcFbb3p1vmZvb28z9cfhPisLkqiB5Qk7SyFDj6HrE
I also see that i have 8.19TB used, 1.67TB free, 8.14TB trash

Where do i start?

When looking into the log, it shows the following:

2024-09-02T14:24:28Z INFO piecestore download started {Process: storagenode, Piece ID: 3WVWVXPCUIASRLSVAKXYJ6IMPRDNT764IZ7ZQBSOM3OXWWYJUE7A, Satellite ID: 12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S, Action: GET_REPAIR, Offset: 0, Size: 10240, Remote Address: 172.29.0.1:43536}
2024-09-02T14:24:28Z INFO piecestore download started {Process: storagenode, Piece ID: QVCNOPQU627KAPZEKYOEET4WE2KAHZTD5ALGHIAUZOKGS24IJB2Q, Satellite ID: 12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S, Action: GET, Offset: 0, Size: 17664, Remote Address: 172.29.0.1:43540}
2024-09-02T14:24:28Z INFO piecestore download started {Process: storagenode, Piece ID: DNNS5I472UOIONS66Y67SV6UPJLYFG3HSEAAMZ5JBBGSNX56T7KQ, Satellite ID: 12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S, Action: GET, Offset: 0, Size: 9472, Remote Address: 172.29.0.1:43548}
2024-09-02T14:24:28Z INFO piecestore download started {Process: storagenode, Piece ID: QVCNOPQU627KAPZEKYOEET4WE2KAHZTD5ALGHIAUZOKGS24IJB2Q, Satellite ID: 12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S, Action: GET, Offset: 0, Size: 17664, Remote Address: 172.29.0.1:43552}
2024-09-02T14:24:29Z INFO piecestore download started {Process: storagenode, Piece ID: 2IAAR3BELEG54HESKRY3OIUX56RZN5ZODI2FAQMLSDAVVB7BZWHQ, Satellite ID: 12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs, Action: GET_REPAIR, Offset: 0, Size: 286976, Remote Address: 172.29.0.1:43556}
2024-09-02T14:24:29Z INFO piecestore download started {Process: storagenode, Piece ID: QVCNOPQU627KAPZEKYOEET4WE2KAHZTD5ALGHIAUZOKGS24IJB2Q, Satellite ID: 12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S, Action: GET, Offset: 0, Size: 17664, Remote Address: 172.29.0.1:43558}
2024-09-02T14:24:29Z INFO piecestore download started {Process: storagenode, Piece ID: 5W3PM63TNI3WLI6KZAW3QOEOW6PGL2JSBCVS5WLRBUZ2DNOYXWYA, Satellite ID: 12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S, Action: GET, Offset: 0, Size: 5120, Remote Address: 172.29.0.1:43562}
2024-09-02T14:24:29Z INFO piecestore download started {Process: storagenode, Piece ID: 6GPQFKDA44HJPC6TVRNE3XC53UCFGNJ3PJQYYHHQD3WI72FA35IQ, Satellite ID: 12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S, Action: GET, Offset: 0, Size: 181248, Remote Address: 172.29.0.1:41472}
2024-09-02T14:24:29Z INFO piecestore upload started {Process: storagenode, Piece ID: RVEL6XFZGTNMYU5VZ6G566AUYY7PN447BEI5FRL3V7OCA4R7D57Q, Satellite ID: 12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S, Action: PUT_REPAIR, Remote Address: 172.29.0.1:43570, Available Space: 1668435229218}
2024-09-02T14:24:29Z INFO piecestore download started {Process: storagenode, Piece ID: QVCNOPQU627KAPZEKYOEET4WE2KAHZTD5ALGHIAUZOKGS24IJB2Q, Satellite ID: 12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S, Action: GET, Offset: 0, Size: 17664, Remote Address: 172.29.0.1:43572}
2024-09-02T14:24:29Z INFO piecestore download started {Process: storagenode, Piece ID: QVCNOPQU627KAPZEKYOEET4WE2KAHZTD5ALGHIAUZOKGS24IJB2Q, Satellite ID: 12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S, Action: GET, Offset: 0, Size: 17664, Remote Address: 172.29.0.1:43576}
2024-09-02T14:24:30Z INFO piecestore upload started {Process: storagenode, Piece ID: NE7HG4GZZ7GKNA64PXSNLZHNTJNX2PH3MHQSE4LTTCKETKTBNDCQ, Satellite ID: 12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S, Action: PUT, Remote Address: 172.29.0.1:39082, Available Space: 1668435229218}
2024-09-02T14:24:30Z INFO piecestore download started {Process: storagenode, Piece ID: VC22EKKCBPRP6US356PKSF3GYVAGZL3WBP5SYEXU5R22BARXR35A, Satellite ID: 12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S, Action: GET_REPAIR, Offset: 0, Size: 15872, Remote Address: 172.29.0.1:43582}
2024-09-02T14:24:30Z INFO piecestore upload started {Process: storagenode, Piece ID: U5TRYY6Q2HAST2FC3NEM3Z5OGEVMJ6B2VAUO7XORWMQ4U4CPQUCQ, Satellite ID: 12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S, Action: PUT_REPAIR, Remote Address: 172.29.0.1:43574, Available Space: 1668435229218}
2024-09-02T14:24:30Z INFO piecestore download started {Process: storagenode, Piece ID: SAUNFKK7YNNR7C7NSEYO5CNUPEHEAKPDYOEZ4JM2ZVFYZKXPJJFQ, Satellite ID:

As of this moment i am running the “How-to-fix-a-database-disk-image-is-malformed” checks of the db files.

./pieceinfo.dbok
./garbage_collection_filewalker_progress.dbok
./piece_expiration.dbok
./info.dbok
./storage_usage.dbok
find: ./piece_expiration.db-shm: No such file or directory
./satellites.dbok
./reputation.dbok
./pricing.dbok
./secret.dbok
./used_space_per_prefix.dbok
./piece_spaced_used.dbok
find: ./piece_expiration.db-wal: No such file or directory
./heldamount.dbok
./bandwidth.dbok
./notifications.dbok
./used_serial.dbok
./orders.dbok

Your filesystem/disk array cannot keep up with the IO, that accumulates in RAM until everything explodes.

Search the forum for advice on how to optimize the filesystem (disable sync, disable atime updates, move databases elsewhere, if it’s btrfs you may want to consider disabling parity/cheksumming on storj data (btrfs allows to do it on a per-file granularity but Synology exposes it only at subvolume level, what they call Share), etc)

Will look into all this, BUT i have had the node running for years, reinstalled once but all with the same settings.
Volume where the data sits on is ext4, 4 disk raid.
Why should it all of a sudden do this? I think it has to do with the recent updates somehow.

UPDATE1:
This is my fstab:

none /proc proc defaults 0 0
/dev/root / ext4 defaults 1 1
/dev/mapper/cachedev_0 /volume4 btrfs auto_reclaim_space,ssd,synoacl,noatime,nodev 0 0
/dev/mapper/cachedev_1 /volume3 btrfs auto_reclaim_space,ssd,synoacl,noatime,nodev 0 0
/dev/mapper/cachedev_2 /volume2 ext4 usrjquota=aquota.user,grpjquota=aquota.group,jqfmt=vfsv0,synoacl,noatime,nodev 0 0
/dev/mapper/cachedev_3 /volume1 btrfs auto_reclaim_space,ssd,synoacl,noatime,nodev 0 0

As you can see, volume2 has all attributes mentioned…right?

UPDATE2:
OK, so i installed iotop and ran it, only when starting the storagenode do multiple instances (between 4-10) of the PID go between 60-99% IO but the disk read/write are not really ever over 1500K/s sometimes hit 3.0M/s

Where do i go from here?

Synology has resource monitor, right? You can keep it running and see what proceeds the failure. There may be a perfect storm of events that goes above what drives can handed ( like Synology indexer, array scrub, smart tests, any number of filewalkers, load from customers, etc).

I don’t remember if Synology shows historical stats to go back and see what was the IOPS/disk queue length/etc when the memory, but if it does— start logging with resource monitor or other util and wait for the issue to reproduce.

These are wrong stats to look at. Mb/sec is irrelevant. You want IO operations per second, on each disk.

For example, writing just 15kB blocks will exceed disk capability at 3MB/s. Most storj blocks are smaller than that. And then there are metadata updates, parity, checksums, directory updates, etc etc.

Your best bet is enable metadata caching.

1 Like

Just installed Active Insight (yeah i know all the other issues) and Read IOPS for the 4 discs show arround 230-250 (Write IOPS 55-78)
I do notice that drive 6 seems higher on all accounts then the other 3 (read+write ms, IOPS and utilization)
Do you have any more info on the metadata cache?

Here you go. HDDs can sustain 200-270 on a good day (you can look up actual number for your drives in their datasheet). They are already working at max capacity.

1 Like

I am very lucky that i can create a read/write cache, so i did this for volume2 but i cannot pin “all Btfrs metadata to an SSD cache” because the volume is ext4 and not Btrfs
Will see tomorrow morning how things are.
Thanks for now @arrogantrabbit

1 Like

I have one node with a slow cpu and fairly slow storage. (the storage is mapped over NFS share on the network, which is explicity not a recommended configuration and I get scolded for it).

Anyway, sometimes the node get “busy”. often when a lot of data is coming in (like the big tests lately) but also… maybe… if certain background chores are running at the same time like used space filewalkers and garbage collection.

During these times, the RAM used by the node starts to increase a lot.

I have a 900MB limit set in docker, and the node tends to hit that and get restarted.

Other nodes with more ram can get over 4GB of RAM usage at times, although other times it’s more like 300MB. And each node is only around 12TB max of data.

Morning all,
OK, node been running “ok” all night, have put a cap on the memory of 4GB.
Active Insight shows quite clearly that everything of drive 6 is high over the last 10-11 hours:
Utilization 96-99% (other 3 drives 35-80%)
Drive read latency 70ms-138ms (other 3 drives 16ms-35ms)
Drive write latency 235ms-532ms (other 3 drives 18ms-50ms)

My questions:

  1. Does it look like my drive 6 is going to eventually fail? even tho all shows Healthy in Syno Storage Manager
  2. What also bothers me is the waste, i have 8.18TB of data AND 8.21TB of Trash how can i clean this up?
  3. My node shows this type of log constantly:
|2024-09-03T07:14:41Z|INFO|piecestore|upload started|{Process: storagenode, Piece ID: YJMH55WYTN24HKLWHIMH6ZWMBEGHRDQR5IUB4H3CQI5O7C4B2TNA, Satellite ID: 12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S, Action: PUT, Remote Address: 172.29.0.1:43128, Available Space: 1605927285268}|
|---|---|---|---|---|
|2024-09-03T07:14:41Z|INFO|piecestore|uploaded|{Process: storagenode, Piece ID: YJMH55WYTN24HKLWHIMH6ZWMBEGHRDQR5IUB4H3CQI5O7C4B2TNA, Satellite ID: 12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S, Action: PUT, Remote Address: 172.29.0.1:43128, Size: 10752}|
|2024-09-03T07:14:41Z|INFO|piecestore|download started|{Process: storagenode, Piece ID: TQS42XUMQXTHCM4O6BTJ4VIG536RGO7BXJPN5YXHR5GJONMUT2VA, Satellite ID: 12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs, Action: GET, Offset: 158720, Size: 32000, Remote Address: 172.29.0.1:45024}|
|2024-09-03T07:14:41Z|INFO|piecestore|upload started|{Process: storagenode, Piece ID: 7563CTYTJAZAQ7GUF5DSQWX52KTPPHOYC7SMWBOOHKH4KSEZ2QYA, Satellite ID: 12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs, Action: PUT, Remote Address: 172.29.0.1:45952, Available Space: 1605927274004}|
|2024-09-03T07:14:41Z|INFO|piecestore|uploaded|{Process: storagenode, Piece ID: 7563CTYTJAZAQ7GUF5DSQWX52KTPPHOYC7SMWBOOHKH4KSEZ2QYA, Satellite ID: 12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs, Action: PUT, Remote Address: 172.29.0.1:45952, Size: 7680}|
|2024-09-03T07:14:41Z|INFO|piecestore|upload started|{Process: storagenode, Piece ID: 5GSRQMQHUCHJO5K5OIDQXV6AJJJGDPCRIAZ6SHDDXD4HPPJTVQNQ, Satellite ID: 12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs, Action: PUT, Remote Address: 172.29.0.1:49808, Available Space: 1605927265812}|
|2024-09-03T07:14:41Z|INFO|piecestore|uploaded|{Process: storagenode, Piece ID: 5GSRQMQHUCHJO5K5OIDQXV6AJJJGDPCRIAZ6SHDDXD4HPPJTVQNQ, Satellite ID: 12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs, Action: PUT, Remote Address: 172.29.0.1:49808, Size: 13056}|
|2024-09-03T07:14:41Z|INFO|piecestore|downloaded|{Process: storagenode, Piece ID: UKLE257BFDKAV4TWZ6FUROSKB6POZF72TTHWFEQPGH3QHRBRZ44A, Satellite ID: 12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S, Action: GET_REPAIR, Offset: 0, Size: 362752, Remote Address: 172.29.0.1:49902}|
|2024-09-03T07:14:41Z|INFO|piecestore|downloaded|{Process: storagenode, Piece ID: TQS42XUMQXTHCM4O6BTJ4VIG536RGO7BXJPN5YXHR5GJONMUT2VA, Satellite ID: 12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs, Action: GET, Offset: 158720, Size: 32000, Remote Address: 172.29.0.1:45024}|

OK, so i changed some settings in the config.yaml:

  • log.output: “/app/config/node.log”
  • storage2.piece-scan-on-startup: true
  • pieces.enable-lazy-filewalker: true
    And now i see the following errors in the log.
|2024-09-03T08:23:20Z|ERROR|piecestore|download failed|{Process: storagenode, Piece ID: G6LN5AHVYPRWOL5ZMMVEPCGEIH6AN4NXGHIYLVFX2R52CY7JYP6Q, Satellite ID: 12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs, Action: GET, Offset: 0, Size: 768, Remote Address: 172.29.0.1:41888, error: untrusted: unable to get signee: trust: rpc: tcp connector failed: rpc: dial tcp: lookup eu1.storj.io: operation was canceled, errorVerbose: untrusted: unable to get signee: trust: rpc: tcp connector failed: rpc: dial tcp: lookup eu1.storj.io: operation was canceled\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).VerifyOrderLimitSignature:146\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).verifyOrderLimit:64\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download:666\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func2:302\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:33\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:62\n\tstorj.io/common/experiment.(*Handler).HandleRPC:43\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:166\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:108\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:156\n\tstorj.io/drpc/drpcctx.(*Tracker).track:35}|
|---|---|---|---|---|
|2024-09-03T08:23:20Z|ERROR|piecestore|download failed|{Process: storagenode, Piece ID: YKO4YVO2M6KDBNNILGJR5P36B6XXJWDKQ4PXY64XDUNL43IRGPRQ, Satellite ID: 12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S, Action: GET, Offset: 0, Size: 7168, Remote Address: 172.29.0.1:41860, error: untrusted: unable to get signee: trust: rpc: tcp connector failed: rpc: dial tcp: lookup us1.storj.io: operation was canceled, errorVerbose: untrusted: unable to get signee: trust: rpc: tcp connector failed: rpc: dial tcp: lookup us1.storj.io: operation was canceled\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).VerifyOrderLimitSignature:146\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).verifyOrderLimit:64\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download:666\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func2:302\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:33\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:62\n\tstorj.io/common/experiment.(*Handler).HandleRPC:43\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:166\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:108\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:156\n\tstorj.io/drpc/drpcctx.(*Tracker).track:35}|
|2024-09-03T08:23:20Z|ERROR|piecestore|download failed|{Process: storagenode, Piece ID: ZEUGM4TTEU6XZKA7JMIPLB2RGUH3WX7C3EUYRGFUIHGOECNBOLWA, Satellite ID: 12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs, Action: GET, Offset: 0, Size: 768, Remote Address: 172.29.0.1:41884, error: untrusted: unable to get signee: trust: rpc: tcp connector failed: rpc: dial tcp: lookup eu1.storj.io: operation was canceled, errorVerbose: untrusted: unable to get signee: trust: rpc: tcp connector failed: rpc: dial tcp: lookup eu1.storj.io: operation was canceled\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).VerifyOrderLimitSignature:146\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).verifyOrderLimit:64\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download:666\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func2:302\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:33\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:62\n\tstorj.io/common/experiment.(*Handler).HandleRPC:43\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:166\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:108\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:156\n\tstorj.io/drpc/drpcctx.(*Tracker).track:35}|
|2024-09-03T08:23:20Z|ERROR|piecestore|download failed|{Process: storagenode, Piece ID: UTMWOPQTGIWMKJMWWVQJMGYVV3J7UHKMWTX5RAURMMSITDSOQ3AQ, Satellite ID: 12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs, Action: GET, Offset: 0, Size: 256, Remote Address: 172.29.0.1:41850, error: untrusted: unable to get signee: trust: rpc: tcp connector failed: rpc: dial tcp: lookup eu1.storj.io: operation was canceled, errorVerbose: untrusted: unable to get signee: trust: rpc: tcp connector failed: rpc: dial tcp: lookup eu1.storj.io: operation was canceled\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).VerifyOrderLimitSignature:146\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).verifyOrderLimit:64\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download:666\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func2:302\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:33\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:62\n\tstorj.io/common/experiment.(*Handler).HandleRPC:43\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:166\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:108\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:156\n\tstorj.io/drpc/drpcctx.(*Tracker).track:35}|
|2024-09-03T08:23:38Z|ERROR|piecestore|download failed|{Process: storagenode, Piece ID: OS3NLUOJH7IQKSG6YDU73I5Q7OTA75OLEYLYSDCWZFKEVWMC3YSQ, Satellite ID: 121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6, Action: GET, Offset: 0, Size: 1024, Remote Address: 172.29.0.1:41968, error: trust: rpc: tcp connector failed: rpc: context canceled, errorVerbose: trust: rpc: tcp connector failed: rpc: context canceled\n\tstorj.io/common/rpc.HybridConnector.DialContext.func1:190}|
|2024-09-03T08:26:02Z|ERROR|piecestore|upload internal error|{Process: storagenode, error: manager closed: unexpected EOF, errorVerbose: manager closed: unexpected EOF\n\tgithub.com/jtolio/noiseconn.(*Conn).readMsg:225\n\tgithub.com/jtolio/noiseconn.(*Conn).Read:171\n\tstorj.io/drpc/drpcwire.(*Reader).read:68\n\tstorj.io/drpc/drpcwire.(*Reader).ReadPacketUsing:113\n\tstorj.io/drpc/drpcmanager.(*Manager).manageReader:230}|
|2024-09-03T08:26:02Z|ERROR|piecestore|upload failed|{Process: storagenode, Piece ID: 77GFO77ED37CSU656TKVQHDQOZTU4ETGDVULQ6T2YXPBQMAOQQ2A, Satellite ID: 12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S, Action: PUT, Remote Address: 172.29.0.1:44018, Size: 131072, error: manager closed: unexpected EOF, errorVerbose: manager closed: unexpected EOF\n\tgithub.com/jtolio/noiseconn.(*Conn).readMsg:225\n\tgithub.com/jtolio/noiseconn.(*Conn).Read:171\n\tstorj.io/drpc/drpcwire.(*Reader).read:68\n\tstorj.io/drpc/drpcwire.(*Reader).ReadPacketUsing:113\n\tstorj.io/drpc/drpcmanager.(*Manager).manageReader:230}|
|2024-09-03T08:28:32Z|ERROR|piecestore|upload internal error|{Process: storagenode, error: manager closed: unexpected EOF, errorVerbose: manager closed: unexpected EOF\n\tgithub.com/jtolio/noiseconn.(*Conn).readMsg:225\n\tgithub.com/jtolio/noiseconn.(*Conn).Read:171\n\tstorj.io/drpc/drpcwire.(*Reader).read:68\n\tstorj.io/drpc/drpcwire.(*Reader).ReadPacketUsing:113\n\tstorj.io/drpc/drpcmanager.(*Manager).manageReader:230}|
|2024-09-03T08:28:32Z|ERROR|piecestore|upload failed|{Process: storagenode, Piece ID: ZHLXQYYPMBDIOUKW3DVHGW4DJFBPAT3GAT3R74QRKAFHG75Z6KTA, Satellite ID: 12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S, Action: PUT, Remote Address: 172.29.0.1:45672, Size: 65536, error: manager closed: unexpected EOF, errorVerbose: manager closed: unexpected EOF\n\tgithub.com/jtolio/noiseconn.(*Conn).readMsg:225\n\tgithub.com/jtolio/noiseconn.(*Conn).Read:171\n\tstorj.io/drpc/drpcwire.(*Reader).read:68\n\tstorj.io/drpc/drpcwire.(*Reader).ReadPacketUsing:113\n\tstorj.io/drpc/drpcmanager.(*Manager).manageReader:230}|
|2024-09-03T08:29:35Z|ERROR|piecestore|upload internal error|{Process: storagenode, error: manager closed: unexpected EOF, errorVerbose: manager closed: unexpected EOF\n\tgithub.com/jtolio/noiseconn.(*Conn).readMsg:225\n\tgithub.com/jtolio/noiseconn.(*Conn).Read:171\n\tstorj.io/drpc/drpcwire.(*Reader).read:68\n\tstorj.io/drpc/drpcwire.(*Reader).ReadPacketUsing:113\n\tstorj.io/drpc/drpcmanager.(*Manager).manageReader:230}|
|2024-09-03T08:29:35Z|ERROR|piecestore|upload failed|{Process: storagenode, Piece ID: PXZ6RS6JMXF6JVRX4QTWLXTMFCEEXNXAD4DZXV3XBS3RQHT7MX3A, Satellite ID: 12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S, Action: PUT, Remote Address: 172.29.0.1:46348, Size: 196608, error: manager closed: unexpected EOF, errorVerbose: manager closed: unexpected EOF\n\tgithub.com/jtolio/noiseconn.(*Conn).readMsg:225\n\tgithub.com/jtolio/noiseconn.(*Conn).Read:171\n\tstorj.io/drpc/drpcwire.(*Reader).read:68\n\tstorj.io/drpc/drpcwire.(*Reader).ReadPacketUsing:113\n\tstorj.io/drpc/drpcmanager.(*Manager).manageReader:230}|
|2024-09-03T08:34:56Z|ERROR|piecestore|upload internal error|{Process: storagenode, error: manager closed: unexpected EOF, errorVerbose: manager closed: unexpected EOF\n\tgithub.com/jtolio/noiseconn.(*Conn).readMsg:225\n\tgithub.com/jtolio/noiseconn.(*Conn).Read:171\n\tstorj.io/drpc/drpcwire.(*Reader).read:68\n\tstorj.io/drpc/drpcwire.(*Reader).ReadPacketUsing:113\n\tstorj.io/drpc/drpcmanager.(*Manager).manageReader:230}|
|2024-09-03T08:35:00Z|ERROR|piecestore|upload failed|{Process: storagenode, Piece ID: KG6T6H7343NT5LI2BSTPSRY4EMOSSO77Z6ZUWC6FSTIU2EAMMZNQ, Satellite ID: 12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S, Action: PUT, Remote Address: 172.29.0.1:49880, Size: 655360, error: manager closed: unexpected EOF, errorVerbose: manager closed: unexpected EOF\n\tgithub.com/jtolio/noiseconn.(*Conn).readMsg:225\n\tgithub.com/jtolio/noiseconn.(*Conn).Read:171\n\tstorj.io/drpc/drpcwire.(*Reader).read:68\n\tstorj.io/drpc/drpcwire.(*Reader).ReadPacketUsing:113\n\tstorj.io/drpc/drpcmanager.(*Manager).manageReader:230}|
|2024-09-03T08:40:50Z|ERROR|piecestore|upload internal error|{Process: storagenode, error: manager closed: unexpected EOF, errorVerbose: manager closed: unexpected EOF\n\tgithub.com/jtolio/noiseconn.(*Conn).readMsg:225\n\tgithub.com/jtolio/noiseconn.(*Conn).Read:171\n\tstorj.io/drpc/drpcwire.(*Reader).read:68\n\tstorj.io/drpc/drpcwire.(*Reader).ReadPacketUsing:113\n\tstorj.io/drpc/drpcmanager.(*Manager).manageReader:230}|
|2024-09-03T08:40:50Z|ERROR|piecestore|upload failed|{Process: storagenode, Piece ID: FDVFP56Z5IVATKIU56EEKO24PYQ32TVLL23MOLUZ64RRU7FUEAEQ, Satellite ID: 12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S, Action: PUT, Remote Address: 172.29.0.1:54010, Size: 262144, error: manager closed: unexpected EOF, errorVerbose: manager closed: unexpected EOF\n\tgithub.com/jtolio/noiseconn.(*Conn).readMsg:225\n\tgithub.com/jtolio/noiseconn.(*Conn).Read:171\n\tstorj.io/drpc/drpcwire.(*Reader).read:68\n\tstorj.io/drpc/drpcwire.(*Reader).ReadPacketUsing:113\n\tstorj.io/drpc/drpcmanager.(*Manager).manageReader:230}|
|2024-09-03T08:41:47Z|ERROR|pieces|used-space-filewalker failed|{Process: storagenode, Satellite ID: 12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S, Lazy File Walker: true, error: lazyfilewalker: signal: killed, errorVerbose: lazyfilewalker: signal: killed\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*process).run:85\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*Supervisor).WalkAndComputeSpaceUsedBySatellite:133\n\tstorj.io/storj/storagenode/pieces.(*Store).WalkAndComputeSpaceUsedBySatellite:722\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run.func1:81\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78}|
|2024-09-03T08:41:47Z|ERROR|pieces|used-space-filewalker failed|{Process: storagenode, Satellite ID: 12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S, Lazy File Walker: false, error: filewalker: context canceled, errorVerbose: filewalker: context canceled\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkSatellitePieces:74\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkAndComputeSpaceUsedBySatellite:79\n\tstorj.io/storj/storagenode/pieces.(*Store).WalkAndComputeSpaceUsedBySatellite:731\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run.func1:81\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78}|
|2024-09-03T08:41:47Z|ERROR|piecestore:cache|encountered error while computing space used by satellite|{Process: storagenode, error: filewalker: context canceled, errorVerbose: filewalker: context canceled\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkSatellitePieces:74\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkAndComputeSpaceUsedBySatellite:79\n\tstorj.io/storj/storagenode/pieces.(*Store).WalkAndComputeSpaceUsedBySatellite:731\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run.func1:81\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78, SatelliteID: 12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S}|
|2024-09-03T08:41:47Z|ERROR|piecestore:cache|error getting current used space for trash: |{Process: storagenode, error: filestore error: failed to walk trash namespace 7b2de9d72c2e935f1918c058caaf8ed00f0581639008707317ff1bd000000000: context canceled, errorVerbose: filestore error: failed to walk trash namespace 7b2de9d72c2e935f1918c058caaf8ed00f0581639008707317ff1bd000000000: context canceled\n\tstorj.io/storj/storagenode/blobstore/filestore.(*blobStore).SpaceUsedForTrash:273\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run.func1:100\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78}|
|2024-09-03T08:41:47Z|ERROR|pieces:trash|emptying trash failed|{Process: storagenode, error: pieces error: lazyfilewalker: signal: killed, errorVerbose: pieces error: lazyfilewalker: signal: killed\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*process).run:85\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*Supervisor).WalkCleanupTrash:195\n\tstorj.io/storj/storagenode/pieces.(*Store).EmptyTrash:436\n\tstorj.io/storj/storagenode/pieces.(*TrashChore).Run.func1.1:84\n\tstorj.io/common/sync2.(*Workplace).Start.func1:89}|
|2024-09-03T08:42:05Z|ERROR|contact:service|ping satellite failed |{Process: storagenode, Satellite ID: 1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE, attempts: 1, error: ping satellite: rpc: tcp connector failed: rpc: dial tcp: lookup saltlake.tardigrade.io: operation was canceled, errorVerbose: ping satellite: rpc: tcp connector failed: rpc: dial tcp: lookup saltlake.tardigrade.io: operation was canceled\n\tstorj.io/common/rpc.HybridConnector.DialContext.func1:190}|
|2024-09-03T08:42:05Z|ERROR|piecestore|download failed|{Process: storagenode, Piece ID: HW2P4EB54ZD3FY6ND6DMRKV77VQDBO4EKB4K7D2WQOL6XFSV6G6Q, Satellite ID: 12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S, Action: GET_REPAIR, Offset: 0, Size: 2319360, Remote Address: 172.29.0.1:55018, error: untrusted: unable to get signee: trust: rpc: tcp connector failed: rpc: dial tcp: lookup us1.storj.io: operation was canceled, errorVerbose: untrusted: unable to get signee: trust: rpc: tcp connector failed: rpc: dial tcp: lookup us1.storj.io: operation was canceled\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).VerifyOrderLimitSignature:146\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).verifyOrderLimit:64\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download:666\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func2:302\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:33\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:62\n\tstorj.io/common/experiment.(*Handler).HandleRPC:43\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:166\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:108\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:156\n\tstorj.io/drpc/drpcctx.(*Tracker).track:35}|
|2024-09-03T08:42:05Z|ERROR|contact:service|ping satellite failed |{Process: storagenode, Satellite ID: 121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6, attempts: 1, error: ping satellite: rpc: tcp connector failed: rpc: context canceled, errorVerbose: ping satellite: rpc: tcp connector failed: rpc: context canceled\n\tstorj.io/common/rpc.HybridConnector.DialContext.func1:190}|
|2024-09-03T08:42:05Z|ERROR|nodestats:cache|Get pricing-model/join date failed|{Process: storagenode, error: context canceled}|
|2024-09-03T08:42:05Z|ERROR|contact:service|ping satellite failed |{Process: storagenode, Satellite ID: 12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S, attempts: 1, error: ping satellite: context canceled, errorVerbose: ping satellite: context canceled\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatelliteOnce:202\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatellite:156\n\tstorj.io/storj/storagenode/contact.(*Chore).updateCycles.func1:87\n\tstorj.io/common/sync2.(*Cycle).Run:99\n\tstorj.io/common/sync2.(*Cycle).Start.func1:77\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78}|
|2024-09-03T08:42:05Z|ERROR|piecestore|download failed|{Process: storagenode, Piece ID: NE5RAYZ3UDQCBO5FP5YLXGYHA5LBNLOHI3E5BTYNBRMNMXCPQDFA, Satellite ID: 12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S, Action: GET_REPAIR, Offset: 0, Size: 17664, Remote Address: 172.29.0.1:55006, error: untrusted: unable to get signee: trust: rpc: tcp connector failed: rpc: dial tcp: lookup us1.storj.io: operation was canceled, errorVerbose: untrusted: unable to get signee: trust: rpc: tcp connector failed: rpc: dial tcp: lookup us1.storj.io: operation was canceled\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).VerifyOrderLimitSignature:146\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).verifyOrderLimit:64\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download:666\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func2:302\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:33\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:62\n\tstorj.io/common/experiment.(*Handler).HandleRPC:43\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:166\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:108\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:156\n\tstorj.io/drpc/drpcctx.(*Tracker).track:35}|
|2024-09-03T08:42:05Z|ERROR|pieces|used-space-filewalker failed|{Process: storagenode, Satellite ID: 1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE, Lazy File Walker: true, error: lazyfilewalker: signal: killed, errorVerbose: lazyfilewalker: signal: killed\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*process).run:85\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*Supervisor).WalkAndComputeSpaceUsedBySatellite:133\n\tstorj.io/storj/storagenode/pieces.(*Store).WalkAndComputeSpaceUsedBySatellite:722\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run.func1:81\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78}|
|2024-09-03T08:42:05Z|ERROR|pieces|used-space-filewalker failed|{Process: storagenode, Satellite ID: 1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE, Lazy File Walker: false, error: filewalker: context canceled, errorVerbose: filewalker: context canceled\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkSatellitePieces:74\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkAndComputeSpaceUsedBySatellite:79\n\tstorj.io/storj/storagenode/pieces.(*Store).WalkAndComputeSpaceUsedBySatellite:731\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run.func1:81\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78}|
|2024-09-03T08:42:05Z|ERROR|piecestore:cache|encountered error while computing space used by satellite|{Process: storagenode, error: filewalker: context canceled, errorVerbose: filewalker: context canceled\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkSatellitePieces:74\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkAndComputeSpaceUsedBySatellite:79\n\tstorj.io/storj/storagenode/pieces.(*Store).WalkAndComputeSpaceUsedBySatellite:731\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run.func1:81\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78, SatelliteID: 1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE}|
|2024-09-03T08:42:05Z|ERROR|lazyfilewalker.used-space-filewalker|failed to start subprocess|{Process: storagenode, satelliteID: 121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6, error: context canceled}|
|2024-09-03T08:42:05Z|ERROR|pieces|used-space-filewalker failed|{Process: storagenode, Satellite ID: 121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6, Lazy File Walker: true, error: lazyfilewalker: context canceled, errorVerbose: lazyfilewalker: context canceled\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*process).run:73\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*Supervisor).WalkAndComputeSpaceUsedBySatellite:133\n\tstorj.io/storj/storagenode/pieces.(*Store).WalkAndComputeSpaceUsedBySatellite:722\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run.func1:81\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78}|
|2024-09-03T08:42:05Z|ERROR|pieces|used-space-filewalker failed|{Process: storagenode, Satellite ID: 121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6, Lazy File Walker: false, error: filewalker: context canceled, errorVerbose: filewalker: context canceled\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkSatellitePieces:74\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkAndComputeSpaceUsedBySatellite:79\n\tstorj.io/storj/storagenode/pieces.(*Store).WalkAndComputeSpaceUsedBySatellite:731\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run.func1:81\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78}|
|2024-09-03T08:42:05Z|ERROR|piecestore:cache|encountered error while computing space used by satellite|{Process: storagenode, error: filewalker: context canceled, errorVerbose: filewalker: context canceled\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkSatellitePieces:74\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkAndComputeSpaceUsedBySatellite:79\n\tstorj.io/storj/storagenode/pieces.(*Store).WalkAndComputeSpaceUsedBySatellite:731\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run.func1:81\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78, SatelliteID: 121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6}|
|2024-09-03T08:42:05Z|ERROR|lazyfilewalker.used-space-filewalker|failed to start subprocess|{Process: storagenode, satelliteID: 12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs, error: context canceled}|
|2024-09-03T08:42:05Z|ERROR|pieces|used-space-filewalker failed|{Process: storagenode, Satellite ID: 12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs, Lazy File Walker: true, error: lazyfilewalker: context canceled, errorVerbose: lazyfilewalker: context canceled\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*process).run:73\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*Supervisor).WalkAndComputeSpaceUsedBySatellite:133\n\tstorj.io/storj/storagenode/pieces.(*Store).WalkAndComputeSpaceUsedBySatellite:722\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run.func1:81\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78}|
|2024-09-03T08:42:05Z|ERROR|pieces|used-space-filewalker failed|{Process: storagenode, Satellite ID: 12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs, Lazy File Walker: false, error: filewalker: context canceled, errorVerbose: filewalker: context canceled\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkSatellitePieces:74\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkAndComputeSpaceUsedBySatellite:79\n\tstorj.io/storj/storagenode/pieces.(*Store).WalkAndComputeSpaceUsedBySatellite:731\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run.func1:81\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78}|
|2024-09-03T08:42:05Z|ERROR|piecestore:cache|encountered error while computing space used by satellite|{Process: storagenode, error: filewalker: context canceled, errorVerbose: filewalker: context canceled\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkSatellitePieces:74\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkAndComputeSpaceUsedBySatellite:79\n\tstorj.io/storj/storagenode/pieces.(*Store).WalkAndComputeSpaceUsedBySatellite:731\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run.func1:81\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78, SatelliteID: 12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs}|
|2024-09-03T08:42:05Z|ERROR|lazyfilewalker.used-space-filewalker|failed to start subprocess|{Process: storagenode, satelliteID: 12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S, error: context canceled}|
|2024-09-03T08:42:05Z|ERROR|pieces|used-space-filewalker failed|{Process: storagenode, Satellite ID: 12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S, Lazy File Walker: true, error: lazyfilewalker: context canceled, errorVerbose: lazyfilewalker: context canceled\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*process).run:73\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*Supervisor).WalkAndComputeSpaceUsedBySatellite:133\n\tstorj.io/storj/storagenode/pieces.(*Store).WalkAndComputeSpaceUsedBySatellite:722\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run.func1:81\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78}|
|2024-09-03T08:42:05Z|ERROR|pieces|used-space-filewalker failed|{Process: storagenode, Satellite ID: 12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S, Lazy File Walker: false, error: filewalker: context canceled, errorVerbose: filewalker: context canceled\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkSatellitePieces:74\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkAndComputeSpaceUsedBySatellite:79\n\tstorj.io/storj/storagenode/pieces.(*Store).WalkAndComputeSpaceUsedBySatellite:731\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run.func1:81\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78}|
|2024-09-03T08:42:05Z|ERROR|piecestore:cache|encountered error while computing space used by satellite|{Process: storagenode, error: filewalker: context canceled, errorVerbose: filewalker: context canceled\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkSatellitePieces:74\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkAndComputeSpaceUsedBySatellite:79\n\tstorj.io/storj/storagenode/pieces.(*Store).WalkAndComputeSpaceUsedBySatellite:731\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run.func1:81\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78, SatelliteID: 12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S}|
|2024-09-03T08:42:05Z|ERROR|piecestore:cache|error getting current used space for trash: |{Process: storagenode, error: filestore error: failed to walk trash namespace 7b2de9d72c2e935f1918c058caaf8ed00f0581639008707317ff1bd000000000: context canceled, errorVerbose: filestore error: failed to walk trash namespace 7b2de9d72c2e935f1918c058caaf8ed00f0581639008707317ff1bd000000000: context canceled\n\tstorj.io/storj/storagenode/blobstore/filestore.(*blobStore).SpaceUsedForTrash:273\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run.func1:100\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78}|
|2024-09-03T08:42:05Z|ERROR|pieces:trash|emptying trash failed|{Process: storagenode, error: pieces error: lazyfilewalker: signal: killed, errorVerbose: pieces error: lazyfilewalker: signal: killed\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*process).run:85\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*Supervisor).WalkCleanupTrash:195\n\tstorj.io/storj/storagenode/pieces.(*Store).EmptyTrash:436\n\tstorj.io/storj/storagenode/pieces.(*TrashChore).Run.func1.1:84\n\tstorj.io/common/sync2.(*Workplace).Start.func1:89}|
|2024-09-03T08:42:19Z|ERROR|piecestore|download failed|{Process: storagenode, Piece ID: YKO4YVO2M6KDBNNILGJR5P36B6XXJWDKQ4PXY64XDUNL43IRGPRQ, Satellite ID: 12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S, Action: GET, Offset: 0, Size: 7168, Remote Address: 172.29.0.1:55160, error: untrusted: unable to get signee: trust: rpc: tcp connector failed: rpc: dial tcp: lookup us1.storj.io: operation was canceled, errorVerbose: untrusted: unable to get signee: trust: rpc: tcp connector failed: rpc: dial tcp: lookup us1.storj.io: operation was canceled\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).VerifyOrderLimitSignature:146\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).verifyOrderLimit:64\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download:666\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func2:302\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:33\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:62\n\tstorj.io/common/experiment.(*Handler).HandleRPC:43\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:166\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:108\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:156\n\tstorj.io/drpc/drpcctx.(*Tracker).track:35}|
|2024-09-03T08:42:19Z|ERROR|piecestore|download failed|{Process: storagenode, Piece ID: YKO4YVO2M6KDBNNILGJR5P36B6XXJWDKQ4PXY64XDUNL43IRGPRQ, Satellite ID: 12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S, Action: GET, Offset: 0, Size: 7168, Remote Address: 172.29.0.1:55174, error: untrusted: unable to get signee: trust: rpc: tcp connector failed: rpc: dial tcp: lookup us1.storj.io: operation was canceled, errorVerbose: untrusted: unable to get signee: trust: rpc: tcp connector failed: rpc: dial tcp: lookup us1.storj.io: operation was canceled\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).VerifyOrderLimitSignature:146\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).verifyOrderLimit:64\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download:666\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func2:302\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:33\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:62\n\tstorj.io/common/experiment.(*Handler).HandleRPC:43\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:166\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:108\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:156\n\tstorj.io/drpc/drpcctx.(*Tracker).track:35}|
|2024-09-03T08:42:19Z|ERROR|piecestore|download failed|{Process: storagenode, Piece ID: SCXOQ5C7U4J444YL4ITIVDIUN43L3N5SUE4BXNVTKBWUEM5URVRQ, Satellite ID: 12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S, Action: GET, Offset: 0, Size: 217600, Remote Address: 172.29.0.1:55186, error: untrusted: unable to get signee: trust: rpc: tcp connector failed: rpc: dial tcp: lookup us1.storj.io: operation was canceled, errorVerbose: untrusted: unable to get signee: trust: rpc: tcp connector failed: rpc: dial tcp: lookup us1.storj.io: operation was canceled\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).VerifyOrderLimitSignature:146\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).verifyOrderLimit:64\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download:666\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func2:302\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:33\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:62\n\tstorj.io/common/experiment.(*Handler).HandleRPC:43\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:166\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:108\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:156\n\tstorj.io/drpc/drpcctx.(*Tracker).track:35}|

After searching some more on the forum, changed settings to:

  • pieces.enable-lazy-filewalker: false
  • storage2.piece-scan-on-startup: true

Will let it run a few hours…see what happens

1 Like

Another update.
So i am reading alot on the forum, changed quite alot for performance and reducing HDD activity for the node itself:

  • All *.dbs are now on a separate NVME volume
  • The “orders” directory is also on this separate NVME volume
  • The “filestatcache” directory is also on this separate NVME volume

I did notice that some of the other directorys are controlled/owned by “Rozz” my NAS user and not “root” is this a problem? @arrogantrabbit

Hey man, I do have two Synologys running as well. For me the trick was indeed adding the SSD cache. I do have 2 devices in 2 different locations running:
My older node is a DS1019+, 5 drives with 12.7 TB with SHR 2-drive fault, so 38.2 TB in total and a 1TB SSD Cache with BTRFS metadata pinned RAID 1, readwrite
(I have allocated 25 TB to STORJ

My newer node is a DS1821+, 6 drives with 7.3 TB with SHR 1-drive fault, so 36.3 TB in total and a 2TB SSD Cache with BTRFS metadata pinned RAID 1, readwrite
(I have allocated 30 TB to STORJ) and can add 2 drives more if needed.

I haven’t done anything else. So just one volume, with storj on, no additional places for dbs or other directories. If none of the filewalkers etc are running the volume utilization is 20 - 25% and on the new one between 10 and 30. They do have different spikes funny enough (if there’s ingress, egress doesn’t take much load)

Both have only 8 TB of data used at the moment. The new one as it is filling up, the old one as it was somehow stuck (I think you saw the old thread) but once I re-activated the SSD cache all file walkers etc. went through and freed up the space for real, so it is also growing again.

Happy to exchange more - but I think if you keep things running and you don’t see any fatal errors, it’ll ‘relax’ at some point and then you only see the relevant delete / bloom filter drops etc. and it should be good.

so just bear in mind that the current maximum size node recommended for storj is 24TB.
What are the requirements for a Storage Node on V3? – Storj

So performance will get worse as the data begins to fill up, and may eventually reach a tipping point where it doesn’t keep up any more.

But also, the max size is based on the bloom filters, going too large means the garbage cleanup process will leave too much unpaid untrashed garbage on your node.

1 Like

No. Please, do not delete anything in the storage subfolder. The pieces could be audited from the trash, and if you delete them, your node could be disqualified.

This is mean, that your node is unable to run a used-space-filewalker with this option to be enabled:

Please, disable it. To speedup a next run of a filewalker you may enable the badger cache:

This is a recommended setup for this node, since the lazy filewalker is failing.

This is mean that you enabled the badger cache already, so let it finish the scan. The next run would finish the scan faster.

Perhaps it would not be an issue for a while, since we do not include the expired TTL data to the BF.

@Alexey Thanks and the rest for the extra info.
I think i may have 2 problems:

  1. A possible failing harddisk since it is the only 1 of the 4 that has a huge read/write latency (115-220ms read and 156-360ms write) with constant 98-99% utilization.
  2. I will not manually delete anything but i do see in the 4 trash (/storage/trash) folders files from Oktober 2023, January 2024, June 2024, July 2024 and then August 2024

My questions then are:

  1. Am i right to assume the hard drive is failing even tho everything from Synology Storage Manager says its healthy
  2. Do i now do anything manually to those trash files?
  3. Do any of you experts recommend anything else i can do?

Rest of the specs of my node:
DS1821+, 8 Seagate Exo X16`s 12TB to 16TB (4 drives in SHR EXT4 for Storj alone) R/W SSD cache for the Storj volume, *.db and filestat files on another seperate NVME volume.

don’t throw away the drive yet. It may not be broken. I suspect it isn’t breaking. Does synology let you run an extended background SMART test? I’d kick one off even though it will probably take forever to finish.

it’s running SHR on the four disks which is… voodoo magic as far as I’m concerned (LVM, anyway) so there is a chance the slow disk is just holding slower data or more fragmented or something innocent.

Now that you have turned OFF lazy filewalker look at your logs for two things: “Retain” (capital R) and “empty”. The Retain function can take hours, even best of circumstances, and emptying trash can take hours when dealing with these monster trash loads. but with the lazy filewalker on it was failing, so hopefully it isn’t.

To give your array some breather, you can also drop your storj node size to smaller than current usage. less ingress is less load.

and remember, if expanding, try to use one node per disk instead of arrays.

1 Like

After some more HDD testing with HDSentinel that also says the drive(s) are fine, so maybe you ( @EasyRhino ) are right in that the drive is ok, just alot on this specific drive.

Will look into the logs and update here, wil also stop the node and do an extended SMART.

For now, can i just change the capacity from 18TB to say 9TB and let the node do its thing?

For the future:
So its better to have 4 nodes with each there own drive? how would i go about doing this? cant delete the array cause of the data on it? move all the data to another volume would take days i think?

UPDATE on Log:
No Retain or error in my log of 3MB