Cloned HDD OS to SSD which contained DB files - what to look for in logs

thelastspark · January 18, 2023, 5:51pm

Hi all,

As per the title, for the longest time I was running (Win 10) 3 nodes on 3 drives (one of which was the OS). All the nodes .exe’s are on the OS drive with folder as below

The 2 nodes (data folders) are on their own drives are as follows:

I have now purchased an SSD and cloned my HDD and booted. Everything seems to be running fine…

My question is, is the DB for each NODE currently in the data folder? And if so is that the best, or should I move their DB’s to the SSD (and why?)? If yes, is there a link or guide to doing that?

Second question, what should I look for in the logs do indicate anything going awry with my nodes? I’m planning on wiping my previous HDD and using it to move the node currently on the OS onto it, but don’t want to wipe my HDD until I’m certain everything got moved okay.

Thanks!

Stob · January 18, 2023, 9:12pm

ERROR and WARN entries in the logs.

Yes, that is the default.

Possibly, probably but it depends.

Warning - How to move DB’s to SSD on Docker - #2 by BrightSilence

Procedure - you would use the new SSD drive letter and path, instead of S:\Storj

BrightSilence · January 19, 2023, 1:39am

Moving DB’s is a process you can easily mess up. Unless you have large amounts of trash regularly or bad success rates, it’s not worth it. Especially because you will also introduce a new failure service to your nodes. If your SSD fails all nodes will initially fail and while you can fix that, you will have lost all historic stats. So yeah, only do this if you’re having issues with your nodes.

thelastspark · January 19, 2023, 2:55am

I’m trying to run Success rate script - Now updated for new delete terminology since v1.29.3 tool to get the success rates but it’s just not completing. It’s consuming all my ram (I have 32gb) and just churning. My log files are 3.5 and 7gb respectively. I can’t even complete the 3.5gb one. Is there another tool or anything I can do to speed it up?

Alexey · January 19, 2023, 3:47am

You need to stop your node and rename the log file (or archive it and remove the source) and start the node back, wait for several hours to have stats and run this script.

Stob · January 19, 2023, 7:35am

My edited version of the powershell success rate script does not max out the RAM and is much quicker for large logs - Success rate script - Now updated for new delete terminology since v1.29.3 - #81 by Stob

thelastspark · January 19, 2023, 6:17pm

Thanks, I’ll give it a shot after…maybe. Problem is, running the original script on a 32gb, i7 (doesn’t matter much since it seems to be single-threaded) is taking well over 12 hours now. I started it at ~11pm, at 7:30am it was still going. I have taken the machine to work and it’s running in the background. Now it is 1:16pm and still going strong lol.

It is currently on repair downloads. At this rate, it would take it days to process my 7gb file.

I have copied the files over so they are not being touched by the nodes.

Stob · January 19, 2023, 7:42pm

I would say don’t persevere with the original script on a large log file. There’s just no need!

For you, I just ran a 1.44GB log from last month through my script and it came back after 6 minutes and 42 seconds…

By comparison the original script running on the same log file on the same machine (Ryzen 3, 30GB, SSD storing the logs) 5 hours and 38 minutes!..

…this was where the RAM filled and the pagefile access started on my HDD:

thelastspark · January 19, 2023, 9:08pm

Okay wow, yea that was much faster, they both finished within the hour:

At that point it’s not 10-30% faster, it’s like 10-30x faster lol.

I’ll keep this thread open for a bit longer in case any errors do show up in my logs though

thelastspark · January 20, 2023, 4:58pm

2023-01-16T09:05:47.327-0500	ERROR	orders.121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6	failed to settle orders for satellite	{"satellite ID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "error": "order: failed to start settlement: rpc: tcp connector failed: rpc: dial tcp: lookup ap1.storj.io: no such host", "errorVerbose": "order: failed to start settlement: rpc: tcp connector failed: rpc: dial tcp: lookup ap1.storj.io: no such host\n\tstorj.io/storj/storagenode/orders.(*Service).settleWindow:254\n\tstorj.io/storj/storagenode/orders.(*Service).SendOrders.func1:205\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75"}
2023-01-16T09:05:47.327-0500	ERROR	orders.1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE	failed to settle orders for satellite	{"satellite ID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE", "error": "order: failed to start settlement: rpc: tcp connector failed: rpc: dial tcp: lookup saltlake.tardigrade.io: no such host", "errorVerbose": "order: failed to start settlement: rpc: tcp connector failed: rpc: dial tcp: lookup saltlake.tardigrade.io: no such host\n\tstorj.io/storj/storagenode/orders.(*Service).settleWindow:254\n\tstorj.io/storj/storagenode/orders.(*Service).SendOrders.func1:205\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75"}
2023-01-16T09:05:47.328-0500	ERROR	orders.12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S	failed to settle orders for satellite	{"satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "error": "order: failed to start settlement: rpc: tcp connector failed: rpc: dial tcp: lookup us1.storj.io: no such host", "errorVerbose": "order: failed to start settlement: rpc: tcp connector failed: rpc: dial tcp: lookup us1.storj.io: no such host\n\tstorj.io/storj/storagenode/orders.(*Service).settleWindow:254\n\tstorj.io/storj/storagenode/orders.(*Service).SendOrders.func1:205\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75"}
2023-01-16T09:05:47.344-0500	ERROR	orders.12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs	failed to settle orders for satellite	{"satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "error": "order: failed to start settlement: rpc: tcp connector failed: rpc: dial tcp: lookup eu1.storj.io: no such host", "errorVerbose": "order: failed to start settlement: rpc: tcp connector failed: rpc: dial tcp: lookup eu1.storj.io: no such host\n\tstorj.io/storj/storagenode/orders.(*Service).settleWindow:254\n\tstorj.io/storj/storagenode/orders.(*Service).SendOrders.func1:205\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75"}

These are the only errors of their kind I see so far - should I be concerned?

Stob · January 20, 2023, 5:18pm

This is a while ago, and beyond the order submission time limit.

This suggests you had a DNS resolution problem on the 16th around 9:05am.

Next thing to do is check the Orders folder doesn’t have any files older than today.

thelastspark · January 20, 2023, 8:40pm

in the archive folder or unsent?

Stob · January 20, 2023, 9:54pm

Check for anything older than today in the unsent folder.

Files in the archive folder will have already been sent.

thelastspark · January 21, 2023, 3:20am

In that case, no, there is nothing older than today in the unsent folder. The archive has some from the 13th though

BrightSilence · January 21, 2023, 3:48am

Looks like download success rates are a little low. You may see some, but limited improvements from moving db’s to SSD. Just be sure to follow the steps carefully and don’t mess it up. It’s not worth losing your nodes over. We’re talking probably at best a 2-5% improvement. But there may even be no noticeable difference.

thelastspark · January 21, 2023, 6:17pm

Should I be worried about the stress or the extra reads/writes to the ssd significantly wearing it out? I am using a samsung 870 1tb.

And the steps are, shut down node. Copy over files (what files exactly) to desired location on ssd. Change the config file to point there, restart node?

BrightSilence · January 21, 2023, 6:54pm

For just the db files I wouldn’t worry about that. It’d be different if you cached the entire node, which I don’t recommend (I do this myself for some of my nodes, but it’s a do as I say, not as I do situation).

There are instructions here: How to move DB’s to SSD on Docker