Windows node cannot start, stop, restart or reinstall. I will be disqualified :(

vladro · January 19, 2023, 5:07am

2 gb of ram is enough when starting your node and during vetting period- so first 6-12 months. after this time its a good idea to add more ram, don’t forget that during some peak time synology will use more ram for native services- active backup, download station, btrfs optimization.
my suggestion for your case - get bigger ram module (check your synology model- it should support 4 or 8 gb if its based on celeron/atom), be smart and don’t purchase original synology ram but a compatible one (it should cost u 10-15$)

are u using windows core as OS?
still need to check windows event log- this error and behavior is related not to storj but to OS
try stop storj service, delete original storegenode.exe and replace it with new one (for example 1.70) restart the service

Ottetal · January 19, 2023, 8:55pm

(This message was typed 18-01-2023 late at night, my new account cannot post many comments in a day. Will post when I can)

Additional extra info - the StorJnode .exe file uses a tonne (!) of CPU, but no disk or network activity.

As I’ve stated in another comment, this is a 4 Core machine, with 9GB vRAM, running as the sole VM on a VMware VMhost on a Intel i3-5010U with 16GB pRAM

Ottetal · January 19, 2023, 9:01pm

@Alexey

Have you renamed/deleted the log file before start of the storagenode service?

Yes, I’ve deleted the log while the services where disabled.

Could your Synology run a docker?

My unit (rs819) cannot do it natively.

Usually inability to stop the service meaning this host (VM?) have a hardware issues. In this case it could be too slow storage or problems with RAM.

I will investigate storageissues further. While I was unable to comment, I’ve moved the VM to another host with known good RAM dimms

If you checked the filesystem from Windows and it fixed an issue, the service should start
normally and should be able to stop without any issues.

I will look into this, thanks

@vladro

2 gb of ram is enough when starting your node and during vetting period- so first 6-12 months. after this time its a good idea to add more ram, don’t forget that during some peak time synology will use more ram for native services- active backup, download station, btrfs optimization.

Will StorJ use the VMs RAM as cache, or the ISCSI target? Either way, the Synology is not at RAM capacity yet, and the VM has tonnes of RAM available.

my suggestion for your case - get bigger ram module (check your synology model- it should support 4 or 8 gb if its based on celeron/atom), be smart and don’t purchase original synology ram but a compatible one (it should cost u 10-15$)

Would love to - but I am on a ARM based synology. I plan on using local storage in the future, and not networked storage.

are u using windows core as OS?

still need to check windows event log- this error and behavior is related not to storj but to OS

try stop storj service, delete original storegenode.exe and replace it with new one (for example 1.70) restart the service

No, regular old Win10. I will check up on the event log again - thanks. Good idea with the newer version - I can grab it from one of my other nodes. Will report back when I have additional findings

Ottetal · January 19, 2023, 9:13pm

Update: While my initial checks of the integrity of the ISCSI drive was A-OK, I never verified anything. I can browse the drive just fine, but I cannot create any files. That is bad. I am running an chkdsk, and will report back when that is finished.

Ottetal · January 20, 2023, 8:21am

Whelp,this might take some time. I hope it will finish in time.

@vladro - you were right on disk issues. Will mark your response as solution when/if this check finishes

Ottetal · January 22, 2023, 9:47am

Update: It has been ~40 hours since I’ve started the update process. It is around 50% finished, and I hope it will finish in time.

Notes: Synology SHR1 with 4x8TB Ironwolf Pros on a Realtek RTD1296 can perform around 250 IOPS with nothing else going on, equating to 15-20MB/s scrubbing of the array.

With a 7TB ISCSI volume dedicated to the VM, I should be at a theoretical ~100 hours for the scrub. Windows reports around 130 for total time. I will update this comment when I have the result.

Disk activity in Windows fluctuates between 99-100% due to some second long breaks from time to time. CPU activity is at around 30% (mapped and reserved two real cores to the VM) with around 3.3GB RAM usage.

vladro · January 22, 2023, 10:39am

don’t worry about downtime- its not a big deal… also check possibility to switch to refs from ntfs- its much better for bigger nodes (based on my experience)

Ottetal · January 22, 2023, 2:17pm

Hello again Vladro. I do think that is a good idea. When I switch to local storage, I will definitely keep that in mind. I will work on that in particular, when I have a 30 day window of constant uptime after this. This forum has many great recourses on how to migrate data from one drive to another, to that should be no problem if I do reach it.

vladro · January 22, 2023, 3:35pm

yep- it was just a suggestion for the future after u recovery your score

Alexey · January 23, 2023, 3:59am

I would not recommend ReFS:

jammerdan · January 23, 2023, 6:35am

I would not recommend ReFs:

Additionally I have had my own ReFS experience:

vladro · January 23, 2023, 5:02pm

no way refs should be used on workstation os, on the other hand its a good option for server os and some tasks (with hardware raid and some tiering)

jammerdan · January 23, 2023, 5:03pm

I had it running in a server.

vladro · January 23, 2023, 5:08pm

i know, it’s really a horrible scenario… we have a lot of tools and software to work with ntfs but only a few cli-based instruments for refs… but like storage spaces with proper preparation its better option for some scenarios

jammerdan · January 23, 2023, 5:13pm

It is great as long as it works.
I lacks maturity and ecosystem.

Ottetal · January 23, 2023, 5:51pm

@jammerdan, @Alexey, thank you on opposing views on ReFS. I will read your stories and decide in the future.

Ottetal · January 24, 2023, 6:55am

I think I am done with this node. After 6 days and 7 hours worth of NTFS scanning with about 3% left to go, my girlfriend accidentally pulled the plug on the server rack. She was cleaning, saw dust behind the rack, moved the rack on casters out of the way, and when the vaccum didn’t reach, she took the power.

“It was only momentarily, what is the big deal?”

I’m flabbergasted. Not at her, because beside explicitly telling her to keep that plug in, she made a mistake. Im flabbergasted at my own inability to plan for such events. I could have secured the plug better. I should have protected my gear beind a UPS, and I should not have left the casters for the rack unlocked.

I’ll rerun the NTFS scan.

After all, I am interested to see if it did indeed have a positive effect.

@vladro, @Stob, @Alexey, thank you so much for your prolonged help with my node. I’ve lost hope that I can recover early enough to not get disqualified, but I want to provide an answer as to the scan.

See you in ~6 days.

Ottetal · January 24, 2023, 8:01am

We’re going again. This time using /x flag as well. I am unsure to see if this will have bad impact on performance in the future, but I am super curious.

IOPS seems to have increased along with scrub time. Maybe it will be four days and not six afterall

Toyoo · January 24, 2023, 9:23pm

One of my friends broke up with his girlfriend because she used metal utensils on his non-stick wok. I hope this is not the case here

Ottetal · January 24, 2023, 9:37pm

I love her to death, and after this mornings incident, I might just as well be dead so everything is in check.

I’ve purchased a UPS that is on it’s way. I hope that will alleviate future human error.