Storj + SMART long test takes 15 days!

jfikar · January 16, 2025, 9:52am

Not sure what you mean.

I get a lot of: no such file or directory - it probably means that node deleted that sj1 file meanwhile - no problem
Two times I got: I/O error - it means HDD can’t read the file as it contains an unreadable sector - these files I deleted. They can’t be recovered anyway.
I never got: hash mismatch - it means that file is readable, but the content was altered - to produce this result I corrupted by hand a copy of one sj1 file just to see, how the output of is-valid-sj1-blob will look like. The real storj node files were of course not harmed during this experiment.

snorkel · January 16, 2025, 11:11pm

Long test on Exos 24TB SATA: 33 hours.

Alexey · January 17, 2025, 7:52am

Oh, I thought you are got such messages. Otherwise everything should be fine.

jfikar · January 18, 2025, 3:35pm

Yes, that gives linear read speed of 200MB/s. Seems about right.

snorkel · January 18, 2025, 3:58pm

Just finished. It took 30 hours. That was the estimate with -c flag. 212MiB/s.

jfikar · January 18, 2025, 4:15pm

The time is also given by smartctl -a as e.g.

...
Short self-test routine 
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        ( 523) minutes.
Conveyance self-test routine
recommended polling time:        (   5) minutes.
...

Forood · March 24, 2025, 3:18pm

jfikar:

Hi, I’m running storj on a 18TB Toshiba HDD in Linux. I’ve scheduled a monthly SMART long test of all disks in the system, which is generally a good idea™.

Anyway, the disk in question is very busy doing storj, so the test takes very-very-very long. Normally, if being idle, it should take 25.5 hours (it is reported by smartctl as 1531 minutes). But with storj it takes 15 days! Today is 6th of January and the smartctl reports: Self-test routine in progress… 60% of test remaining. Test started on 1st January.

Is there a way to temporarily slow down the storj disk activity to allow the SMART test to be finished sooner? The SMART test running so long also decreases the disk performance, which is not good for storj.

I’ve been working with IoT for a while and faced a data transfer issue with multiple sensors. I spent a lot of time troubleshooting, but with help from a company I found through this list of top IoT companies https://www.cogniteq.com/blog/top-iot-companies., I was able to optimize my system and save a ton of time. Just wanted to share this in case anyone else runs into similar problems!

Hi everyone, just wanted to jump in and share my experience. I was running into a similar issue with my setup, where my HDD was taking forever to finish the SMART test, while still running the storage service. I tried temporarily reducing the disk activity by decreasing the allocated space, and that helped a bit, but it was still slow. In the end, I realized that I was over-complicating it! After reading through the comments here and some advice from @support, I switched to using short tests instead of long ones. Honestly, it worked like a charm. No need to stress about running a full check every time, especially when the disk is busy with other tasks like storj Thanks to everyone for the suggestions, they’ve been super helpful! Also,has anyone else tried combining their storj node with an SSD to speed up metadata access?

JuhaML · March 25, 2025, 7:42pm

With piecestore, I tried ZFS with an SSD special device for metadata, and was impressed by how quickly e.g filewalkers got done. Would be my preferred way of setting up a piecestore.
However, now with hashstore, I don’t see much use for such tweaking, since there aren’t millions and millions of files to manage anymore.

arrogantrabbit · March 25, 2025, 8:08pm

Right, or we can abandon hashstore before it’s too late (extra code – extra bugs; this is effectively keeping pieces in a makeshift database; better version of this (using actual production tested database) was tried before and abandoned due to stability issues, in favor of a plain old filesytem tested by decades. Deletions taking long time is not a problem that requires solution.)

Furthermore, for most users SSD on the array either as a special device or cache dramatically improves experience, most arrays today have it; there is no reason not to add it, just for the sake of your users. And in this case storagenode gets to benefit from it for free.

You are right, the filewalker completes in minutes with tens of thousands of read IOPS, filesystem is perfectly fine managing many small files. Storj is not a large workload at all.

I keep watching these threads in disbelief and amusement – why are they purposefully destabilizing the codebase with extra unnecessary features?

Alexey · March 26, 2025, 7:49am

To significantly increase the speed (it is in several times of the magnitude better than with a piecestore engine). I think we may post some tech blog there: Storj Engineering Blog | Storj Engineering Blog.
I often hearing from developers to promote it as a default storage engine, but there is no agreement on that.