Disk usage discrepancy?

batelis · June 22, 2024, 9:19am

I checked my diskwith chkdsk /scan, /f and it didnt find any problems, is there a clear instruction how to give ssd space for filewalker? This solution looks like it would work and would have good speed. The problem should be that the test data already takes 100% of hdd capabilities, no?

Alexey · June 22, 2024, 10:22am

I believe there is not. But you need to google how to do it in your case.
There are two known implementations:

use ZFS (there are a lot of articles in the internet how to do so)
use LVM: Metadata cache ext4 (RAM maxed) - #5 by Ambifacient (or again, use Google).
It’s also possible with Windows using a tiered Storage Spaces: GitHub - freemansoft/win10-storage-spaces: Scripts for creating an destroying Fault Tolerant multi-tier storage spaces on Windows 10 or a Primocache.

perhaps. But the goal is to emulate the future load from the customers:

So, if we win, you likely would have the same usage in the future.

batelis · June 22, 2024, 10:31am

Hmmmm just disabled lazy filewalker and still the same, one more thing when I restart my node it always goes back to 6,37tb and if I restart my second node that one goes to 6,07tb

Alexey · June 22, 2024, 10:43am

The disabling of a lazy filewalker will not give you a positive results until each of them would finish and will not meet the databases issues during the way (like “database is locked” for example).
The lazy/not lazy only affects how fast they would finish, see

batelis · June 22, 2024, 10:54am

Okey so this thing happens to other people aswell so I have to wait till filewalker does its job and it should change how much space is used after. Oke ill wait

Alexey · June 22, 2024, 11:27am

Yes, you are correct. It affecting many Operators (their nodes to be precise), but not all.
You need to keep an eye on the logs to do not have any errors related to filewalkers and databases to be on a safe side.

Toyoo · June 22, 2024, 7:54pm

Yep. But we can’t design software to run on Commodore 64 just because someone’s got a spare one in the closet.

arrogantrabbit · June 23, 2024, 1:29am

Only vocal minority has corruption issues because of using ill suited or unsupported configuration. For the vast, overwhelming majority of many thousands of node operators its flawless.

lol :). I won’t even bother arguing with this nonsense.

By the way storj is not using databases for anything remotely important. They used to, but switched to relying purely on a filesystem for what matters. Databases today are pretty cosmetic and needed only to support the dashboard. (I’m aware of expiration database and orders, my point still stands). You can delete databases any time and nothing bad will happen.

This is plenty. If it does not perform — software configuration is crap. You need to fix it. Storj can’t do it for you. I’ve been running nodes on much weaker hardware and they did not create perceptible load on the machine (nor ever corrupted databases or caused any other issues). It’s entirely a “you” problem.

You misunderstand the purpose of the project. If storj paid market rates they could run datacenter themselves. The point is to prevent wasting existing online idle capacity and therefore any payment they make to you is 100% profit. You can’t complain that you get too little of free money.

Yes. And it does. But if your hardware is weak or you don’t know how to properly configure your overpowered machine for this quite simple usecase — don’t run the node. Storj does not recommend buying hardware for storagenode. They never promised that it will run on abacus just because this is what you have in your closet.

t0x · June 23, 2024, 2:57am

Only vocal minority has corruption issues because of using ill suited or unsupported configuration. For the vast, overwhelming majority of many thousands of node operators its flawless.

See quote from “leader” below, clearly it isn’t a “configuration” issue:

My DBs are on HDD too, and one of the node has had issues with a “database is locked” errors, which I never saw for the last several years.

By the way storj is not using databases for anything remotely important. They used to, but switched to relying purely on a filesystem for what matters. Databases today are pretty cosmetic and needed only to support the dashboard. (I’m aware of expiration database and orders, my point still stands). You can delete databases any time and nothing bad will happen.

Clearly since the only “workaround” is buying new hardware for a corrupted database you are 100% correct lol.

You misunderstand the purpose of the project. If storj paid market rates they could run datacenter themselves. The point is to prevent wasting existing online idle capacity and therefore any payment they make to you is 100% profit. You can’t complain that you get too little of free money.

I don’t misunderstand the purpose of the project. I’m not complaining I get “too little free money.” - I am complaining that people have the audacity to expect me to fork over more $$ on hardware when the project pays jack shit and doesn’t cover the added cost. This is also my point about using “spare hardware”, I shouldn’t have to incur additional costs to fix crap software.

Yes. And it does. But if your hardware is weak or you don’t know how to properly configure your overpowered machine for this quite simple usecase — don’t run the node. Storj does not recommend buying hardware for storagenode. They never promised that it will run on abacus just because this is what you have in your closet.

My hardware is properly configured and runs just fine - the only issue I’ve ever had is with storj. I never said I bought the hardware for a storagenode; I have much more complicated use cases that it addresses. As I said, I gave it spare HDD space because I wasn’t using it. Remove storj and all other “issues” go away tells me that the docker container is garbage and causing conflicts with other parts of TrueNAS.

Toyoo · June 23, 2024, 3:09am

If you don’t like Storj that much, why are you engaging? The quality of software is what it is, no denying here. In fact, you should read @Alexey’s message exactly as a statement that the software is not perfect. It’s not likely to gonna change short-term either, Storj Inc. has different priorities. Storj managed to prepare storage nodes to work on some configurations. This already serves commercial purposes. As the needs are growing, effort will be put into making it better—balanced by the availability of hardware where it already runs ok.

You can now belittle the project as much as you can, but will it help? No.

arrogantrabbit · June 23, 2024, 3:19am

It is. It’s unreasonable to expect for a single isolated HDD to handle IOPS that node requires. It’s a configuration issue. This system as configured is not suitable to run the node.

It’s a different usecase. It’s qualitatively different from storing movies and running virtual machines. It is a lot of non-localized IO scattered across the datastore. If your system is not designed for low latency random access (and by extension, good user experience regardless of workload: this improves many usecases) — it won’t perform well for storagenode. Don’t run storagenode on it.

If you are interesting in constructive feedback and not just to vent — there are plenty threads on performance optimization on this forum.

ACarneiro · June 23, 2024, 6:31am

I am not entirely sure that you can be confident enough to state this.

I would view that there are thousands of operators out there, many of whom are not in the forum because they haven’t really engaged too much with the project. They just set it and forget it and take a look at things sometimes.
By and large the software has been running happily so they may have had very little intervention required over many months.

The locked database issues are very insidious in that you won’t notice them unless
1 - You like to “keep an eye” on the dashboard and start seeing inconsistent data
2 - You like to “keep an eye” on the general health of your hardware and notice “missing space” on your HDDs
3 - The node dies a horrible database-messy, disk-space-missing death.

Most of the posts in this forum seem to fall into the first two categories because the forum membership is obviously skewed to those of us who take a keener interest in the project for whatever reason.
We have no way of knowing how many nodes will be silently approaching “category 3” and may start dropping off the network, possibly even unnoticed by their SNOs for days or weeks.

Now, I have no way of knowing whether my hypothesis is right either but I think it is plausible and we shouldn’t be complacent in thinking that these issues are isolated and being blown out of proportion by a (occasionally puerile and unsophisticated) vocal minority.

Finally, on the issue of unsuitable or ill-configured hardware: whilst you are technically right, you also have to accept that hardware which seemed perfectly suitable at the time of many of these nodes being deployed turns out to not have scaled well to the increased workload (eg.: my “lettuce node” with a USB spinning rust and a small SD card).
Many of the less techie SNOs (like myself) and even some of the more technically minded ones would have had no way of seeing this coming.
So this is being part of the whole learning experience for both us and Storj.

Alexey · June 23, 2024, 7:08am

I still believe that we should have any sorts of hardware setups, even suboptimal. The protocol itself should handle all the loading pressures. And I like the new feature “choice of n”, it allows even weak devices to handle the load without a crash.
We still need improvements to achieve other challenges like the drives cannot keep up. The filewalker should not fail, or if it’s failed it should be started back, if the database is locked, but we have RAM, then keep this info in the RAM until the database would be unlocked to flush this data from the RAM to the database, etc.

ACarneiro · June 23, 2024, 7:21am

If the database can’t keep up because of slow write speeds, wouldn’t that just end up filling the RAM up as well?

Alexey · June 23, 2024, 7:29am

Yes, of course. But I do not have a better idea, how to resolve this congestion.
We perhaps could split databases by satellites, however, I believe it would only buy us some time before we may meet it again.

ACarneiro · June 23, 2024, 7:35am

Yes, it’s a tricky one that someone much cleverer than I is going to have to think about.

And speaking of which, whilst I would imagine the workload at Storj is immense right now it would be nice to have some sort of noises from the higher-ups with at least an acknowledgement to the issues and if there is any vague plan to address them in the near or mid future.
An update to @elek 's operational updates post might be a good way of doing that

jammerdan · June 23, 2024, 7:49am

Another idea. Again, I am not a coder so I don’t know if this could be done: It is my understanding that writing to a text file is less expensive than writing to a database.
So could we not journal the database requests to a (text?) file instead of executing them and replay this log when the load is low again?
The current situation is crazy because with a lot of ingress and egress the database operations also increase. So IOPS going through the roof when they are most needed for the customers data.
Instead if we could log the database statements instead of executing them and execute them later we could do that when the load is lower again.
And I think I have read that batched statements are better for Sqlite. So maybe before executing such a journal, It could be run through a statement optimizer to batch as many as possible.

ACarneiro · June 23, 2024, 7:53am

I’m not sure that this is fundamentally very different to the “page to RAM until there is less load on the DBs” solution that @Alexey mentioned.
What if there is no downtime? What if there is a sustained load? (This seems plausible with the current test pattern).
All you’re doing is adding another step which may itself become congested, which increases complexity.

I fear something much more radical than that is needed. Either a fundamental rethink of pice-tracking or (and this might make more sense) more profound customisation and optimisation of the underlying database engine as suggested in another post (can’t remember by whom, but it might have been you, @jammerdan?)

Alexey · June 23, 2024, 8:10am

The text file is unstructured by default, so parsing it requires much more CPU cycles than read the same from a structured file (like a database).
It also doesn’t effective with simultaneous usage (this is the exact reason why we have “database is locked” issues - it’s the file too, which need to be accessed from the different code paths).
So, this idea wouldn’t solve anything but can make it even worse unfortunately, sorry, I always like to read your ideas though .

But the idea with append-only records could be a solution, like described here:

this one could be a solution too, we already implemented some kind of that by caching bandwidth in the RAM before flushing to the database.
The journal itself could help too, because it’s exactly the append-only file.

You are correct, it’s the same suggestion, but perhaps more RAM consuming than mine.

I’m with you in that conclusion.
It’s possible that the badger experiments would give us another boost.

jammerdan · June 23, 2024, 8:16am

I don’t know how much RAM is needed, but it is normally limited and maybe better used or required for other things. Also of course it is volatile.

But for example we are writing Docker logs the same speed customer data flows in. Obviously this does not hurt. So I think it should be possible to write database statements to a file as well.
And I am sure there will be downtimes to replay such a log. It would decouple database load from the actual data load.
This could give a lot of control on what to execute and when to execute instead of forcing load onto the databases when the files system needs all IOPS it can get.

There are so many other thing possible. You could execute the journal on a timed basis, like every hour. Signal to the satellite that database replay is running, so the satellite will not send data. As I have no idea how big such a file would be and how much pressure it would put on the file system, I don’t know if it would be required, but it could be done to signal to the satellite to stop ingress for a couple of minutes.

Unfortunately it seems that the easiest solution to simply run the database statements while data flows in, do not work. So yes maybe more complexity is required.
It seems that Sqlite has a setting for that, I asked an AI:

Yes, SQLite has a feature that allows you to log database statements to a file instead of executing them. This is called the “echo mode” or “test mode”.

You can enable echo mode by setting the sqlite3_test_control function to SQLITE_TESTCTRL_LOG and providing a file pointer where the statements will be logged.

And according to AI it can be read:

Yes, you can replay a SQLite log file to execute the statements again. One way to do this is by using the .read command in the SQLite command-line shell.

It sounds like it could be implemented easily.