Update:
After watchtower update to 1.102.3 last night the trash has now been successfully deleted and the free disk space is now shown correct.
Update:
After watchtower update to 1.102.3 last night the trash has now been successfully deleted and the free disk space is now shown correct.
Iâd say this still applies, albeit the definition of suitable hardware will shift. I.e. if you have bunch of raspberry pies â forget it. If you have an actual server â sure. And that âsureâ may extend to adding more drives to existing setup.
FWIW I have 4 empty bays. If the 100Mbps traffic becomes reality Iâll add 60TB without batting an eye.
The concern, however is how many folks have asymmetric connection. I.e. docsis customers. Gigabit down - 30 megabit up.
All that data that ie being uploaded may one day needed to be downloaded. This shall be tested too.
you are right about this, but it is also true that to be of quality the Storj service must be entrusted to SNOs who have invested in Server Farm level equipment and have connections of over 2gbit upload and 10gbit download. Youâre right that not everyone has a high-level connection but unfortunately, since Iâm also a Storj customer, if I pay I expect the files to download quickly and the file search responses to be immediate or almost so. Otherwise, if Stoj were based only on home users, the quality of the service would not take off. Obviously itâs just my opinion, certainly not shared by everyone but unfortunately in this jungle of cloud services the winner is the one who offers not only the best price for data storage but also the quality of this service.
therefore anyone who is not keeping up with the speed and quality of the network which is developing and expanding more and more will be disqualified or will lose income because they will not be able to keep up with the other SNOs. Unfortunately itâs the hard truth but I suppose thatâs how it works
Iâd love an excuse to use 20TB+ HDDs! But I think the average node is growing around 400GB/month right now? So not going to fill existing disks for yearsâŚ
Good question. Ik not sure how fast storage is used up. Iâm getting 1-1.5TB of ingress. Iâm not sure how fast does stuff get deleted.
And here is another one:
A node claiming to run on 1.102.3 but still has the old hierarchy under trash without date folders.
/storage/trash/.trash-uses-day-dirs-indicator
/storage/trash/ukfu6bhbboxilvt7jrwlqk7y2tapb5d2r2tsmj2sjxvw5qaaaaaa/2a
Trash folders from all other satellites are empty.
What I donât know is, if there is an issue with it or if it is normal, for example if the migration mabye has not yet started.
The gc-filewalker does not seem to have a problem with the current folders:
2024-05-02T20:28:21Z INFO lazyfilewalker.gc-filewalker.subprocess gc-filewalker completed {"Process": "storagenode", "satelliteID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "piecesSkippedCount": 0, "Process": "storagenode", "piecesCount": 4804204, "trashPiecesCount": 78266, "piecesTrashed": 78266}
So the questions would be:
The filewalker logs show that it is ready to go by the date:
2024-05-02T12:01:46Z INFO lazyfilewalker.trash-cleanup-filewalker.subprocess trash-filewalker started {"Process": "storagenode", "satelliteID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Process": "storagenode", "dateBefore": "2024-04-25T12:01:46Z"}
Honestly: The number of different issues I am seeing just on few of my nodes ( I do not check them systematically for these kind of trash errors) is far beyond what I have expected and what I believe it should be. This is very concerning.
I see the same issue.
It created a new date folder without moving the old files. This happened earlier but I thought the newer BF will move the older folders in to per day folder.
Well spotted.
I did not see the date folder in my satellite folder, but it is there.
And interesting enough I have all the subfolders in there:
/storage/trash/ukfu6bhbboxilvt7jrwlqk7y2tapb5d2r2tsmj2sjxvw5qaaaaaa/2024-05-02/2a
So I have:
/storage/trash/.trash-uses-day-dirs-indicator
/storage/trash/ukfu6bhbboxilvt7jrwlqk7y2tapb5d2r2tsmj2sjxvw5qaaaaaa/2a
/storage/trash/ukfu6bhbboxilvt7jrwlqk7y2tapb5d2r2tsmj2sjxvw5qaaaaaa/2024-05-02/2a
OMG. The next mess.
Edit: Both subfolders contain piece files /trash/ukfu6bhbboxilvt7jrwlqk7y2tapb5d2r2tsmj2sjxvw5qaaaaaa/2a
as well as /trash/ukfu6bhbboxilvt7jrwlqk7y2tapb5d2r2tsmj2sjxvw5qaaaaaa/2024-05-02/2a
Yes both folders will have files as per their respective bloom filters. Its the migration to the per-day folder that messed up somewhere.
Latest GC log entry. This hasnât finished yet.
2024-05-02T22:03:13Z INFO lazyfilewalker.gc-filewalker.subprocess gc-filewalker started {âProcessâ: âstoragenodeâ, âsatelliteIDâ: â12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3Sâ, âProcessâ: âstoragenodeâ,
âcreatedBeforeâ: â2024-04-23T17:59:59Zâ, âbloomFilterSizeâ: 5069692}
The migration was a one time thing. So maybe at one point it would be safe to not to solve the underlying problem, but to wipe out the satellite trash folder to get rid of everything that is wrong in there and resume the operation with date folders.
No, but now you can bring more hardware to the network. Itâs still not a requirement, itâs opportunity, see a difference?
We need more simple nodes across the world, not necessarily increase your hardware. However, it will be accepted as well (we do not track, is it economically viable for your or not, sorry about this).
We posted a message, then you need to decide what you would like do (or do nothing, and this is OK too!). For example - Iâll do nothing. I do not have a physical access to my server, so I would not be able to expand (and I, of course, regrets about this, but well, itâs a lifeâŚ)
For the overcomplicated setups like using Virtualization or RAID or other layers - for sure. My setup do not suffer. Whatâs a difference? My nodes followed the recommendations:
Yes, the OS is a worst choice (Windows), however, they are works. And only that is important.
I have 2 docker for Windows Desktop nodes and the GUI one, they are works perfectly and can handle the load. Yes, this is an old 2019 server with i7 and 32GB of RAM and simple WD (and one Toshiba ) disks, three in total - everything is working perfectly. The server is in EU (some kindâŚ), and not used currently for anything else (I moved my work to the cloud several years agoâŚ).
Perhaps itâs a nice idea, would you mind to make a pool request?
I would disagree, the fleet of Raspberry Pis is very powerful. I have had one, 1GB of RAM and it worked perfectly during the past stress tests (which was much more stressful by the way - my Internet channel of 100Mbps was saturated on 100% for several days!), and itâs survived.
Also deletion of this data after, so I would strongly disagree with all respect.
We do not recommend to invest in any circumstances even now. So, itâs your risk and your opportunity.
Could you please proof your claim?
This perhaps a bug and discussed separately
Either you misstyped the quoted part, or that reply wasnât meant to be for me. My setups are baremetal, one node per disk.
Flexing aside, that is not what the recent announcement has suggested. The announcement has suggested that the new prospective client which canât be given a free one month trial to upload/download his/her/its data and see if the network can keep up, has to wait for the OK from the storj team after testing on a dozen petabytes of data, ie the 50% (my claim, we are still waiting on a confirmation of the exact figure for the past month) of currently âunpaidâ free-account data currently sitting on the network. If Iâm testing, Iâd personally test with 10% of the data. Maybe the client doesnât have this much data, but the announcement disproves that. The announcement specifically states, and I quote, that âthis will be the new normalâ: Data will be continuously deleted from the production satellites and migrated to the testing satellite for the next year or so. It will stay there for a couple of weeks (to avoid paying out the whole month for it) then deleted as well. That means the new client will match the current usage at best, or based on a logical 10% testing will far exceed the current network size. Or the client is simply overblowing his numbers.
That settles the use what you have/go by storj requirements part. Either the network expands to match the usage, or the answer to clients is âsorry, canât handle the load right now, you can check back laterâ. As we have already established in another thread: The berry plant eventually runs out of free berries for anyone to pick.
Back to the topic at hand: The golden rule in life that should be taught in every single school class, whether itâs history or physics, is:
once, it happens
twice, itâs a coincidence
thrice, itâs a pattern
Judging by the replies above, Iâm not the only one that had the date-named folders mixed with prefix-named folders. My questions to the team are:
Iâll take those, I guess.
While we would like to think that we are smart enough to foresee all possible bugs, it turns out we are still human . As to why there wasnât a fallback mechanism, this is because the migration step is supposed to be extremely simple and quick (it consists only of three renames/moves) and in the unlikely event it was interrupted it could be fixed manually.
But somehow the migration seems to have been kicked off multiple times all at once, and in some cases deletions were already being processed before the migration, leaving behind directories in the old location. This seems to be because of file walker subprocesses opening the blob store at the same time as the main process, which wasnât caught because unfortunately our test systems donât spawn the filewalker subprocesses in the same way.
We donât have a fix yet. This situation is not considered ultra-high priority because the only data at risk of being lost is already in the trash. We definitely do see how itâs a problem for our SNOs, though, and we would like very much to fix it. The plan is to recognize the recursive directory problem and the unmigrated directories problem and correct them automatically on node start.
If the amount of data in the trash is causing you problems, you can delete it. Ideally you should keep pieces from the last 7 days. It may be helpful if you keep the data, just so we can test our fixes, but youâre under no obligation to do so.
Saw a video a couple of days back comparing the lack of finger pinch sensors on cybertruck vs other car manufacturers. That verifies that cybertruck is indeed designed by humans, while kia/honda/nissan/etc are designed by AI, and tesla will never compete on the AI front.
At some point in life you realize that Murphyâs Law is one of the three fundamental laws of the universe: What can go wrong, will.
The second fundamental law of the universe is Soddâs Law, which is based on Murphyâs Law: What can go wrong, will, at the worst possible time.
The third, I already mentioned in my previous reply.
There will be a point where the storagenode software decides to
rm -rf /$directory
I pray to $deity that that variable doesnât end up being empty. Or at least that a human can use an AI to predict that the variable will be empty at some point and only rm -rf if the variable isnât empty.
The amount of trash isnât causing problems. Itâs only at 19TB for now. Whatâs a brand new 20TB ultrastar in the grand scheme of things? Iâll keep the data for the next 3 months (since changing logos are a higher priority right now) until a proper fix is implemented.
Wasnât the recent test specifically tailored to anticipated customer with specific range or segment sizes?
Single disk has a 200 iops limit, since your 100Mbps connection was saturated and it handled it well, each file sent must have been quite large 125kb or larger.
Median file size nodes hold today is 16k. If that wasnât that specific custom tailored test, but just scaled up existing customer traffic, you node would choke at about 10Mbps ingress.
Itâs simple math: median file size times iops â max bandwidth.
If we delete the old trash manualy, does it require to run the used-space-file-walker to update the databases?
Or trash folder space is updated by other walkers?
I keep USFW off, including lazzy mode, so I need to know if I must enable it, if I decide to clean the trash.