Great post @Dominick !
I’m wondering though, why are these two separate settings? Couldn’t you just have 1 setting to determine the number of transfers and let uplink figure out whether it should transfer segments from one file or multiple files? This would prevent the end user from having to do that balancing game you describe and would also deal more flexibly with varying file sizes in a single cp operation.
I can imagine cases where a user might have a preference, but I suspect it doesn’t matter in the general case.
I think it should be able to be more automatic than that though. An option to limit RAM and have it do as many up-to-64MB segments as it can at the same time? Or a limit to the maximum number of segments at a time (and let it move sequentially through the files as needed to max out the segment count), so a 100GB file might consume all 48 slots, as it gets to the last 64MB segments it’ll leave some free slots that could start transferring the next file.
This assumes it can dynamically assign parallelism and transfers, which may not be possible, but even if not it could hold off on new files (effectively putting some of the --transfers
into a waiting state) when the number of segments to be processed at a time is reached, allowing the count to bounce around the target (which should always be better than having numbers consistently too high or consistently too low on a mixed set of files).
I considered suggesting this, but this could mean many thousands of parallel transfers if the files are small, which could cause connection issues. You’d probably need to set both RAM and transfer limits then.
That’s exactly what I was suggesting. I don’t understand why the user now has to set separate limits. Whether the 48 segments are all 1 file, 2 files or 48 files has pretty much the same impact on RAM, CPU and network connection. This really should just be done automatically. Especially since any combination could cause peaks now. Right now if you set both settings to 8, it would fluctuate between 8 and 64 transfers depending on file sizes. Yet if your resources can handle 64 it should always be using 64.
It may not be how it’s built currently, but the uplink determines which transfers to start, so it could definitely be made to work that way. What you described after this sentence I believe is how it already works if you only set the transfers parameter.
I retract my previous reply. Your statement is NOT true. At least, according to Storj white paper page 11. And I quote:
“It should be economically advantageous to be a storage node operator not only by utilizing underused capacity but also by creating new capacity, so that we can grow the network beyond the capacity that currently exists.”
The purpose of Storj is not just for existing spare capacity but also NEW capacity. Node operators should then be incentivized to invest in new hardware. Meaning: the new investment should yield a fair amount of ROI.
As an aside, another quote:
“… it is required that storage node operators have sufficient incentive to maintain reliable and continuous connections to the network.”
Thus my description of preferable node operators stands:
“c. fairly paid decentralized operators who are incentivized to invest for and maintain a certain quality of service, and NOT to fake decentralization”
If Storj stays true to its own white paper, it must makes changes to make sure that node operators are incentivized to make NEW investment.
The new capacity is added with more unused or used for something else online hardware.
For example, the mining rigs, game servers, home servers, even rented servers if they are not utilized in full. Even existing operators can just add another disk, which is just collect the dust on the shelf otherwise.
So, new capacity ≠ buy hardware.
“added with more unused” still falls under “underused” category. I think your interpretation is not what the white paper meant at all when it says “creating new capacity” and “be economically advantageous”.
already paid and own → underused
Also, at the current rate, it takes years to fill 4TB HDD, of which average life is just 3-4 years. You earn a few dollars, if not cents, a months on average. How “economically advantageous” is that?
it’s more than zero anyway. If you did not buy a dedicated hardware, and did not invest in to make it 24/7 online, any income will be a profit.
The whitepaper is said nothing about investments, so creating new capacity literally mean - add free space and bandwidth to the network and be awarded for that, it does not exactly mean to buy a new hardware, you may use used, refurbished or partially used for other puproses.
The fact is, you may run storagenode even on the router: Running node on OpenWRT router?
@Alexey Nah… I disagree.
But, hey, thank you for being active and resourceful.
Wow… none of that is even close to true. It’d take about 8 months to fill up 4TB and most of that is because you lose months to vetting. This month my ingress was almost 0.9TB on a vetted node (not representative of an average month yet, but it is increasing over time currently). So no, it won’t take years.
Modern HDD’s have an annual failure rate of about 1-2% for the first 5 years. It goes up slowly after that, sure. But the average lifespan of an HDD is well over those 5 years. Most of mine last over 10 years and I’m even running 3 that are now over 15 years old. So no, not a lifespan of 3-4 years.
With 4TB filled you earn about $13 a month. Not cents. The first year a lot of time is spent filling up the HDD, so you make a total of about $80, with subsequent years earning you roughly $160 per year. Buying a 4TB HDD costs about $80. So you have earned your costs back in a single year. It is currently economically viable to expand as long as you don’t have additional running costs (hardware already online anyway). Even if the HDD only lasts 5 years, that’s an ROI of 900%.
well us long standing node operators would have been making new investments… we definitely wouldn’t have started off with a 100TB space on storj… probably started with a 4TB or at best 8TB running a node… then as it gets nearly filled, we either find another underused hdd or buy a new one to provide more disk space
i would say we are pretty well incentivized to invest over a long term view…
it won’t take that long to fill a 4TB currently… probably 1.5 years at most (pretty pessimistic estimate actually), on a single IP… ROI on just a brand new HDD will be cleared within 1.5 years… second hand HDD within a year or less… put into consideration of the bad disk etc, 2 years ain’t that bad for a 100% return?
the bigger the initial sizing, the longer the ROI… can’t do big bang with storj, even if you can get firesale HDD from chia farms shutting down… probably easier to do it with crypto mining where assets can be utilized immediately
Yeah, I failed to recognize that. Your 0.9TB/month reconciles with @Knowledge 's comment.
This does not reconcile well with what I gather. Survival rate starts to dive in year 4. See chart from Blackblaze:
But, yeah, individual cases may differ.
So income from usage and egress is approximately 50/50? Perhaps my estimate was overly pessimistic about egress.
Wow… now future looks a lot brighter. Thanks for sharing.
Yeah, I very likely was wrong. People say it should be ~1TB after vetting.
Thanks, this one is useful. I was mostly quoting backblaze stats too, but they generally don’t use HDD’s longer than 5 years. This image makes it pretty clear why. Though they probably don’t have a lot of stats beyond 5 years. Though even at the end of the displayed 6 years 2 out of 3 HDD’s will have survived. That is a little worse than my own anecdotal experience, but still plenty for a decent ROI.
The problem isn’t really whether an investment into buying an HDD is worth it. It’s more that you can’t just buy hundreds and assume to fill them up.
how long have you been waiting for your investment to pay off?
my investment are alredy payd off, i am operatin around 3 years already. I making investments from storj income.
Still trying to fill my 10TB and I am well over my first year on dedicated Ubuntu hardware and gigabit fiber.
How full is it in that time?
Well Storj Labs asumptions are wrong once again… maybe some SNOs have old disks laying around, but I don’t. I only use laptops with 500GB-1TB SSDs. And what are those old HDDs? 3-4TB? You fill those up realy quickly and will run out of old HDDs in 1-2 years. And will run out of SATA and USB slots too. I think all comited SNOs are ending up investing in new hardware. Buying second hand HDDs is not a way for me, they are too expensive and are pretty worn out. The cheapest HDDs I could find as $/TB with good reliability and that I can use in my machines are Exos 16TB.
Storj Labs demands exapansion and in the same time not investing in new hardware. That’s pretty much imposible and they should stop highlighting that.
If you want to do it like a hobby, yeah, use what you want, an old HDD, see what SNO is all about, have some fun. If you want to make money, you need many IP addresses, many good HDDs, and patiance. The ROI is not great but is steadily increasing. In the long run, the earnings per TB are decreasing, obviously, as well as the price per TB of new hardware, and the price per TB of cloud storage, and Storj is obligated to align itself with the general trend. So you will have to expand and buy bigger HDDs. Old bigger HDDs are not a thing.
I won’t pay to much attention to data center’s data about reliability of HDDs. I don’t know in what conditions are they used, at what temps, and for how long. I imagine that they are in pretty tight racks and the temps are over maybe 40-50 C most of the time. But maybe I’m wrong. As a home user, I can keep my HDD temps lower, close to 30-35. Also the work load is smaller than a data center one.
So as a conclusion, if you want to make money, investing in new hardware and bigger HDDs is the way to go. You can rely on your machines for more than 10 years. To make it as a business? It depends of what your investing founds are and what expectations you have, and what resources are available for your location. Starting a business on SH hardware is not a good business plan. For me, to scale it up is not a big problem; my ISP can install as many fiber optics as I want, at a resonable price, so the number of IPs and the available bandwith is not a problem. And I have more than one location. I don’t have the money though to make it bigger for now. 7 machines in 7 locations are enough for the moment.
It’s not necessary old disks, it just need to be unused, like in your NAS or mining rig, or game server, etc. and this hardware is online anyway.
The idea similar to Airbnb, but for unused bandwidth and space.
Personal anecdote: I’ve been buying used HDDs exclusively for a close to decade now. The key is to buy from a trustworthy vendors on a safe buyer friendly platform (read eBay), from listings that post either individual drive SMART reports or other information about disk origin. There are a lot of surplus stores selling datacenter drives — those drives are often replaced due to upgrade, not because they reached end of life.
Some disks may still be under warranty in fact and either way you get the disk that is already past the first few months of successful operation and is thus vetted.
If my storage needs increase - I will add space. It makes sense to add more space than necessary to cover the immediate needs, and that extra space can be rented to storj until my own demand catches up. At which point I’ll add more space. Rinse, repeat. As a result both storj gets more space and I get more space, at the expense of buying slightly bigger than needed drives. With second hand drives this cost difference is minor
It does not matter mostly. These disks are so cheap that even if every third one fails in a month you still save money.
I don’t bother babysitting it. My nas is on the patio. (It’s noisy, and I live in California with mild climate). Has been for past 5 years. Drive temperature fluctuates up and down 15C daily and between 10C and 55C across seasons, according to SMART plots. Still within the datasheet. I haven’t noticed any adverse effects, and all the disks that this nas started with are still spinning (mix of used 8TB wd and seagate from eBay).
So, I would not worry about it.
The point being — I still runs this nas anyway, and running storj on the unused space costs me absolutely nothing. I add drives when I need more space, and adding slightly larger drives to grow free space faster is not much different. So it seems to work as designed — Airbnb for drives