Updates on Test Data

Alexey · June 2, 2024, 7:02am

This is why we usually recommend to use a specialized VPN solutions with a port forwarding instead of a “free” VM+VPN. However, you have had a new way how to solve the CGNAT issue… For a while.

arrogantrabbit · June 2, 2024, 9:36am

I would if I could, but I can’t. At that location I can choose any available ISP out of one. Not my house not my problem. But $2/month for airvpn wont break the bank.

At my house i do have a public IP and run two nodes that share it.

It’s all fun and games of course. Out of all nodes I now get $50/ month. Kind of silly amount of money to bother, if that wasn’t fun.

Basically it’s a way to convince friends to run a NAS at their house (hey, storj will offset electricity cost) to be able to backup to each other. A win-win. Storj is just a facilitator to get them to agree to start doing backup.

TechAUmNu · June 2, 2024, 3:05pm

So when the switch to the actual customer takes place, will they be deleting all their data every couple of weeks and continuously uploading new data? Essentially using storj as a live backup.

I.e. will they be only uploading constantly? (which we don’t get paid for at all)

Or is this load testing just to verify that they can move their whole dataset over in a timely manner?

If the use case is going to change to huge ingress with low TTL for the data, requiring huge bandwidth usage to maintain storage, that means large nodes with lots of storage are never going to fill up, so they get paid significantly less while also having higher costs due to needing faster internet to keep the nodes even partially full.

Now, if they are going to upload a shitload of data then use it for a CDN or something that actually produces a lot of egress that is fine because it is paid bandwidth so we can use that payment to cover any bandwidth costs.

I understand that having unlimited free ingress to the customer is a good selling point. However it is biasing your service towards backup use cases because there is no penalty to them for hammering the network constantly.
If there was some charge for ingress as well, then you are going to have a higher proportion of customers that either upload once and download many times (CDN like) or upload once and leave that data for a long time (Archival).
I don’t think we want to see anyone uploading data many times and deleting it quickly (hot backups).

Mitsos · June 2, 2024, 3:08pm

I’m assuming that everyone doing backups will be going by the established backup rules:

One daily backup
Keep one backup per week
Keep one backup per month
Keep one backup per year

no?

Ruskiem · June 2, 2024, 3:25pm

There are nodes for any purpose among us, don’t worry.
For me? thats perfect, because as many i have high download and rather low upload.
I want to store as much as possible, and send not so much.

at some point customers will saturate ours HDDs,
and for lets say day 90’s file being deleted, an upload of day 0 comes.
So i think it gonna occupy space for rather static value after that given 90days.
The ones who has more download will faster fill the HDD, on expenses of the slowest nodes.

And thats positve again, it motivates SNOs to upgrade their plans because with higher internet plan, upload also comes higher, so double win for storJAY ;3
Or triple win, also ISPs should like it, if customers want to upgrade.

one more thing, there is “penalty” for hammering us with ingress, the file lands on HDD, and that costs to keep it;>

and

i think some SNOs would rub hands, if they have nodes able to keep up, better than others.

Alexey · June 2, 2024, 3:32pm

Backups of the storagenode are useless, as soon as you bring it online, it will be disqualified for the lost data since backup.

Mitsos · June 2, 2024, 3:33pm

That’s not what I said. Backups from the client’s POV.

Alexey · June 2, 2024, 3:35pm

Ok, now I get it
You are referring to:

Yes, I think the customers could have a such usage pattern too.

Mitsos · June 2, 2024, 3:43pm

I think you haven’t had your morning coffee yet

The client using storj for his offsite backup. The logical continuation to the comment above mine.

nerdatwork · June 2, 2024, 3:47pm

I would like to suggest to click the “reply” button to the comment you are replying to so the chain of thoughts is understood by other people. Discourse does sometimes not show it but it helps.

It shows on the top right hand like

I think you may have clicked the “reply” button to the thread instead.

Alexey · June 17, 2024, 7:25am

6 posts were split to a new topic: The link to the original topic sometimes not shown

Knowledge · June 2, 2024, 4:39pm

That wouldn’t necessarily be proditable for Storj Labs as well if we weren’t retaining data or getting Egress.

Imagine a backup that you make daily. At the end of the week, you keep one of the daily backups as a weekly and delete the rest. After some weeks, you keep a monthly and delete the weekly.

Retention is determined by the customer’s needs. You get paid for the data you are storing. Some you store for a short period, some you store lomger. Nothing has changed here.

Now, imagine the backup software is run by individuals and not a central server at a business somewhere. You could have thousands of people in thousands of locations all moving data in and setting their own retention times simultaneously. Imagine most of these people backup at night around the same time. Thousands of backups at once.

Before anyone extrapolates this into questions about what backup software and who the individuals are. This is just an example. There are also multiple potential large customers and they all have different needs. LittleSkunk and ths team are looking at the possibilities if everyone signs up and how that will impact the network. And there will be more customers after that, and after that, and so on…

So, if bandwidth is an issue you should look into structuring your setup so it isn’t. The data will flow…

arrogantrabbit · June 2, 2024, 6:32pm

Good point. I’m wondering how often, if ever, does that happen. Most of their servers are half-utilized, according to their stats.

Just measured, I’m getting at least 200Mbps down and 30Mbps up (it’s close to the max of the connection there)

On the other hand, they have some servers with higher guaranteed bandwidth, and it’s probably always possible to hop around and pick the one less loaded.

Frosty81 · June 2, 2024, 6:54pm

This may have been answered, but what is the deal with the test data and removal. It sounds like it is getting written to my node and then deleted shortly after just to verify what i can handle. Is this data sticking around long enough for me to get paid for it? I don’t mind the new avg of 15-18TB of data a month, so long as i am getting paid. But if its getting removed before data usage payouts are determined, i would rather not needlessly write a couple hundred TB a year just for the sake of endless free testing.

Roxor · June 2, 2024, 7:05pm

The TBm part of the node dashboard is from the Satellite isn’t it? If so then if your TBm is going up… you’re getting paid for the data.

nerdatwork · June 2, 2024, 7:05pm

Its right above your post

Frosty81 · June 2, 2024, 7:39pm

It is not going up anywhere near the rate at which my Ingress would suggest. I will keep a closer eye on it. Its hard to tell as i have a lot of data coming in, but my trash is also growing at a steady clip. I just can’t tell if thats this test data (i sort of took the plan that when the test data expired it just disappears and does not go to trash) or if its cleanup finally happening with lazy filewalker to try and resolve my 5 TB disk usage discrepancy.

daki82 · June 2, 2024, 7:51pm

yes, but.
holy s**t, my 2 nodes getting 1.2TB together in just 2 days (same ip). My provider will not be happy about that…
for myself on the other side: bring it on!
While doing defragmentation! one is sweating heavily (mini-pc) the other is rather bored (gaming pc) exept the storj drive ofc.

Frosty81 · June 2, 2024, 7:51pm

It wasn’t really clear in the opening discussion if the data was all being cycled with the same time to live. I got the impression it was. Which would mean if the ttl was less than a month, it would be removed before i got paid for it. From the quote you referenced, it sounds like all data has random ttl?

Mitsos · June 2, 2024, 8:00pm

If you store the data for a day, you still get paid 1/30 (assuming a month has 30 days) for it. It doesn’t have to be stored for a month to get paid.