Updates on Test Data

I thought that… independent of specific potential customers… that they were going to maintain 3-months estimated capacity at all times going forward. So even if their sales pipeline contracted… they’d still keep 3-months worth filled (just a lower amount). I hope so.

If you look at a 3-month customer-stored-data graph and squint a bit… is it fair to say paid data is increasing around 1.5PB/month?

1 Like

I think the best answer for now is that SNO’s should expect similar data flows that they have seen these past few weeks. There may be starts and stops like before. I dont think there is an exact upper target for SLC data right now. However, you should see overall usage across all Sats to be comparable to recent activity.

2 Likes

It’s almost 2 months old now, but there were targets:

It doesn’t look like SLC has grown enough since that post for us to have filled all that space: so capacity-reservation data is probably still growing?

It depends. That data is meant as reserved capacity. If customers are starting to utilize some of that capacity, it will be less on SLC and more on the other Sats. But this is all fluid, as it depends on customer trends. We want the space to be available to the customers and not used up because we are holding it on SLC. So, if a customer is ready to utilize a chunk of that data we have to have it available and thus it gets released. The exact timing of when a customer uses that data isnt up to us. We are just making sure the capacity is there.

And, of course the SLC data could go up if more reserved capacity is asked for. And it can go down as it gets released to be used. Thus, it is fluid and such definitive pins on how much capacity will be in one place or another is not guaranteed.

3 Likes

When you say “throughput”, are we talking IOPS or bandwidth? I’m wondering whether I’ll need more HD space, bandwidth or IOPS in the future.

Replying to myself, as the math is confusing me. If the targets add up to 30PB of capacity-reservation… then to maintain it SLC would have to be uploading 1PB/day steady-state to keep up with 30-day-TTL data expiring… wouldn’t it?

If a 1Gbps Internet connection can move around 10TB/day… then SLC would have to fill a 100Gbps connection 24x7. I can’t imagine how much Storj is paying for that class of bandwidth.

Or did I add it up wrong? If targets are quarterly… wouldn’t it makes sense to have 90-day TTL… so the required upload rate would be 1/3rd? (You can still force-delete TTL files if you need the space back sooner).

(Edit: Also, I want a pic of SLC’s upload hardware: it must be beefy!)

(Edit #2: Is this worse than I think… because to have space to hold 30PB of customer files… you’d have to upload around 2.7x that amount to the network? So 2.7PB SLC uploads/day?)

(Edit #3: I guess I don’t need to know these answers. I’m content if I see my HDD fill a bit more every month… :money_mouth_face: )

I dont think they can say to clients: Hello, you wanna 10Pb of storage? Ok, it will be available to you after 3 months, because of our test TTL )

I think it is still possible to delete data without waiting for TTL to expire.

1 Like

Yes, of course. It will be collected by the garbage collector, it should take about two weeks.

There are three points:

  1. Test the throughput
  2. Keep space used
  3. The test data should be autoremoved without a garbage collector and waiting in the trash

Thus 90 days is not better, because it cannot help to test a throughput, if the node would be full. Also these customers have a similar usage pattern as we emulating with Saltlake, include segments size and levels of TTLs (of course, this is an estimate based on the information we have about how they plan to use the Storj network). Therefore, it is important to conduct the most accurate testing possible.

And it’s already allowed us to optimize our code to have more high throughput than before, it’s also helped to reveal several bugs and fix some of them.

6 Likes

Hello. It’s been almost 24 hours since test data ingress paused. I can see that SLC data stored is slowly going down.
Any updates?

As of right now I don’t have an update. My hope is on Monday I can provide an update.

4 Likes

That’s actually a good thing: It means TTL deletions are working! :stuck_out_tongue_winking_eye:

Some just don’t get that space back… but help is on the way!

1 Like

Not gonna lie, it’s more exciting when the test data is coming in fast and filling up drives and breaking things and causing errors.

8 Likes

Thank you. This is all I need. My main goal was to make sure it was not just my nodes on pause. :slight_smile:

1 Like

Taking a look at one of the big SNOs: it looks like they lose 15-20TB per day when SLC isn’t firing out data. Imagine emptying an entire HDD every day, yikes! :open_mouth:

1 Like

When the ingress drops 97% suddenly I always think theres a problem with my node

2 Likes

The good thing about having multiple nodes in multiple countries is that when ingress drops in all of them at the same time you’re a bit more relaxed… :slight_smile:

2 Likes

Do not panic. This is a usual everyday in the distributed storage with an unpredictable usage…
Just imagine - our storage is used by real humans. Do you still expect a predictable results? Even for a day…

1 Like

None of my nodes showing such large drop, it is more like 30-40%. :thinking: