Wow, that's weird [drop on "Disk Space Used This Month" graph]

clapsyourhand · November 23, 2020, 7:44pm

2020-11-23 20_43_36-Node Dashboard same

SGC · November 24, 2020, 9:25am

looks like this explains why it’s going all crazy right now…
they are migrating the satellites cluster and the accounting system is being replaced… so no wonder its acting up. you can read about it in the new change log

kalloritis · November 24, 2020, 1:06pm

Yea, saw that. Not happy about being part of an unclean migration, but understand it can happen. Just not sure though why there wasn’t more testing to develop a fix to mitigate this- I mean I’d hope there’s a CI/CD setup specifically for making sure things aren’t breaking too bad.

SGC · November 24, 2020, 4:10pm

well it’s to fix the issue we pondering… or that’s how i understand it…

and it’s not critical infrastructure for the nodes, meaning it really can’t go bad, from our perspective…
and when implemented maybe this disk space used this month graph will finally stop being all weird because the new system will not require the same level of computation that the old system did…

ofc it’s critical for tardigrade that there is no double spend… but even if there was for a short time… it doesn’t matter to much… ofc i suppose in theory the worst case would be tardigrade denying uploads if the system truly failed…

and the reason it’s affecting the space used … graph might just be because the cluster / satellites are doing a lot of processing right now to convert into the new system or whatever…

i duno how it works… but it doesn’t worry me one bit… i cannot imagine this affecting SNO’s much

i’m more worried about when to update… i skipped the last update because there was a good deal of people having issues with the new orders.db or whatever… in 1.16.1
so hopefully i won’t get into trouble when going from 1.15.3 to 1.17.1
and hopefully more SNO’s won’t suffer when updating from 1.16.1 to 1.17.1

kalloritis · November 24, 2020, 4:39pm

I must have missed the bullet on that one- 1.15.3 to 1.16.1 went fine and I don’t imagine major issues into 1.17.4 - that being said, stranger things have happened.

SGC · November 24, 2020, 5:37pm

from my understanding it was only an issue that hit very old nodes maybe… something with some of their orders being from an old version… so in theory i should be okay to just update…
but saw more than a few affected on the forum… ofc thats how it goes when updating software infrastructure or whatever one calls it… core processes, functions, thingamajigs…

most likely not be a huge issue since we didn’t see a new patched release, something which i sort of expected… found that a bit surprising, but maybe there was no fix for it and if 98% of the nodes where updated… then the damage already had happened…

1.17.4 … i think i need to read up on version numbers again… why is it .4 and not .1
i mean we went from 1.15.1 to 1.15.3 (because of a mistake) then we went back to 1.16.1 and now we go to 1.17.4
either i can’t count or whoever is in control of these version numbers can’t…

kalloritis · November 24, 2020, 5:49pm

I believe there was a 1.17.1 for a period- but I think there’s merit to 1.17.4:

baker · November 24, 2020, 5:51pm

Here are the “missing” releases:

Not every version is released outside Storj’s own development.

Overall, version numbers are pretty arbitrary. Chrome is on version 87 for example. It’s whatever works for the devs.

SGC · November 24, 2020, 5:58pm

i just don’t really understand why public released versions wouldn’t be sequential, sure it’s more or less completely irrelevant, ofc there maybe some advantage to using the same versions for development as for public releases…

i was just wondering if there was something i missed about understanding how to read the version numbers, i can ignore the last number … if it helps the devs
just have to remember to write 1.17.x instead
does that mean devs have x factor?

ifraixedes · November 25, 2020, 10:25am

This about why we rollout to SN different patch versions on each minor version.

Let me give some context about SN rollout.

SN rollouts take time because we release them in ~24 hours steps that make the new version applicable to an increasing % of storage nodes. Currently, the steps are: 5%, 10%, 20%, 40%, 80%, 100%. Weekends and official company holidays are excluded from this cadence, which means that we don’t promise to follow the next % rollout during those days, although we sometimes do; when some rollout isn’t finished and runs on any of those days, the rollout continues the following business day.

The storage nodes which follow the cursor are the ones that use the storage node updater which, currently, are the ones installed through or Windows and Linux installer.

The ones using the Docker images and Watchtower are updated when we publish the new Docker images and Watchtower executes the check. The Docker images are just published right after we deploy the 100% step.

But, why do we only roll out some versions?

Because sometimes we need/want to release some Satellite updates and avoid selecting all the SN changes that we want to release for the next release and other times, there aren’t any new changes for the storage node so we wait until we have them to create a new release.

Considering that some Satellite updates are released in a pretty short period of time, just hours if we start a new SN release roll out on each version, we would end up that SN release may never catch up with the last version due to our ~24 hours percentage release commented above.

I hope that this comment clarifies, why SN releases aren’t consecutive.

rmon · November 29, 2020, 4:10pm

My stats this month show this curious drop in storage usage.

How can it happen?
I haven’t had downtime. And if it were due to somebody deleting and reuploading, I would have seen the effect on bandwidth (and bandwidth usage haven’t skyrocketed).

dada181 · November 29, 2020, 4:14pm

That’s what happened to everyone

SGC · November 29, 2020, 7:44pm

it doesn’t graph disk space used, it’s graphing satellite storage calculation speeds…
you can basically super impose rmon’s graph on top of mine and see the exact same thing.
nearly… the only thing you can take away from this graph is the average it gives, it could be represented in a single line of the avg over the days and it would be more useful and accurate… lol
makes me sad and angry every time i look at it… it’s just mass broadcast confusion.

just look at total disk space used space and then generally thats about what you’ve had for a month… and what you will get paid for…
ofc the less data stored the less accurate it will be because it can change rather quickly, like in dada181’s case… because it’s a new node.