Changelog v1.6.4

littleskunk · June 25, 2020, 11:39pm

Storj has set it but that is not a garantee that we did it correct. Please go ahead and check the monkit data.

This is not correct. For a double-spend, the uplink needs 29 storage nodes. It is very unlikely that they all drop the same random serial number.

And if the SQLite DB is locked we suspend the node? I thought that was the reason we moved away from it. I am a bit supprised that we now want to implement something that will suspend nodes again.

Sasha · June 25, 2020, 11:42pm

I’m not implying that we go back to the previous method, I am simply stating that deleting them creates the opportunity for SNO’s to not be paid for the same download as you pointed out (if they were deleted by oldest timestamp) which is the same if they were deleted randomly, it can still occur.

Does this mean the serial number is exactly the same for all 29 storage nodes?

BrightSilence · June 26, 2020, 5:07am

It wouldn’t matter, if not at least 29 nodes have forgotten about the previously used serial, whether they are the same or different, the download will fail. If the size of the mem pool is chosen well (and I’m sure some thought has gone into that), this would be a virtual impossibility. It’s a game of chance, sure there is a possibility that 29 nodes will drop the serial for the same segment. But if a customer would fail 99 out of 100 double spend attempts it really isn’t worth it especially if that means you only get to download the same thing twice in a short time span. Were arguing about a really theoretical problem. If you’re worried about it, try to exploit it and see how far you get.

SGC · June 26, 2020, 6:46am

i got no idea what you guys are actually talking about lol… xD but it sure sounds like something that could shut down the entire network if it isn’t working correctly…

so maybe just remember to check that it won’t choke the entire system on some kind of failure…

i mean security measures are good, but they can be dangerous for network reliability, while if people can download stuff twice for 24 hours because something fails doesn’t seem that dangerous to me…

ofc that might just be due to my lack of understanding what this system/scheme actually does…

just saying keep in mind stuff will fail, and if its not something that can be short term exploited, its better that the system allows continued network operation and informs a maintenance crew that something is broken, even if some low the medium level exploits become opened up…

ofc one would have the determine what could actually be the exploits… double unpaid bandwidth usage seems kinda low prio, compared to maybe being able to getting all 29 pieces of a customer data … block… which would be bad… even if it should require more energy than the sun puts out in 30 years to decrypt it on a perfect computer… but yeah… well they would still be getting closer to abusing something

nerdatwork · June 26, 2020, 7:00am

You can make your post better by following these

Alexey · June 26, 2020, 7:44am

They can’t

To give more chances for storagenode to submit them to the satellite.

But of course, uplink can downlad twice triple and so on with a new orders (and new serials) each time.

SGC · June 26, 2020, 7:55am

it’s was an example of the duration of an exploit before patched… not related to any existing timers in the code which i don’t understand nor care to understand…

24 hours seems perfectly reasonable time to patch a critical bug…

was just trying to say, it’s dangerous to put in code that can shut down a network if one aim’s for 99.99% uptime, which is what this sounded like…

Alexey · June 26, 2020, 8:02am

Seems we treated “shutdown the network” a little bit differently. For me “shutdown the network” - it’s unable to upload or download to/from the network.
This is not the change which can do this.
For other possible bugs in the release we have a slow rollout process, when we updates nodes slowly.

It’s already was helpful with

SGC · June 26, 2020, 8:07am

i just understood the serials as being the control what can be transmitted or not… sounds important for tracking stuff… which serials usually is… but whatever, sorry i commented

Alexey · June 26, 2020, 8:17am

The serial would exist only in one case - the download is happened. So, no customers are affected.
There is a possibility that 29 nodes will drop the same serial number for an hour at the same time that a malicious uplink tries to cheat.
It’s virtually impossible

BrightSilence · June 26, 2020, 8:31am

And lets not forget that the only prize at the end of this long tunnel of modding the uplink software to retry downloads with the same serial first, fail most of the time, wasting performance, causing delays, wasting bandwidth etc, is… you get to download a tiny fraction of data a second time for free within 1 hour. But in order to maybe get a rare piece downloaded for free you will have to try it for all of them because there is no way for the customer to know which serials were dropped.

If you had workloads where you often need data more than once in an hour, it would be a million times cheaper to just implement a local cache and prevent having to download it again in the first place. It’s near impossible to exploit and there is nothing to really gain from it.

It’s a non-existent problem.

anon27637763 · June 26, 2020, 12:21pm

It’s also important to add that the supposed issue of deleted serials only happens under two conditions:

The node has greater than 1 MB of stored serials in the RAM database.
The node reboots or restarts.

My node sees about 2000 serials per hour. The current 48 hour database is about 11 MB. So, a RAM database for 1 hour is very unlikely to exceed 1 MB.

Node reboots and restarts occur, but not often. I’ve automated about 3 restarts a month.

So, for the two possible conditions enabling a very unlikely double spend for no benefit to the cheater, one will almost never occur and the other will only occur very rarely.

However, it should be pointed out that if the RAM database was an order of magnitude smaller and a “Ring” filter, as someone suggested, was applied to the timestamp expiration on the serials… cheating would be much more likely since all 29 pieces would be available for double spend across the network in a predictable pattern. Thus the random selection for deletion of serials is significantly better than using a time based deletion… If, only if, the serial storage exceeds 1 MB in 1 Hr.

Sasha · June 26, 2020, 10:20pm

Thanks for joining the conversation, but I think you missed the point.
The issue mentioned is quite a technical one and I think without understanding the entire workings of the change we’re all just making assumptions.

BrightSilence · June 26, 2020, 10:22pm

Thanks for your response, but that wasn’t an assumption.

SGC · June 27, 2020, 9:37am

i also did state that i had no real clue about it, my only intention was to inspire people to think, and assume that everything will fail, and that prediction / assumption of failure of all individual parts and engineering to expect this, equals future reliability.

one thing that doesn’t really make sense then, is that little skunk said the serials was used to avoid exploits, but if they are used after the fact as a logging feature… then it shouldn’t be exploitable.

maybe the prize as you say, might be that if one can download a piece like say 1 hour afterwards, then it might be possible for someone to collect all 29 pieces and not be the actual owner of the pieces… ofc then they would have the difficult task of decryption the data, but the assumption that there is nothing gained by an exploit is a security failure in the making…

sure what one can imagine a malicious actor could do with an exploit might not be useful for anything, but when a malicious actor digs around enough, eventually they can collect enough exploits and figures out a new way to combine them to make something useful out of it… meaning malicious exploit.

BrightSilence · June 27, 2020, 10:45am

Lets not start making things up now. The only “exploit” serials are meant to protect against is the same customer downloading the same segment again within an hour without paying for the second download. There is no way anyone else can get their hands on the data just because serials are dropped. The rights management systems for that haven’t been touched with this change.

littleskunk · June 29, 2020, 7:59pm

I did a few tests with my node. If I reduce the serial cache to 50KB I can see that the monkit metric is working.

delete_random_serial,scope=storj.io/storj/storagenode/piecestore/usedserials rate=0.435235
delete_random_serial,scope=storj.io/storj/storagenode/piecestore/usedserials total=3440.000000

With a cache size of 100KB I don’t see it anymore. I am running 6 nodes. 2 of them are unvetted and don’t get full traffic at the moment but lets assume the incoming traffic is split on 6 nodes. With about 600KB I should be more then safe right now. The default is 1MB.

However I still recommend that you run this test on your own storage node. I would expect the more data you are holding the more serial numbers you will have to process. I have increase the serial number cache to 1GB on my storage node just because I can.

Sasha · June 29, 2020, 9:17pm

It would be good to know roughly what is the recommended ratio to set this value based on data stored. Or better yet based on max available data in TB’s.

BrightSilence · June 29, 2020, 9:53pm

Could you elaborate a little on what these 2 numbers are?

littleskunk · June 29, 2020, 10:58pm

An indicator whether or not your storage node has dropped a serial number because of memory limitations.