Changelog v1.6.4

For docker nodes ? For update

On a short question without understanding the background I will give a short answer. Yes.

The long answer would be: We havenā€™t published v1.6.3 on docker so there is no need to push out v1.6.4 right now. The number of windows nodes that have to deal with the v1.6.3 bug is relativ low. It could get worse with v1.6.4. ā†’ Slow rollout even for the bugfix. Maybe not as slow as usually but at the end I donā€™t see a reason to update the v1.6.3 nodes to v1.6.4 within a few minutes. We shouldnā€™t rush it and make sure the bugfix is working as expected.

1 Like

For the phased rollout it might be good to stick with the same seed and just reset the cursor so effected nodes with 1.6.3 get this version first.

I take it, it will randomly start dropping ā€œUsedā€ serial numbers, rather than new serial numbers that havenā€™t expired yet within the 1hr.

Ideally these serial numbers should be dropped in the oldest timestamp first, rather than at random.

Nope for 2 reasons.

  1. Tracking the timestamp requires a lot of additional memory.
  2. If you drop the oldest timestamp first then I can cheat you and download my file twice without having to pay for it. If you drop a random serial number that might be a serial number from 10 minutes ago but from my perspektive it is very unlikely that 29 storage nodes dropped the same serial number. Even if you dropped it I canā€™t abuse it.
2 Likes

IĀ“m on the middle of a graceful exit on one of my windows nodes. I stopped the update service while the process doesnĀ“t end. Right move?

You should be able to update just fine. GE will resume after the update is done. If youā€™re on 1.6.3 I think itā€™s probably better to get that update as there is a known issue in that one. If youā€™re on 1.5.2 you may as well finish the GE on that version.

1 Like

IĀ“m on 1.6.3 :frowning:
So, should I just keep the update service running?
Are we sure it wonĀ“t break the GE process? I have quite an amount in heldā€¦

Iā€™m sure that Storjlings have said in the past that a restart is fine and the process will proceed as long as you donā€™t stay offline too long. Thatā€™s good enough for me, but I havenā€™t done a code review on this. Youā€™ll have to decide for yourself whether you want to rely on that. But the alternative is sticking with a version that has caused fatal errors on some nodes and took them offline. So that doesnā€™t sound like a great idea either. I would take the update when itā€™s available, but perhaps @littleskunk can advise on this.

1 Like
  1. Wouldnā€™t a ring buffer automatically drop the oldest orders without tracking any timestamps?
  2. If the oldest timestamp is dropped, in what situation can you cheat me and download files twice which isnā€™t covered by the unlikelyhood of 29 storagenodes all dropping the same orders? (because then there must be a situation where all 29 storagenodes canā€™t submit the orders and are all dropping the same orders. So the only possibility I can think of is a satellite downtime of more than 1 hour)

serial numbers != orders

ok, guess I may not have understood what serial numbers and orders are about in that context. So Iā€™ll just leave it at that.

Should I just leave the updater service running in my windows node thatĀ“s running a GE?

Could you please explain what is serials numbers and how it woking?
I think a lot of questions on this thread that was/will are gone after itā€¦

My understanding is that:

  1. Serial Numbers are used by the customer to retrieve stored files.
  2. Orders are submitted to the satellite for purposes of tracking payment for SN actions.

itā€™s voodoo i tell yeā€¦ all voodoo
i think the whole idea with the serial vs timestamp is that each serial is unique and time stamps can be the sameā€¦ thus you can utilize a timestamp from multiple hosts, while if a one time use unique serial is used, then it will only be able to be used oneā€¦

but i dunoā€¦ it was just what i thought made sense when i read itā€¦ maybe wrongā€¦ not that i really need to know the details :smiley: interesting subjectā€¦ but really understanding the storj programming ā€¦ just think about how complex your local storage solution isā€¦ this is just online using much more unreliable storage and higher latenciesā€¦

i can imagine the programming must be ā€¦ complicate to put it mildlyā€¦ might even end up putting zfs to shame lolā€¦ eventuallyā€¦ so understanding it without months of deep dives into the subject is unlikelyā€¦

I believe itā€™s like this. When a customer wants to download a file, it gets a signed download order from the satellite for the nodes it wants to download from. That download order has a serial number. The signature is checked by the node to make sure the satellite knows about this download and will pay the node for it. The serial is used to make sure that download canā€™t be requested more than once. Without a serial, the customer could use the same download order over and over again and the node would only be paid once.

The reason why removing a random serial from memory is better is mostly because there is never a moment you can be sure the serials arenā€™t in memory anymore for the nodes you want to download from. If after an hour or several hours you always remove the oldest one, you could determine a moment after which it is likely that all nodes have dropped that serial number and you can try again. With it being random there is always a high likelihood that several nodes still know about it and your download would fail, even if a long time has gone by.

2 Likes

How could the double spend / cheat occur if the node still retains orders for 48 hours as per above?
Did this new change create a possibility now of a random used serial to be double spentā€¦ Maybe this needs further review to not allow this to occur and should be in-line with the highlighted above statement of retaining orders for 48 hours. Randomly removing them creates the possibility of a double spend and I donā€™t understand why this was accepted where SNOā€™s wouldnā€™t be compensated correctly in this circumstance.

Surely instead of randomly deleting a used serial (which now allows that serial to be double spent) the better way would be to write the data from memory to SQLite DB serial, and free up the memory pool for the next lot. Process the memory > to SQLite DB as a regular dump, rather than deleting from memory and not updating SQLite that it was used.

Do we now count chickens befor they hatch? I have added the monkit information and the config value to the changelog for a reason. If you are concerned about double spends you can adjust them and you can even verify the impact. I donā€™t see a reason to talk about hypothetical double spends. We have added the monkit data for a reason.

1 Like

I donā€™t think it should be up to the SNO to decide what the appropriate value should be for the mempool for these serial numbers. StorJ should set this so that mempool is never full and a delete for a used serial never occurs because if it does, then that serial could be used again for a double spent. Instead of a delete it should be written to the SQLite DB and be included in the 48 hour window where orders are kept.