Drop Bloom filters, bring back direct deletes

Toyoo · August 15, 2024, 1:41am

Server-side copies. For each delete you need to check whether the customer have another reference to the removed segment somewhere, and maybe not send the delete request to the node after all.

This was a conscious decision by Storj some time ago to make deletions faster.

Seems so. Having a separate index just to track whether a given piece is a part of multiple files might be effectively doubling the database in size. That would be a lot of resources.

Contacting one entity, then some tens of entities, is slower than contacting one entity due to plain old network latency. Besides, you need to contact all nodes, not just the fast ones like with uploads and downloads.

Consider that for uploads and downloads data transfer has to happen. This makes the design of uploads and downloads a trade-off between latency and bandwidth. Latency for contacting the nodes is higher. But total bandwidth is also higher. You don’t care about bandwidth for deletions. You do for data transfers.