I’ve talked to a heavy AWS S3 user today. He complained to me that they’d love to see a form of atomic snapshots for whole buckets, and AWS S3 does not offer anything like that. The apparently keep tens of millions of objects in a single bucket, which keep changing. Any iteration over all items in a bucket is bound to take hours. And they’d love to be able to make an atomic snapshot of a bucket’s state for backup and investigation purposes.
Is this possible with server-side copies on Storj?
Yeah, for the use case he was describing, he really insisted on atomicity. He is working around his problem now by setting up an EBS volume with LVM and does LVM snapshots instead, but he was clearly annoyed he has to do so.
I thin the problems can be solved easily, as any change in file for storj is a new file, then old files just need to keep every instance, it takes space(lot of space) but you can recover every change.
The idea was to make a snapshot to be able to debug the state of application and to make a consistent offline backup. So these snapshots would be short-lived.
Client can always make some time thing, like after 3 months it delete all old version and stay only last and 1 previous version. More over if they are short time, then client just delete them. When you change file in storj you always need to upload new file, then you deside delete or not delete old, it not happen itself.
The difficulty is not in storing files, the difficulty is in knowing what was exactly the state of the bucket at a snapshot time. That is, this file existed and had this content, that file didn’t exist yet.
Snapshots take up space as well, since they work the same way (some way of keeping what has changed).
But yeah, since in Storj, every change is a new file, this means that Storj is basically COW, adding snapshots to that should not be too difficult - just keep the old data until the snapshot is deleted.