Is there a way to get the hash of the file either through Storj or S3 gateway?
My use case is I want to very if my local file has already been uploaded To Storj. I also want to check if my local file is identical to Storj.
Thanks!
Is there a way to get the hash of the file either through Storj or S3 gateway?
My use case is I want to very if my local file has already been uploaded To Storj. I also want to check if my local file is identical to Storj.
Thanks!
I don’t believe this is functionality native to Storj or S3 implementations in general. I’ve seen folks implement this by storing files using their hash and then having another object reference or link to the hashed version. This way, you can quickly check if a file with the same hash exists and that the current file reference points to the current hash.
That’s a clever trick to add it on. Should work in most scenarios unless the intention was to check the remote file integrity. If that was the intention, I can add that it’s basically not possible for the remote data to change. If a node would change data in their piece, the erasure coding will no longer work for that piece and when recombining the data into the original file, other pieces will simply be used to recombine the original file. In the extreme case that too many pieces have been altered to reconstruct the file, you would get an error. You would never get data that doesn’t match the original file.
However, to my knowledge that extreme case has never happened, because Storj ensures data is repaired long before that could ever occur.
Does Storj S3 support tags?
or S3 md5 verification?
Could I check upload size and check it’s the same size as the local file size at a minimum?
Storj does support attaching custom metadata to objects. Here’s a post that shows how duplicati adds information such as LastAccess
and LastModification
:
Similarly, you could add an md5sum
or sha256sum
value to the custom metadata to indicate what the objects signature is/should be.
Some cloud storage systems do have this as a feature, see e.g. the “Hash” column in this table: Overview of cloud storage systems
We definitely support using ETags in Gateway-MT using HeadObject (HTTP HEAD) requests, similar to S3. Behind the scenes this updates object Metadata. In the native Uplink API, you can call StatObject
which returns SystemMetadata
including a Created
timestamp. If you’re looking for simple remote change detection, this should work. StatObject
also optionally also returns CustomMetadata
which can be tailored to specific use cases. You can get a sense of this using the uplink CLI tool and the command uplink ls --access $ACCESS sj://bucket/myfile --expanded