Verify Hash of file

Is there a way to get the hash of the file either through Storj or S3 gateway?

My use case is I want to very if my local file has already been uploaded To Storj. I also want to check if my local file is identical to Storj.

Thanks!

1 Like

I don’t believe this is functionality native to Storj or S3 implementations in general. I’ve seen folks implement this by storing files using their hash and then having another object reference or link to the hashed version. This way, you can quickly check if a file with the same hash exists and that the current file reference points to the current hash.

1 Like

That’s a clever trick to add it on. Should work in most scenarios unless the intention was to check the remote file integrity. If that was the intention, I can add that it’s basically not possible for the remote data to change. If a node would change data in their piece, the erasure coding will no longer work for that piece and when recombining the data into the original file, other pieces will simply be used to recombine the original file. In the extreme case that too many pieces have been altered to reconstruct the file, you would get an error. You would never get data that doesn’t match the original file.

However, to my knowledge that extreme case has never happened, because Storj ensures data is repaired long before that could ever occur.

Does Storj S3 support tags?

or S3 md5 verification?

Could I check upload size and check it’s the same size as the local file size at a minimum?

Storj does support attaching custom metadata to objects. Here’s a post that shows how duplicati adds information such as LastAccess and LastModification:

Similarly, you could add an md5sum or sha256sum value to the custom metadata to indicate what the objects signature is/should be.

4 Likes

Some cloud storage systems do have this as a feature, see e.g. the “Hash” column in this table: Overview of cloud storage systems

2 Likes

We definitely support using ETags in Gateway-MT using HeadObject (HTTP HEAD) requests, similar to S3. Behind the scenes this updates object Metadata. In the native Uplink API, you can call StatObject which returns SystemMetadata including a Created timestamp. If you’re looking for simple remote change detection, this should work. StatObject also optionally also returns CustomMetadata which can be tailored to specific use cases. You can get a sense of this using the uplink CLI tool and the command uplink ls --access $ACCESS sj://bucket/myfile --expanded

2 Likes