Sharing ML Dataset via read-only access grant

We have two access types - we have access grants and access keys. You can use uplink to register an access grant to become an access key.

If you are going to share credentials publicly, I recommend using an access key and not an access grant. Even though an access key can’t be used with libuplink, an access grant has one significant flaw that we have yet to resolve.

If you share an access grant in a way that a malicious storage node operator can observe, then that storage node operator can use your access grant with a modified storage node and uplink and have both sides of the operation collude to lie and send your bill sky high. Essentially, a malicious storage node operator that has an access grant that points to data that lives on their storage node has the ability to modify an uplink to claim that huge amounts of data have been requested, without ever actually using any. In this scenario, the storage node gets paid through your project’s bill.

An access key can be used with hosted services such as link.<region>.storjshare.io or gateway.<region>.storjshare.io, but cannot be used with a libuplink directly. When you run uplink share --url, that command implicitly creates the kind of access key that you’re looking for (readonly, public, etc) and it should be safe to share without worrying about this malicious attack. You can share folders and files this way. rclone, through the s3 integration, should also be able to contact a gateway using an access key like this.

This is a temporary solution to this collusion problem, and we’re still thinking about a better way to detect and prevent it.

Anyway, summary: please use an access key and don’t share your access grants.

6 Likes