Update on public sharing of Access Grants

Yukon · October 17, 2023, 4:41pm

Hi
This is a kind of follow up of this topic: Sharing publicly an API key for read only access

In summary, I needed to share (read, list only, no write, no delete permissions) the contents of a bucket publicly, and I was looking for the best (safest) way to do that.

I configured my sharing as recommended there, registering an previously created Acces Grant (in uplink CLI) as S3, as explained in this Alexey answer https://forum.storj.io/t/sharing-publicly-an-api-key-for-read-only-access/20612/6?u=yukon

It worked fine, but I there is an aspect I would like to avoid. In S3 gateway access, I have to pass by a single server (gateway) in order to then access to the distributed or decentralized data. This is a potential risk, because if the gateway has some problem and is not accessible, or is compromised by some bad actor (public or private), I loss the access to my data.

For this reason I would like to use the Access Grant (generated in CLI) for native access instead of S3 gateway, but NOT if sharing publicly the access grant still has the vulnerabilities highlighted in this post of jtolio: Sharing ML Dataset via read-only access grant - #3 by jtolio

This was discussed at the time, but as almost a year has passed, I would like to have confirmation, if possible, if this concern/risk is still true, or if nowadays it is possible to share publicly an Access Grant without these (or other) risks.

Thanks!

jtolio · October 17, 2023, 5:09pm

The situation in Sharing ML Dataset via read-only access grant - #3 by jtolio is still the state of the software.

However - we don’t often get this request, so finding a solution hasn’t been prioritized. What is your overall use case? If we can understand the circumstances here it might help inform a potential easy solution (such as rate limited access grants or something else)

Yukon · October 17, 2023, 5:49pm

Thanks for your prompt answer.

I am building a kind of knowledge database in plain text files, with some images or short videos, but mainly text files, that I need to be accessible directly by anyone who is interested on it. So I want share to the general public and access in a read only bucket.

I shall of course have write access to the bucket, but all other people (public) will have only read only and list permissions.

I would prefer to use an Access Grant generated in uplink CLI as it is faster than S3, more decentralized (and safer).

Alexey · October 18, 2023, 3:12am

There are at least two ways how to implement this:

Setup a static web site for this bucket/prefix and use your own domain as simple as http://your.domain.tld/your-object.png across the site (so the base URL would be http://your.domain.tld/, if you have a paid account you may also use TLS to have https:// domain)
Share the bucket(s)/prefix(es) and generate a read-only URL

uplink share --url --not-after=+1h sj://my-bucket/my-prefix/

replace /s/ to /raw/ in the generated URL and use it as a base URL on your site for the static content.
Of course you may use a read-only access grant directly without its registration on Gateway-MT, but with mentioned risks.
Another way is to run your own Self-hosted S3-compatible Gateway in a static web-site mode behind your own reverse-proxy with configured TLS.

Yukon · October 18, 2023, 9:04pm

Thank you for your answer.
I forgot one important detail in my use case description, though: I need the objects access to be through the client file system, like a network unit, let’s say, not through urls. This is due to the software used to visualize the content.

So, I guess I should stick to the S3 credentials for now, even with the potential “weakness” of a single gateway as pass-through.

But, please correct me if I am wrong, when accessing and downloading an object using an Access Grant from uplink or rclone CLI, I understand that I am connecting first and passing through an Storj server, from which the different and distributed/decentralized parts of an object are retrieved, put together and sent -encrypted- to my client. Isn’t? If so, then this is also a single point of access, just as the S3 gateway is also a single point of access. Is my understanding correct?

Another aspect of S3 if I am not wrong, is that the encryption is done in server side, if so, then the information exchanged between the client and the gateway could be visible, or at least -if https is used (I guess?)- there is some moment in the server/gateway when the data to/from the client will be decrypted-encrypted (or viceversa) between Storj encryption and TLS encryption. Am I wrong? If so, this could be a moment of potential (although unlikely) risk.

It is not the confidentiality of the content that worries me the most (as I want it to be publicly accessible anyway), but the privacy of the client accessing it. Maybe a rclone crypt could help?

Thanks!

Alexey · October 19, 2023, 4:56am

No one byte is going through Storj services in this case. Your uplink library contact satellites only to get a list of nodes which stores pieces of the segment of the file, then the uplink library contact these nodes directly and downloads pieces, then combines them locally to the segments using Reed-Solomon decoding, then combines segments to the file, then decrypts this file using your locally stored encryption key.

But if you would use a linksharing service or Gateway-MT, then your data will go through them (they are distributed services, so the contacting client will take the requested info from the closest instance to their location) and all recombination and decryption is happened on them. The client gives them either the access grant or the Access Key (which will allow to decrypt the encrypted access grant, stored on Gateway-MT, see Understanding Server-Side Encryption - Storj Docs), this access grant allows them to get pieces from the nodes, combine them back to the file and decrypt it with the encryption key provided inside the access grant (see Understand and Manage Access Grants - Storj Docs), then provides the decrypted file to the client who requested this file providing either an access grant or an Access Key.

If you did not make S3 credentials public (the flag --register --public or --url in your uplink share command), you will need also to have a Secret Key to be able to decrypt an access grant and use it to get the file (i.e. when you did uplink share --register without a --public flag or generated an S3 credentials in the satellite UI).

Yukon · October 19, 2023, 6:52am

Hi, thanks, I see.

Then in case I use an Access Grant (not S3 gateway):

For the paid tier, is it possible to put a limit of bandwidth or a limit to the invoice (limiting thus the bandwidth/traffic)? So, if some bad actor wants to abuse the Access Grant, it will not succeed more than the limit imposed (and maybe, by the way it could become exposed…)
For the free tier, I understand that this bad actor could not produce a high (nor small) invoice, as the account is free. Is that correct? If so, what could he do? Maybe exhaust all the egress monthly traffic? So I, neither anyone, would be able to download/upload anything until next month?

Thanks

En jueves, 19 de octubre de 2023, 07:06:32 CEST, Alexey via Storj Community Forum official ‘storj at literatehosting.com’ xvky7x9q@anonaddy.me escribió:

Warning: This email may be spoofed or improperly forwarded, please check the ‘X-AnonAddy-Authentication-Results’ header.

Alexey Leader
October 19

Yukon:

But, please correct me if I am wrong, when accessing and downloading an object using an Access Grant from uplink or rclone CLI, I understand that I am connecting first and passing through an Storj server, from which the different and distributed/decentralized parts of an object are retrieved, put together and sent -encrypted- to my client. Isn’t?

No one byte is going through Storj services in this case. Your uplink library contact satellites only to get list of nodes which stores pieces of the segment of the file, then uplink library contact these nodes directly and downloads pieces, then combines them locally to the segments using Reed-Solomon decoding, then combines segments to the file, then decrypts this file using your locally stored encryption key.

But if you would use linksharing service or Gateway-MT, then your data will go through them (they are distributed services, so the contacting client will take the requested info from the closest instance to their location) and all recombination and decryption is happened on them. The client gives them either the access grant or the Access Key (which will allow to decrypt the encrypted access grant, stored on Gateway-MT, see Understanding Server-Side Encryption - Storj Docs), this access grant allows them to get pieces from the nodes, combine them back to the file and decrypt it with the encryption key provided inside the access grant (see Understand and Manage Access Grants - Storj Docs), then provides the decrypted file to the client who requested this file providing either an access grant or an Access Key.

Alexey · October 19, 2023, 7:19am

Yes, it’s: Managing Projects on the Storj Console - Storj Docs

Yes, it will be limited by free tier limits (25GB of storage, 25GB of egress/mo and 10,000 segments).
However, only egress would be provided every month, the storage and storage segments are provided in total. So, if your account is full, it will remain full the next month, but you may download 25GB again until the next month.

Yukon · October 20, 2023, 5:40pm

Thanks.

Reading the jtolio answer in Sharing ML Dataset via read-only access grant - #3 by jtolio, a question come to my mind: in case the shared Access Grant is read and list only, is the described flaw still valid? Or this malicious behaviour is only possible if the Access Grant has write permissions?

Thanks!

jtolio · October 20, 2023, 8:02pm

Oh, certainly if you share an access grant with modifying permissions then the situation is worse - anyone with the access grant can do whatever they want to your account.

However, in the context of my post, I’m specifically talking about read access, in which a colluding uplink and storage node can drive an egress bill through the roof. That’s possible with read only access, yes.

Yukon · October 21, 2023, 1:21pm

Tx

Regardless of the type of access shared (as read-only), is there a way to limit (by time or amount of data or reducing download speed, maybe), based on IP of the client, for example?
I am thinking if there is some protection to avoid that someone wants to block the account by using up all the egress monthly traffic allowed, (would it be by downloading repeatedly or by abusing the commented flaw)?

If not, then the difference between sharing and Access Grant or an Access key is minor IF using a free tier, and in case of paid account, if traffic is limited to a low value. In both cases, I would loss access (due to egress limit reached by an abuser), regardless of the “flaw” of the Access Grant.

Thanks

Alexey · October 22, 2023, 4:05am

Yes. But only in your hosted web server which will work as a gate for authorization.
Because neither linksharing nor Gateway-MT reads caveats in the access grant, they simple uses it to upload/download your data. The access control is implemented on the satellite, but the traffic is not going through the satellite (it’s flowing directly between uplink and nodes), so it cannot be limited by the access grant currently.
Thus you need to place an own controlling server which will gate the traffic based on IPs.

Yukon · October 23, 2023, 3:42pm

Ok, and in case the egress is used up, all the time (I am thinking of some bad actor that wants to block access by downloading “constantly”), can I still revoke the read-only access (whatever it be acces grant or access key) if egress traffic is all used?
It could be fine, to be able also in the free tier to limit the maximum egress traffic.

BTW, there is a limit of simultaneous connections to a bucket?

It is a pity that the limitation by IP is not integrated or more easy to implement, specially when, sharing the access is already considered in design. That would be perfect.

EDIT:
And last, please, do you see some differences regarding privacy of the client accessing the objects, between using the native (CLI) vs the S3 Gateway? It seems to me that the S3 gateway method is less private (i.e., someone could know to which project the client is connecting, which is what matters me regarding privacy). But I am not sure, so I would appreciate if you could explain the differences regarding this.

Thanks!

Thanks

Alexey · October 24, 2023, 4:05am

of course:

But when you would generate a next access grant for the same path you would need to slightly modify caveats (because the access grant will be the same with the same API key, the same encryption phrase and the same caveats), like adding --not-before -1h to the uplink share command.

It’s limited to a free tier defaults. To be able to change them you need to have a PRO account, i.e. have a payment method in your account.

per project, not per bucket.

From the privacy perspective it’s better to use a native integration or self-hosted S3 gateway, because your encryption keys did not leave your local system.
However, since you publicly share a content from the bucket(s)/prefix(es), you deliberative gives the derived (from your encryption phrase) decryption key and derived an API key in the access grant, so it does not really matter which method you would use (except the discussed risks for the published access grants, especially root ones). But registration of that access grant on Gateway-MT gives you a more abstract Access Key, which cannot give to the attacker any clue about caveats, a derived API key and a derived decryption key. The other way is to use your own self-hosted S3 gateway in a web-site mode with a reverse-proxy or a web site with a native integration libraries, then nothing will be exposed at all.

You cannot figure out the account or project used without your account credentials. Content cannot be seen even with your credentials, it’s still encrypted with your encryption phrase.