As many of us all know, storing data on-chain is not a viable solution when it comes to anything that isn’t very small, or data that you want to keep private. Therefore, off-chain storage is needed. However, there are many foreseeable use cases where access to data is controlled by on-chain logic. Having done some research, at the moment there is no existing solution to this problem; there is no way for a smart contract to act as access control to data (except for the case of filecoin - which is not hot storage and would not serve many of these possible use cases).
I’ve been thinking about this a lot and wanted to propose an idea with the hope that the people on this forum could try and poke holes in it with their better knowledge of the Storj codebase. Obviously I am going to talk about the Storj codebase in ways that it does not currently function - this is just a what if? The big issue in coming up with this idea is that data stored on chain is public, even if the variable is marked private - that is the nature of a public blockchain - so we cannot just store access tokens there. What we can do is use asymmetric encryption to mask data for everyone but the desired recipient.
In Storj, a user can create a client ID and secret (which is an asymmetric public/private key), and deploy a smart contract storing the client ID. This will be used to encrypt messages back to Storj. This smart contract has a function that will contain logic (e.g. has caller paid some value) and upon validation of this logic, will create an auth token/access key (this can be a one off access key or have some other form of expiry/none) and encrypt it with the caller’s public key, returning it to the user. The function will also encrypt the token/key with the client ID and emit it in an event. The Storj platform will have an event listener connected to the smart contract where it will receive the encrypted (by Client ID) message and decrypt it with the client secret and add the token/key to the authorisation database. The caller can then make a http request with the token/key to the resource which should then be accepted.
Importantly, no sensitive info is stored publicly - the unencrypted access key/token only exists in memory on the EVM which is erased after the function terminates. With the knowledge I have of the EVM, and a brief review of the Storj codebase, I cannot see any reason why this would not work - besides its reliance on functionality that does not exist. I really look forward to anyone’s criticism and feedback.
What’s problem are you trying to solve? I do not see a usecase here except using a blockchain because it’s fancy and cool.
The only problem which would be nice to solve is to have an encrypted decentralized and fast database to store user’s metadata.
Any todays blockchains are not fast enough because of consensus, maybe except zkSync Era. However, I’m not sure, that their performance would be enough for the database. And It’s also public. Even encryption could not help, if all data would be publicly available - you may engage hundreds supercomputers or million of smart devices or few quantum computers to break it.
Always a good question to ask. The value I see in this idea is that you can provide and control access to data without having to rely on a third party/single point of failure. The reliability and security of Storj is one of its main selling points, however (my understanding is) there is still a central point of failure - being that a person controls authorisation and data within their account.
You are right in that blockchain is often implemented not because it is better but because it is new and cool. At the moment however, there is no existing solution that I know of that will enable someone to create a fully decentralised access control to off-chain data.
This is very true and something that I overlooked. If I have sometime, I will mock up a crude implementation to test.
All data would not be publicly available, but the access keys/tokens would be so your point is still valid. One possible way around this is to make the auth token single use - can be used once and then it is no longer valid. If these are granted each time someone requests access, then there is no time for whatever resources a malicious actor may have to break the encryption. Additionally, breaking encryption with lots of smart devices/some supercomputers/a few quantum computers is a problem facing cybersecurity/cryptography generally so I think I would be unequipped to solve that problem here If someone had the resources to break such encryption I think they would go after perhaps something more lucrative too (depending on what would be stored with this though)
Thank you for engaging in this discussion I look forward to any other inputs you may have.
The access is controlled by you already, only you know the API key, satellite and encryption key (derived from your encryption phrase), without these main 3 components of access grant nobody has an access to your data. The access grant also may contain permissions and bucket(s)/prefix(es)/object(s) keys, see Access Management - Storj DCS Docs
So, having only access grants stored on any blockchain does solve nothing, because these info is not enough, you also need a Metadata (where pieces of segments of your encrypted objects are stored, and hashes of root API keys).
At the moment all that Metadata is stored on the satellite distributed across many datacenters.
So, exactly this Metadata is needed to get addresses of the nodes stored pieces of your objects. This data must be consistent and accessible everywhere in fully decentralized system.
The next point - you need some server instance, who will audit and repair data, updating addresses for segments in the Metadata, if nodes become unavailable. Without that you will get a Filecoin - your data may be corrupted or deleted, they do not have an audit and repair service.
So you must run an own full node somewhere or on every device where you want to have an access (and now you get a SIA way, but even they do not store your Metadata on their blockchain, it’s stored on your full node locally).
Currently no one blockchain is fast enough to work as a database because of consensus. Your access to data, its consistency and reliability will depend on speed of the chosen blockchain and your full nodes (otherwise you would need to share at least keys to decrypt addresses of nodes for each segment of your objects with other full nodes).
If you would not run a full node or run it only periodically (and it must sync with a blockchain), your data eventually will be lost.
In the public blockchain? It must. Yes, it can be encrypted, but anyone can download it and force decrypt without any limits.
This is a single point of failure/point of centralisation that I am talking about. If other people are using this to store data, they need to trust me to provide and control access accordingly. If this could be delegated to a smart contract on-chain, then the access is controlled deterministically and securely.
Let me explain this with an example. Imagine a protocol that lets users upload and share videos. These videos must be stored off-chain (e.g. Storj), however this protocol enables creators to earn revenue per view (i.e. a user must pay some fee to watch). Therefore the protocol must ensure the user has paid in order to access the video. Enforcement of payment can be done on-chain but granting access cannot - as you said, access is controlled by someone (me if I was the one to implement this). That is where this idea comes into play - can we detach the control of access away from a person/entity and have it validated deterministically but securely. Without that functionality, you are right in saying there is no use case for blockchain - the only value is if you are able to implement the entire solution in a decentralised manner.
In saying that, I am convinced this idea fails for the other points you make. The complexities introduced by the metadata information, auditing and repairing data, encryption of node addresses, and speeds of blockchains introduce complex issues that I do not have an answer for.
Thank you so much for your responses, they have been so very educational.
In expense of slowing down your ability to store and access data, plus you need a full node to get grants on each device which should access your data.
You just missing the main advantage of the access grant - it contains all needed permissions, they are not stored anywhere, so the only point of failure is you - if you share it with someone, they will have the same access incapsulated to the access grant, include derived encryption key. The access grant is a macaroon, a selfcontained cookie, some kind of token. It doesn’t need to be stored on the satellite, it should be stored on your device, which need to have an access to your data.
And storing it on the public blockchain doesn’t make sense - anyone can download your access grant and will have an access. Thus this access grant should be encrypted or you should not use a public blockchain. So you need some controlling server who would at least decrypt encrypted access grant downloaded from the blockchain and then will download data from Storj DCS for the user. In this case why to bother with a blockchain at all? If you delegate decryption to to client, who will stop them to decrypt the access grant and have an access for free?
That’s another feature, and you may implement it right now - you need to write a smart contract contained the access grant and it will control this level of access, like number of watches and so on.
The access grant allows you to specify only read, write, delete, list operations and time before and time after, also paths (buckets, prefixes, objects keys (names)), satellite address, derived API key and derived encryption key. So you need to implement an extension of access grants on your site to control such kind of access like number of watches. And in this case I do not see why you need to use a blockchain for that, only to collect revenue maybe? However it’s a very specific usecase, not generic.
This is it, the part I missed. Perhaps I need to read up more on macaroons, but what you’ve explained there makes it pretty clear. And then your mention of a controlling server to decrypt the encrypted access grant makes perfect sense.
Worth noting, this video example is not something I am going to implement, especially now haha. Thank you for your help - I am going to mark this as a solution but I do not want that to discourage discussion from anyone on this topic