Can an attacker store potentially incriminating data unencrypted on a node?

Hey friends!

So a couple of things on “FrameUp” and the potential impact of illegal data stored on the network, in general.

The FrameUp paper was written analyzing Storj v2. Storj v2 had a very different situation around replication and encryption than v3. Most of the issues FrameUp discusses are not possible with v3 - all data on storage nodes is stored encrypted and then Reed Solomon encoded.

I think this problem is unlikely to be a big deal (especially with the v3 changes) and I’m with @monty in that if someone did want to do this, there are higher impact ways to do it that don’t involve us.

That said, @570RJ and @BrightSilence are right - a modified Uplink could be created to disable encryption and the first k Reed Solomon pieces are just splits of the original data (or the modified Uplink could just avoid Reed Solomon entirely), so it is still possible.

Additionally, none of the countermeasures suggested in the FrameUp paper really work. The FrameUp paper discusses measuring entropy, etc, but all of these styles of countermeasures are defeated if the “incriminator” simply stores illegal data encrypted, but with a widely shared encryption key.

There are other countermeasures the FrameUp authors didn’t propose. One such would work a bit better, but would add more resource usage and load to storage nodes. We could design the system such that storage nodes encrypt data they receive at rest with a random encryption key, and then return the encryption key to the storer and thereafter throw it away. This would mean that all data on the storage node would be encrypted (and not with a widely known key), and the storage node wouldn’t be able to decrypt it. The downside is there is more metadata overhead (the Uplink and Satellite need to keep these keys) and there is more resource usage (a storage node like a Raspberry Pi with unaccelerated symmetric encryption could be impacted more than it already is). If we did this, we’d want to make a very careful encryption scheme selection that weighed the likelihood of this attack against the potential performance impacts.

This solution itself is not without flaws - the incriminator could simply widely share the key the storage node returned. Further, even if that problem were able to be solved, a traffic analysis would still reveal which nodes store which accessed data unless we directed all traffic through Tor, which would definitely be prohibitively performance restricting. I’m not sure there’s a fully general solution to the problem, if the main goal of the network is to store data for others and let them access it how they want.

So, summary, we think the likelihood is low and we don’t view this as a big risk, and the changes to v3 have certainly helped to some degree. There is a small amount more we could do technically, but additional work comes with tradeoffs we would need to evaluate, and there is still no panacea here. To some degree, these problems are endemic to any form of cloud storage, and participants in a distributed or decentralized storage system are no exception.

We recognize there is some risk to storage node operators, and we’ll continue to review our agreements and ensure that we are providing SNOs the greatest legal protection possible. Customers storing data, whether legal or not, is our problem, and should not be SNOs’ problem. We do intend to stand by and support our storage node operators, especially in these ways where older rules collide with newer technology.

11 Likes