Potential hurdles of GDPR in relation to tardigrade-using consumer application (Germany and world-wide)

I’m developing a product which leverages the storj/tardigrade network.
Since I’m based in Germany (though the target group is world-wide), I was wondering whether anyone has experience/information about how the GDPR might get in the way. (Sensitised by this post, where the (potential) issue is mentioned too.)

Obviously there should not be an issue, since all data is encrypted and sharded, but arguably the GDPR is yet another example of a law “throwing the baby out with the bath water” (which probably causes more problems than it solves). But I digress…

Summarising: would I need to do a lot of due diligence before going to market, or is there already “prior art” which may point to there not being a significant concern?

1 Like

I don’t know much about the German implementation of the GDPR laws but the UK information commissioners office produced some helpful checklists related to data storage, encryption and retention policies:

1 Like

I work for a multinational company and I am not a lawyer, yet I have to work with our legal department on a regular basis and Germany has the most stringent privacy laws, even IP addresses are considered personal information. I doubt the folks with Storj/Tardigrade will provide advice that can be construed as legal advice. I think you need to talk to a lawyer or data protection officer in Germany to understand what the pitfalls are. I know that when we store data for German citizens (or EU citizens) we have many hoops to jump through including annual penetration tests of our systems. Even loss of access to stored data (including encryption by ransomware) triggers the compliance requirement to provide breach notification.

1 Like

Thanks for the info and links both!
I’m as a first step merely looking for experiences and anecdotal evidence about this topic, so thanks for your insights. I’m aware that it won’t avoid me the hassle of actually getting more official legal advice as well :wink:

As a matter of thought experiment: Nextcloud seems to be doing quite well, not sure how they handled this. But then, I suppose their case is different because they just provide software for data access (at least for the self-hosted tier) and don’t actually store data. In the case of Tardigrade I suspect that the focus of attention is the satellite and its metadata database? So then one would have to make sure to prove security of the satellite data and any other user account data.

Another thought is (probably as a first possible step to be able to postpone GDPR compliance?) to have users explicitly waive the GDPR on signup, but I’m not sure that optionality is even legally possible.

Maybe this helps a bit: DSGVO/GDPR: Leitlinien für die Auswahl von Cloud Storage | Computer Weekly

GDPR does not forbid cloud storage but it aims to make sure that specific processes and procedures are in place to protect data.

I don’t think this can be done as it would contradict the aims of the GDPR.

The way I see it your biggest challenge will be the requirement for data processing agreements.

https://ico.org.uk/for-organisations/guide-to-data-protection/guide-to-the-general-data-protection-regulation-gdpr/contracts-and-liabilities-between-controllers-and-processors-multi/when-is-a-contract-needed-and-why-is-it-important/#2

When does the GDPR say a contract is needed?

The GDPR says that a contract is needing in two circumstances.

Firstly, Article 28(3) states that:

Processing by a processor shall be governed by a contract or other legal act under Union or Member State law, that is binding on the processor with regard to the controller…

This means every time a controller uses a processor to process personal data, there must be a written contract that binds the processor to the controller in respect of its processing activities.

Article 28(3) could be complied with not only by a direct contract between the controller and the processor, but also by other legally binding contractual arrangements (for example, a set of contracts between multiple parties) provided the processor is ultimately bound, as a matter of contract law, to each controller in respect of the particular processing.

Secondly, Article 28(4) states that:

Where a processor engages another processor for carrying out specific processing activities on behalf of the controller, the same data protection obligations as set out in the contract or other legal act between the controller and the processor as referred to in paragraph 3 shall be imposed on that other processor by way of contract or other legal act…

This means that every time a processor uses another processor (a sub-processor), there must be a written contract between the processor and the sub-processor. The terms of the contract that relate to Article 28(3) must offer an equivalent level of protection for the personal data as those that exist in the contract between the controller and the processor.

Further Reading

Relevant provisions in the GDPR – See Articles 28(3), 28(4) and 28(9) and Recital 81

Basically this says you need to have a data processing agreement with Storj Labs that binds them to the legal obligations set out under GDPR. But furthermore… nodes could be seen as sub-processors in this context. So this also outlines the requirement for processing agreements between Storj Labs and SNOs. This is obviously a much harder hurdle to tackle. I previously asked about this requirement in one of the town halls, but the response was a little evasive. It basically came down to there being no compliance with GDPR at the moment, but they’re looking into it. To me, this seems like a near impossible hurdle to tackle. As requiring legal contracts between SNO’s and Storj Labs would basically make it impossible for just anyone to contribute. The law clearly doesn’t deal with encryption very well. Since strict control of keys means you don’t have to trust a data processor to ensure data is secure. For now, unfortunately it seems that using Tardigrade for personal data isn’t an option in the EU.

@jocelyn, it would be nice if this topic could be addressed again in the next town hall? Perhaps there has been some progress I’m not aware of. I’m sure Katherine has been thinking about these subjects a lot over the past months.

Disclaimer: While I have dealt with GDPR in my daily work, I am not a lawyer or legal expert of any kind. The above should not be taken as legal advise. Don’t make any decisions without consulting a lawyer to talk through your specific situation.

2 Likes

This is the bit. Storj might not be a processor under the terms of the GDPR legislation as they cannot read or ‘process’ the data. Although with the linkshare service, and other services which provide decrypted access it could push them into the processor classification.

The definition of processing in GDPR is very broad and includes just receiving and storing data. As I mentioned before, the details on encrypted data are lacking. GDPR doesn’t deal with encryption. As far as I know this is still an open question. But there are suggestions that whether encrypted or not, PII data should be considered as such anyway. The reasoning is that GDPR differentiates between anonymized data and pseudonymized data. For which anonymized data means there is absolutely no way to relate that data back to an individual. Pseudonymized means you would need additional data to relate the data back to an individual. Encrypted data could be considered pseudonymized in that context as it requires an encryption key to decrypt. Unfortunately, while GDPR doesn’t consider anonymized data as PII (personally identifiable information) data, they make no such exception for pseudonymized data. Whether this is actually an interpretation that will be used in court kind of has to be tested. European law is often extremely vague and the different implementations in member countries really don’t help make it any clearer. So… data processing agreements may or may not be required. It would really be great to get some information from a lawyer experienced in this field.

At my place of work(NL), we don’t store encrypted PII data without a data processing agreement with the processor. But that’s also because we would prefer to do too much than too little. And this doesn’t mean the law actually requires it.

Disclaimer: While I have dealt with GDPR in my daily work, I am not a lawyer or legal expert of any kind. The above should not be taken as legal advise. Don’t make any decisions without consulting a lawyer to talk through your specific situation.

ive placed this on our internal slack to discuss at an upcoming gdpr guild meeting. thx!

1 Like

From what I understand Storj is clearly the controller by definition of the GDPR of the data, as they determine how it is processed, distributed and stored by their software, not to forget the satellites they operate.
I don’t think there is a twist that could make the GDPR not applicable for them.

I am seeing an additional challenge: Because maybe you could get off the hook of the GDPR rules for the encrypted data pieces (one could argue these are not personal information as they are not only encrypted, but also segmented so nobody can read them so they cannot be personal identifiable information).
But the Storj nodes interact directly with the up- and downloader. Therefore they process IPs (They even show up in the logs sometimes). As IPs are considered personal identifiable information, there is no question that Storj nodes process such.
And this at least would make the GDPR fully applicable for nodes in my opinion. It could again be considered differently if the nodes would only interact with central gateways operated by Storj, but this is not how Tardigrade works.

I don’t however agree that nodes are sub-processors. As Storj probably is to be considered a controller in any way, the nodes would be normal processors. But anyway I think this does not mean anything in terms of the responsibility.

I also do not agree that there need to be written contracts, as the GDPR directive states “contract or other legal act” which could be an approved certification process. However of course this does not reduce the obligations that stem from the GDPR.

As Storj self has stated that it is not GDPR compliant at the moment it turns into the tough question, if they even can legally offer their service in the EU (which means for SNO as well as for EU citizens who are using Tardigrade) currently or if they are at the risk of fines from data protection authorities any time. I wonder if the SNO then could be fined as well? Generally it is considered that if the product or service is offered within the EU, then the data processing needs to comply with the GDPR, whether or not the company is physically located there or not.

@jocelyn: Let me bring back 2 topics that I have started, that are connected with this one:
Maybe being a member in an association like Eco could help to address and resolve all those questions and uncertainties: Increasing Storjs exposure to potential partners and customers
Need for certification of GDPR compliance:
Tardigrade independent 3rd party certification / audits?

I believe it’s not a big problem, although I’m not an expert on this field. I’ll just list what is on my mind.

Tardigrade itself are the only ones that can identify a customer, but only if that one used his credit card to pay for the service, otherwise a customer does not need to enter any identifiable information. Here tardigrade has to put up a declaration that this data is just used for processing payments.

EDIT: Forgot the email address. But that one is only used to contact the customer for account related things.

As for SNOs: Since the data is encrypted on the customer side there is no identifiable information except the IP Address (and let’s be fair, unless you are an ISP there is no way to recover real information from that), everything else is just scrambled data. This can also be a declaration for the SNO that the IP is just stored (log files) and not processed. If the IP Address is not necessary for running the service it also does not need to be stored.

I wonder. The linkshare service does not work without your consent. It can’t access your data without the Token that you have to provide. Also it can not identify you as a person. Anyone can enter that Token.

If I store my own data on Tardigrade, yes.
If I store the data of my customers on Tardigrade, then those customers may not even know that their data is stored on Tardigrade or that the linkshare service is used.

That is part of your own GDPR declaration. Tardigrade does not have anything to do with your customers.

You and your customers are the only ones who can create the token for linkshare.

These are typical tasks of processors. The controller for Storj would be the Tardigrade customer. They collect the data, decide what happens with it and are also responsible for removing the data when applicable. Amazon would be doing all the things you mention, but just for their internal storage infrastructure. That doesn’t make them the controller either. Both would be processors under GDPR.

While it is true that IP’s are considered PII data, it doesn’t necessarily mean that there needs to be explicit consent to process those. GDPR allows collection of information if it is of legitimate interest to the collector. Web servers log IPs all the time using this argumentation. And in the case of nodes IPs are only logged for certain errors, which could definitely be considered a legitimate interest similar to logging IPs for security purposes on web servers.

Either way this needs to be legally binding. I don’t think this gives much flexibility at all.

They absolutely can. Customers just have to be aware of this and not store PII data on Tardigrade for now. But it’s absolutely no problem to store other content on Tardigrade.

I think we need to separate out Tardigrade as a collector and controller of data from their own customers and the platform as used by customers as a processor of data. While both are important, I think the latter was the core question being asked here. And in order for Storj Labs to operate in the EU the former needs to be in place already. So I kind of assumed that has already been settled.

Thanks all for the abundant information. Obviously it would be a great help if Storj would consider this topic as an issue for wider adoption / certification.

In the meantime (trying as much as possible to find a maximum pragmatic and minimum resistance approach), I’d focus on the following from above:

  • satellites are processors
  • storagenodes are processors
  • IPs are PII (which is unfortunate)

Now as a thought experiment: if there were no satellites, the problem would be as small as (legal) BitTorrent, there would be only P2P and an implicit consent from each SNO/uplink user (recall that you can see peer IPs in BT clients) to communicate with these peers. The company offering the infrastructure (Stroj) would be unrelated to this PII (except for user account management, but this is true for every web-shop out there).

So that would move focus to the satellite which coordinates Tardigrade’s P2P and as such not only passes PII between SNOs, but also stores relationships between them. Even though there’s no more information in there than “IP1 is related to IP2 by storing an unidentifyable data shard”, this might be an issue, also because as was pointed out above the GDPR doesn’t really merit encryption. Although I would find it hard to believe if there were no ways to get this to help the case, because encryption is the only way available to any entity processing data to enable confidentiality/privacy/security.

Since this appears a complicated avenue, let’s go back to “how others appear to do it”. Remember how many people use e.g. W*, G*, F* communication platforms; here’s just one (quickly found, arguably not the best or authoritative) article to sketch the idea: Is WhatsApp in breach of the GDPR? A lawyer's view. What caught my attention is the mention of article 6 (regarding consent and obligations). Spoiler: W* is far from GDPR compliant in the author’s view, and he’s probably quite right.

But it may put us on the right track:

Suppose:

  • every user of the platform is guaranteed to have given specific consent about the extent to which PII is shared (which in our case could be something like “limited to your IP address within the network, (highly/securely) encrypted and segmented data only accessible with your encryption key”
  • it is not possible that this data is (inadvertently) shared with entities that have not given this consent

Given customary measures (I guess e.g. passing security audits to cover the second point), and we define only two types of entities: (1) uplink/storagenode users (tardigrade/SNO), (2) satellite operators (Storj company), then one would only need to ensure that:

  • explicit consent is obtained from above entities for sharing of above data scope between above entities
  • technical measures are in place to avoid leaking of defined data beyond the defined entities

And that sounds pretty doable… Or am I missing something obvious?

EDIT: @BrightSilence seems to have summarised more aptly and come to a more or less similar conclusion. One thing is still a bit troubling:

This would on one hand “limit the use case” for users to store company/professional data on their Tardigrade account, but this would be their responsibility and not Storj’s.

I feel like this may or may not be relevant to the discussion, so I wanted to add it to be sure. It was mentioned several times that data is encrypted and split up over nodes. While the first part (encryption) is definitely true, there are exceptions to the being split up part. Specifically, very small segments that would be smaller than the metadata required on the satellite are instead stored inline on the satellite and not on storage nodes. While this almost certainly won’t apply to most database backups with PII data, it very well might to backups of individual communications, which can also contain PII data. Of course in any case encryption is still applied. So again it kind of boils down to the question whether encryption is enough to not require data processing agreements. I’m almost certain that the splitting up in itself holds very little value when you apply the laws as they are described, but it’s worth keeping in mind that this mitigation also doesn’t always apply.