[ARTICLE] Tardigrade Thursday - Security Access Part 1

A New Standard in Data Security

Tardigrade is the world’s first enterprise-grade, decentralized cloud storage service—a cloud storage service powered by the Storj open source platform. Through decentralization, Tardigrade is more secure, more performant, more affordable, and more private—all by default.

Privacy and security are critical for distributed and decentralized systems. This is why we implemented technologies in the platform to ensure data remains secure and private in a trustless environment.

A combination of pushing end-to-end encryption client-side, adding macaroon API keys and providing simple, yet granular control to developers reduces the threat surface on your critical data to almost zero. With Tardigrade, developers can move access management from a centralized server to the edge.

The Power Behind Tardigrade Secure Storage

A Multilayered Approach for Secure Encryption

Tardigrade utilizes multiple encryption technologies that work in concert to ensure your data is
secure—giving you absolute control over how your data is accessed.

Tardigrade Encryption Methodology

All data stored on Tardigrade is end-to-end encrypted on the client-side. What this means is users control all encryption keys and the result is an extremely private and secure data store. Both the objects and the associated metadata are encrypted using randomized, salted, path-based encryption keys. The randomized keys are then encrypted with the user’s encryption passphrase. Neither Storj Labs nor any Storage Nodes have access keys, data, or metadata.

Client-Side Encryption

Every object stored on the platform is first split into 64 MB segments, then encrypted using one of
two included encryption schemes, the default AES-GCM 256 CTR, or Secretbox. The encryption scheme is designed to be pluggable, meaning developers may also integrate custom encryption schemes if requirements dictate it.

This encryption is designed to avoid using the same keys for content encryption of different files and different segments of the same file. This is advantageous because it makes sharing of encrypted files more secure and it doesn’t put other segments or files at risk if one of them is compromised.

Path-Based Encryption

Paths are encrypted in a hierarchical and deterministic way using the root encryption key. Each path component is encrypted separately based on information derived from previous path components.

Consider an unencrypted path p made up of path elements p1/p2/…/pn. The end goal is to generate an encrypted path e, which is made up of elements e1/e2/…/en. We achieve this using the process shown.

The order of listed items is determined by the paths stored on the Satellite. Listed items will always be returned in an order based on their encrypted path names, but won’t be ordered alphabetically when the paths are decrypted.

This method of path encryption allows users to share content under their path with another user without revealing anything at a higher level.

Content and Metadata Encryption

When a user uploads a file, it’s read one segment at a time on the client-side. Before each segment is split up, erasure encoded, and stored on remote Storage Nodes, a random
content-encryption key is generated. A starting nonce equal to the segment number is created and used, along with the random key, to encrypt the segment data.

Next, we generate the derived key, dk, which we define with sn+1 = HMAC(sn, “content”), where dk = K(sn+1) and sn is the last secret generated from the file path using the technique detailed
above. We add one more dimension of key derivation for content encryption to ensure a user can’t derive the access key to an unshared file that has the same prefix.

Each segment has metadata associated with it on the Satellite. Segment metadata includes the random key used to encrypt a segment’s content. We encrypt the random key with the derived key (dk) and a randomly generated nonce. The nonce is stored along with the encrypted content
key in the segment metadata. This way, we use a different random encryption key for each segment, but anyone with access to the derived key can decrypt those keys.

The encryption algorithm used for content and metadata is easily configurable between AES-GCM
and Secretbox, which are both authenticated encryption algorithms. This means if any encrypted
data is tampered with, the client downloading the data will know about it once the data is
decrypted. The layers of content encryption mean only you know what you’ve stored on the
platform.

2 Likes