Tardigrade Thursday thread: Meaning of Decentralization

I just showed what I see, when you show a picture. So, we need to explain it.

so, please explain it.

Well, yes. At least if all Tardigrade satellites exchanged the data among themselves, it would be an improvement. Though it would be real decentralization if all satellites (even those not operated by Tardigrade) exchanged data, so that if one satellite operator goes down the data is still accessible.

Or, I guess the data could somehow be tied to the client instead of the satellite. That is, I upload a file, signing it with my key. Then, I can download the same file from the nodes with my key, even if there is no satellite at all (or, let’s say, one satellite operated by some random guy and definitely not the original satellite).

I agree about running own satellite. Faced with a choice between that or storing data on own servers I think most would choose own servers.

However, the way the current system works, the data might as well be stored on the satellite itself - it does not add any reliability to distribute the data among multiple nods.

I am not saying that the satellite is unreliable and such, however, from the viewpoint of the client, storing data on Amazon or storing in on Tardigrade is exactly the same, as far as reliability is concerned. Maybe Amazon is even more reliable, being a bigger company and probably using older, time-proven technologies.

I think comparison with BitTorrent is correct here. The satellite acts like the tracker. However, the difference is that usually, I can download the same file using many trackers. Here, I would have to upload the file to every satellite and pay for multiple copies - the nodes would not even dedup the data. So, each satellite acts like a separate file server, no matter if the files themselves are stored in the satellite itself or distributed to the nodes.

There is always balance between ideal state and reality.
For me decentralization means

  1. Speed (because it’s distributed)
  2. My data does not stored in the single datacenter
  3. I have an access from anywhere
  4. Nobody except me can read my data

I want to have a distributed access too (on case of clusters failure), but I do not want to run a full node to achieve that. Except if it will increase the speed and will be simple as a one binary and will not require to download a full blockchain (or whatever) first.

Thanks for your explain, we just have different expectations of “full-decentralization” and different use cases as I mentioned before.

Let’s compare with traditional, “centralized” services, like Amazon, Google etc.

Amazon is fast, but this is probably faster.

I do not know how Amazon stores the data, but it’s probably replicated as well. Not sure about different datacenters though.

Same with Amazon.

Same with Amazon, though you may have to encrypt the files as a separate step.

So, is Amazon “decentralized”?

Here’s what Amazon and Storj have in common though:

  1. People with guns come to the company office and order to delete my data - my data is gone.
  2. Something really bad happens to the central server - my data is gone.
  3. The company goes out of business or decides to terminate this particular service - my data is gone.

Yeah, I guess we have different definition for “decentralized” than Storj Labs does.

By default - not. Only in cluster in one selected datacenter.

No, you need to encrypt it first with an additional tools then upload data.

I’m here to share my opinion not StorjLabs’ opinion.

What I want to do - is receive an explanation of decentralization from Community.

This will bring up a new problem. This is a paid service. All satellite operators could have an own pricing for access. However, it is included into decentralization - we need a decentralized payments too. And perhaps for data exchange?

OK, sorry for assuming.

OK, yeah, it’s a separate service, but someone can have his files in multiple Amazon Availability Zones.
And I assume each satellite is in multiple datacenters.

Which is why I wrote “as a separate step”. The only difference here is the tool used to upload the files, not the server. Storj-uplink encrypts the files itself, there is no reason why an S3-uplink cannot do the same.

I use the term “decentralization” the way it is used in regards to P2P networks.

  1. Napster was centralized.
  2. BitTorrent is an interesting thing:
    2a. Private trackers are “separate networks”. One goes down, all its files are inaccessible. However, a specific file is usually available in multiple trackers.
    2b. DHT (trackerless) is decentralized - as long as there are some nodes with the file, the file is accessible. There can even be multiple “hash-only” trackers that help coordinate the nodes, but are not critical.

IMO, Storj operates the same way as private trackers do, except that here, my files are only available on one satellite. If it goes down, my files go down with it.

Yes, I guess that’s why the alternatives (Sia, Filecoin etc) are more complicated.

Yeah, something like that. For example, let’s say a satellite goes down. I could use my key with another satellite and “move my files” there. The satellite checks if it can access the nodes with my data and then accepts the move or doesn’t. Now I pay that satellite instead of the original one.

Or, a satellite could be just an access provider - I want to download my file, I connect to any satellite with my key and download them (and pay for it of course).
I have to contract one satellite to do repairs and audits (and pay it for storage), but I can switch to another if it goes down, as long as I do it in a reasonable time (or the nodes will delete my data because they did not get paid).

In these cases, if something really bad happened, I could set up my own satellite to just pull my files out of the nodes (and pay them) for the last time instead of knowing that my files are out there, but there is no way to access them.

I have a couple of questions for folks here:

  1. Not counting massive web-mail hosts like Gmail, is email sufficiently decentralized?
  2. How about Mastodon? Is Mastodon a sufficiently decentralized Twitter replacement?
1 Like

Folks here would like to see your position and your vision, please not be afraid to share it.

1 Like

Email service is centralized - usually runs on one server and even Gmail is centralized since the service depends on one company. The protocol itself is decentralized however, since I can run my own email server and not depend on any service provider. Gmail may go away, but my emails on my server will not disappear.

I did not know about it until now, so my opinion is based on the short “how it works” video on the site and some quick reading.
It looks like Mastodon is essentially something like separate forums (called communities), but here users of one community can send messages to a user of another community, like email.
So, if one community disappears, the content there is lost, but the users can still reach each other (I assume a user can either join multiple communities or leave one and join the other while keeping his identity).
So no, not really “decentralized”. I mean, OK, not as centralized as Twitter, but still.

In comparison, however, Storj, as far as my files are concerned, looks as centralized as Twitter. It does not matter if there are 5000 satellites run by 4000 companies. If the satellite with my files goes away, so do my files. So, each satellite is its own isolated Twitter.

Here is an example of a decentralized open-source service.

Mastodon is decentralized. I think it’s important to think about the distinction between decentralization and having different networks. Because Mastodon is decentralized, everyone can access other instances and every instance can be owned by a person/community. (However, the content is not stored decentralized because if one community goes down, their content is gone. But the network itself is decentralized). [if I understood it correctly from that 2 minutes video clip on their website]
Tardigrade has different networks in which the data is decentralized. The control of the network is centralized. Each satellite creates its own network and since the satellites don’t talk to each other, each satellite is an isolated network. Therefore Tardigrade itself is not a decentralized network but offers multiple separate networks for decentralized storage.
It could become a decentralized network if the satellites exchange data and I could access my data from any satellite.
At least when looking at the technical aspect, tardigrade would remain being owned by a single corporation and is therefore still a centralized solution just like Gmail is a centralized solution for mails from google while their service might be decentralized on multiple servers/clusters/datacenters, just like Tardigrade stores its data decentralized, just in a different way.

To be honest, our position and vision with regard to this topic is best explained by this blog post: The Electric Car Example Applied to Decentralized Cloud Storage

Ultimately, I’m not super interested in “decentralization” “purity tests.” Instead, decentralization to me is just a tool, one way to solve a set of problems. It might be worth laying out the set of problems one is interested in solving, and then go about solving them with all tools available. Let’s not lose the forest for the trees - decentralization is useless if it makes no one’s lives better, so let’s start with a list of ways we want to make people’s lives better and then go and solve those problems without constraining our solution space.

To me, I’m interested in data durability and reliability across a large number of dimensions, and separately and personally, I’m interested in breaking the hegemony of internet oligopoly providers. The internet should be by and for the public, and Google, Amazon, Microsoft, Facebook, etc, have too much control.

For me, the future of the internet looks much less centrally controlled, and as we outline in the blog, the first step to that is to provide a compelling and better alternative to Amazon, Google, etc, by their own metrics (speed, durability, etc), and then the second step (one of our main goals for next year) is to build an ecosystem around that alternative (community-run Satellites).

Sure, I would prefer an economically sustainable, enterprise-grade, 100% decentralized (byzantine fault tolerant) service, but purely from an engineering perspective I don’t believe that those three goals are all simultaneously achievable, today. Fundamentally, AWS S3-throughput scale byzantine fault tolerant performance is just really hard. This might be possible soon! I’m especially interested in zkRollups. But that’s still too far out for us to build a business on, for where we are in the product cycle, today. Instead, we believe we can achieve our end goals (focusing on the desired result, not which tools we use) of a rock-solid service while providing a better alternative to the current internet oligopolic hegemony with (b), below:

image

(b) worked for email for the last 60(!) years. Anyone can run their own email server (though I admit that analogy is getting harder due to DKIM/DMARC/SPF).

I’m sorry, I can’t take this position seriously. Email is a poster-child of a robust, decentralized service. It’s especially decentralized if Mastodon is. But that’s all semantics. Instead of arguing about definitions of words, we should figure out what change we want to make in the world and do that, independent of the words we use.

So that’s my position.

Email protocol as a whole may be decentralized, but my messages on Google servers (or even my own server) are not.

And IMO Tardigrade is a good service, I just think that calling it, in its current state, “decentralized” is misleading. Just like it would be misleading to call Gmail “decentralized”, even if Google replicates my messages to multiple servers.

Sure, I agree us-central-1.tardigrade.io is like Gmail, no question, but it is our goal to build email, not Gmail. The fact that we’re building Gmail as well is a part of the strategy.

To me, some of the discussion around wanting a storage platform to be S3-level performant and 100% decentralized is a bit like asking for flying cars. Sure! They’re possible in theory, but there are a ton of engineering and logistical challenges. Flying cars have been just years away for decades. Community Satellites are coming up on our roadmap of course, but decentralizing a specific Satellite itself (making Gmail itself be decentralized, in the email analogy) is an engineering feat that does not yet exist. I think the argument it should exist would be much easier to have if there were any examples to point to. There are none, and I’m counting Sia (Sia nodes are like our community Satellites, when we have them, Sia is probably the closest here), and Filecoin (I keep trying to get Filecoin architects to read https://www.usenix.org/legacy/events/hotos03/tech/full_papers/blake/blake.pdf, I think they’re destined for disaster) as non-examples. They’re good attempts at exploring the design space of possibilities, but I don’t think they hit the desired end state out of the park by any definition.

1 Like

yes, sorry I forgot about my first sentence in the message and therefore it didn’t even fit with the rest.
Email as a network is indeed decentralized like Mastodon is. The data however is just as centralized as Mastodon.

And even though I think Tardigrade is just as centralized as google and could easily “control” content (if not for encryption but Storjlabs has authority over data and can delete it or shut down the network), the data is decentralized and it’s a good business model. I don’t think it is neccessary to have a fully decentralized distributed network. That is simply not possible at the moment. So personally I never wanted Storjlabs to achieve everything. The current model is great and if satellites could sync their data it would be even better. But more is not really neccessary.

1 Like

I can only comment on what is here now (and . Maybe after the 5 year plan is achieved, it will be different.

There should be a way for a client to get his metadata from one satellite and upload it to another, effectively “moving his files” to that satellite without having to download, delete and re-upload them.
Or have a local backup of the metadata in case the satellite disappears.

1 Like

Oh, I totally agree and want to do this. This was originally slated for production release but had to get pushed back. We have so much to do! Ultimately the way I see it, relative to the whitepaper, we’re still very much a work in progress, but everyone here in the community’s support is why we even have a chance to pursue adding future features like this! So thanks to all of you!