Don’t forget about Filecoin for important backups, guys

OK…

However, Filecoin is the… “Airbnb” of Web3.0 per the Holon article.

Why do I need to message this specific human on a slack channel in order to find his service node on Filecoin?

Dishonest entities have been known to do some interesting things…

What I’m looking for, as a generalized buyer of assured data storage, is an independently verified and automated method of ensuring that a dishonest provider isn’t going to be a problem.

Storj does this.

Storj actually does ensure that dishonest SNOs will absolutely not be a problem.

How does Filecoin do this in a way that is transparent and independently assured to the client?

The blockchain doesn’t have the ability to do that… because the blockchain is only reactive to a problem.

The standards don’t do that… because the standards are not enforceable in real time along with assured data recovery if a dishonest provider is on the network.

Without a repair algorithm and a verifiable — and paid for — method of recovery, there can be no assurance of data integrity.


Assurance defined:

https://csrc.nist.gov/glossary/term/information_assurance

Measures that protect and defend information and information systems by ensuring their availability, integrity, authentication, confidentiality, and non-repudiation. These measures include providing for restoration of information systems by incorporating protection, detection, and reaction capabilities.

1 Like

I think we need to account an environmental impact too.
Since FC have (currently) only a block rewards to storage providers and FC Blockchain uses the PoW, it has a high impact on environment.
Even if they implement more robust audit system and possibility to repair corrupted data sometime in the future, they still uses a replication as a reliability mechanism and PoW to have rewards.
See The Green Case for Storj

So I personally think that FC cannot be used for robust and long time living backups. It can be not only lost or corrupted, it also will harm the environment.

1 Like

Yeah man, thanks for caring about my important backups, I sure would not wanna lose those!

Oh :see_no_evil:

Oh you mean like the unpaid test net satellites introduced here and here.

Yeah, I wasn’t counting those on the Storj end either. All other satellites are paid as well. That still doesn’t provide evidence that the data stored on either paid satellites or paid filecoin storage providers is actual “real” data and not random noise. It just means someone is paying for it. In both cases.

I appreciate the links btw. But it seems the vast majority of data usage is slingshot
image
I’m a little surprised to see this add up to 37PiB, when elsewhere the total usage was reported as 33PiB (though 37PB, maybe it’s a unit confusion). However, I don’t want to nitpick, but it seems that without that one use case there isn’t much else left.

I don’t really know why this is still a point of discussion. All you have to do is read the Filecoin FAQ.

After I made a deal with a storage provider and sent my data to them, how exactly is the data supposed to be recoverable and healable if that storage provider goes down?
Automatic repair of faulted data is a feature we’ve pushed off until after the mainnet launch. For now, the way to ensure resiliency is to store your data with multiple storage providers, to gain some level of redundancy. If you want to learn more about how we are thinking about repair in the future, here are some notes (opens new window).
How do I know that my storage storage provider will not charge prohibitively high costs for data retrieval?
To avoid extortion, always ensure you store your data with a fairly decentralized set of storage providers (and note: it’s pretty difficult for a storage provider to be sure they are the only person storing a particular piece of data, especially if you encrypt the data).
Miners currently provide a ‘dumb box’ interface and will serve anyone any data they have. Maybe in the future, storage providers will offer ACLs and logins and such, but that requires that you trust the storage provider. The recommended (and safest) approach here is to encrypt data you don’t want others to see yourself before storing it.

They clearly acknowledge that file loss and even extortion are a valid concern and their response is that you can mitigate that by using multiple storage providers, basically punting the responsibility back to you as the client. If you trust a single entity there is a risk of loss, if you trust 2, there is a smaller risk of loss, but it isn’t just gone. And with no repair integrated (yet), availability can only get worse and not better.

It strikes me that there is really nothing decentralized about the data storage itself unless you make it decentralized for yourself. Though that really feels to me like “I stored my data with copies on AWS, GCP and Azure, so it’s decentralized!”. What is decentralized is the deal making processes, payment and data auditing. But data storage really isn’t unless you as a user ensure a distributed set of storage operators. And how many replications should you upload then? If you create 3 then you already have a larger expansion factor than the maximum on Storj (about 2.7) and you’re still nowhere near the 11 9’s of durability that Storj offers.
Lets just link some reference here: https://oceanstore.cs.berkeley.edu/publications/papers/pdf/erasure_iptps.pdf
And if you’re ok with first party info (section 3.4): https://www.storj.io/storjv3.pdf

I think it’s possible to build services on top of filecoin to implement erasure coding and repair. At which point that can probably be done with less stringent requirements on expansion factor, because of the higher requirements for individual nodes. But you would be reliant on some sort of orchestrator… say a satellite perhaps? At which point the remarks on Storj not being as decentralized would immediately apply to filecoin as well. I personally don’t think that is an issue, especially if there are several such services to choose from. Which is also why I hope to see more development on support for storj community satellites soon.

But getting back to the remark that there is no wide spread evidence of file loss. You (probably unintentionally) actually linked to pages yourself that refer to data loss for sling shot, when you linked to their pages about repair functionality.

Slingshot Repair is an effort dedicated to restoring the copies of Slingshot data that were lost as part of recent events on Dec 13, 2021. Unlike the Restore program, this program is targeted at the data that still has at least 1 replica in an active deal in the Filecoin network (there are ~54TiBs of such unique data).

So ~54TiBs of data that still has a replica… but what is that ominous Restore program they are refering to, for which this is apparently not the case? Let’s look at the linked doc.

Objective
The Slingshot Restore Program is an initiative to restore ~12.5PiB of Slingshot data on Filecoin as quickly as possible.

Ooooh, only 12.5PiB. Or roughly 1/3rd of all data on filecoin. No big deal.
Perhaps I’m misreading this… but that sounds highly problematic.

All of those links basically only talk about block rewards and going by your own words customers are barely paying anything into the system compared to these block rewards. Could you respond to the question of what happens when those dry up? Because these links don’t provide an answer from what I can tell.

So that’s a link to a press release about an investment fund that doesn’t mention any certifications or SLA’s at all. I did some Googling to see if I could find that anywhere else… but no. Do you have actual evidence of this? Because it kind of feels like you just linked something random and unrelated.

You didn’t post an example of clients with hundreds PiB use cases. You posted a link that shows there is one client accounting for almost 80% of all stored at 30PiB. The rest are 6PiB or lower. Where are the multiple clients with intentions of hundreds of PiB?

It is, you count all paid data on Filecoin, but add an additional requirement for Storj. All data on the trusted satellites is paid for. There is 9.6PiB of paid data on Storj. That’s what you should be comparing to by your own definition.

3 Likes

Does Storj publish a list of all of its SNOs? Why in the world would Filecoin publish this information, this is Web3 where participants can choose their level of public information or choose to stay anonymous, you are clearly thinking with a centralized authority mindset. Some do choose to publish their information, some don’t, anyone familiar with blockchain understands this.

Obviously you are far more comfortable relying on a central trusted authority. Why not just use AWS or GCP? They do sharding and encryption. I think you miss the whole point of Web3 and open marketplaces and open transparent systems.

Actually it is not PoW but Proof of Storage (PoST). You might want to Google the green aspects of FC. Several of the providers use data centers with 100% renewable energy, hard to imaging all SNOs use green energy.

The Holon “article” claimed Filecoin is the Airbnb of decentralized storage…

So…

Where is the Airbnb experience to be found?



I know this thread is just going to go in circles for ever. That’s fine, and sort of the point.

You run a Filecoin node. I didn’t know exactly what that meant before this thread. I greatly appreciate the information you’ve provided on Filecoin. I also greatly appreciate you supporting the Storj network by running a few nodes.

You seemed to want an honest debate of the merits of Storj vs. Filecoin. Hopefully the debate didn’t get too personal. I don’t think that was the intent of anyone, certainly not me. For myself, I just didn’t know enough to enter the debate without first getting some pointers to how Filecoin works and what’s involved in running a Node.

I’ve gotten what I was looking for… a better understanding of what I was looking at and finding about Filecoin.

I’ve posted WAY TOO MANY replies on this thread. So, this will be my last reply on this one.

Some of this USC data is still be ingested.

Yes, just as I point out there are mitigations.

Actually Filecoin is decentralized as I pointed out with examples above. Stoj is not at all decentralized despite the marketing claims and is no more than AWS or CGP with outsourced JBODs.

Actually this shows one case, not ‘widespread loss’ and certainly no evidence has been presented of ‘many complaints about file loss’.

And it is nearly restored already. Customers will not have noticed given the replicas stored on other providers which is part of the requirements of Slingshot. Don’t skip over this as well

The Filecoin network similarly helps ensure resiliency despite localized failures by:

  • Creating many copies of data across many different geographic locations (the Slingshot competition, for example, rewards up to 10 independent copies of data to increase resiliency)
  • Replicating data stored in the network across many independent storage providers (clients usually store deals with 4-6 independent providers)
  • Repairing deals in light of failures by creating new replicas with new storage providers when an SP goes offline, maintaining a high storage redundancy rate over time.

I saw a cut sheet from Jonathan and took note of it. You asked for an example of a storage provider and I provided it. Now you can choose to ignore it or follow up.

This whole thread started as a result of me correcting inaccurate statements. It does not do the Storj community any benefit by throwing out strawmen and false assertions.

I know you know this isn’t relevant, because you know how Storj works. As a customer, your trust relationship is not with the individual node operators, but with the satellite. If community satellites pop up that don’t provide any information on their repair architecture and controls. I wouldn’t entrust my data to them either. With Storj satellites you get an SLA with uptime and 11 9’s of durability. You know exactly what you get.

Because on filecoin your trust relationship IS with the storage provider. I don’t care if filecoin publishes it or the storage provider themselves. But someone should. Otherwise you just entrust your data blindly to some entity.

Neither @anon27637763, nor I have even hinted towards this. We’ve just asked for some guarantees. We’re comfortable with knowing what we put trust into, before we send important data there. That’s all.

Again with the moving goal posts… come on man…

I agree that FC is certainly not as bad as PoW systems. But it still requires a lot higher expansion factor (HDD space) and processing power than Storj does.

And for what it’s worth, my nodes use no power at all. There isn’t a device running or a disk spinning that wouldn’t have been had I not been running Storj. Pretty hard to beat that.

I would like to just say the same. Debate is how we learn. I appreciate both of you for the info provided!

Right, mitigations, not solutions. And it’s up to you as a customer to implement them. That’s a massive difference compared to Storj. And that’s fine, but lets not beat around the bush. This is a win in the Storj column.

You keep saying you’ve pointed things out when you didn’t. You as a customer must store data in multiple places. You could do that with any old set of data centers. Filecoin is just small centralized datacenters on a decentralized deal platform, with the added bonus of not being able to trust them based on airtight SLAs. (See I can do oversimplification too)

Airbnb is decentralized living! (I know, airbnb isn’t on the blockchain. But you know… business idea! Time for a proof of meatspace concensus algorithm! :slight_smile: )

Yep, just a third of all data stored. No biggie.

Then what am I misreading here?

Unlike the Restore program, this program is targeted at the data that still has at least 1 replica in an active deal in the Filecoin network (there are ~54TiBs of such unique data).

That seems to me that the restore program is for the data that DOESN’T still have at least 1 replica in an active deal. And that restore program covered 1/3rd of all data on filecoin.
I don’t know whether that means they recovered it from the original source or from storage of expired deals.

Did I though? Or did I specifically ask for one that has these certifications?

Hmmm… You could have just scrolled up after I addressed that you didn’t answer my question.

See I will take these accusations seriously if you point out where I set up a strawman or made a false assertion. I believe all I did was ask for examples and evidence. And provide examples of myself. But I’m not perfect, I’m open to be corrected. But I can’t respond to these open accusations that don’t address what I actually said.

I’m even asking you where I misread things because you provide some counter evidence. That’s genuine, if I misunderstand, I want to know. When you linked to an article that didn’t contain an answer to the question I asked, I tried to Google it myself to give you the benefit of the doubt. I don’t think I have been unreasonable towards you, despite you dismissing both me and @anon27637763 as “just not knowing enough” or suggesting ulterior motives. But if you disagree, please be specific and I’ll address it.

I have pointed out what decentralization and Web 3 means several times in this thread.
Key attributes if a Web3 decentralized system:

  • User control - Clients can decide who to store with, at a price they choose, with as many providers as they choose, with providers that meet their requirements
  • Public Blockchain - Provides security, audit ability and transparency of [storage] transactions, not trusting a central authority to control the system
  • Open Source - code and communities transparent, democratic and distributed decision making
  • Verifiable - without requiring special access
  • Permissionless - access not dependent upon special privileges
  • Trustless - algorithm and actions determine trust not central authorities
  • Native built-in payments - transactions based on tokens

Actually, it was one replica of 1/3 of the stored data. Actually not a big deal except for the storage provider that suffered a $1m penalty and is actively being supported by the community to restore his lost data.

Yes and once again you ignore that I mentioned that there is nothing inherently decentralized about the DATA STORAGE part of filecoin, unless you as a customer store your data in more than one place. Which is just true. And your lecture on web3, which really wasn’t necessary, doesn’t address any of that.

I’ve quoted twice now the exact text that said this was about data that didn’t have at least 1 replica left on filecoin and asked you specifically how I misread that if it wasn’t correct. You failed to address that too.

It seems we’ve reached the end of usefulness of this conversation and we’re just talking past each other now. So I’m gonna follow beasts example, thank you for the conversation and wish you a nice day!

You simply never presented any evidence of your claim. I just gave you a list of the attributes of Web3 decentralization and Filecoin is fully decentralized. You are mistaken to believe that decentralized means that the storage media is highly distributed. If that were the case then Azure and AWS and GCP would by your definition be decentralized. I have yet to see you counter any of my assertions on this. This is not simply my opinion, but as you read third party material you will see this is the case.

I think the thread is stuck.
There is no SLA, there is no data decentralization, it has rudimentary audit once a day, there is no repair, it uses replication as a reliability mechanism and high-end powerfull specialized hardware to waste environment.

Backup may be lost (and this is happened several times, and even added to their FAQ), unless you would use several providers.

As a customer I would say, that FC is not ready for important backups yet.

Maybe it’s ok for mining, but not for the storage at this stage.

5 Likes