Disk usage discrepancy?

This command will work only if you add all untrusted satellites to the untrusted list, as specified in the article. If you did not - you need to provide them right in the command, like this:

you should not disable a filewalker, only a lazy mode.

Okay that seems to work for me:

i have enabled file walker again and hoping that will help on the file amount

I guess i will have to continue watching to see if space will decrease.

1 Like

I know :wink:

And also considering to create a youtube video to explain it.

But until that (with more context):

Satellite (metadata server) and Storagenodes should agree on what pieces are stored.

There are multiple solutions for this. For example with Apache Hadoop HDFS / Ozone the Storagenods (they have different names, but I use the Storj names here) reports back to the stored pieces to the metadata server.

It has hard scalability issues, as reports can be very large. To fix this, they implemented incremental reports, which has own problems…

Storj uses the opposite direction, the Satellite sends the list of the stored pieces to Storagenodes, and all the pieces which are not in the list can be deleted.

But still the full list would be a very huge list (like Gigabytes). Instead of a huge list, Storj uses bloom filters which is a probabilistic data structure.

It can categorize each pieces:

  • surely can be deleted
  • definitely should be kept

Bloom filter is very small (like 1-2 mbytes), but in exchange, it may miss some deletes (it’s never wrong about the files which should be kept). But eventually all the files will be deleted. (0.5-1.5% overhead is possible, but seeing 7 TB used space vs 4 TB space reported by satellite is a bug.)

An issue just created to double check the current behavior / parameters of the full bloom filter.

5 Likes

@Alexey

It need to pass several GC filewalkers, so, it may take several weeks.

2 Likes

Just to understand. after having run the forget satelites do i then need to wait 1 week for GC to come clean it? i still have not regained and good amount of space so this is still an issue :slight_smile:

Hello elek!

We all know that satellites have performance problems.
What is the reason - in the architecture of the storj or in the desire to save satellite resources, or both of these points, unfortunately this topic has not been covered.
Yes, this probably doesn’t matter, for us node operators. Because before the bloom filter there was no such thing.

To delete or not to delete - that is the question.
But for the node operator this does not result in 1%, but for example in my case 15-20% on old nodes.

So saving the satellite resource with the bloom filter is good, but it seems that the node operators do not receive payment for the pieces they store.

There are also a few more points that are not entirely clear to me.

  1. Why, after deleting the satellites, did storj not simply release a new version of the software that would remove any traces of the deleted satellites? So it’s a monkey’s work for operators to go and check whether the directories have been deleted and to execute any additional commands.

  2. Why does a truant always start walking all over again - why can’t one store progress on a satellite and start not from the beginning, but from the place of the previous state?

  3. I think the bloom filter has not passed the reality check and we need to return to the previous deletion scheme, perhaps sending not single deletions, but entire packages of deletions with one command. And it should be a transactional model, not a probabilistic one. Otherwise, unaccounted data on the nodes will only grow.

2 Likes

Unaccounted data is the word.
Do I understand correctly: Customer deletes data. Instead of immediately deleting it and freeing space on SNOs node it takes a week until the bloom filter has been prepared and sent out. Then the data is sent to trash where it resides another week. And additionally there is still data to delete left because the bloom filter does not cover it all so it needs several passes aka weeks until the data is finally cleared? And all of the unpaid for the SNO?

3 Likes

The bloom filter does not delete transactionally - guaranteed and documented, but probabilistically - you can delete it, but you can not. These are two big differences.

I would rather avoid these kind of generic statements. Satellites are working well, thanks to the continuous improvements. Usage is also increasing, therefore newer and newer challenges should be solved.

The reason behind the async deletes can be found in the design doc.

Because the decentralization, especially the federation which is a form of decentralization. What you write is possible, but Storj software tries to be generic, and usable with any satellites depends on the decision of the Storage Node Operator

Again, I am a technical person, and my main focus is solving technical problems, and helping to understand technical question. It’s a very generic statement, and based on technical facts, I have slightly different opinions.

But I understand your unhappiness, because you are affected by a bug, and try to help with the fix.

From technical point of view, bloom filters are working well for most of the nodes, but nodes with huge number of segments might be affected by a bug (which is a bug only thanks to the expansion of the network). Please follow the linked issue for the progress of the fix. (Thanks to your contribution of the piece list, now it’s easy to test any changes).

1 Like

Sorry i didnt get it completely.
So if we have nodes disk usage discrepancy the Problem will resolve by itself by Running the Node for several weeks so gc and filewalker can do its work?

Yes, just stay tuned.

We are aware of one problem, and working on fixes.

  • Small discrepancies are possible (due to the architecture, and because it’s a highly distributed system: we couldn’t have one snapshot view from the same moment).
  • Only big (>15M piece per Satellite) Storagenodes are affected by the (known) bug.
  • If you have small Storagenodes and large discrepancies → Check if you deleted the data for the legacy Satellites
  • If you have small Storagenode, large discrepancies for active Satellites: let me know your numbers (I believe there is no such case)
3 Likes

I have read the small discussion on github, regarding the 45M/25M node.

Even if a larger bloom filter would reduce the number of segments to around 25M, there will eventually increase again to around 45M of legitimate pieces, if of course they are the same size. In this case it needs to be large enough to still be effective at 45M pieces that should be kept.

Even larger nodes probably will exist.

I can’t even imagine how many files could be on a full node running one of those new 22TB drives?

(maybe nobody has managed to fill one yet?)

I would expect at least 60-70M files.

2 Likes

well i would count You, got some shy 8TB and filling 10 and 14TB slowly, but need a command for Windows GUI pls anyone, for PowerShell to make You a stats for each satellite folder. because i got a shy suspection that the problem is on smaller nodes as well and i want to help find some clues.

So, untill a new bloom filter design is out and we see the results, I would speculate that in the actual conditions, it is better to run multiple small nodes on one big drive, than a single one?
For ex. we should split a 22 TB drive to 3 nodes, 7TB each?
I don’t know how the performance of the entire cluster of nodes will be affected, but this seems the logical conclusion.

The from my earlier post recently got 500GB marked as trash as seen above.

I see this as a huge step forward.

still a bad idea, one drive can’t handle 3 nodes. not even cached i guess