Trash comparison thread

Friend’s node in Vancouver, Canada.
Node ID: 1MhTUzg1fzdoZBAU1MNveQq1WpL7FyfQ3X5MDKBxsuXHUaJrba

image

7.2G    ./pmw6tvzmf2jv6giyybmmvl4o2ahqlaldsaeha4yx74n5aaaaaaaa
6.1M    ./abforhuxbzyd35blusvrifvdwmfx4hmocsva4vmpp3rgqaaaaaaa
13G     ./ukfu6bhbboxilvt7jrwlqk7y2tapb5d2r2tsmj2sjxvw5qaaaaaa
9.1G    ./v4weeab67sbgvnbwd5z7tweqsqqun7qox2agpbxy44mqqaaaaaaa
7.1G    ./qstuylguhrn2ozjv4h2c6xpxykd622gtgurhql2k7k75wqaaaaaa
230G    ./6r2fgwqz3manwt4aogq343bfkh2n5vvg4ohqqgggrrunaaaaaaaa
266G    .

That biggest one is europe-north-1. But you seem to have more than me on all satellites. So it’s not satellite specific, which to me indicates it’s likely also not specific to certain customers.

1.7G    v4weeab67sbgvnbwd5z7tweqsqqun7qox2agpbxy44mqqaaaaaaa
189M    pmw6tvzmf2jv6giyybmmvl4o2ahqlaldsaeha4yx74n5aaaaaaaa
574M    qstuylguhrn2ozjv4h2c6xpxykd622gtgurhql2k7k75wqaaaaaa
5.3G    6r2fgwqz3manwt4aogq343bfkh2n5vvg4ohqqgggrrunaaaaaaaa
225M    abforhuxbzyd35blusvrifvdwmfx4hmocsva4vmpp3rgqaaaaaaa
809M    ukfu6bhbboxilvt7jrwlqk7y2tapb5d2r2tsmj2sjxvw5qaaaaaa

forgive me i’m not even sure i even know what a long tail cancellation means…

is that the files that complete but then the satellites cancel them… after the fact…??

well to my knowledge or from what we discussed if memory serves earlier and from what my trash looks like, then those surely doesn’t seem to be the case… or i win most…

my trash numbers seems stable at around 6-7gb with a spike of 15 a few days back…

also it was mostly in jest… i don’t understand the mechanics off it…

but i noticed that the ingress, and stored data wasn’t the same ratio on all nodes… from the bandwidth comparison thread, so it just made sense it had to be related to cancelled uploads in some way…

the details of it… as you well know i have oblivious to the programming, so cannot really speak the lingo or make any arguments towards that…

i’m sure the storjlings will figure out the details and solve it… since it seems to hit some nodes very hard… and some it doesn’t seem to affect at all

i don’t mean to offend anyone or need to prove anything, i just enjoy the community and figuring things out just for the fun of it.
something which i’m sure is shared by quite a few around here

1 Like

It’s the result of over-provisioned segment uploads. 110 pieces are started, only 80 are finished. The remaining 30 are long tail cancelations. So that’s mostly what the success rates are about as well.

This is different from the zombie segments mentioned earlier. Those appear when a file has multiple segments (>64mb) and earlier segments have finished, but the upload is interrupted before the file is completed. The segment being uploaded at that moment will be canceled on all nodes and won’t end up on them, but the previously finished segments for that same file stay behind and get caught by the zombie reaper on the satellite (apparently twice a week) and then garbage collection on nodes end.

Btw, didn’t mean to put you down, it was just not yet the moment to say told you so. :wink: You may well end up being right, we just don’t really have the proof yet to pin it down exactly. It may be true that long tail cancellation doesn’t normally end up in trash… but perhaps there are scenarios in which they can end up there. If you look at the trash sizes per satellite, they do seem to coincide with the satellites that have the most uploads recently (excluding repair). So it does seem to be something related to uploads. Of course zombie segments are also related to uploads.

takes a lot to put me down… xD i don’t really attribute a lot of meaning to individual words because well in most communication over 10% is really understood between people… so yeah… when one starts to realize that is true… it sort of becomes a crabshot to try and successfully happen through the point for those that actually has the chance of understanding what one really meant…

i’m not saying the words doesn’t make sense… only that the true intent is more often than not misunderstood rather than clearly understood… we are simply biased and see all kinds of patterns and things we want, don’t want or have different understandings of words… because how do we describe words… with more sequences of words… which again is explained by more sequences of words…

becomes a pretty confusing and at times very specialized language between all branches of people… hell you cannot have 5 people working together for a week before the group start developing their own sort of language… so as in jokes and slang

alas i digress… back on point…

so in relation to the zombie hordes, i’m wide open fields of internet and you likewise if memory serves… so more speed equals less time exposed and thus less zombies… could be as simple as that…

but yeah i duno… you explained it very elegantly tho … good job
can be very difficult to relay complex topics

If it was due to zombie segments, it should not be targeted. It should affect all SNOs since Storj is distributed, this was the case when stefanbenton’s satellite did the zombie purge at the end of June. Right now, I think it is different.

I tend to agree. There is likely some other source of garbage that hasn’t yet been identified. But Stefan did say he would look at some logs. I guess it might help to provide some node ID’s for that. Let me get to my node and I’ll add mine to this post.

Edit: 12aYrWFmJqrmhN3zgkvANBTsj2DdLwf2aZC8T5t7CrNazHahKXW
Just noticed that today my trash actually jumped from just over 2GB to close to 10GB. Still not nearly as much as @litori for example.

I updated the node ID, directory sizes, location, and picture for my friend’s node and mine in the posts above.

1 Like

Node in Romania
Node ID: 1benqCWaV7p4FveprRQGsE8qWXz83xa7WADGDvKv2joNEtTHNr

image

4.0G    ./qstuylguhrn2ozjv4h2c6xpxykd622gtgurhql2k7k75wqaaaaaa
776M    ./pmw6tvzmf2jv6giyybmmvl4o2ahqlaldsaeha4yx74n5aaaaaaaa
63G     ./6r2fgwqz3manwt4aogq343bfkh2n5vvg4ohqqgggrrunaaaaaaaa
6.1G    ./v4weeab67sbgvnbwd5z7tweqsqqun7qox2agpbxy44mqqaaaaaaa
7.7G    ./ukfu6bhbboxilvt7jrwlqk7y2tapb5d2r2tsmj2sjxvw5qaaaaaa
3.3M    ./abforhuxbzyd35blusvrifvdwmfx4hmocsva4vmpp3rgqaaaaaaa
81G     .

Node ID:

1FHNA5TLoetSbvd2CW7MTDmL8QD8GRpfAuRGjN4vszadCt2WZ8

Based in The Netherlands

afbeelding

  • 133G 6r2fgwqz3manwt4aogq343bfkh2n5vvg4ohqqgggrrunaaaaaaaa
  • 492M abforhuxbzyd35blusvrifvdwmfx4hmocsva4vmpp3rgqaaaaaaa
  • 11G pmw6tvzmf2jv6giyybmmvl4o2ahqlaldsaeha4yx74n5aaaaaaaa
  • 13G qstuylguhrn2ozjv4h2c6xpxykd622gtgurhql2k7k75wqaaaaaa
  • 18G ukfu6bhbboxilvt7jrwlqk7y2tapb5d2r2tsmj2sjxvw5qaaaaaa
  • 18G v4weeab67sbgvnbwd5z7tweqsqqun7qox2agpbxy44mqqaaaaaaa

This is a bit of a summery i did on the

https://forum.storj.io/t/bandwidth-utilization-comparison-thread

if you track TheMightyGeek, which has the slowest internet connection of all of those we tracked at the time…
sorry there isn’t more data… it was what i could fine with ease…
will do a updated summery soon and try to track this further…

alas

TheMightyGeek’s ingress doesn’t correspond to what his stored should be the day after…
ingress + past day stored = present day stored and so on and so forth…

his is clearly off, but a significant amount…
10july 105 ingress… okay maybe that one was inaccurate… the data isn’t perfect

one can see it from 11th to the 12th
11th he has 1.73 TB +117gb ingress
12th he has 1.82 TB which is like 25% less ingress added to data stored off from the expected 1.84 TB

i haven’t had time to dig through all the posts and check if this is a real thing tho… but i really should, but i noticed it a few times when working on the summery, not sure if its useful for anything, i just figured it was cancelled uploads, but i suppose it might be them zombies…

10 july 
Dragonhogan   - ingress 107,53 - egress 67,20  = 6,75 ‰ of stored 10,47 TB
TheMightyGeek - ingress 105,78 - egress  7,13  = 6,81 ‰ of stored  1,6 TB
kevink        - ingress 109,26 - egress 47,75 = 12,76 ‰ of stored 3,74 TB
SGC            - ingress 109,89 - egress 47,02 = 4,07 ‰ of stored 11,54 TB

11 July 
SGC           - ingress 118,39 - egress 59,78   = 5,13 ‰ of stored 11,64 TB
TheMightyGeek - ingress 117,14 - egress 11,09   = 6,41 ‰ of stored  1,73 TB
kevink        - ingress 117,66 - egress 44,19   = 11,5 ‰ of stored 3,84 TB
dragonhogan   - ingress 115,75 - egress 73,64   = 6,97 ‰ of stored 10,56 TB

12 july 
SGC           - ingress 114,58 - egress 71,56  = 6,09 ‰ of stored 11,75 TB
TheMightyGeek - ingress 113,97 - egress  11,2  = 6,15 ‰ of stored 1,82 TB
kevink        - ingress 113,68 - egress 37,18  = 9,40 ‰ of stored 3,95 TB
dragonhogan   - ingress 111,00 - egress 68,49  = 6,41 ‰ of stored 10,68 TB

not sure if it’s actually useful for anything tho… if a real thing… just caught my eye when doing through the numbers, figured it might be useful for something…

This is a trash thread and there are no trash numbers there.

1 Like

the discrepancy between ingress and stored has to go somewhere…

but yeah these numbers where taken before i realized it was connected to the trash issue… the plan is to include that… but been waiting for the next update to get correct successrates
and maybe a solution to the trash issue thus making the whole point of it irrelevant.

The difference is canceled uploads. If you are canceled during an upload, you will delete the almost completed piece.
So if the ingress > delta between the two dates, than this is simply due to cancelation during uploads.

1 Like

6 posts were split to a new topic: My node was causing a memory leak and has been killed

Here is some trash data/stats I posted back in May in this thread: ERROR pieces:trash emptying trash failed However I wasn’t getting that error in my logs that trash empty failed.

and thats why it’s the guy with the slowest internet connection that sees the largest effect of it…

does delete mean it goes to trash… or does everything go through trash before deletion?

Okay peeps… lets do this thing…

what if your internet connection speed and how high in your trash levels…
Go

i’m
Inet = 400mbit/400mbit
trash = 6 - 9 GB avg rare peaks at 15GB
node age is 4½ months

whats your internet bandwidth?

Cable down: 500 up: 40

afbeelding