When will "Uncollected Garbage" be deleted?

Julio · August 10, 2024, 5:00am

That’s a very thoughtful, easy to grasp description Toyoo.
I hereby award chew 5 1/2 cents, and a like

batelis · August 10, 2024, 5:22pm

Aperently my node has no trash at all perfect
120h since last restart btw

pasatmalo · August 10, 2024, 5:43pm

github.com/storj/storj

1.109.2 apparently not updating trash stats

opened 08:09PM - 29 Jul 24 UTC

closed 02:52PM - 07 Aug 24 UTC

zipiju

Bug Injection

Have updated 1.107.3 node to 1.109.2 because 1.107.3 got (as I understood from t…he debug endpoint) stuck on multiple retain calls and was using excessive CPU. After starting on the new version the node either received a BF or started processing the previously received ones. Monitoring the process I saw trash subfolders slowly being created and files appeared in them. Now after the lazy gc-filewalker is done, the node is still showing 0 bytes as trash. Might there be some issue with updating the database in this version? Databases are on SSD. ![image](https://github.com/user-attachments/assets/737b557d-7982-413b-abe1-6dbe0a21988f) ``` 2024-07-29T11:34:23Z INFO retain Moved pieces to trash during retain {"Process": "storagenode", "cachePath": "/mnt/storj-data/data/retain", "Deleted pieces": 614702, "Failed to delete": 0, "Pieces failed to read": 0, "Pieces count": 33022028, "Satellite ID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE", "Duration": "6h51m0.498222511s", "Retain Status": "enabled"} 2024-07-29T15:00:28Z INFO retain Moved pieces to trash during retain {"Process": "storagenode", "cachePath": "/mnt/storj-data/data/retain", "Deleted pieces": 5526315, "Failed to delete": 0, "Pieces failed to read": 0, "Pieces count": 41620208, "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Duration": "2h34m22.621320677s", "Retain Status": "enabled"} ``` Saltlake: ``` s3S"} 2024-07-29T04:43:23Z INFO lazyfilewalker.gc-filewalker.subprocess Database started {"Process": "storagenode", "satelliteID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE", "Process": "storagenode"} 2024-07-29T04:43:23Z INFO lazyfilewalker.gc-filewalker.subprocess gc-filewalker started {"Process": "storagenode", "satelliteID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE", "Process": "storagenode", "createdBefore": "2024-07-07T17:59:59Z", "bloomFilterSize": 17000003} 2024-07-29T11:34:17Z INFO lazyfilewalker.gc-filewalker.subprocess gc-filewalker completed {"Process": "storagenode", "satelliteID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE", "trashPiecesCount": 614702, "piecesTrashed": 614702, "piecesSkippedCount": 0, "Process": "storagenode", "piecesCount": 33022028} 2024-07-29T11:34:22Z INFO lazyfilewalker.gc-filewalker subprocess finished successfully {"Process": "storagenode", "satelliteID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE"} /mnt/storj-data/data/storage/trash/pmw6tvzmf2jv6giyybmmvl4o2ahqlaldsaeha4yx74n5aaaaaaaa/2024-07-29 # ls 22 2p 3e 3z 4o 5d 5y 6n 7c 7x am bb bw cl da dv ek f7 fu gj h6 ht ii j5 js kh l4 lr mg n3 nq of p2 pp qe qz ro sd sy tn uc ux vm wb ww xl ya yv zk 23 2q 3f 42 4p 5e 5z 6o 7d 7y an bc bx cm db dw el fa fv gk h7 hu ij j6 jt ki l5 ls mh n4 nr og p3 pq qf r2 rp se sz to ud uy vn wc wx xm yb yw zl 24 2r 3g 43 4q 5f 62 6p 7e 7z ao bd by cn dc dx em fb fw gl ha hv ik j7 ju kj l6 lt mi n5 ns oh p4 pr qg r3 rq sf t2 tp ue uz vo wd wy xn yc yx zm 25 2s 3h 44 4r 5g 63 6q 7f a2 ap be bz co dd dy en fc fx gm hb hw il ja jv kk l7 lu mj n6 nt oi p5 ps qh r4 rr sg t3 tq uf v2 vp we wz xo yd yy zn 26 2t 3i 45 4s 5h 64 6r 7g a3 aq bf c2 cp de dz eo fd fy gn hc hx im jb jw kl la lv mk n7 nu oj p6 pt qi r5 rs sh t4 tr ug v3 vq wf x2 xp ye yz zo 27 2u 3j 46 4t 5i 65 6s 7h a4 ar bg c3 cq df e2 ep fe fz go hd hy in jc jx km lb lw ml na nv ok p7 pu qj r6 rt si t5 ts uh v4 vr wg x3 xq yf z2 zp 2a 2v 3k 47 4u 5j 66 6t 7i a5 as bh c4 cr dg e3 eq ff g2 gp he hz io jd jy kn lc lx mm nb nw ol pa pv qk r7 ru sj t6 tt ui v5 vs wh x4 xr yg z3 zq 2b 2w 3l 4a 4v 5k 67 6u 7j a6 at bi c5 cs dh e4 er fg g3 gq hf i2 ip je jz ko ld ly mn nc nx om pb pw ql ra rv sk t7 tu uj v6 vt wi x5 xs yh z4 zr 2c 2x 3m 4b 4w 5l 6a 6v 7k a7 au bj c6 ct di e5 es fh g4 gr hg i3 iq jf k2 kp le lz mo nd ny on pc px qm rb rw sl ta tv uk v7 vu wj x6 xt yi z5 zs 2d 2y 3n 4c 4x 5m 6b 6w 7l aa av bk c7 cu dj e6 et fi g5 gs hh i4 ir jg k3 kq lf m2 mp ne nz oo pd py qn rc rx sm tb tw ul va vv wk x7 xu yj z6 zt 2e 2z 3o 4d 4y 5n 6c 6x 7m ab aw bl ca cv dk e7 eu fj g6 gt hi i5 is jh k4 kr lg m3 mq nf o2 op pe pz qo rd ry sn tc tx um vb vw wl xa xv yk z7 zu 2f 32 3p 4e 4z 5o 6d 6y 7n ac ax bm cb cw dl ea ev fk g7 gu hj i6 it ji k5 ks lh m4 mr ng o3 oq pf q2 qp re rz so td ty un vc vx wm xb xw yl za zv 2g 33 3q 4f 52 5p 6e 6z 7o ad ay bn cc cx dm eb ew fl ga gv hk i7 iu jj k6 kt li m5 ms nh o4 or pg q3 qq rf s2 sp te tz uo vd vy wn xc xx ym zb zw 2h 34 3r 4g 53 5q 6f 72 7p ae az bo cd cy dn ec ex fm gb gw hl ia iv jk k7 ku lj m6 mt ni o5 os ph q4 qr rg s3 sq tf u2 up ve vz wo xd xy yn zc zx 2i 35 3s 4h 54 5r 6g 73 7q af b2 bp ce cz do ed ey fn gc gx hm ib iw jl ka kv lk m7 mu nj o6 ot pi q5 qs rh s4 sr tg u3 uq vf w2 wp xe xz yo zd zy 2j 36 3t 4i 55 5s 6h 74 7r ag b3 bq cf d2 dp ee ez fo gd gy hn ic ix jm kb kw ll ma mv nk o7 ou pj q6 qt ri s5 ss th u4 ur vg w3 wq xf y2 yp ze zz 2k 37 3u 4j 56 5t 6i 75 7s ah b4 br cg d3 dq ef f2 fp ge gz ho id iy jn kc kx lm mb mw nl oa ov pk q7 qu rj s6 st ti u5 us vh w4 wr xg y3 yq zf 2l 3a 3v 4k 57 5u 6j 76 7t ai b5 bs ch d4 dr eg f3 fq gf h2 hp ie iz jo kd ky ln mc mx nm ob ow pl qa qv rk s7 su tj u6 ut vi w5 ws xh y4 yr zg 2m 3b 3w 4l 5a 5v 6k 77 7u aj b6 bt ci d5 ds eh f4 fr gg h3 hq if j2 jp ke kz lo md my nn oc ox pm qb qw rl sa sv tk u7 uu vj w6 wt xi y5 ys zh 2n 3c 3x 4m 5b 5w 6l 7a 7v ak b7 bu cj d6 dt ei f5 fs gh h4 hr ig j3 jq kf l2 lp me mz no od oy pn qc qx rm sb sw tl ua uv vk w7 wu xj y6 yt zi 2o 3d 3y 4n 5c 5x 6m 7b 7w al ba bv ck d7 du ej f6 ft gi h5 hs ih j4 jr kg l3 lq mf n2 np oe oz po qd qy rn sc sx tm ub uw vl wa wv xk y7 yu zj /mnt/storj-data/data/storage/trash/pmw6tvzmf2jv6giyybmmvl4o2ahqlaldsaeha4yx74n5aaaaaaaa/2024-07-29 # ls mm 225polsc2vu5urrmal72hdck2gw5omj4co2sqbl57w3tojgheq.sj1 ezqllhjeri4ltd3rpg5ylubfceuwqosfcdivh2l4nfm7i3uozq.sj1 orrnnjtiiirh5mqbsx5pesakz6hi6haej3sfzys56y2fambobq.sj1 24alkfgzggeasucpqt5s2glue2vtr6bqhzepn3klusw7btdu5q.sj1 f4nsno33vo2t2qi2vrczpwkrhanpmctsu6tcnkemnszor4t3sa.sj1 otsbtcwhoe65fxy2i7bkohajw5yrco4nu55r4bljftpvb6zm7q.sj1 25zn64vgh3sv5zwwiin5q26h76hebsfydmwgt2fu75yb6ret6a.sj1 fa7ewusbe35rrhxpx37sg6luotto2miam4fm2vpczxcumzwabq.sj1 ovsdhsegtnn3mxv6sku5lx2zknefcbe3szepfjj37mh3ksvo2q.sj1 27xxlbbxlckbuhlhw4kaobb4tgtlirzcbqu3axry6t5oyazwya.sj1 fab6lf5ijjeiklfwhutxvbyqhz5aofc3qfm2x7z62yvm2g52xa.sj1 ow7puy45kuooy5kmh5etrfwh7azjsnqeci3sdge7mu7c36umkq.sj1 2dakofjd23rctjie3xzioyrhjsnljp2fbgflkxf2wynqhvozeq.sj1 fc3yw7wc5aycwbd774anl5ssid5ur52nzviyaz4ahs26ax6hwa.sj1 oy6yqlvczwpbhjbydsqi3luayqt772qmpmkf7sksflbozqj42q.sj1 ... much longer ``` US1: ``` 2024-07-29T12:26:05Z INFO lazyfilewalker.gc-filewalker starting subprocess {"Process": "storagenode", "satelliteID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S"} 2024-07-29T12:26:05Z INFO lazyfilewalker.gc-filewalker subprocess started {"Process": "storagenode", "satelliteID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S"} 2024-07-29T12:26:06Z INFO lazyfilewalker.gc-filewalker.subprocess Database started {"Process": "storagenode", "satelliteID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Process": "storagenode"} 2024-07-29T12:26:06Z INFO lazyfilewalker.gc-filewalker.subprocess gc-filewalker started {"Process": "storagenode", "satelliteID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Process": "storagenode", "createdBefore": "2024-07-23T17:59:59Z", "bloomFilterSize": 19882856} 2024-07-29T14:55:35Z INFO lazyfilewalker.gc-filewalker.subprocess gc-filewalker completed {"Process": "storagenode", "satelliteID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Process": "storagenode", "piecesCount": 41620208, "trashPiecesCount": 5526315, "piecesTrashed": 5526315, "piecesSkippedCount": 0} 2024-07-29T15:00:18Z INFO lazyfilewalker.gc-filewalker subprocess finished successfully {"Process": "storagenode", "satelliteID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S"} /mnt/storj-data/data/storage/trash/ukfu6bhbboxilvt7jrwlqk7y2tapb5d2r2tsmj2sjxvw5qaaaaaa/2024-07-29 # ls 22 2p 3e 3z 4o 5d 5y 6n 7c 7x am bb bw cl da dv ek f7 fu gj h6 ht ii j5 js kh l4 lr mg n3 nq of p2 pp qe qz ro sd sy tn uc ux vm wb ww xl ya yv zk 23 2q 3f 42 4p 5e 5z 6o 7d 7y an bc bx cm db dw el fa fv gk h7 hu ij j6 jt ki l5 ls mh n4 nr og p3 pq qf r2 rp se sz to ud uy vn wc wx xm yb yw zl 24 2r 3g 43 4q 5f 62 6p 7e 7z ao bd by cn dc dx em fb fw gl ha hv ik j7 ju kj l6 lt mi n5 ns oh p4 pr qg r3 rq sf t2 tp ue uz vo wd wy xn yc yx zm 25 2s 3h 44 4r 5g 63 6q 7f a2 ap be bz co dd dy en fc fx gm hb hw il ja jv kk l7 lu mj n6 nt oi p5 ps qh r4 rr sg t3 tq uf v2 vp we wz xo yd yy zn 26 2t 3i 45 4s 5h 64 6r 7g a3 aq bf c2 cp de dz eo fd fy gn hc hx im jb jw kl la lv mk n7 nu oj p6 pt qi r5 rs sh t4 tr ug v3 vq wf x2 xp ye yz zo 27 2u 3j 46 4t 5i 65 6s 7h a4 ar bg c3 cq df e2 ep fe fz go hd hy in jc jx km lb lw ml na nv ok p7 pu qj r6 rt si t5 ts uh v4 vr wg x3 xq yf z2 zp 2a 2v 3k 47 4u 5j 66 6t 7i a5 as bh c4 cr dg e3 eq ff g2 gp he hz io jd jy kn lc lx mm nb nw ol pa pv qk r7 ru sj t6 tt ui v5 vs wh x4 xr yg z3 zq 2b 2w 3l 4a 4v 5k 67 6u 7j a6 at bi c5 cs dh e4 er fg g3 gq hf i2 ip je jz ko ld ly mn nc nx om pb pw ql ra rv sk t7 tu uj v6 vt wi x5 xs yh z4 zr 2c 2x 3m 4b 4w 5l 6a 6v 7k a7 au bj c6 ct di e5 es fh g4 gr hg i3 iq jf k2 kp le lz mo nd ny on pc px qm rb rw sl ta tv uk v7 vu wj x6 xt yi z5 zs 2d 2y 3n 4c 4x 5m 6b 6w 7l aa av bk c7 cu dj e6 et fi g5 gs hh i4 ir jg k3 kq lf m2 mp ne nz oo pd py qn rc rx sm tb tw ul va vv wk x7 xu yj z6 zt 2e 2z 3o 4d 4y 5n 6c 6x 7m ab aw bl ca cv dk e7 eu fj g6 gt hi i5 is jh k4 kr lg m3 mq nf o2 op pe pz qo rd ry sn tc tx um vb vw wl xa xv yk z7 zu 2f 32 3p 4e 4z 5o 6d 6y 7n ac ax bm cb cw dl ea ev fk g7 gu hj i6 it ji k5 ks lh m4 mr ng o3 oq pf q2 qp re rz so td ty un vc vx wm xb xw yl za zv 2g 33 3q 4f 52 5p 6e 6z 7o ad ay bn cc cx dm eb ew fl ga gv hk i7 iu jj k6 kt li m5 ms nh o4 or pg q3 qq rf s2 sp te tz uo vd vy wn xc xx ym zb zw 2h 34 3r 4g 53 5q 6f 72 7p ae az bo cd cy dn ec ex fm gb gw hl ia iv jk k7 ku lj m6 mt ni o5 os ph q4 qr rg s3 sq tf u2 up ve vz wo xd xy yn zc zx 2i 35 3s 4h 54 5r 6g 73 7q af b2 bp ce cz do ed ey fn gc gx hm ib iw jl ka kv lk m7 mu nj o6 ot pi q5 qs rh s4 sr tg u3 uq vf w2 wp xe xz yo zd zy 2j 36 3t 4i 55 5s 6h 74 7r ag b3 bq cf d2 dp ee ez fo gd gy hn ic ix jm kb kw ll ma mv nk o7 ou pj q6 qt ri s5 ss th u4 ur vg w3 wq xf y2 yp ze zz 2k 37 3u 4j 56 5t 6i 75 7s ah b4 br cg d3 dq ef f2 fp ge gz ho id iy jn kc kx lm mb mw nl oa ov pk q7 qu rj s6 st ti u5 us vh w4 wr xg y3 yq zf 2l 3a 3v 4k 57 5u 6j 76 7t ai b5 bs ch d4 dr eg f3 fq gf h2 hp ie iz jo kd ky ln mc mx nm ob ow pl qa qv rk s7 su tj u6 ut vi w5 ws xh y4 yr zg 2m 3b 3w 4l 5a 5v 6k 77 7u aj b6 bt ci d5 ds eh f4 fr gg h3 hq if j2 jp ke kz lo md my nn oc ox pm qb qw rl sa sv tk u7 uu vj w6 wt xi y5 ys zh 2n 3c 3x 4m 5b 5w 6l 7a 7v ak b7 bu cj d6 dt ei f5 fs gh h4 hr ig j3 jq kf l2 lp me mz no od oy pn qc qx rm sb sw tl ua uv vk w7 wu xj y6 yt zi 2o 3d 3y 4n 5c 5x 6m 7b 7w al ba bv ck d7 du ej f6 ft gi h5 hs ih j4 jr kg l3 lq mf n2 np oe oz po qd qy rn sc sx tm ub uw vl wa wv xk y7 yu zj /mnt/storj-data/data/storage/trash/ukfu6bhbboxilvt7jrwlqk7y2tapb5d2r2tsmj2sjxvw5qaaaaaa/2024-07-29 # ls mm 22igtwycr7mqyzu7z7ubwo5cjysk72vihfny5j7zx4q76zmcra.sj1 edttcplbwffvndnznxkj7ansnpin6eeikwc7fafyzesbqpusaa.sj1 pgh73m4a4jryjabxc5d5w6sgrw3bltz76k62neogmov4pudnva.sj1 22kviptq67iehgxduolsbkqyytwimg73ow4j26vzwbpemngxhq.sj1 eduprz57femksc43dyqb3uy34tu2k77fnwjduu2r27dakjyjbq.sj1 pghdwdd57vscjjcc22mw5vxio7l63qfupfltr5dywwp6lkyn3q.sj1 22lyiubrj7msubwjozbipwlphii3xxvdd6pk5lmeplhjuoyofa.sj1 ee3yzxh5muqipmaeocnfgdeatyldt66737be4eir5pjfekeb7q.sj1 pgjcozx5svz2rzavon2f3hmp4r777mpzawn4u2ycilsxzs26ua.sj1 22vvjzpurdsuz4i626s3xwtej42z6pzb7znbwis3vefaq5qpqq.sj1 ee42jo26rjnl2iaoz2rnvnhdgwwkfxnybirf32o7bg2blkk2xa.sj1 pgupaw7fczahalkq6iijgkgc3cmaxbsguohse7nbobstq6jm7q.sj1 22y33ouwkdcubreg6ofejdb3uq5v5n62ezm4dalvk66l4kgkna.sj1 ...much much longer ```

batelis · August 10, 2024, 5:51pm

Saw this already in other post, just wanted to brag about it since everyone got lots of trash

littleskunk · August 13, 2024, 3:17pm

We have a possible fix open for review: https://review.dev.storj.io/c/storj/storj/+/13789

I am going to install that on one of my nodes to get some early test results. Tomorrow would be the new release cut. It is a risk (breaking my test node) and for the current 64 MB uploads on SLC it makes no sense. However the moment customers sign up and start uploading with the file mix we simulated earlier I would have to increase the priority by a lot. I don’t have controle over customer signups but I can reduce the time it needs to get this PR merged and rolled out.

And my node is begging for these kind of tests anyway. It wants to be treated like a real QA node. It was already looking into drugs. I better keep it busy to make sure it has no time to even think about it

MarviBiene · August 13, 2024, 3:59pm

Can I install a beta on Linux docker node too? I got a node, that I think got hit hard (I think 10+ TB wrong) and want to test, if something will fix it soon.

Roxor · August 13, 2024, 4:27pm

That sounds like a smart improvement: and exactly the type of housekeeping you can offload to the filesystem. Save databases for the tasks that need to be clever.

littleskunk · August 13, 2024, 4:54pm

This fix isn’t going to help you. It can’t recover the lost TTL entries. What this fix can do is make sure 30 days after rolling it out the cleanup will work better.

BrightSilence · August 13, 2024, 7:19pm

Can I kindly request file not found errors be suppressed on TTL cleanup. It’s pretty clear this append only method would make it impossible to remove entries. Which is fine, but the long error lines are super wasteful when files aren’t found. And if this is the way it’s intended to work when data is removed before the TTL, it certainly isn’t an error.

MarviBiene · August 13, 2024, 7:36pm

With “fixes the issue” I mean collect uncollected garbage. But I can wait longer.

littleskunk · August 13, 2024, 8:16pm

Again for the next 30 days there will be no change in behavior. This fix isn’t going to clean up any garbage. It will make sure there is less garbage to deal with after 30 days.

chinabjy · August 14, 2024, 3:53pm

I have multiple nodes with similar issues, and I believe that the test data with TTL expired and was not deleted, and at the same time, the Bloom filter cannot clear them

chinabjy · August 14, 2024, 3:54pm

edo · August 14, 2024, 6:23pm

@littleskunk does this mean the fix won’t be in release v1.111, but possibly in v1.112?

littleskunk · August 14, 2024, 6:26pm

We can still cherry pick it onto the release branch.

thepaul · August 15, 2024, 12:45am

Yes, I’ll be sure to take that out (or move it to DEBUG priority or something). Good catch.

Toyoo · August 15, 2024, 1:17am

I like it. One nitpick to a comment:

	// SetExpiration sets an expiration time for the given piece ID on the given satellite. If pieceSize
	// is non-zero, it may be used later to decrement the used space counters without needing to call
	// os.Stat on the piece.

The inode will have to be read by the OS anyway to remove the file: how do you remove a file without figuring out how many data blocks to set free?

I would also be worried about the default size of the LRU cache, 1000 is close to a popular default of 1024 of max open files per process on many systems. Though with golang liking I/O they might have solved it in a different way

edo · August 15, 2024, 11:32am

It’s been a few weeks since we filled out that form about BFs not clearing enough data, and I’m still noticing my nodes are full and holding onto what seems about 50% of uncollected data. While the BFs are coming through and getting processed successfully, the total amount of lingering data doesn’t seem to be shrinking as much as I’d hoped.

Does anyone have any insights on the findings from @elek’s report? If the BFs are indeed working as expected and should be more efficient now (since they should be smaller, given I shouldn’t be storing that much according to the satellites), when might we see this extra data finally start to clear out?

It’s really important we get this sorted so we can free up space and get back to accumulating more paid data. Thanks so much for any updates!

agente · August 15, 2024, 12:03pm

Me too. 50% in basic ext4 nodes. 40% in cached zfs+l2arc.

chinabjy · August 15, 2024, 12:23pm

40% of unpaid data is still stronger than some of my nodes. I have a few bad nodes with unpaid data, even up to 70%. Most of my nodes have around 30% unpaid data