Copied my node, now i've gotten atleast 1 failed audit

bleep me side ways…
i just had an audit fail out of 542 less than 24hours after booting up my node on it’s new location on the same drive, it seems my rsync failed to copy the node correctly…

used this, ran it like 3-4 times before shutting down the node, then it started finishing in good time, and i added the delete parameter after checking exactly what it did xD
then i ran that, which also finished in good time… then shutdown the node and ran it again a few times…

then tried to compare the sizes of the datasets… but that was basically impossible because of different compressions on zfs… and that the one was just a folder in a already used dataset
looked kinda alright so i figured rsync most likely worked like it was suppose to…
i may have been able to figure out if they where the same, and i guess i should have…

i’ve also run a scrub with no errors, so all checksums are correct, so i should be able to exclude a data error… it has to be an inability to copy the node folder correctly.

also i still have to old data, so in theory i should be able to fix it… atleast if my node gave me the option xD, keep calm and keep storjing … i’m sure it will be fine a few failed audits never hurt anyone right lol

used this command to copy the data…
how did i go wrong???

rsync -u -avHAXx --delete /zPool/storj /zPool/storagenodes/storj --progress -W -B=8192

also i strong assume my node will most likely survive and that there isn’t a way to fix this…
but i would like to avoid this again and help prevent others from doing the same.

Is it a failed audit or just a failed attempt that got re-tried (and good) later?

Also, you found a compression that works with encrypted data???

right i should find my log… never had a failed audit before… so i doubt the data will be there, i would guess that the copy was not 100%.
zfs with compression doesn’t compress the encrypted data, but it does compress the non used space of blocks written and thus it does take up different amounts of space…

i switched from lz4 to zle so that my zfs wouldn’t tried to compress the encrypted data all the time and just do zero length encoding, basically using short hand for writing many zero’s in blocks and files…
only gives me 2% less written space when working with 256k size records.

looking at logs now… apparently i duno how to find a failed audit in my logs xD
trying to figure out what to search for, will post the related log information when i find it

This looks odd…

2020-05-03T16:49:13.464Z INFO piecestore download started {“Piece ID”: “2G3ZYVAJLEOITB7YH2HN4B26KS4AGOA4Z24PWTKJRX3I47NBIXIA”, “Satellite ID”: “1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE”, “Action”: “GET”}
2020-05-03T16:49:21.408Z INFO piecestore downloaded {“Piece ID”: “2G3ZYVAJLEOITB7YH2HN4B26KS4AGOA4Z24PWTKJRX3I47NBIXIA”, “Satellite ID”: “1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE”, “Action”: “GET”}
2020-05-03T16:54:12.963Z INFO piecestore download started {“Piece ID”: “2G3ZYVAJLEOITB7YH2HN4B26KS4AGOA4Z24PWTKJRX3I47NBIXIA”, “Satellite ID”: “1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE”, “Action”: “GET”}
2020-05-03T16:54:22.978Z ERROR piecestore download failed {“Piece ID”: “2G3ZYVAJLEOITB7YH2HN4B26KS4AGOA4Z24PWTKJRX3I47NBIXIA”, “Satellite ID”: “1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE”, “Action”: “GET”, “error”: “usedserialsdb error: database is locked”, “errorVerbose”: “usedserialsdb error: database is locked\n\tstorj.io/storj/storagenode/storagenodedb.(*usedSerialsDB).Add:35\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).verifyOrderLimit:76\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).doDownload:523\n\tstorj.io/storj/storagenode/piecestore.(*drpcEndpoint).Download:471\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func2:995\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:107\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:66\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:111\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:62\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:99\n\tstorj.io/drpc/drpcctx.(*Tracker).track:51”}

looks like most of my 80 download failed entries shows that database locked issue.
as posted above

========== AUDIT ==============
Critically failed:     0
Critical Fail Rate:    0.000%
Recoverable failed:    1
Recoverable Fail Rate: 0.184%
Successful:            542
Success Rate:          99.816%
========== DOWNLOAD ===========
Failed:                80
Fail Rate:             0.675%
Canceled:              224
Cancel Rate:           1.889%
Successful:            11551
Success Rate:          97.436%
========== UPLOAD =============
Rejected:              9
Acceptance Rate:       99.990%
---------- accepted -----------
Failed:                0
Fail Rate:             0.000%
Canceled:              17238
Cancel Rate:           19.572%
Successful:            70835
Success Rate:          80.428%
========== REPAIR DOWNLOAD ====
Failed:                0
Fail Rate:             0.000%
Canceled:              0
Cancel Rate:           0.000%
Successful:            14
Success Rate:          100.000%
========== REPAIR UPLOAD ======
Failed:                0
Fail Rate:             0.000%
Canceled:              271
Cancel Rate:           18.261%
Successful:            1213
Success Rate:          81.738%
========== DELETE =============
Failed:                0
Fail Rate:             0.000%
Successful:            1580
Success Rate:          100.000%

Look in other threads for the locked issue. If it is an issue.

Maybe your audit is OK after all

1 Like

looks like it might be a storagenode software issue in 1.3.3
because i ofc updated while the node was down anyways…