Machine freezing for short periods

After roughly 1h15 with the --filestore.write-buffer-size="4MiB" option, the node is dying again it seems: load average: 299.51, 282.63, 210.63.

Besides, something really suspect happens time to time, not sure if it was the case with previous configurations, but it is little by little taking a very large amount of RAM (1000MB+) up to the point where it suddenly goes back to almost nothing, and it rises again (I saw it above 1580MB once).

When checking the logs, it appears that the start-up routine is triggered several times, not only at start-up time. For instance, here is a quote in the middle of this test “session”:

[START OF THE LOG FILE]
2020-06-10T13:22:10.975Z        INFO    Configuration loaded    {"Location": "/app/config/config.yaml"}
2020-06-10T13:22:10.980Z        INFO    Operator email  {"Address": "..."}
2020-06-10T13:22:10.981Z        INFO    Operator wallet {"Address": "..."}
2020-06-10T13:22:11.842Z        INFO    Telemetry enabled
2020-06-10T13:22:11.946Z        INFO    db.migration    Database Version        {"version": 39}
2020-06-10T13:22:12.988Z        INFO    preflight:localtime     start checking local system clock with trusted satellites' system clock.
2020-06-10T13:22:13.870Z        INFO    preflight:localtime     local system clock is in sync with trusted satellites' system clock.
2020-06-10T13:22:13.871Z        INFO    trust   Scheduling next refresh {"after": "5h41m38.286636482s"}
2020-06-10T13:22:13.871Z        INFO    bandwidth       Performing bandwidth usage rollups
2020-06-10T13:22:13.871Z        INFO    Node 12VHJRHeqGmnHsD3bcV2bkpfbv8jwER8NtX7h2wpgED8qDD6oHS started
2020-06-10T13:22:13.871Z        INFO    Public server started on [::]:28967
2020-06-10T13:22:13.872Z        INFO    Private server started on 127.0.0.1:7778
2020-06-10T13:22:15.446Z        INFO    piecestore      download started        {"Piece ID": "O4PHHU64KAJOZINH255MGEWIUCG7XLY2YYKDF2OWN2ZR6ZA2FYIQ", "Satellite ID": "12EayR...
2020-06-10T13:22:16.326Z        INFO    piecestore      downloaded      {"Piece ID": "O4PHHU64KAJOZINH255MGEWIUCG7XLY2YYKDF2OWN2ZR6ZA2FYIQ", "Satellite ID": "12EayRS2V1kEsW...
[...]
2020-06-10T14:22:39.917Z        INFO    piecestore      upload started  {"Piece ID": "KZCZESNGC7CAXUKYTHUEQDN2DPOWEAG7ZA5ZSM6UNVTO6QIHFN2A", "Satellite ID": "12rfG3sh9NCWiX...
2020-06-10T14:22:40.271Z        INFO    piecestore      upload started  {"Piece ID": "BZW7E3WAVNX6QNH4EZI2W46ZEJBFV5RI47LS7WXGLT7HVFKPPETQ", "Satellite ID": "121RTSDpyNZVcE...
2020-06-10T14:23:34.760Z        INFO    Configuration loaded    {"Location": "/app/config/config.yaml"}
2020-06-10T14:23:34.763Z        INFO    Operator email  {"Address": "..."}
2020-06-10T14:23:34.763Z        INFO    Operator wallet {"Address": "..."}
2020-06-10T14:23:35.792Z        INFO    Telemetry enabled
2020-06-10T14:23:37.745Z        INFO    db.migration    Database Version        {"version": 39}
2020-06-10T14:23:43.470Z        INFO    preflight:localtime     start checking local system clock with trusted satellites' system clock.
2020-06-10T14:23:44.382Z        INFO    preflight:localtime     local system clock is in sync with trusted satellites' system clock.
2020-06-10T14:23:44.382Z        INFO    bandwidth       Performing bandwidth usage rollups
2020-06-10T14:23:44.388Z        INFO    trust   Scheduling next refresh {"after": "4h56m45.640427281s"}
2020-06-10T14:23:44.390Z        INFO    Node 12VHJRHeqGmnHsD3bcV2bkpfbv8jwER8NtX7h2wpgED8qDD6oHS started
2020-06-10T14:23:44.390Z        INFO    Public server started on [::]:28967
2020-06-10T14:23:44.390Z        INFO    Private server started on 127.0.0.1:7778
2020-06-10T14:23:44.603Z        INFO    piecestore      upload started  {"Piece ID": "ANB5SA53KLJYDQIS3XXIM5OOTLONEEEJ7KQPGYJUPVXNIMNIQHUA", "Satellite ID": "12L9ZFwhzVpuEK...
2020-06-10T14:23:44.662Z        INFO    piecestore      upload started  {"Piece ID": "BHQCQJ4IJSAK6CXMKHSB7YAVWPUZ2LBLGADZ7K7CQM76BPS4XIQQ", "Satellite ID": "12rfG3sh9NCWiX...
[...]

Is the storagenode software crashing, or something? (still running v1.5.2 as it did not auto-update yet - watchtower is running).


Maybe my disk is not in good shape. Or… SMR disks really are bad, except for casual home users.