The app tries to write data into the which seems to be already there or like the other post maybe here the
bandwidth_usage_rollups table is also missing.
run the sqlite3 container (replace the /mnt/storagenode to your path to the data):
docker run --rm -it -v /mnt/storagenode/storage/info.db:/data/info.db sstc/sqlite3 sqlite3 ./info.db`
And execute this SQL script:
SELECT datetime(strftime('%Y-%m-%dT%H:00:00', created_at)) created, bu.satellite_id sat_id, bu.action act , SUM(bu.amount) amt
FROM bandwidth_usage bu, bandwidth_usage_rollups bur
WHERE datetime(created_at) < datetime(CURRENT_TIMESTAMP)
AND bur.interval_start = created
AND bur.satellite_id = sat_id
AND bur.action = act
GROUP BY created, sat_id, act;
Exit from the sqlite3: .exit
Post the result
Also, better to have an exact date, when your storagenode has been updated.
For the watchtower: docker logs watchtower 2>&1 | grep -A 3 -B 3 Found
The temporary solution is to delete the duplicated rows. You should take a date from the result of the query and insert it to the query below:
Replace the path to yours.
docker stop -t 300 storagenode
docker run --rm -it -v /disk2/storj/storage/info.db:/data/info.db sstc/sqlite3 sqlite3 ./info.db
delete from bandwidth_usage_rollups where interval_start = 'put-the-date-with-time-here';
for example: delete from bandwidth_usage_rollups where interval_start = '2019-07-24 15:00:00'; replace the date to your result!!!
3. .exit
4. docker start storagenode
I’m seeing an issue similar to the ones reported for v0.15.2 however I’m still seeing the issue with v0.15.3 as well.
The container comes up for only around 30 seconds, during which time the dash actually appears pretty normal with the Last Contact time updating.
Grand total of the logs produced below.
2019-07-19T02:46:56.066Z INFO Configuration loaded from: /app/config/config.yaml
2019-07-19T02:46:56.099Z INFO Operator email: REDACTED
2019-07-19T02:46:56.099Z INFO Operator wallet: REDACTED
2019-07-19T02:46:57.551Z INFO running on version v0.15.3
2019-07-19T02:46:57.554Z INFO db.migration Latest Version {"version": 13}
2019-07-19T02:46:57.556Z INFO Node 12txo9zvG5XkqYp3T7QgR1DPcf31zEWgM1R4yvMd3e99GRp17jh started
2019-07-19T02:46:57.556Z INFO Public server started on [::]:28967
2019-07-19T02:46:57.556Z INFO Private server started on 127.0.0.1:7778
2019-07-19T02:46:57.557Z INFO vouchers Checking vouchers
2019-07-19T02:46:57.565Z INFO vouchers Requesting voucher {"satellite": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S"}
2019-07-19T02:46:57.565Z INFO vouchers Requesting voucher {"satellite": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6"}
2019-07-19T02:46:57.565Z INFO vouchers Requesting voucher {"satellite": "118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW"}
2019-07-19T02:46:57.754Z INFO running on version v0.15.3
2019-07-19T02:47:00.806Z INFO piecestore:monitor Remaining Bandwidth {"bytes": 4490899859456}
2019-07-19T02:47:11.202Z ERROR vouchers Error requesting voucher {"satellite": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "error": "voucher: unable to connect to the satellite: context canceled", "errorVerbose": "voucher: unable to connect to the satellite: context canceled\n\tstorj.io/storj/storagenode/vouchers.(*Service).request:132\n\tstorj.io/storj/storagenode/vouchers.(*Service).Request:116\n\tstorj.io/storj/storagenode/vouchers.(*Service).RunOnce.func1:103\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57"}
2019-07-19T02:47:11.202Z ERROR vouchers Error requesting voucher {"satellite": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "error": "voucher: unable to find satellite on the network: context canceled", "errorVerbose": "voucher: unable to find satellite on the network: context canceled\n\tstorj.io/storj/storagenode/vouchers.(*Service).request:127\n\tstorj.io/storj/storagenode/vouchers.(*Service).Request:116\n\tstorj.io/storj/storagenode/vouchers.(*Service).RunOnce.func1:103\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57"}
2019-07-19T02:47:11.203Z ERROR vouchers Error requesting voucher {"satellite": "118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW", "error": "voucher: unable to find satellite on the network: context canceled", "errorVerbose": "voucher: unable to find satellite on the network: context canceled\n\tstorj.io/storj/storagenode/vouchers.(*Service).request:127\n\tstorj.io/storj/storagenode/vouchers.(*Service).Request:116\n\tstorj.io/storj/storagenode/vouchers.(*Service).RunOnce.func1:103\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57"}
ERROR: 2019/07/19 02:47:11 pickfirstBalancer: failed to NewSubConn: rpc error: code = Canceled desc = grpc: the client connection is closing
ERROR: 2019/07/19 02:47:11 pickfirstBalancer: failed to NewSubConn: rpc error: code = Canceled desc = grpc: the client connection is closing
ERROR: 2019/07/19 02:47:11 pickfirstBalancer: failed to NewSubConn: rpc error: code = Canceled desc = grpc: the client connection is closing
ERROR: 2019/07/19 02:47:11 pickfirstBalancer: failed to NewSubConn: rpc error: code = Canceled desc = grpc: the client connection is closing
ERROR: 2019/07/19 02:47:11 pickfirstBalancer: failed to NewSubConn: rpc error: code = Canceled desc = grpc: the client connection is closing
ERROR: 2019/07/19 02:47:11 pickfirstBalancer: failed to NewSubConn: rpc error: code = Canceled desc = grpc: the client connection is closing
An extra curiosity, on the first run after a reboot the pattern is slightly different. The node will come up for nearly 5 minutes and report only the following.
2019-07-19T02:54:16.797Z INFO Configuration loaded from: /app/config/config.yaml
2019-07-19T02:54:16.829Z INFO Operator email: REDACTED
2019-07-19T02:54:16.829Z INFO Operator wallet: REDACTED
2019-07-19T02:54:19.714Z INFO running on version v0.15.3
2019-07-19T02:54:19.915Z INFO db.migration Latest Version {"version": 13}
2019-07-19T02:54:19.916Z INFO Node 12txo9zvG5XkqYp3T7QgR1DPcf31zEWgM1R4yvMd3e99GRp17jh started
2019-07-19T02:54:19.916Z INFO Public server started on [::]:28967
2019-07-19T02:54:19.916Z INFO Private server started on 127.0.0.1:7778
2019-07-19T02:54:19.922Z INFO vouchers Checking vouchers
2019-07-19T02:54:19.998Z INFO vouchers Requesting voucher {"satellite": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6"}
2019-07-19T02:54:19.998Z INFO vouchers Requesting voucher {"satellite": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S"}
2019-07-19T02:54:19.998Z INFO vouchers Requesting voucher {"satellite": "118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW"}
2019-07-19T02:54:20.109Z INFO running on version v0.15.3
2019-07-19T02:54:36.999Z INFO vouchers Voucher request denied. Vetting process not yet complete
2019-07-19T02:54:38.625Z INFO vouchers Voucher request denied. Vetting process not yet complete
2019-07-19T02:54:41.689Z INFO vouchers Voucher request denied. Vetting process not yet complete
ERROR: 2019/07/19 02:59:58 pickfirstBalancer: failed to NewSubConn: rpc error: code = Canceled desc = grpc: the client connection is closing
ERROR: 2019/07/19 02:59:58 pickfirstBalancer: failed to NewSubConn: rpc error: code = Canceled desc = grpc: the client connection is closing
ERROR: 2019/07/19 02:59:58 pickfirstBalancer: failed to NewSubConn: rpc error: code = Canceled desc = grpc: the client connection is closing
ERROR: 2019/07/19 02:59:58 pickfirstBalancer: failed to NewSubConn: rpc error: code = Canceled desc = grpc: the client connection is closing
ERROR: 2019/07/19 02:59:58 pickfirstBalancer: failed to NewSubConn: rpc error: code = Canceled desc = grpc: the client connection is closing
ERROR: 2019/07/19 02:59:58 pickfirstBalancer: failed to NewSubConn: rpc error: code = Canceled desc = grpc: the client connection is closing
ERROR: 2019/07/19 02:59:58 pickfirstBalancer: failed to NewSubConn: rpc error: code = Canceled desc = grpc: the client connection is closing
It then falls back into the pattern in the first post, flapping every 30 seconds.
I had same issue… what worked for me was to stop and remove all storj containers (including watchtower). Then remove all docker images for storagenode and watchtower and re-download docker images.
After that it went back to normal and 0.15.3 start working fine.
Please, edit your config.yaml to enable debug level of the logging: replace the info to debug, save the configuration file and restart the container: docker restart -t 300 storagenode
Then search for the FATAL errors: docker logs storagenode 2>&1 | grep -i FATAL
For Powershell: docker logs storagenode 2>&1 | sls fatal
Aren’t I going to need to delete every timestamp in that long list?
Something like:
DELETE FROM bandwidth_usage_rollups WHERE interval_start IN (
SELECT created FROM (
SELECT datetime(strftime('%Y-%m-%dT%H:00:00', created_at)) created, bu.satellite_id sat_id, bu.action act , SUM(bu.amount) amt
FROM bandwidth_usage bu, bandwidth_usage_rollups bur
WHERE datetime(created_at) < datetime(CURRENT_TIMESTAMP)
AND bur.interval_start = created
AND bur.satellite_id = sat_id
AND bur.action = act
GROUP BY created, sat_id, act
));