Docker Container restarting every X seconds after 0.15.2 upgrade

Hello,

I’m currently getting the following in my docker container logs:

2019-07-17T15:54:43.733898500Z 2019-07-17T15:54:43.733Z INFO Public server started on [::]:28967
2019-07-17T15:54:43.733902000Z 2019-07-17T15:54:43.733Z INFO Private server started on 127.0.0.1:7778
2019-07-17T15:54:43.764094100Z 2019-07-17T15:54:43.764Z INFO running on version v0.15.2
2019-07-17T15:54:58.608310000Z 2019-07-17T15:54:58.608Z INFO piecestore:monitor Remaining Bandwidth {“bytes”: 96429487211008}
2019-07-17T15:55:03.568580900Z 2019-07-17T15:55:03.568Z INFO Configuration loaded from: /app/config/config.yaml
2019-07-17T15:55:03.579248000Z 2019-07-17T15:55:03.579Z INFO Operator email: MYEMAIL@gmail.com
2019-07-17T15:55:03.579265700Z 2019-07-17T15:55:03.579Z INFO Operator wallet: WALLETADDRESS
2019-07-17T15:55:04.004487200Z 2019-07-17T15:55:04.004Z INFO running on version v0.15.2
2019-07-17T15:55:04.047156500Z 2019-07-17T15:55:04.047Z INFO db.migration Latest Version {“version”: 13}
2019-07-17T15:55:04.047306600Z 2019-07-17T15:55:04.047Z INFO vouchers Checking vouchers
2019-07-17T15:55:04.047693300Z 2019-07-17T15:55:04.047Z INFO Node MYNODEID started
2019-07-17T15:55:04.047739700Z 2019-07-17T15:55:04.047Z INFO Public server started on [::]:28967
2019-07-17T15:55:04.047790900Z 2019-07-17T15:55:04.047Z INFO Private server started on 127.0.0.1:7778
2019-07-17T15:55:04.080621800Z 2019-07-17T15:55:04.080Z INFO running on version v0.15.2
2019-07-17T15:55:17.353479300Z 2019-07-17T15:55:17.353Z INFO piecestore:monitor Remaining Bandwidth {“bytes”: 96429487211008}
2019-07-17T15:55:22.700088800Z 2019-07-17T15:55:22.699Z INFO Configuration loaded from: /app/config/config.yaml
2019-07-17T15:55:22.710743100Z 2019-07-17T15:55:22.710Z INFO Operator email: MYEMAIL@gmail.com
2019-07-17T15:55:22.710872800Z 2019-07-17T15:55:22.710Z INFO Operator wallet: WALLETADDRESS
2019-07-17T15:55:23.006364000Z 2019-07-17T15:55:23.005Z INFO running on version v0.15.2
2019-07-17T15:55:23.075473900Z 2019-07-17T15:55:23.075Z INFO db.migration Latest Version {“version”: 13}
2019-07-17T15:55:23.076533500Z 2019-07-17T15:55:23.076Z INFO vouchers Checking vouchers
2019-07-17T15:55:23.078147400Z 2019-07-17T15:55:23.078Z INFO Node MYNODEID started
2019-07-17T15:55:23.078166700Z 2019-07-17T15:55:23.078Z INFO Public server started on [::]:28967
2019-07-17T15:55:23.078171500Z 2019-07-17T15:55:23.078Z INFO Private server started on 127.0.0.1:7778
2019-07-17T15:55:23.109352200Z 2019-07-17T15:55:23.109Z INFO running on version v0.15.2
2019-07-17T15:55:37.764466200Z 2019-07-17T15:55:37.764Z INFO piecestore:monitor Remaining Bandwidth {“bytes”: 96429487211008}
2019-07-17T15:55:43.793401000Z 2019-07-17T15:55:43.793Z INFO Configuration loaded from: /app/config/config.yaml
2019-07-17T15:55:43.812908300Z 2019-07-17T15:55:43.812Z INFO Operator email: MYEMAIL@gmail.com
2019-07-17T15:55:43.813027300Z 2019-07-17T15:55:43.812Z INFO Operator wallet: WALLETADDRESS
2019-07-17T15:55:44.126199300Z 2019-07-17T15:55:44.126Z INFO running on version v0.15.2
2019-07-17T15:55:44.182618400Z 2019-07-17T15:55:44.182Z INFO db.migration Latest Version {“version”: 13}
2019-07-17T15:55:44.183293000Z 2019-07-17T15:55:44.183Z INFO vouchers Checking vouchers
2019-07-17T15:55:44.184615100Z 2019-07-17T15:55:44.184Z INFO Node MYNODEID started
2019-07-17T15:55:44.184638900Z 2019-07-17T15:55:44.184Z INFO Public server started on [::]:28967
2019-07-17T15:55:44.184645200Z 2019-07-17T15:55:44.184Z INFO Private server started on 127.0.0.1:7778
2019-07-17T15:55:44.217831200Z 2019-07-17T15:55:44.217Z INFO running on version v0.15.2
2019-07-17T15:56:00.573527400Z 2019-07-17T15:56:00.573Z INFO piecestore:monitor Remaining Bandwidth {“bytes”: 96429487211008}
2019-07-17T15:56:05.375435000Z 2019-07-17T15:56:05.375Z INFO Configuration loaded from: /app/config/config.yaml
2019-07-17T15:56:05.399833500Z 2019-07-17T15:56:05.399Z INFO Operator email: MYEMAIL@gmail.com
2019-07-17T15:56:05.399886300Z 2019-07-17T15:56:05.399Z INFO Operator wallet: WALLETADDRESS
2019-07-17T15:56:05.694238500Z 2019-07-17T15:56:05.694Z INFO running on version v0.15.2
2019-07-17T15:56:05.738579900Z 2019-07-17T15:56:05.737Z INFO db.migration Latest Version {“version”: 13}
2019-07-17T15:56:05.738607100Z 2019-07-17T15:56:05.738Z INFO vouchers Checking vouchers
2019-07-17T15:56:05.751971200Z 2019-07-17T15:56:05.751Z INFO Node MYNODEID started
2019-07-17T15:56:05.751989900Z 2019-07-17T15:56:05.751Z INFO Public server started on [::]:28967
2019-07-17T15:56:05.751993300Z 2019-07-17T15:56:05.751Z INFO Private server started on 127.0.0.1:7778
2019-07-17T15:56:05.770000500Z 2019-07-17T15:56:05.769Z INFO running on version v0.15.2
2019-07-17T15:56:21.284246900Z 2019-07-17T15:56:21.284Z INFO piecestore:monitor Remaining Bandwidth {“bytes”: 96429487211008}
2019-07-17T15:56:29.032344300Z 2019-07-17T15:56:29.032Z INFO Configuration loaded from: /app/config/config.yaml
2019-07-17T15:56:29.045259000Z 2019-07-17T15:56:29.045Z INFO Operator email: MYEMAIL@gmail.com
2019-07-17T15:56:29.045350600Z 2019-07-17T15:56:29.045Z INFO Operator wallet: WALLETADDRESS
2019-07-17T15:56:29.415674700Z 2019-07-17T15:56:29.415Z INFO running on version v0.15.2
2019-07-17T15:56:29.577496500Z 2019-07-17T15:56:29.569Z INFO db.migration Latest Version {“version”: 13}
2019-07-17T15:56:29.594719100Z 2019-07-17T15:56:29.594Z INFO vouchers Checking vouchers
2019-07-17T15:56:29.605761200Z 2019-07-17T15:56:29.605Z INFO Node MYNODEID started
2019-07-17T15:56:29.605793700Z 2019-07-17T15:56:29.605Z INFO Public server started on [::]:28967
2019-07-17T15:56:29.605802500Z 2019-07-17T15:56:29.605Z INFO Private server started on 127.0.0.1:7778
2019-07-17T15:56:29.649829300Z 2019-07-17T15:56:29.649Z INFO running on version v0.15.2

Also am seeing this pretty frequently:

2019-07-17T15:52:01.810385700Z ERROR: 2019/07/17 15:52:01 pickfirstBalancer: failed to NewSubConn: rpc error: code = Canceled desc = grpc: the client connection is closing

Essentially, I’m launching my dashboard, it runs for 3-5 seconds, then back to the PowerShell prompt. When this happens, I don’t see an error in the logs, with the above one occasionally coming up. Any ideas?

I’ve moved my info.db out of my Data directory, and it has recreated one. Now my node seems stable.

I’m running Windows Server 2016 Datacenter

As I’ve deleted my info.db (as it was 5GB+), is there any issue with doing so? I see my bandwidth reset, but as a network wipe has occurred, this should be one of the rare cases where that is ok, right?

Been up for 20 minutes, and haven’t had the “pickfirstBalancer: failed to NewSubConn: rpc error: code = Canceled desc = grpc: the client connection is closing” failure at all. I’ll report back if I get it, but I’m guessing it’s because of the info.db being corrupt. :frowning:

I assume you meant you moved the original info.db out rather than flat out deleting it, yes? If you’ve deleted it, I believe that could have a lasting impact for your node regardless of the data wipe.

1 Like

In the future, ask before acting. I can’t tell you if there is any lasting effect. Even just moving it and rebuilding one can make it near impossible to merge databases again. So I advise taking a little more caution in these scenarios.

1 Like

Yes, I meant moved. I still have the old one, but my node is acting normally now. Who would be able to tell if I’m in a position of problems in the future?

Well that didn’t resolve it. 3.5 hours later, the container restarted.

Can you give us some info on what kind of hardware you are using?

Please, update to the latest version and copy your new logs docker logs --tail 20 storagenode

I was already on 0.15.3, but ran my upgrade script to ensure I was on the latest. Here’s the log snippit after the node came back online: https://drive.google.com/file/d/1rP-_x2QZWS74OAbVRZy4nWi7pehUXlOV/view?usp=sharing

As for system specs:
Windows Server 2016 DataCenter
MSI Z170A Gaming M5
Intel i7-6700k
Drives are connected through an LSI HBA card (16e)
Storj Storage is using StableBit DrivePool disks (2x4TB).

Running other storage from server through HBA (Roughly 80TB RAW), with multiple SSD Cache and NVME OS disk.

Not sure what else I can provide. Thanks for investigating :slight_smile:

Looks like your problem is solved with 0.15.3

2 posts were merged into an existing topic: Error Codes: What they mean and Severity Level [READ FIRST]

2 posts were merged into an existing topic: Error Codes: What they mean and Severity Level [READ FIRST]