2 nodes on win server/Hyper-v/linux setup issue

Lukasz · June 26, 2021, 8:38pm

Hi guys, 8 days ago I enabled my second node. Both of them are on the same windows server unit with hyper-v as 2 separated VM Linux systems.

Both dashboards were showing online. Unfortunately as soon I enable the second node. My first node stops receiving/sending any data. I was thinking it is normal as is full. Today I rebooted node 1 looks like it is getting back some data unfortunately my dashboard was freeze for all 8 days. After reboot, I see “europe-north-1.tardugrade.i …” 63% Online and I got 3 dollars more on the current month earning.

Now I can not connect to the dashboard for node 2 also it is showing port 14003 as not reachable. Can you advise if my ports or the command structure are ok?

The second node via that 8 days get 35GB. All satellites were 100%.
I do not want to test, I can be disqualified on my node 1. That one is from the beginning of the v3 and it is full of data.

Please any advice?

zuik · June 26, 2021, 9:27pm

Do you use the same identity for both nodes (don’t do it!)?

Lukasz · June 26, 2021, 11:03pm

No is not the same. Different email, different auth token, different identity generated.

Pac · June 27, 2021, 8:03am

I’m afraid an audit score below 60% usually means the node got already disqualified on corresponding satellites

Whenever scores start dropping below 95/90%, it’s urgent to investigate and fix issues.

Like @zuik I would have thought such a behavior could be caused by 2 nodes using the same identity which is a fast road to disqualification for both nodes. It’s a real shame satellites can’t detect 2 nodes are using the same identity and warn the SNO accordingly

Anyways, I’m not sure that’s what’s wrong in your case as it seems both of your nodes are pointing to different identity folders : “/home/[user]/.local/share/storj/identity/storagenode” & “/local/share/Identity/storagenode”.
So unless they contain the same identity files, or if one folder is a symlink to the other, or if the 2 identities got mixed up… that’s probably not the issue.
This said, because they seem like default folder locations, it could have been easy to mix up destinations between nodes I guess. A good practice is to store identities within storage disks, so they are where the data they belong to is.

Did you check your nodes’ logs? There’re probably plenty of errors popping up in them that could shed some light on what went wrong, to better understand what to fix.

Lukasz · June 27, 2021, 9:11am

Node 1 on that 3 satellites is disqualified now over the year from that point they added that suspension/audit graphs. Then also they related that to SMR HDD.
Now we have graphs, unfortunately, they froze for 8 days

The dashboard was for that 8 days showing 99% online and the same amount of money for this month. Just after reboot on day 8. It updated to 60% +3$.

I was setting up nodes2 on Linux all was done on that VM. I do not copy any data between two instances of Linux. I will check the log.

Lukasz · June 27, 2021, 4:57pm

Log:
------------------------ @s JEW3X2BRHTAC6W35L6CYS5YA 2021-06-27T16:33:37.601Z 2021-06-27T16:34:13.009Z 2021-06-27T16:34:13.385Z 2021-06-27T16:34:18.547Z 2021-06-27T16:34:18.892Z 2021-06-27T16:34:18.894Z 2021-06-27T16:34:23.616Z 2021-06-27T16:34:57.650Z 2021-06-27T16:37:05.722Z 2021-06-27T16:38:06.326Z 2021-06-27T16:38:16.408Z 2021-06-27T16:38:55.576Z 2021-06-27T16:38:55.577Z 2021-06-27T16:38:55.578Z 2021-06-27T16:39:14.157Z 2021-06-27T16:39:14.588Z 2021-06-27T16:39:23.302Z 2021-06-27T16:40:23.410Z 2021-06-27T16:40:23.411Z 2021-06-27T16:40:47.973Z 2021-06-27T16:40:48.360Z 2021-06-27T16:40:54.116Z 2021-06-27T16:40:54.612Z 2021-06-27T16:42:10.842Z 2021-06-27T16:42:11.228Z @s -----------------------------------Logs Node2---------------------------
torj_node_2:~$ screen
QLFKJRUMOWZJCNV46MSM6B4WA"}
INFO piecedeleter delete piece sent to trash {“Satellite ID”: “12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs”, “Piece ID”: “RTNGVVGK5MT36LGACLQ5PV2A2YMVRNQHKS6LYMIMUHRS3QZH2MGA”}
INFO piecestore upload started {“Piece ID”: “4PSY4S3PD4KDJCPS6ASITI6TZZJP5MGFEFY2ZDBTLY36SOZ2ZIBA”, “Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “Action”: “PUT”, “Available Space”: 6964070917632}
INFO piecestore uploaded {“Piece ID”: “4PSY4S3PD4KDJCPS6ASITI6TZZJP5MGFEFY2ZDBTLY36SOZ2ZIBA”, “Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “Action”: “PUT”, “Size”: 181504}
INFO piecestore upload started {“Piece ID”: “YJSYESJ6QBYEB4EABUXL2YDU4I3RV3HZTDVKFVNFPMULLJQBANPA”, “Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “Action”: “PUT”, “Available Space”: 6964070735616}
INFO piecestore uploaded {“Piece ID”: “YJSYESJ6QBYEB4EABUXL2YDU4I3RV3HZTDVKFVNFPMULLJQBANPA”, “Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “Action”: “PUT”, “Size”: 181504}
INFO piecestore upload started {“Piece ID”: “B3Z7TSAXC7YG74VFL54WNQWSHN6MYTKIXHUXGXC24B73ZNRQ2APQ”, “Satellite ID”: “12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs”, “Action”: “PUT”, “Available Space”: 6964070553600}
INFO piecestore uploaded {“Piece ID”: “B3Z7TSAXC7YG74VFL54WNQWSHN6MYTKIXHUXGXC24B73ZNRQ2APQ”, “Satellite ID”: “12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs”, “Action”: “PUT”, “Size”: 362752}
INFO piecedeleter delete piece sent to trash {“Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “Piece ID”: “CWFHRHPACKUCCOKJSSVTRV4ERTZRK566XTEM6Z6ZJL4Y6EDQWCKA”}
INFO piecedeleter delete piece sent to trash {“Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “Piece ID”: “MNWTF5JU5RQ46F7S3MUYG7ABJBYPTWGI67KQRGEWFBPQNGJ72CVA”}
INFO piecestore upload started {“Piece ID”: “KZYL454PLC3W7CEKR6ET7BZLRGIJDMQHVMVP7AZQWSBUD2F2R25Q”, “Satellite ID”: “12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs”, “Action”: “PUT_REPAIR”, “Available Space”: 6964070190336}
INFO piecestore uploaded {“Piece ID”: “KZYL454PLC3W7CEKR6ET7BZLRGIJDMQHVMVP7AZQWSBUD2F2R25Q”, “Satellite ID”: “12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs”, “Action”: “PUT_REPAIR”, “Size”: 1620224}
INFO piecedeleter delete piece sent to trash {“Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “Piece ID”: “EPDMSB77W3Q7EZLQWBWCZ7L6CTUBI2MUA7IEMFPYKYR324DBWNWA”}
INFO piecedeleter delete piece sent to trash {“Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “Piece ID”: “Y7TDVQX3A3XW5D7C2OCK3GJ6BOLWHONINANCIMJEYWXQRF2FSDQQ”}
INFO piecedeleter delete piece sent to trash {“Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “Piece ID”: “IU3ORNF2CY2WSVF5K5A3WEK3PJT4UKR5BC3WGI4XLDENAA7M6GGA”}
INFO piecestore upload started {“Piece ID”: “DQ4RATSDNX7SMSOLYCSFXANVKMYLF7OECW7OU5TNIUCDGRDREITA”, “Satellite ID”: “12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs”, “Action”: “PUT”, “Available Space”: 6964068569600}
INFO piecestore uploaded {“Piece ID”: “DQ4RATSDNX7SMSOLYCSFXANVKMYLF7OECW7OU5TNIUCDGRDREITA”, “Satellite ID”: “12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs”, “Action”: “PUT”, “Size”: 60416}
INFO piecedeleter delete piece sent to trash {“Satellite ID”: “12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs”, “Piece ID”: “DQ4RATSDNX7SMSOLYCSFXANVKMYLF7OECW7OU5TNIUCDGRDREITA”}
INFO piecedeleter delete piece sent to trash {“Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “Piece ID”: “L6X7P2R6MVZMTVXXPAL5GTHDZ7RHE6TUQZPKMZTKBFPN34FK6L4A”}
INFO piecedeleter delete piece sent to trash {“Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “Piece ID”: “NRJ77KUYPFWZCWD7GRITL5TM3DGCNWMVFYI7DWHYZPR7BDHEMEGA”}
INFO piecestore upload started {“Piece ID”: “WF2OXZFVFPDW7HEN2ODY3SG4SQKWD5N5NTBJMFHWUFKRXTNPZ3VA”, “Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “Action”: “PUT”, “Available Space”: 6964068508672}
INFO piecestore uploaded {“Piece ID”: “WF2OXZFVFPDW7HEN2ODY3SG4SQKWD5N5NTBJMFHWUFKRXTNPZ3VA”, “Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “Action”: “PUT”, “Size”: 181504}
INFO piecestore upload started {“Piece ID”: “EZOVLHPGUHVTKEYBN2ESXEKFJJRYCH4DF5CFCKNBSNXR3ANAKCGA”, “Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “Action”: “PUT”, “Available Space”: 6964068326656}
INFO piecestore uploaded {“Piece ID”: “EZOVLHPGUHVTKEYBN2ESXEKFJJRYCH4DF5CFCKNBSNXR3ANAKCGA”, “Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “Action”: “PUT”, “Size”: 181504}
INFO piecestore upload started {“Piece ID”: “WCUS3CY3U5YRFFCUKB64M2WTXHFU63HG7C4EDEHUAYZYDDX4FOAQ”, “Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “Action”: “PUT”, “Available Space”: 6964068144640}
INFO piecestore uploaded {“Piece ID”: “WCUS3CY3U5YRFFCUKB64M2WTXHFU63HG7C4EDEHUAYZYDDX4FOAQ”, “Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “Action”: “PUT”, “Size”: 181504}
torj_node_2:~#

penfold · June 28, 2021, 6:52am

Based on your docker run command aren’t you forwarding traffic from node2 to the same port node1 runs on? the 28968:28967 part?

Lukasz · June 28, 2021, 10:09am

that was the only way I get node2 online.

Node1
docker run -d --restart unless-stopped -p 28967:28967 -p 14002:14002
Node 1 working ok with that comand.

Node2
docker run -d --restart unless-stopped --stop-timeout 300
-p 28968:28968/tcp
-p 28968:28968/udp
-p 127.0.0.1:14003:14003
Is that correct, if I have the second node on the same Lan, connected to the same router?

Pac · June 28, 2021, 9:54pm

This wouldn’t be right. Ports on the right side of colons are internal ports: they should never change. The command you showed earlier is the right one for node2: with ports configured like follows:

[...]
 -p 28968:28967/tcp \
 -p 28968:28967/udp \
 -p 127.0.0.1:14003:14002 \
[...]

All nodes can be on the same LAN. No problem with that. In fact, they can even be on the same machine behind the same local IP. As long as port redirections are setup correctly on the ISP router, nodes can basically be anywhere on your local network.

The log excerpt you showed seems okay. Could you search for lines that contain “error” or “warning” in them?

Alexey · June 29, 2021, 8:01am

Please, do not forward dashboard’s port - it has no protection and anyone in the world can see your private information, use the secured remote access instead: How to remote access the web dashboard - Node Operator

Regarding remained parts all looks good. They are behind the same public IP, so they will share an ingress and can receive as a one node in total. So, stopping ingress looks right.

By the way - you doesn’t need to have several Linux VMs to run the storagenode, you can run them both on the same VM, but with different disks and identities.

Lukasz · June 29, 2021, 6:43pm

Hi Alexey, I will try that “secured remote access”.

When I pull CLI dashboard egress/ingress changed, Node 2 it is getting some data. 1400X is just for the web interface and audits/suspensions, month earrings. Any advice to get that working?

I also try with (local IP):1400X and I got “This site can’t be reached” Issue is somewhere on that command line structure or that VM.

@Pac node 2 gets just that small amount of logs and there are no “errors” or “warning” Compare to Node1.

Alexey · June 29, 2021, 7:32pm

replace

to
-p 14003:14002 \
And your node will be available from the local network and since you have a port forwarding rule for that port - everyone in the world.

Lukasz · June 29, 2021, 7:48pm

Hi Alexy, I will left as is for now. That makes sense when I use the command line structure from the old node 1. When was created a long time ago there was missing that part. Probably was added later for security. Thank you, guys! That can be closed now. Now I need to think about security. Thank you, Alexy!