Support for problematic node

support27 · July 31, 2024, 9:36pm

That’s why I have muted users that have responded in such ways prevoiusly - unfortnately, it seems that bad happens more than good.

Not one person has explained to me what is meant by on 30s. (perhaps I have missed that, but I have looked and I am only waiting for their(the poster) response - there is a time difference from my current location and theirs). Please, feel free to explain what is meant by it.

I have addressed every question(perhaps I have missed, but not intentionally - it gets very convoluted when multiple threads are merged into one and I do not know who is asking what of whom - , posted respses to requests (eg, system specifications most recently). I have provided answers - however as I may not know what is wanted - would it not be better to request information as opposed to myself posting everything, making it much more difficult to get through, especially as I am not aware of what is being looked for (you are certainly correct, in that I am not experienced in storj and have never stated such)

That is certainly not the case - first I was directed to create a support helpdesk account and provide details in there - and then when that was done and I was putting details in there - I get threatened to have that account deleted - it was them that DIRECTED ME to setup the account and then the account gets deleted for actually using the account? How is that supportive? And that, I believe (could be wrong) is by the actual support manager staff / department!? I am only observing but that process seems to be a giant ***********.

How that reflects to what you say(perahps at least more for the actual support staff than a forum member for those specific examples) I do not see.

Stez · July 31, 2024, 9:55pm

What I think @Alexey was trying to say was “up [it] by 30s” or “by 30s” instead of “on 30s”.

[It] in this context points to the config.yaml file and specifically this line:

“# how long to wait for a storage directory readability verification to complete”

You would in this case change the default value ‘1m0s’ on the line below from ‘1m0s’ to ‘1m30s’ and uncomment that line. In case you might not know what uncommenting means; it’s removing the hashtag at the start of the line, making that particular line active.

Chances of making any difference for your node is about 0,1%, I would argue.

support27 · July 31, 2024, 10:31pm

The only thing I have that is similar is the following (which is uncommented already - did know what that is)

I believe that was done via another suggestion (possibly @Alexey) and if that is the case, that may be the language barrier he referenced.

# how frequently to verify the location and readability of the storage directory
storage2.monitor.verify-dir-readable-interval: 1m30s

Mark · August 1, 2024, 12:19am

There is another setting available for timeout. If it’s not in your config file, you may manually add it yourself and edit it as desired. On my node, it looks like:

# how long to wait for a storage directory readability verification to complete
# storage2.monitor.verify-dir-readable-timeout: 1m0s

As you already know, it will need to be un-commented if you intent to customize this setting.

Alexey · August 1, 2024, 7:20am

As I said above, in your config.yaml.

Alexey:

You need to increase the writable check timeout on 30s,
Fatal Error on my Node
storage2.monitor.verify-dir-writable-timeout: 1m30s
You need to save the config after the change and restart the node.

See above.

Open cmd.exe or PowerShell as an Administrator and execute the command:

chkdsk c: /f

Alexey · August 1, 2024, 7:25am

This is a wrong one. It’s for readability check interval, not the timeout.
Do you also have a readable check failures in your logs? Because I saw that you have only writeable check failures:

So you need to add/uncomment another option and increase the default timeout from 1m0s to 1m30s:

Julio · August 1, 2024, 8:04am

He means after you change the aforementioned.
The reference is to the config.yaml lines, “storage2.monitor.verify-dir-writable-interval: 5m0s” and
“storage2.monitor.verify-dir-writable-timeout: 2m30s” ← my defaults, that I prefer.

I believe you will see the default check interval at about 2minutes0seconds, and the default writable timeout set as 1m0s (thus the phrasing of the error you see).

Alexey is recommending you increase from that default to 1m30s, then restart the node… if that specific error persists, then increase/increment by 30s intervals to 2m0s, etc. Do not surpass 5 minutes. Thereafter adjusting the config file, of course a re-start is necessary for the option to engage. While it should be obvious, I feel compelled to point out, you would need to adjust the interval check to be at least that of the timeout value, otherwise the software could f# up and step all over itself, or use default values anyways due to the error.

But regardless of that as a bandaid - it indicates that the IOPs your node wants - needs to push to your disk, gets so plugged up that the OS still hasn’t flushed data, can’t keep up, so the app stops itself, because if you’re not writing what it should have - you’d be subject to later random audits, and your node would simply not have the data to represent it - cuz it never got it.
It’s simply choking to death, that specific node…
I think you dismissed this very important part:

Don’t waste time. You need to at least separate the .db folders from your actual node’s data hdd, even then you’ll probably have to defrag that disk.
The rest is patience.

Julio · August 1, 2024, 8:17am

By the way… lol… no don’t want to but it! Have Enterprise junk here. Nevertheless, if you did buy it for this project - not a good move… and if you have just started this project, I’d seriously just nuke that node, and start over. It’s just junk test data this last month and a half… and it will continue.

1/16 of 1 cent.

Alexey · August 2, 2024, 4:55am

it would be 5m0s, if they wouldn’t change it. Right now they specified only the check interval but for readable checks instead of a writeable ones.

support27 · August 2, 2024, 1:42pm

What is the right one? I do not know.

Alexey:

So you need to add/uncomment another option and increase the default timeout from 1m0s to 1m30s:
Fatal Error on my Node
storage2.monitor.verify-dir-writable-timeout: 1m30s
You need to save the config after the change and restart the node.

Done.

support26 · August 2, 2024, 10:36pm

No, just some extra Dell servers(too outdated for live environment). Zero investement - was more a curiosity when I started the nodes - don’t recall exactly, frankly. All of them have been a while - which is why I am was hesitant to / worried about losing the node. Think the payouts on the servers are several hundred each at this point - never cashed them out.

support26 · August 2, 2024, 10:41pm

Don’t waste time on what? changing to a ssd?
The databases are on a differnt location and that has been done. How do I know if it is working?

Alexey · August 3, 2024, 2:17am

You wouldn’t have any errors related to the databases in your logs.

support26 · August 3, 2024, 10:39am

How would I know that??

Mitsos · August 3, 2024, 10:43am

You have extra servers that aren’t suitable for a live environment (= some knowledge about separating production and staging/development environments), yet you don’t know how to check the logs for an error?

I’ve been following this thread silently, but at this point I would suggest that running a storj node perhaps is not the best for you. It’s not a fully-managed-turn-key solution that you never bother with it, it requires at least some basic system administration and networking skills.

support26 · August 3, 2024, 11:12am

They are certainly suitable for a live environment. They are not suitible for the live environment they came from(what was meant by original post - to clarify).

Absolutely, how would I know the storj system?

Not a very welcoming / helpful mentality to have, for a community support forum.
Unfortunately, the few forum members offering geniune assitance vs the proscribing mentality of the others seems to be the normal / majority on here.

Mitsos · August 3, 2024, 11:14am

People have been trying for the past two weeks to provide as much information/help as they can, can’t get any more helpful than that except letting us do the install/maintenance for you.

Alexey · August 4, 2024, 3:42am

From you logs. You already know how to check them. You should not have any errors related to the databases and/or filewalkers.

support26 · August 4, 2024, 1:45pm

I do have errors in the logs. This is what I have been trying to fix. (Again, unable to post the results here)

Alexey · August 5, 2024, 3:06am

Please post at least one of them between two new lines with three backticks, like this:

```
error here
```