Failed to encode json:unsupported value

hatred · January 27, 2023, 9:11am

My node constantly send error messages to log.

ERROR	console:endpoint	failed to encode json response	{"Process": "storagenode", "error": "consoleapi storagenode: json: unsupported value: NaN", "errorVerbose": "consoleapi storagenode: json: unsupported value: NaN\n\tstorj.io/storj/storagenode/console/consoleapi.(*StorageNode).Satellite:110\n\tnet/http.HandlerFunc.ServeHTTP:2084\n\tgithub.com/gorilla/mux.(*Router).ServeHTTP:210\n\tnet/http.serverHandler.ServeHTTP:2916\n\tnet/http.(*conn).serve:1966"}

restart or recreating container does’t help. It runs 1.70.2 version, what have I to do?

Alexey · January 29, 2023, 6:50am

Please try to check your databases:

hatred · January 29, 2023, 7:18am

Checking the database is a rather complicated procedure. Do you think that this error occurs for this reason?

Alexey · January 29, 2023, 7:23am

Almost all stats taken from the databases, so if databases are corrupted or have wrong values, the code may produce different errors.

hatred · January 29, 2023, 7:36am

Well, thank you. Do you have instructions on how to move databases to SSD ? Is it easier to do this via bind mount or via a configuration file?

Alexey · January 29, 2023, 8:56am

You should do both
See

You also may specify storage2.database-dir parameter as an argument --storage2.database-dir after the image name instead of modifying the config file

fbl · January 31, 2023, 11:15pm

Hi. I am getting the same error on a rather recently set up storagenode. The issue appears since somewhere around v1.69.2.
It looks like something has changed with the JSON encoder used here: storj/storagenode.go at 6f87ea801ff88ec0fa2fb2f0983e94b27e395115 · storj/storj · GitHub

For some reason my storage_usage.db contains entries with invalid data for one of the satellite:

sqlite> select * from storage_usage;
2023-01-19 00:00:00+00:00|HRE�*�_\�8�^� |0.0|0001-01-01 00:00:00+00:00
2023-01-21 00:00:00+00:00|HRE�*�_\�8�^� |0.0|0001-01-01 00:00:00+00:00
[..]

This, in turn, causes the JSON encoder to fail. Removing those entries temporarily fixed the issue, but as they are regenerated at least on every restart, the issue appears again after some time.

Roberto · February 1, 2023, 6:26am

But if I have corrupted db’s, do I have to repair them or will the statistics work again the following month?

ItsHass · February 1, 2023, 12:25pm

I’m seeing this tooo…

‘known issue’ ?

nothing else wrong with the node as such… it just appears sometimes in the logs

Cheers
Hass

jennifer · February 1, 2023, 7:33pm

Have you gotten a “database disk image is malformed” error in your logs?

jennifer · February 1, 2023, 7:34pm

Please try to fix your database if you can. Here is the documentation.

jennifer · February 1, 2023, 7:43pm

@hatred How has your troubleshooting gone? Are you still seeing the same error?

fbl · February 1, 2023, 7:47pm

Not recently. The PRAGMA integrity_check; did not bring up any issues for any of the database files.

jennifer · February 1, 2023, 7:53pm

Ok thanks for checking, I’m looking into the recent changes to the usage data field to see if there’s an edge case bug.

fbl · February 1, 2023, 8:33pm

It looks like the issue has been there before, but manifested differently.

The issue only appears in combination with a Prometheus Exporter, so only if the API endpoint is being used. In particular, it only happens when loading Statistics for a particular Satellite via /api/sno/satellite/<id>. (Only with the satellite which has rows with invalid data in storage_usage.db)

With v1.68.2 the API would return an error message in the API response. With v1.69.2 and later, the API response is empty and an error message is printed on stderr instead.

I have tried it again today and I am not able to reproduce the issue anymore. The root cause of this issue probably is the invalid data in storage_usage.db, but I was unable to figure out how this is generated in a reasonable amount of time.

Vadim · February 1, 2023, 9:29pm

I use api requests every day, and i have seen something like that on 1.68 version, for some reason error was only in first part of the day, later it worked fine. so i think somewhere divide by zero or to small number marked error.
Also this error was only on part of my nodes, i have 70 of them. about 30-40% of them
I dont have prometeus, i have my own app that read json from api.

Code_Breaker · February 4, 2023, 2:55am

@Vadim et all,

I believe you are correct that there might be a divide by zero condition occurring. There was also an issue where the calculation wasn’t using the correct units which could cause a similar issue similar to what you mentioned with dividing by too small a number. The second issue was discovered and is resolved in the next release. If after upgrading you still have problems please let us know and we can investigate further.

Hope this message finds you doing well!

Roberto · February 5, 2023, 1:48pm

I checked the databases. I had to copy them to windows. They look ok

hatred · February 6, 2023, 6:09pm

at most of nodes it disappear but appear at others.

hatred · February 21, 2023, 10:54pm

it’s back again:

failed to encode json response	{"Process": "storagenode", "error": "consoleapi storagenode: json: unsupported value: NaN", "errorVerbose": "consoleapi storagenode: json: unsupported value: NaN\n\tstorj.io/storj/storagenode/console/consoleapi.(*StorageNode).Satellite:110\n\tnet/http.HandlerFunc.ServeHTTP:2084\n\tgithub.com/gorilla/mux.(*Router).ServeHTTP:210\n\tnet/http.serverHandler.ServeHTTP:2916\n\tnet/http.(*conn).serve:1966"}

is any cure exist for this time?