Order: unexpected EOF

@moby the files still exist unsent order old file on the desktop.

That was quick! Nice job!

1 Like

Another thing that would be helpful for us is if anyone could provide me with an example of a corrupted file that failed to send and is causing this issue. If you would like to help, please send a corrupted orders file to me at moby@storj.io
@donald.m.motsinger do you still have the corrupted file?

1 Like

I just sent you all 4 corrupted files I have.

1 Like

Received. Thank you :slight_smile: They should help a lot in my investigation.

It is not clear to me how to solve this error in the log :sweat_smile:
What to do? :thinking:

Windows 10 installation

Unfortunately the logs do not provide enough information yet to be able to tell you precisely what to do, but hopefully we can make them clearer in the future. Basically, one or more of your unsent orders files has been corrupted. There is one unsent orders file for every hour, but presumably the corrupted one(s) are before the log you posted above (2020-09-10T11:53:51.382-0300) - maybe you can narrow down which files it could be based on that.

An easier solution might be to move all the files out of the directory (don’t delete, just move), then put them back in one at a time or in batches to send them out - the node should be configured to attempt sending every 5 minutes by default, but it can be reconfigured.

By the way, since you have timestamps for the logs and can figure out when you first saw this error, you should be able to make the process a little easier by only messing with files before that point. You can do that by taking the last portion of the unsent file which looks like “1599692400000000000” (representing the timestamp of the order creation hour) and plugging that in here: https://play.golang.org/p/0HBazvcfYPa

Once the unsent files causing the error have been found, what should be done with those files? Do I leave them there or is it fixed just by moving and returning?

what was the cause of the corruption :face_with_symbols_over_mouth:

i sent my corrupted files :sweat_smile:

By the time the fix is rolled out, those files will be expired, so I do not think there is much benefit in keeping them around. But the fix should submit any orders at the beginning of the file that have not been corrupted. Thank you for sending your files :slight_smile:

@sorry2xs as far as I understand, corruption is always going to be a possibility when it comes to writing files like this, even if there are no bugs in the code. However, we are discussing ways of improving the order saving system so that the consequences related to this type of corruption will be minimal.

Even after moving all unsent files to another folder. I still get

ERROR orders listing orders {“error”: “order: unexpected EOF”, “errorVerbose”: “order: unexpected EOF\n\tstorj.io/storj/storagenode/orders.readLimit:515\n\tstorj.io/storj/storagenode/orders.(*FileStore).ListUnsentBySatellite.func1:238\n\tpath/filepath.walk:360\n\tpath/filepath.walk:384\n\tpath/filepath.walk:384\n\tpath/filepath.Walk:406\n\tstorj.io/storj/storagenode/orders.(*FileStore).ListUnsentBySatellite:196\n\tstorj.io/storj/storagenode/orders.(*Service).sendOrdersFromFileStore:398\n\tstorj.io/storj/storagenode/orders.(*Service).SendOrders:192\n\tstorj.io/storj/storagenode/orders.(*Service).Run.func1:139\n\tstorj.io/common/sync2.(*Cycle).Run:152\n\tstorj.io/common/sync2.(*Cycle).Start.func1:71\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57”}

I did stop the node, moved ALL unsent files to another folder then restarted my node but it didn’t work.

I am really confused about how that could happen. That code path will only get run when there are order files > 1hr old inside the orders/unsent folder. How long after you restarted did this error occur?

did you leave that other folder in the original unsent folder if take the new folder out.

@moby The error shows up at every 5 min interval.

Its in a different drive altogether.

Update: Since nothing was working I checked databases and bandwidth.db had wrong # of indexes so I fixed it. Now it did send orders but also gave above error.

Yes, once you see it for the first time, you will see it every 5 minutes, but I am curious how long passed between when you started the node after moving all the files out and seeing the error for the first time - it should have been at least an hour.

This shouldn’t have anything to do with the bandwidth.db file. You will only see it if there are no more orders in the database to send. At that point, it will try to send any filestore order limits. So if you do ls <storagedir>/orders/unsent, how many files do you see?

I know but I did the steps SNOs before me performed and it didn’t work out hence my attempt to check everything.

As of now I see 10 files. I assume those are 2 files each of all 5 satellites for current and previous hour. I haven’t copied any of the older files since it starts giving up that error again.

Had the same issue and this solved it. Thanks guys!

1 Like