What about latency and mapped drive?

j2geu · April 23, 2020, 11:50pm

Hello,

I don’t see any specification about on storj node requierement. Is it a relevant thing or not ? If yes, what is the maximum latency ?

Is it possible to use mapped network drive for storj node, I mean local network and cloud network ? It will be mount as any drive. the principal problem is latency.

Thanks.

Floxit · April 23, 2020, 11:57pm

Well, I’m not the best expert on the forum, but actually, the latency is counting since our nodes enters in competition in term of response, that includes the lag between the satellittes and your geolocalization, between your connection and the relay of your ISP, then the ethernet cable between the router and your device, then your own hardware performances, including mostly the disk latency (that’s why some SNO are using SSD cache or more dangerously the RAM cache, or both, but it also increases the risk of data corruption/wearing another disk). So, I think a attached disk directly (by SATA/SAS) to the node device is still the fastest. A disk connected remotely is a very bad thing in I/O performances and reliability for such node activities, I think its highly not recommanded by the Storj staff because it could be more risky.

But, in worst case, you won’t be disqualified for high latency (except if it responds by a timeout when auditing/uptime checking happens), you’ll lose the race when download and uploads are showing “canceled” in the logs, so it will decrease the rate of succeeded upload/download traffic (thus your chances to receive and send the pieces). That means other nodes have been faster than you, and it doesn’t require more nodes to collect the piece for the resiliency/the customer.

Node requirements as posted in the blog, also asked as checkboxes when you fulfil the form for your node on https://storj.io/sign-up-node-operator/ :

Minimum Recommended Storage Node Requirements:

A minimum of one (1) processor core dedicated to each storage node service

A minimum of 500 GB with no maximum of available space per node

2 TB of bandwidth available per month; unlimited preferred

5 Mbps bandwidth upstream

25 Mbps bandwidth downstream

Online and operational 99.3 % of the time per month (MAX total downtime of 5 hours monthly)

Preferred Storage Node Requirements:

A minimum of one (1) processor core dedicated to each node service

A minimum of 8 TB and a maximum of 24 TB of available space per node

16+ TB of unmetered bandwidth available per month; unlimited preferred

100 Mbps bandwidth upstream

100 Mbps bandwidth downstream

Online and operational 99.5% of the time per month

Source: We need great storage node operators for the V3 network! Have you got what it takes to succeed?

About the pieces and node “latency” competition:

When a file is uploaded to the network, it’s first broken into 256 MB segments. The 256 MB segments are encrypted client side by the Uplink client, then broken up into 95 erasure-coded pieces. The Uplink client requests 95 storage nodes to which it will stream the pieces from the Satellite. The Satellite performs a statistical analysis of the nodes and then returns the list of 95 nodes to the Uplink client. The Uplink client attempts to upload the 95 pieces but stops after 80 of the pieces are uploaded. Even though the Satellite returns the best 95 storage nodes, the Uplink client further optimizes for the fasted 80. Of those 80 pieces, only 29 are actually needed to recover the file based on the Reed-Solomon erasure coding scheme.

Subsequently, when the Uplink client attempts to retrieve the segment while downloading the file, it requests the segment from the Satellite. The Satellite performs a statistical analysis of the nodes holding the segment pieces and returns a list of 35 nodes. The Uplink client requests the pieces held by the 35 nodes, but stops after receiving 29 pieces from the fastest 29 nodes.

You can also look that article who articles how many pieces are sent to the nodes and how many are required before the “race finishes” for the slowest nodes: Reputation Matters When it Comes to Storage Nodes

j2geu · April 24, 2020, 7:01am

Thank you very much.
So it is not an obligation, but too slow node never download or send anything.

Floxit · April 24, 2020, 4:39pm

Probably not “never”, but if your node is slower than 90% of the nodes receiving/sending the piece, the chances to get/send could be really low and less/not profitable for you.

For the statistics/rates, with scripts easy to run, you’ll see the rate you’ll get : Script: Calculate Success Rates for Audit, Download, Upload, Repair
So you can check your rate anytime, especially after a full month of activity (thus, the first month will be not the most representative because the new nodes need to be “vetted” by the satellittes before they get the normal traffic).

If your setting is not failing audit and uptime checking, you should be fine to run it.

donald.m.motsinger · April 24, 2020, 4:45pm

SMB and NFS are not supported, only iSCSI. Search the forum for more details.

cdhowie · April 24, 2020, 6:27pm

Note that SMB and NFS are not supported specifically for the SQLite databases. It is possible to have just the blob storage on SMB/NFS without problems, but then you have your node’s data (blobs and databases) split across two drives which increases your chances of losing your node to a disk failure.

If the databases are on local redundant storage and the blobs are on an SMB/NFS share, that should work fine.

j2geu · April 24, 2020, 6:58pm

Hi, this is a great news ! Do you have a tutorial to split database on the system drive and blob on the smb ?

Thank you very much.

cdhowie · April 24, 2020, 8:15pm

You’d need to mount the four directories under storage from SMB/NFS:

blobs
garbage
temp
trash

However, note that there are pitfalls here (what if Storj adds another directory, like they did with garbage?) and if you don’t get the configuration exactly right you will lose your node.

This is one of those cases where I’d advise against doing this unless you can do it without a tutorial, because otherwise I fear it will be way too easy to make a simple mistake that destroys the node.

Also, as I mentioned in my prior post, the databases would need to be stored on a redundant volume (any of the redundant RAID levels, ZFS/btrfs in redundant configurations, etc.) or you are increasing your chances of losing the node because now the data is on two disks. The odds of one of two disks failing is higher than the odds of one disk failing.

KernelPanick · April 25, 2020, 4:34pm

And the odds of one of two redundant arrays failing is lower than the odds of one disk failing.