Understanding storj

  • How many deals are created when a file is uploaded?
  • What factors determine the number of deals created?
  • Can we configure or set a parameter to control the number of deals?
  • How is redundancy maintained in erasure coding? Into how many chunks is a file divided, and how are these parameters decided? How does the system select storage providers?
  • Is the redundancy level set by default, or can users configure it?
  • I plan to upload files of 1GB, 5GB, and 10GB to Storj. What is the best method for this—using the CLI or the object browser?
  • I am unable to install the CLI on my Windows machine. Which documentation should I follow if I only want to upload and retrieve data using the CLI?
  • When data is uploaded to the Storj platform, does it get directly delivered to all storage providers, or is there an intermediary step?
  • When downloading data, does the process happen through the Storj portal, or is it downloaded directly from the storage providers?

Vast majority of your questions are answered in the white paper: https://static.storj.io/storjv3.pdf

Web ui is not designed to be a primary interface for interacting with the service. You can use any of a multitude of third party tools — some support storj natively, and some via s3.

There you go.

We do not use this term, your libuplink uploads each segment of the encrypted file (less or equal to 64MiB) to 80 (current default) uncorrelated nodes across the globe, so the amount of used nodes depends on the number of segments, which are depends on the used chunks size and the file size.

It doesn’t make sense, it will limit you how large segment and how much of such segments you would be able to upload. If the file has a little more than this enforced limit - the upload will fail.
But you can request a different numbers for expansion factor instead of the current default of 80/29 (only any 29 pieces out of 80 is enough to reconstruct the segment) via Sales, because it’s a premium service.

Depends on the used chunk size, which is depends on the chosen tool. Storj native uses 64MiB chunks by default, because it matches the underlaying segment size. Other tools may have their own defaults, most of S3-compatible uses 5MiB by default, but some of them allows you to specify it. We would recommend to use either 56MiB-64MiB chunk size (56MiB to be able to have 64MiB after the expansion factor).

Depending on the settings for your bucket (you cannot configure them via UI, but it can be requested via support, however it’s a premium service, so you may start with Sales first).
The current default value is 110/80 for uploads and 39/29 for downloads. E.g. when you upload, your libuplink will select 110 random nodes from unique /24 subnets of public IPs which has a higher reputation and success rate for your location, then start uploads in parallel, when the first 80 are uploaded, all remained got canceled. The same for downloads, but it would start with 39 and end with fastest 29 (because any 29 from 80 pieces are enough to reconstruct the segment). So you always uploads and downloads to/from the fastest nodes for your location.

Users can request a custom redundancy via contact with Sales (it’s a premium service). The default is a balanced redundancy, so most of the customers never change it.

For up to thousand files - doesn’t matter, but CLI usually significantly faster. If you have more than thousand, then CLI would be the best, rclone or uplink. Some GUI apps may be too, like Cyberduck or FileZilla.

Depending on the chosen integration. If you would use a Storj native, then your libuplink will encrypt the file, split it to segments (less or equal of 64MiB), erasure codes them, breaks to pieces and uploads to uncorrelated nodes across the globe
See

However, if you would use an S3-compatible protocol, it will use a server-side encryption unlike the Storj native integration with the end-to-end encryption, so, your S3-compatible tool will upload files unencrypted via encrypted protocol to the S3-compatible Gateway, which will do the same, like when you upload using libuplink tools - encrypts, sliced to segments, erasure coding them, slice to pieces and uploads to uncorrelated nodes across the globe using your access grant decrypted by provided S3 credentials.
You may also took best from both worlds - use S3-compatible tools with your own Self-hosted S3-compatible Gateway, running locally on in your infrastructure, so your access grant will not leave your network.

4 Likes