A service, professional SNOs could provide to customers who wants to transfer huge amounts of data onto Tardigrade

jammerdan · December 24, 2020, 1:57pm

What are these repair costs? Is there a way to quantify them? I understand normal repair costs as such that pieces have to be downloaded from SNOs which get paid for that.
But that is not the case here. The upload agent receives has all pieces locally. Uploading to Tardigrade should not come with costs.
Only reconstructions which is probably CPU resources plus disk space coming to my mind.
Am I overlooking something?

Alexey · December 24, 2020, 2:20pm

Unfortunately the upload to the network is not free for most of cloud computing.

If you mean that service provider should act as a repair worker then it’s not desirable because the worker requires a direct access to the satellite’s database and thus must be trusted to the Tardigrade satellite (and have keys).
The other way is to act as an uplink on behalf of the customer, in such a case the service provider must have a customer’s keys (or access grant).
Both ways are not good and requires trust. And will be much complicated with all documents related to data security. It’s better to avoid any trust requirements between parties.

The simplest and workable solution would be to upload 35 pieces to the network and let workers doing their job for repair costs. Or upload all required 80 and do not pay this fee.

Pac · December 24, 2020, 11:47pm

I think that’s a very good sum up. Easy to understand, straight forward (even cost-wise).
I think it’d be a nice thing too

Alexey · December 25, 2020, 10:32am

The only one problem is remained - the path to the bucket and the account (API key). The pieces itself do not contain such an info.
So we can’t avoid of giving an access grant to the agent to upload those pieces to the bucket. But in case of pre-encrypted pieces we can give only a temporary write access without exposing even derived encryption key.

jammerdan · January 26, 2021, 3:59pm

Here is a nice AWS case study for such a service (which seems to be required to get big data into the cloud):

As the DAM system was being implemented, the Rock Hall was undertaking a project to modernize its aging LTO storage and offsite backup for the preservation of large digital files, by moving the files to the cloud. Many of the LTO tapes were not easily accessible due to hardware and software failures and onsite storage limitations. Parnin says, “As the tech landscape progressed from the non-digital to digital, our archival storage system had become unmanageable and unsustainable.”

Using Amazon S3 and S3 Glacier Deep Archive provided the Rock Hall with the confidence that its digital media would be preserved and easily accessible at an affordable price. However, the Rock Hall still needed to recover the data on the LTO tapes. Working with AWS, the project team ingested the files into S3 Glacier Deep Archive via six AWS Snowball Edge Storage Optimized devices. Using AWS Snowball Edge helped to address common challenges with large-scale data transfers, including high network costs, long transfer times, and security concerns.

The Rock Hall worked with Tape Ark and its strategic partner Seagate Technology (Seagate Powered by Tape Ark) to extract all the data from the LTO tapes and load it onto the AWS Snowball Edge devices. Tape Ark then sent the Snowball devices loaded with data back to AWS, and the data went right into the Rock Hall’s Amazon S3 bucket. Once the Rock Hall’s digital media was in Amazon S3 on the AWS Cloud, Amazon S3 lifecycle policies were set up to automatically move the media files into S3 Glacier Deep Archive to help optimize storage costs for files that were rarely accessed. For the Rock Hall, the process was effortless, with Tape Ark managing the end-to-end migration.

@Alexey: I don’t know if the new Gateway MT could help with that. But if you read the above case study I believe a solution is really needed to get big data on board. I think nobody with a sane mind would even try to upload 300 TB x 2,7 = 810 TB over normal internet connection. Even with a SDSL fibre connection with 1 Gbit upload speed, which is among the fastest what you can get here it would take 75 days.
From my view there needs to be a way to prepare everything locally on disks, send them to a Storj Labs approved data center and they upload it within a couple of days or even hours.

Alexey · January 30, 2021, 11:42am

Yes, the Gateway MT may help here but only for service provider - since their bandwidth would be utilized only 1x instead of 2.7x
All other will remain the same. The only problem there is an unencrypted data.
In case of Gateway MT it’s also can be only server-encrypted at the moment, the client-side encryption for Gateway MT is not implemented yet.

jammerdan · June 24, 2021, 6:35am

Just noted that in the meantime Wasabi also offers a hardware solution to help customers to transfer large amounts of data onto their storage:

Toyoo · June 24, 2021, 7:29am

Disappointed that their ball isn’t in the shape of a ball.

jammerdan · June 24, 2021, 8:42am

Balls are overrated.

jammerdan · November 25, 2021, 12:32pm

It seems that for some reasons such devices must be named ‘*ball’:

https://www.backblaze.com/b2/solutions/datatransfer/fireball.html

jammerdan · April 28, 2024, 1:31am

Bringing this back to attention.
Now with the commercial operators Storj has professional datacenter partners that could act as uploader for customers who want to migrate large quantities of on-premise data without uploading it themselves.

The idea is still the same and similar to what all the other cloud providers offer:

Customer wants to migrate data, contacts Storj
Storj sends “device” to customer
Customer moves data onto device
Customer sends device to the nearest Storj commercial operator
Commercial operator connects device and uploads data
Commercial operator sends “device” back to Storj

All of this of course in a way that data is encrypted and remains encrypted at all times and nobody else than the customer has access to the unencrypted data.

Alexey · April 28, 2024, 9:39am

We have at least this one Partner, who can help with it:

I think we also have other Partners and guides to do the same:

and many others. The simplest one is to use the rclone sync.
See also

I do not think that we have Partners or Customers, who want to migrate their data physically (using a device) so far.

jammerdan · April 28, 2024, 10:21am

On the other hand, there has to be a demand for it. Otherwise Google, AWS and others would not offer it:

Amazon even had offered a truck full of storage devices customers could order to have petabytes of data moved into AWS.

https://www.datacenterdynamics.com/en/news/aws-retires-snowmobile-truck-based-data-transfer-service/

Satellite firm DigitalGlobe – later acquired by Maxar – was a named Snowmobile customer, using the service to move 100PB of satellite imagery to AWS.

Alexey · April 28, 2024, 10:42am

Yes, they may contact our sales team to have this feature too, I believe.

MatthewSteeples · April 28, 2024, 11:56am

That demand could be fading

Alexey · April 28, 2024, 12:37pm

For honestly, I always thinked that’s not effective. Now they just confirmed my feelings.

jammerdan · April 30, 2024, 2:33am

What is your idea how to move move 100 PB into the cloud? Generally I mean.

And I wonder what would be Storjs suggestion to a customer let’s say with a really huge archive of tens of petabytes on premise to move it to Storj?

That’s what I like about those devices the competition offers. The customer basically has to do nothing. He receives the right device for his use case, plugs it in moves the data onto it and ships it to he approved data center without worries.

Here is another extreme case in which shipping hard drives was the only viable solution:

Generally the field of astronomy sounds like it could be an interesting case for Storj like many other fields of science where huge amounts of data gets created and moved around the world. Cloudferro seems to successfully acquire clients in these fields and their object storage seems expensive:
In the past I made a suggestion how Storj could eventually get some visibility in the science and research space:

But there must be also trade fairs or conferences for the technologies that scientists and researchers use where Storj could present its solution and may get in contact with them.

Toyoo · April 30, 2024, 10:50pm

Heh, “nothing”. Moving storage itself is only a fraction of what a prospective customer would need to do. If you store petabytes of data, you probably have tons of pipelines, batch jobs, data warehousing solutions and such that need to be reconfigured and moved to take advantage of new data location. There might be reporting tools that might need to be recreated from scratch simply because the tool just doesn’t offer integration with new mode of storage. You need to reconfigure your networking layer to provide bandwidth and change firewall options to allow traffic on new routes—often for each large and small piece of code that was connecting to the old storage solution. You need to prepare a new storage observability layer from scratch, so that you can monitor costs, look for new fault points and detect security incidents.

“Nothing”

jammerdan · May 1, 2024, 2:18am

Please try to understand the context.

Alexey · May 5, 2024, 9:17am

I believe our engineers will figure that out in the conversation with the exact customer required this service. Those customers likely will contact the team anyway at least from the sales/pricing perspective.