Can file transfers resume?

Is there a function built in already that resumes up- or downloads? Sometimes file transfer get stuck and needs to be restarted or a customer has a large file transfer and wants to pause it and resume it later.
Is that possible or does it mean a complete restart from zero?
Because thinking of it, as effectively shards being up- and downloaded, shouldn’t it be easy to let’s say for a broken download, just download the remaining number of missing shards instead of downloading all shards again?

1 Like

i think it was introduced to tardigrade a few versions back… you should be able to find it in the change logs… but certainly not sure…

It’s restart from zero in the current version, however, it uploads the first part before interruption a little bit faster, if your file is greater than a one segment (64MiB).
I don’t think it’s designed like that, but it’s my own experience.

Resuming might be important for potential customers. I came across that here https://www.signiant.com/products/jet/ as potential requirement from large data customers when doing a bit of research for my suggestion here: The digital cinema as potential Tardigrade use case?

From a general perspective it does not sound too hard to implement a resuming feature as data is up- and downloaded in parts anyway. So if a user has downloaded 40 of 80 parts and the transfer gets interrupted it should be possible to make the software in a way that it downloads only the missing 40 parts (or whatever number is required to successfully create the complete file). Same goes for uploads.

Such a resuming feature could save Tardigrade customers time and bandwidth cost.

2 Likes

Just had another incident like that:

upload failed: ecclient error: successful puts (74) less than success threshold (80)

After that the transfer resumed at 0. For missing 6 puts. If I understand this correctly, it seems it would be very intelligent if transfer would not start at 0 but transmit only the remaining 6 pieces.

3 Likes

maybe something to do with the data being live… duno how erasure coding works really… but resuming might not be so simple as one might think…

but yeah without a doubt it would certainly make sense to have a resume function anything else is just wasteful and annoying. :smiley:

Yeah but imagine you upload a 150 GB movie and at 99% it tells you, sorry, I have uploaded 79 of 80 required pieces and start from zero now!.. :crazy_face: :crazy_face: :crazy_face: :crazy_face:

This does not sell. At least in case of uploads you don’t have to pay the traffic as ingress is free. But how about interrupted downloads? Hopefully only completely successcul downloads a user has to pay for egress.

1 Like

Just had an even better one:

upload failed: ecclient error: successful puts (78) less than success threshold (80)

Aaaaaaaaaaaaaaaand restart!!!
:confused: :woozy_face: :confounded:

it was an issue during the dawn of the internets, even some FTP’s wouldn’t allow resuming transfers… was hell, and ofc some server would have timeout’s so insanely slow downsloads taking hours or days would ofc drop the connection.

and ofc from their perspective it’s like oh well we conserve our bandwidth by dropping / clearing connections are set amount of time… . and users are like oh well it crashed ill just download it all again… doesn’t really help the bandwidth usage when something is seeing to much traffic.

ofc it might keep their webserver a float so they didn’t do a 404

even windows didn’t have resume on copy before like windows 8, i think
copying or moving data in windows is usually a mess tho, like if you move stuff and then it fails for whatever reason… network, usb, weird disk stuff… then you got like files that might be incomplete and you don’t know which… and it’s just hell

which is why i basically stopped using the move function all together unless on the same drive, because then it’s just a metadata correction and doesn’t take much.

was moving 4 TB over my 1gbit network TWICE… atleast the network was stable but still… i think most of the methods for moving data today is antiquated, there should be much more information, much more tests and verification that one could see, to ensure the data lands on location and is okay…
because it really seems like the failure rate on moving data is pretty high… like more than 1 out of 10k or 100k

it’s not often i see it ofc it might also have to do with hdd’s as a media isn’t perfect and put massive amounts of data on them and one will get errors and then keep copying stuff, and the data degrades with one knowing…

data transfer is a complex thing, but resume is a must imo…

You can develop tool that supports “resume” in a sense. Your tool can divide file in 1GB parts and upload each part as separate file. when upload fails, you restart last 1GB. On download side, you do opposite. You can even stream.

However, uplink itself should support retry (not to be confused with resume). It already chunks the file and uploads those chunks in sequence. When one of them fails, it should be able to retry last chunk upload. Does not sound technically difficult at all. :face_with_monocle:

1 Like

At the moment we recommend to use rclone for such a task. It’s already working like this (it can retry).

1 Like

Does rclone retry whole file or just last segment? If it is segment, then it seems that’s the answer OP was looking for.

The last piece (inside transfer) I would say. But it can retry a whole file (or batch) too.
It’s not a resume, so would not cover the OP needs.

And again:

upload failed: ecclient error: successful puts (79) less than success threshold (80)

For more than 10 hours I am trying now to upload 3 files via Filezilla. Each of them around 2 gigs. Uploading simultaneously. It keeps restarting due to this reason. Many times it’ s just so close before the 80 threshold that I am wondering if this is a coincidence.

I am having quite some other load on the machine and I’ll see later when that drops if the Filezilla uploads will go through or not.

You need to reduce the number of parallel transfers to 2 I believe.

I wanted to do the same test with uploading 3 files using filezilla I used 3 files around 2gigs each as well to try to see if I can see the same thing.

Monitoring the usage with task manager.

Im also doing this on wireless setup.

So far my files are completing within about 14mins
But I didn’t see any errors when transfering 3 files at the same time.

1 Like

This is highly depends on upstream speed and quality of the channel/router. If you have an upstream more than 40Mbit, any transfer would be successful.

So should limit my bandwidth to 40mbit and do it again?

This is somehow related to the type of the connection. For example, I have a UTP line to the provider’s switch somewhere in my house, I have a symmetrical channel 100Mbit up/down.
Even if I shape my channel on the router up to 20Mbit any transfer is still successful.
Without a limitation my router and ISP’s switch are able to transfer 10 threads in parallel with a full usage of my bandwidth without issues via WiFi from my laptop.
Unfortunately some people have problems to transfer even via wire connection.

Alright I just wanted to help debug that but for sure wont help on my connection… But I thought by using one of my wireless access points for a test it would at least for sure fail.