Upload refactoring

You’ve been super helpful so far in looking into these things. I haven’t been able to locally reproduce it, so can I ask you to continue to be helpful and try out the uplink from this release: https://github.com/storj/storj/releases/tag/untagged-1466ba86c93426837233?

It comes with two main changes:

  1. a concurrent segment limit that should prevent the OOM issues
  2. a --upload-log-file flag that will output debug logging information to the specified file about what is going on with the upload.

It would be super helpful if you could do the 1TB file upload with the --upload-log-file flag set (something like --upload-log-file 1TB.log). I’d also prefer if it was done with --maximum-concurrent-pieces 1300 --long-tail-margin 50.

Thanks!

Sure thing,

Though https://github.com/storj/storj/releases/tag/untagged-1466ba86c93426837233 directs me to a 404 page :slight_smile:

Th3Van.dk

Dang. Must require some permissions and I didn’t notice because i was logged in. How about Release v1.80.4 · storj/storj · GitHub?

Works, downloading uplink v1.80.4 now…

I’ll start the 1TB test again, with the options you sugested.

Th3Van.dk

I’ve started the 1 TB file upload using Uplink v1.80.4 and you can follow the process via the log file : Uplink upload script - Log File (refresh manullay)

Starting upload of 1t-file1 using uplink v1.80.4 @ 20230606001904

SHA256 of uplink-1804 : 03b4b18dbf1bcf311119ed1af1afeb19c1b38ed114e940f17d86e1276a5b6009

- Command : ./uplink-1804 cp --maximum-concurrent-pieces 1300 --long-tail-margin 50 1t-file1 sj://test/1t-file1-v1804-20230606001904
- 2023-06-06T00:19:16 UTC - Uplink mem usage : 921 MiB - 25.07 MB / 1.00 TB

Stats should update every 15 min.

The link for uplink (v1.80.4) debug log file, should be on it’s way in a pm @zeebo

Th3Van.dk

1 Like

I have canceled the upload, since uplink was consuming way to much memory (+350 GB), therefor a OOM would happen before the 1TB was completely uploaded.

Starting upload of 1t-file1 using uplink v1.80.4 @ 20230606001904

SHA256 of uplink-1804 : 03b4b18dbf1bcf311119ed1af1afeb19c1b38ed114e940f17d86e1276a5b6009

- Command : ./uplink-1804 cp --maximum-concurrent-pieces 1300 --long-tail-margin 50 1t-file1 sj://test/1t-file1-v1804-20230606001904
- 2023-06-06T00:19:16 UTC - Uplink mem usage : 921 MiB - 25.07 MB / 1.00 TB
- 2023-06-06T00:34:17 UTC - Uplink mem usage : 13829 MiB - 6.13 GB / 1.00 TB
- 2023-06-06T00:49:17 UTC - Uplink mem usage : 26269 MiB - 13.89 GB / 1.00 TB
- 2023-06-06T01:04:17 UTC - Uplink mem usage : 35708 MiB - 21.34 GB / 1.00 TB
- 2023-06-06T01:19:17 UTC - Uplink mem usage : 45928 MiB - 29.25 GB / 1.00 TB
- 2023-06-06T01:34:18 UTC - Uplink mem usage : 56409 MiB - 37.95 GB / 1.00 TB
- 2023-06-06T01:49:18 UTC - Uplink mem usage : 66822 MiB - 46.77 GB / 1.00 TB
- 2023-06-06T02:04:18 UTC - Uplink mem usage : 76977 MiB - 55.23 GB / 1.00 TB
- 2023-06-06T02:19:19 UTC - Uplink mem usage : 87587 MiB - 64.55 GB / 1.00 TB
- 2023-06-06T02:34:19 UTC - Uplink mem usage : 99304 MiB - 74.08 GB / 1.00 TB
- 2023-06-06T02:49:19 UTC - Uplink mem usage : 112389 MiB - 84.56 GB / 1.00 TB
- 2023-06-06T03:04:19 UTC - Uplink mem usage : 128469 MiB - 97.10 GB / 1.00 TB
- 2023-06-06T03:19:20 UTC - Uplink mem usage : 144156 MiB - 109.99 GB / 1.00 TB
- 2023-06-06T03:34:20 UTC - Uplink mem usage : 158735 MiB - 122.40 GB / 1.00 TB
- 2023-06-06T03:49:20 UTC - Uplink mem usage : 174161 MiB - 135.34 GB / 1.00 TB
- 2023-06-06T04:04:21 UTC - Uplink mem usage : 190436 MiB - 148.57 GB / 1.00 TB
- 2023-06-06T04:19:21 UTC - Uplink mem usage : 205926 MiB - 162.00 GB / 1.00 TB
- 2023-06-06T04:34:21 UTC - Uplink mem usage : 221807 MiB - 175.79 GB / 1.00 TB
- 2023-06-06T04:49:21 UTC - Uplink mem usage : 241597 MiB - 191.79 GB / 1.00 TB
- 2023-06-06T05:04:22 UTC - Uplink mem usage : 259104 MiB - 207.16 GB / 1.00 TB
- 2023-06-06T05:19:22 UTC - Uplink mem usage : 274204 MiB - 220.45 GB / 1.00 TB
- 2023-06-06T05:34:22 UTC - Uplink mem usage : 296471 MiB - 236.96 GB / 1.00 TB
- 2023-06-06T05:49:23 UTC - Uplink mem usage : 339521 MiB - 264.45 GB / 1.00 TB

Th3Van.dk

6 Likes

Great! Thanks for these results. I was able to track down the memory leak: private/storage/streams/buffer: fix memory leak · storj/uplink@72bcffb · GitHub Hopefully we’ll get that included in upcoming releases ASAP.

One thing I can’t figure out or reproduce though is your upload seems to heavily limit the concurrency even though the flags appear to be set. For example, the logs show that you only ever have 2 segments in flight at a time, and at most around 180 pieces. It’s like something is heavily limiting the concurrency. Running the same binary on my machine with the same flags, I get a much different picture. Here’s two graphs of the first 10000 points of data of keeping track of how many pieces are being uploaded:

Th3Van:

Me:

I can’t explain it. Can you tell me what operating system it’s running on? Can you maybe double check the spelling of the flags?

Thanks again! We’re getting closer!

root@server030:/disk103/uplink-test#  lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 22.04.2 LTS
Release:        22.04
Codename:       jammy

If i do
./uplink-1804 cp --maximum-concurrent-pieces 1300 --long-tail-margin 50 10g-file1 sj://test/10g
and compares it to
./uplink-1804 cp --maximum-concurrent-pieces 1 --long-tail-margin 1 10g-file1 sj://test/10g
It’s going to take way longer to upload the same file, so I’d say that the spelling of the flags should be correct.

Now, to dig further into this issue, I’ll be setting up a dedicated server (AMD 32 Cores / 128 GB ram / 4 x Kioxia CD8 6,4 TB NVMe (Raid 0) / 100Gbit uplink NIC) this weekend, with a freshly standard installed Ubuntu (22.04.2) and run some more tests, to see if that gives me better speed performance, using Uplink v1.80.2

I’ll post back here, as soon I’m done testing.

Th3Van.dk

5 Likes

Ok, I’ve set up a dedicated server for testing uplink, since running uplink v1.80.4 on the primary storj server was very slow for some reason, compared to @zeebo’s upload speed

The dedicated hardware setup :

- CPU model name ------- : AMD Ryzen Threadripper 3970X 32-Core Processor
- CPU socket(s) -------- : 1
- CPU core(s) per socket : 32
- CPU min MHz ---------- : 2200.0000
- CPU max MHz ---------- : 3700.0000
- System memory -------- : 131757 MB
- Harddisk --------------: 4 x Kioxia CD8 6,4 TB NVMe (Raid 0)
- Harddisk controller -- : Broadcom MegaRAID 9670W-16i
- Internet / uplink ---- : 100 Gbit/s
- OS ------------------- : Freshly installed Ubuntu 22.04.2 LTS (no sysctl option changed/added)

First I tested uplink v1.77.2 vs. v1.80.4, using the flags suggested by zeebo for v1.80.4 :

  • v1.77.2 : --parallelism 20
  • v1.80.4 : --maximum-concurrent-pieces 1300 --long-tail-margin 50

http://www.th3van.dk/benchmark/uplink/script-log-file-1t-file1-v1804-20230611081923.log :

Starting upload of 1t-file1 using uplink v1.77.2 @ 20230611081934

SHA256 of uplink-1772 : 8f41f40789fac6e6e8542badbc098c80a0431714e2ec51be3f702fede5e48d5d

- Command : ./uplink-1772 cp --parallelism 20 1t-file1 sj://test/1t-file1-v1772-20230611081923
-
- 2023-06-11T08:19:57 UTC -- CPU : 18.63, 4.26 , 2.09  -- Uplink mem usage (VSZ) : 3181 MiB -- Upload speed : 12465 Mbit/s -- Uploaded : 8.89 GB / 1.00 TB
- 2023-06-11T08:21:58 UTC -- CPU : 58.93, 24.43, 9.75  -- Uplink mem usage (VSZ) : 3315 MiB -- Upload speed : 12480 Mbit/s -- Uploaded : 60.03 GB / 1.00 TB
- 2023-06-11T08:23:58 UTC -- CPU : 64.67, 37.94, 16.47 -- Uplink mem usage (VSZ) : 3320 MiB -- Upload speed : 11952 Mbit/s -- Uploaded : 110.13 GB / 1.00 TB
- 2023-06-11T08:25:58 UTC -- CPU : 64.91, 46.90, 22.37 -- Uplink mem usage (VSZ) : 3321 MiB -- Upload speed : 12548 Mbit/s -- Uploaded : 160.25 GB / 1.00 TB
- 2023-06-11T08:27:59 UTC -- CPU : 63.44, 52.62, 27.46 -- Uplink mem usage (VSZ) : 3321 MiB -- Upload speed : 12434 Mbit/s -- Uploaded : 210.29 GB / 1.00 TB
- 2023-06-11T08:29:59 UTC -- CPU : 65.11, 56.84, 32.06 -- Uplink mem usage (VSZ) : 3321 MiB -- Upload speed : 12002 Mbit/s -- Uploaded : 260.21 GB / 1.00 TB
- 2023-06-11T08:31:59 UTC -- CPU : 64.42, 59.44, 36.03 -- Uplink mem usage (VSZ) : 3321 MiB -- Upload speed : 12528 Mbit/s -- Uploaded : 311.08 GB / 1.00 TB
- 2023-06-11T08:34:00 UTC -- CPU : 64.17, 60.99, 39.43 -- Uplink mem usage (VSZ) : 3321 MiB -- Upload speed : 12463 Mbit/s -- Uploaded : 361.30 GB / 1.00 TB
- 2023-06-11T08:36:00 UTC -- CPU : 64.96, 62.39, 42.57 -- Uplink mem usage (VSZ) : 3321 MiB -- Upload speed : 12271 Mbit/s -- Uploaded : 411.53 GB / 1.00 TB
- 2023-06-11T08:38:00 UTC -- CPU : 64.52, 63.25, 45.39 -- Uplink mem usage (VSZ) : 3321 MiB -- Upload speed : 11794 Mbit/s -- Uploaded : 461.59 GB / 1.00 TB
- 2023-06-11T08:40:01 UTC -- CPU : 64.66, 63.50, 47.64 -- Uplink mem usage (VSZ) : 3321 MiB -- Upload speed : 11084 Mbit/s -- Uploaded : 511.52 GB / 1.00 TB
- 2023-06-11T08:42:01 UTC -- CPU : 65.37, 64.19, 49.82 -- Uplink mem usage (VSZ) : 3321 MiB -- Upload speed : 11875 Mbit/s -- Uploaded : 561.50 GB / 1.00 TB
- 2023-06-11T08:44:01 UTC -- CPU : 65.69, 64.69, 51.75 -- Uplink mem usage (VSZ) : 3321 MiB -- Upload speed : 12144 Mbit/s -- Uploaded : 612.26 GB / 1.00 TB
- 2023-06-11T08:46:02 UTC -- CPU : 65.20, 64.82, 53.37 -- Uplink mem usage (VSZ) : 3321 MiB -- Upload speed : 12583 Mbit/s -- Uploaded : 662.10 GB / 1.00 TB
- 2023-06-11T08:48:02 UTC -- CPU : 65.93, 65.22, 54.91 -- Uplink mem usage (VSZ) : 3321 MiB -- Upload speed : 11600 Mbit/s -- Uploaded : 711.86 GB / 1.00 TB
- 2023-06-11T08:50:02 UTC -- CPU : 65.64, 65.29, 56.19 -- Uplink mem usage (VSZ) : 3321 MiB -- Upload speed : 11979 Mbit/s -- Uploaded : 761.59 GB / 1.00 TB
- 2023-06-11T08:52:03 UTC -- CPU : 64.80, 65.02, 57.20 -- Uplink mem usage (VSZ) : 3321 MiB -- Upload speed : 11977 Mbit/s -- Uploaded : 811.30 GB / 1.00 TB
- 2023-06-11T08:54:03 UTC -- CPU : 65.10, 65.18, 58.21 -- Uplink mem usage (VSZ) : 3321 MiB -- Upload speed : 12274 Mbit/s -- Uploaded : 861.00 GB / 1.00 TB
- 2023-06-11T08:56:03 UTC -- CPU : 65.98, 65.50, 59.18 -- Uplink mem usage (VSZ) : 3321 MiB -- Upload speed : 12489 Mbit/s -- Uploaded : 910.64 GB / 1.00 TB
- 2023-06-11T08:58:03 UTC -- CPU : 65.30, 65.30, 59.87 -- Uplink mem usage (VSZ) : 3388 MiB -- Upload speed : 12295 Mbit/s -- Uploaded : 960.99 GB / 1.00 TB

- Upload time (seconds) : 2408

Finished upload of 1t-file1 using uplink v1.77.2 @ 2023-06-11T08:59:42 UTC

---------------------------------------------------------------------------------
Waiting 900 sec...
---------------------------------------------------------------------------------

Starting upload of 1t-file1 using uplink v1.80.4 @ 20230611091444

SHA256 of uplink-1804 : 03b4b18dbf1bcf311119ed1af1afeb19c1b38ed114e940f17d86e1276a5b6009

- Command : ./uplink-1804 cp --maximum-concurrent-pieces 1300 --long-tail-margin 50 1t-file1 sj://test/1t-file1-v1804-20230611081923
-
- 2023-06-11T09:15:56 UTC -- CPU : 5.07, 3.98, 21.63 -- Uplink mem usage (VSZ) : 12256 MiB -- Upload speed : 3305 Mbit/s -- Uploaded : 5.46 GB / 1.00 TB
- 2023-06-11T09:17:56 UTC -- CPU : 8.43, 5.79, 20.15 -- Uplink mem usage (VSZ) : 29652 MiB -- Upload speed : 2275 Mbit/s -- Uploaded : 14.16 GB / 1.00 TB
- 2023-06-11T09:19:57 UTC -- CPU : 9.20, 6.97, 18.84 -- Uplink mem usage (VSZ) : 45747 MiB -- Upload speed : 2096 Mbit/s -- Uploaded : 22.41 GB / 1.00 TB
- 2023-06-11T09:21:57 UTC -- CPU : 6.35, 7.15, 17.50 -- Uplink mem usage (VSZ) : 61642 MiB -- Upload speed : 1505 Mbit/s -- Uploaded : 30.40 GB / 1.00 TB
- 2023-06-11T09:23:57 UTC -- CPU : 6.61, 6.71, 16.06 -- Uplink mem usage (VSZ) : 77736 MiB -- Upload speed : 1167 Mbit/s -- Uploaded : 37.93 GB / 1.00 TB
- 2023-06-11T09:25:57 UTC -- CPU : 7.08, 6.94, 15.01 -- Uplink mem usage (VSZ) : 92264 MiB -- Upload speed : 1716 Mbit/s -- Uploaded : 45.95 GB / 1.00 TB
- 2023-06-11T09:27:57 UTC -- CPU : 9.01, 7.76, 14.33 -- Uplink mem usage (VSZ) : 110315 MiB -- Upload speed : 1686 Mbit/s -- Uploaded : 53.96 GB / 1.00 TB
- 2023-06-11T09:27:59 UTC -- Uplink mem usage > 100GB (VSZ) - terminating uplink...

- Upload time (seconds) : 799 -- Upload failed !!!!!

Finished upload of 1t-file1 using uplink v1.80.4 @ 2023-06-11T09:28:03 UTC

Th3Van.dk

Since we stil have the OOM bug in v1.80.4, I’m not able to run a full upload test, using that particular version of uplink. I’ve set the script to terminate Uplink if mem usage (VSZ) runs higer than 100GB as a safety precaution.

Now, if i run the upload test aging using v1.80.4 using no flags, it looks like the upload is a bit faster.
http://www.th3van.dk/benchmark/uplink/script-log-file-1t-file1-v1804-20230611213157.log :

Starting upload of 1t-file1 using uplink v1.80.4 @ 20230611213207

SHA256 of uplink-1804 : 03b4b18dbf1bcf311119ed1af1afeb19c1b38ed114e940f17d86e1276a5b6009

- Command : ./uplink-1804 cp 1t-file1 sj://test/1t-file1-v1804-20230611213157
-
- 2023-06-11T21:33:48 UTC -- CPU : 3.37, 1.33, 0.49 -- Uplink mem usage (VSZ) : 20076 MiB -- Upload speed : 3485 Mbit/s -- Uploaded : 9.58 GB / 1.00 TB
- 2023-06-11T21:35:48 UTC -- CPU : 7.14, 3.17, 1.25 -- Uplink mem usage (VSZ) : 45157 MiB -- Upload speed : 3617 Mbit/s -- Uploaded : 22.21 GB / 1.00 TB
- 2023-06-11T21:37:49 UTC -- CPU : 9.81, 5.72, 2.44 -- Uplink mem usage (VSZ) : 73828 MiB -- Upload speed : 4324 Mbit/s -- Uploaded : 36.42 GB / 1.00 TB
- 2023-06-11T21:39:49 UTC -- CPU : 6.95, 6.36, 3.10 -- Uplink mem usage (VSZ) : 91030 MiB -- Upload speed : 2065 Mbit/s -- Uploaded : 45.90 GB / 1.00 TB
- 2023-06-11T21:41:50 UTC -- CPU : 8.13, 6.74, 3.63 -- Uplink mem usage (VSZ) : 109731 MiB -- Upload speed : 2248 Mbit/s -- Uploaded : 54.89 GB / 1.00 TB
- 2023-06-11T21:41:51 UTC -- Uplink mem usage > 100GB (VSZ) - terminating uplink...

- Upload time (seconds) : 586 -- Upload failed !!!!!

Finished upload of 1t-file1 using uplink v1.80.4 @ 2023-06-11T21:41:54 UTC

Th3Van.dk

Uplink debug log files has been send to zeebo in a PM

Upload times for 1 TB file :

  • v1.77.2 - 2408 seconds @ ~ 12000 Mbit/s
  • v1.80.4 - 799 seconds (only 54 GB got uploaded before OOM) @ ~ 1950 Mbit/s
  • v1.80.4 (no flags) - 586 seconds (only 54 GB got uploaded before OOM) @ ~ 3147 Mbit/s

Despite uplink v1.80.4 now is in fact uploading faster (compared to the other primary server I used before) v1.80.4 it’s still not as fast as v1.77.2

Please don’t hesitate to let me know, if you want me to run more tests, with other uplink/Ubuntu flags or other versions of uplink.

On a personal level - I’d really like this to upload with the best speed as possible, since I’m planning on using Uplink in a Veeam Cloud Connect solution, that is (hopefully) going live this year :crossed_fingers:

Th3Van.dk

9 Likes

Th3Van.dk

2 Likes