Minimum S3 multipart size is not enforced

Amazon S3 has a minimum multipart size of 5M. Apparently that limit is not enforced on the S3MTGW. I just tested HashBackup with partsize 1K (normally HB gives an error, but I removed the error check), and Storj allowed me to create a file with this small 1K part size w/o error. Uploading a 42K file this way means 42x1K segments, 80 pieces each, so the upload with HB is very slow (854 bytes/s) and downloading with uplink is also very slow (6.7 KiB/s)

There should be an error check for minimum part size because this allows a denial-of-service attack by using very small part sizes on a multipart upload. I suppose it would allow even smaller part sizes than 1K. See below for Amazon’s error if part size is < 5M.

[jim@mb hbrel]$ py backup.py -c sj backup.py
Backup directory: /Users/jim/hbrel/sj
Backup start: 2021-10-27 02:21:22
Using destinations in dest.conf
# for /Users/jim/hbrel/sj/DESTID numchunks=1
This is backup version: 0
Dedup not enabled; use -Dmemsize to enable
/
/Users
/Users/jim
/Users/jim/hbrel
/Users/jim/hbrel/backup.py
/Users/jim/hbrel/sj
/Users/jim/hbrel/sj/inex.conf
# for /Users/jim/hbrel/sj/arc.0.0 numchunks=42
Cache size: 29 MB (3600 pages)
Waiting for destinations: sjs3
Copied arc.0.0 to sjs3 (42 KB 49s 854 bytes/s)
Writing hb.db.0
# for /Users/jim/hbrel/sj/hb.db.0 numchunks=5
Waiting for destinations: sjs3
Copied hb.db.0 to sjs3 (4.8 KB 12s 375 bytes/s)
# for /Users/jim/hbrel/sj/dest.db numchunks=36
Waiting for destinations: sjs3
Copied dest.db to sjs3 (36 KB 38s 962 bytes/s)

Time: 8.2s
CPU:  0.1s, 1%
Wait: 101.3s, 1m 41s
Mem:  58 MB
Checked: 7 paths, 136192 bytes, 136 KB
Saved: 7 paths, 136192 bytes, 136 KB
Excluded: 0
Dupbytes: 0
Compression: 68%, 3.2:1
Efficiency: 0.00 MB reduced/cpusec
Space: +42 KB, 79 KB total
No errors

[jim@mbp ~]$ uplink cp sj://hbtest/testmb/arc.0.0 x
41.55 KiB / 41.55 KiB [------------------------------------------------------------------------------------------------] 100.00% 6.74 KiB p/s
Downloaded sj://hbtest/testmb/arc.0.0 to x
[jim@mbp ~]$ ls -l x
-rw-r--r--  1 jim  staff  42544 Oct 27 02:42 x

Amazon:

[jim@mb hbrel]$ py backup.py -c hbs3 ../test10
Backup directory: /Users/jim/hbrel/hbs3
Backup start: 2021-10-27 02:58:51
Using destinations in dest.conf
This is backup version: 0
Dedup not enabled; use -Dmemsize to enable
/
/Users
/Users/jim
/Users/jim/hbrel
/Users/jim/hbrel/hbs3
/Users/jim/hbrel/hbs3/inex.conf
/Users/jim/test10
Cache size: 29 MB (3600 pages)
Waiting for destinations: s3
dest s3: error #1 of 9 in send arc.0.0: [S3ResponseError] S3ResponseError: 400 Bad Request
<Error><Code>EntityTooSmall</Code><Message>Your proposed upload is smaller than the minimum allowed size</Message><ProposedSize>1048576</ProposedSize><MinSizeAllowed>5242880</MinSizeAllowed><PartNumber>1</PartNumber><ETag>f4be738b24def28664ab0e947bc7abb0</ETag><RequestId>9Y3HACXJH6Y3AX6J</RequestId><HostId>QPa7MWgvs9x5MpauzNG59ZyPu73UH61JeEXPsSPqjuvA9ZaeKklQvCFMejjg3tppCEMN9POx35k=</HostId></Error>

Perhaps related or not, I tried to use the share trick to verify the number of pieces, but it didn’t quite work. It displayed the file size, but said there are 0 pieces. Maybe it’s an access issue - not sure. Here’s the share URL

And you will pay for that :slight_smile: not only with hits to performance, but also money: Usage Limit Increases - Storj DCS

2 Likes

Thank you for the input! A minimum part size checking feature is code-complete and slated for an upcoming Satellite release. This check had previously been implemented on Gateway-MT, but was recently disabled due to issues it created in edge cases where there is high network latency.

5 Likes

I have been doing testing with many segments yesterday and today and had a question about billing. My account page says:

My First Project
Estimated Total $0.29

Storage ($0.004 per Gigabyte-Month)
Oct 1 - Oct 27
0.86 Gigabyte-month
$0.00

Egress ($0.007 per GB)
Oct 1 - Oct 27
41.75 GB
$0.29

Objects ($0 per Object-Month)
Oct 1 - Oct 27
88.09 Object-month
$0.00

Where does the charge for excessive segments show up? I’m still on the free tier and it looks like I can go pretty crazy on segments without accruing charges, but maybe I’m just not near the limits.

A related accounting question: does the accounting work by doing a scan of my account every hour, or are logs kept when objects are created and deleted? It seems if there is a poll every hour, it would be possible to go nuts creating and deleting objects and segments and never be charged. Ie, the degrade service attack I mentioned would not cost anything.

With the nearest release I hope

Did this bug get fixed? Small multipart uploads are no longer working, which is fine, but the error message is not good (if that’s what’s causing this):

Traceback (most recent call last):
  File "/hb.py", line 154, in <module>
  File "/destcmd.py", line 442, in main
  File "/destcmd.py", line 241, in dotest
  File "/s3dest.py", line 1024, in sendfile
  File "/s3dest.py", line 1001, in sendmulti
  File "/opt/lib/python2.7/site-packages/boto/s3/multipart.py", line 319, in complete_upload
    self.id, xml)
  File "/opt/lib/python2.7/site-packages/boto/s3/bucket.py", line 1779, in complete_multipart_upload
    headers=headers, data=xml_body)
  File "/opt/lib/python2.7/site-packages/boto/s3/connection.py", line 668, in make_request
    retry_handler=retry_handler
  File "/opt/lib/python2.7/site-packages/boto/connection.py", line 1071, in make_request
    retry_handler=retry_handler)
  File "/opt/lib/python2.7/site-packages/boto/connection.py", line 1028, in _mexe
    raise BotoServerError(response.status, response.reason, body)
BotoServerError: BotoServerError: 500 Internal Server Error
<?xml version="1.0" encoding="UTF-8"?>
<Error><Code>InternalError</Code><Message>We encountered an internal error, please try again.</Message><Key>hbtest/test.tmp</Key><BucketName>hbtest</BucketName><Resource>/hbtest/test.tmp</Resource><RequestId>16B6942493190103</RequestId><HostId></HostId></Error>

Please see Amazon S3’s error message for too small parts for reference. Throwing a general 500 error and asking to retry is not that helpful. :slight_smile:

Have you created a bug report?
I didn’t find it :man_shrugging:

See above la, la, la for 20 chars

1 Like

This may be because the error returned from the satellite/uplink library which Gateway-MT uses is not being mapped to an appropriate response, and the default is 500 for anything unrecognized. I’m taking a look at this now, and an issue has been created here: Minimum part size errors should return an appropriate response · Issue #105 · storj/gateway-mt · GitHub

4 Likes