Can we increase the part number beyond 2^31?

Hi, I wanted to check if it’s possible to increase the maximum number of parts beyond 2³¹. I’m trying to upload a 130 GB Parquet file using multipart upload, but it’s exceeding this limit.

I haven’t touched this in a long time: but I had to tell my S3 client to upload larger chunks/objects/parts (can’t remember the wording) - I think at the time 64MB was ideal.

2 Likes

yes Roxor, I think its part size not the number of parts correct?

So here’s my use case, Roxor — I have around 150 GB of compressed CSV files in Storj, where each file corresponds to a specific date. I’m trying to convert each CSV file into a single Parquet file (one Parquet file per date). I have approximately 500 such files. Do you have any recommendations or suggestions for handling this efficiently?

The current data engineering industry recommendation is to do smallish parquet files (100 MB÷1 GB), and form an Iceberg table from them. A single 130 GB file is almost never a good choice.

1 Like

Hello @akki9413688,
Welcome to the forum!

Please take a look on S3 Compatibility - Storj Docs

The Limits section in particular:

Please also note, that the current Legacy tier has a per segment fee and each part will take at least one segment, so from the billing perspective it’s better to use a 64MB segment size to use less segments, rather than more.
E.g. 2³¹ * $0.0000088 = $18,897.86 only for such amount of segments. So, I would like to suggest you to increase a chunk size for multipart upload to 64MB, this will results in 2344 segments for the 150GB file. I also think, that your upload is failing, because you use a default 5MiB chunk size, which results in 30,000 parts for the 150GB file, which is more than our S3 limit of 10,000 parts.