S3 gateway multipart upload with presigned urls javascript

Dear community,

We are building a webapp utilizing storj and we currently have problems with multipart uploads and presigned urls on the s3 gateway.

Especially, we are getting back a malformed response from the uploadPart property (Class: AWS.S3 — AWS SDK for JavaScript).

The response is:

The problem with that response is that it doesn’t return an ETag, from which we can then complete the multipart upload.

I suspect it has something to do with cors. We are trying to interact with the gateway from our localhost dev environment.

Any help would be greatly appreciated.

2 Likes

Hi @3eeaeb9ae56916c4c0be

Could you provide a JS snippet of all the steps that you are doing?

We’d like to see how you’re creating the pre-signed URLs, creating the multipart upload, and after how after you upload parts using them.

Thank you.

Sure. I am using sveltekit…

Client side:

export async function getStorjSignedUrl(path: string, filename: string) {
    const requestOptions = {
        method: 'GET',
        redirect: 'follow' as RequestRedirect
    };
      
    try {
        const response = await fetch(`http://localhost:5173/api/storage/storj-signed-url?path=${path}&filename=${filename}`, requestOptions)
        return await response.json()
    } catch (error) {
        console.log('error', error)
    }
}

export async function uploadFileStorj(readableStream: ReadableStream<Uint8Array>, filename: string, path: string) {
    //assume that filename & path are free (this should be checked in frontend)
    const createMPRequestOptions = {
        method: 'GET',
        redirect: 'follow' as RequestRedirect
    };
      
    try {
        //create multipart upload
        const createMPResponse = await fetch(`http://localhost:5173/api/storage/storj-create-multipart-upload?path=${path}d&filename=${filename}`, createMPRequestOptions)
        const {UploadId} = await createMPResponse.json()

        console.log('upload id', UploadId)

        //get signed urls for every part of the incoming stream
        //then use the signedurls to start the upload
        let partNumber = 1
        const uploadPromises = []
        for await (const iterator of readableStream as ReadableStreamType<Uint8Array>) {
            const getSignedUrlRequestOptions = {
                method: 'GET',
                redirect: 'follow' as RequestRedirect
            };
            
            const getSignedUrlResponse = await fetch(`http://localhost:5173/api/storage/storj-signed-url?path=${path}d&filename=${filename}&partnumber=${partNumber}&uploadid=${UploadId}`, getSignedUrlRequestOptions)
            const signedUrl = await getSignedUrlResponse.text()
            console.log(partNumber)
            console.log('signed url', signedUrl)

            //upload every part of the incoming stream

            const uploadPartRequestOptions = {
                method: 'PUT',
                body: iterator,
                redirect: 'follow' as RequestRedirect
            };
            const res = fetch(signedUrl, uploadPartRequestOptions)
            console.log(await res)
            uploadPromises.push(fetch(signedUrl, uploadPartRequestOptions))
            partNumber++
        }
        const resUploadParts = await Promise.all(uploadPromises.map(async (item) => {
            const response = await item
            console.log(response.body)
            return response.json()
        }))
        console.log("result upload parts", resUploadParts)
        //complete multipart upload
        const completeUploadParams = {
            Bucket: "examplebucket", 
            Key: "bigobject", 
            MultipartUpload: {
             Parts: [
                {
               ETag: "\"d8c2eafd90c266e19ab9dcacc479f8af\"", 
               PartNumber: 1
              }, 
                {
               ETag: "\"d8c2eafd90c266e19ab9dcacc479f8af\"", 
               PartNumber: 2
              }
             ]
            }, 
            UploadId: "7YPBOJuoFiQ9cz4P3Pe6FIZwO4f7wN93uHsNBEw97pl5eNwzExg0LAT2dUN91cOmrEQHDsP3WA60CEg--"
           }
    } catch (error) {
        console.log('error', error)
    }
    
}

API endpoint http://localhost:5173/api/storage/storj-signed-url:

import type { RequestHandler } from "@sveltejs/kit";
import { generatePath } from "$lib/utils";
import { getSignedUrl } from "@aws-sdk/s3-request-presigner";
import { UploadPartCommand } from "@aws-sdk/client-s3";
import { S3 } from "@aws-sdk/client-s3";
import { STORJ_ACCESS_KEY, STORJ_SECRET_ACCESS_KEY, STORJ_ENDPOINT, STORJ_REGION, STORJ_BUCKET } from '$env/static/private';

export const s3 = new S3({
    credentials: {
        accessKeyId: STORJ_ACCESS_KEY,
        secretAccessKey: STORJ_SECRET_ACCESS_KEY
    },
    endpoint: STORJ_ENDPOINT,
    forcePathStyle: true,
    region: STORJ_REGION
});

export const GET: RequestHandler = (async ({ url }) => {
    const path = url.searchParams.get('path') ?? generatePath()
    const filename = url.searchParams.get('filename') ?? generatePath()
    const partNumber = Number(url.searchParams.get('partnumber'))
    const uploadId = url.searchParams.get('uploadid')
    console.log(uploadId)
    console.log(partNumber)
    if(uploadId === null) {
        console.log(new Error("no upload id or partnumber specified"))
        return new Response()
    } else {
        const bucketParams = {
            Bucket: STORJ_BUCKET,
            Key: `${path}/${filename}`,
            PartNumber: partNumber,
            UploadId: uploadId
        };
        const command = new UploadPartCommand(bucketParams)
        const signedUrl = await getSignedUrl(s3, command)
        return new Response(signedUrl)
    }
})

API endpoint http://localhost:5173/api/storage/storj-create-multipart-upload:

import { json, type RequestHandler } from "@sveltejs/kit";
import { generatePath } from "$lib/utils";
import { S3 } from "@aws-sdk/client-s3";
import { STORJ_ACCESS_KEY, STORJ_SECRET_ACCESS_KEY, STORJ_ENDPOINT, STORJ_REGION, STORJ_BUCKET } from '$env/static/private';

export const s3 = new S3({
    credentials: {
        accessKeyId: STORJ_ACCESS_KEY,
        secretAccessKey: STORJ_SECRET_ACCESS_KEY
    },
    endpoint: STORJ_ENDPOINT,
    forcePathStyle: true,
    region: STORJ_REGION
});

export const GET: RequestHandler = (async ({ url }) => {
    const path = url.searchParams.get('path') ?? generatePath()
    const filename = url.searchParams.get('filename') ?? generatePath()
    const bucketParams = {
        Bucket: STORJ_BUCKET,
        Key: `${path}/${filename}`,
    };
    let res
    try {
        res = await s3.createMultipartUpload(bucketParams)
    } catch (error) {
        console.log(error)
    }
    return json(res)
})
1 Like

I’m a bit confused and I have to say that I don’t have that much experience in this area, but if you are using the pre-signed URL for uploading parts I don’t understand why do you have to call uploadPart?

Oh, well I have to supply a command to getSignedUrl… and UploadPartCommand is the correct one I would say. I later do a simple PUT to the signed url.

Also seems to be correct as written in stack overflow node.js - migrate multipart upload with presigned urls from aws javascript sdk v2 to v3 - Stack Overflow

According to UploadPart - Amazon Simple Storage Service the etag is part of the headers in the response.

Also it seems you’re still using v2 with s3.createMultipartUpload(bucketParams). In AWS SDK v3 you’ll do:

const command = new CreateMultipartUploadCommand(bucketParams);
const resp = await s3Client.send(command);

Here’s a express server example

var express = require("express");                                                                                                                                                                     
var router = express.Router();                                                                                                                                                                        
                                                                                                                                                                                                      
var {                                                                                                                                                                                                 
  S3Client,                                                                                                                                                                                           
  CreateMultipartUploadCommand,                                                                                                                                                                       
  UploadPartCommand,                                                                                                                                                                                  
} = require("@aws-sdk/client-s3");                                                                                                                                                                    
const { getSignedUrl } = require("@aws-sdk/s3-request-presigner");                                                                                                                                    
                                                                                                                                                                                                      
const s3Client = new S3Client({                                                                                                                                                                       
  credentials: {                                                                                                                                                                                      
    accessKeyId: process.env.ACCESS_KEY,                                                                                                                                                              
    secretAccessKey: process.env.SECRET_KEY,                                                                                                                                                          
  },                                                                                                                                                                                                  
  region: "global",                                                                                                                                                                                   
  endpoint: "https://gateway.storjshare.io",                                                                                                                                                          
});   

router.post("/presign-multipart", async function (req, res, next) {
  try {
    const bucketParams = {
      Bucket: "demo-bucket",
      Key: `${req.body.path}/${req.body.filename}`,
      PartNumber: req.body.partNumber,
      UploadId: req.body.uploadId,
    };
    const command = new UploadPartCommand(bucketParams);   
    const signedUrl = await getSignedUrl(s3Client, command, {
      expiresIn: 3600,                                                                             
    });                                                                                            
    res.json({ url: signedUrl });
  } catch (err) {                                                                                  
    console.error(err);
  }                                            
});                          
                                                 
router.post("/create-multipart", async function (req, res, next) {
  try {
    const bucketParams = {                     
      Bucket: "demo-bucket",   
      Key: `${req.body.path}/${req.body.filename}`,
    };           
    const command = new CreateMultipartUploadCommand(bucketParams);
    const resp = await s3Client.send(command);
                                                 
    res.json(resp);
  } catch (err) {       
    console.error(err);                                                                            
  }
});

Calling the endpoints on the server would look something like this:

#!/bin/bash

UPLOAD_ID=$(curl http://localhost:3001/create-multipart -d 'path=test' -d 'filename=file' | jq -r .UploadId) 
# echo $UPLOAD_ID                                                                                                                                                                                       
URL=$(curl http://localhost:3001/presign-multipart -d 'path=test' -d 'filename=file' -d 'partNumber=1' -d "uploadId=$UPLOAD_ID" | jq -r .url)                                                         
# echo $URL 

curl -v -X PUT --data-binary @part1.txt $URL

And the response of that last curl is (notice the Etag as part of the header):

< HTTP/2 200 
< accept-ranges: bytes
< content-security-policy: block-all-mixed-content
< date: Thu, 05 Jan 2023 23:08:07 GMT
< etag: "871c68189b326a94de95e219f660448e"
< server: MinIO
< vary: Origin
< x-amz-request-id: 17378C55F9E21B04
< x-xss-protection: 1; mode=block
< content-length: 0
2 Likes

@danw Thanks for your detailed answer!

I can see that the last curl is the one to upload part @part.txt to the server - could you give me an example on how to actually upload all parts from a js readablestream to the server?

I adjusted the API endpoint http://localhost:5173/api/storage/storj-create-multipart-upload:

import { json, type RequestHandler } from "@sveltejs/kit";
import { generatePath } from "$lib/utils";
import { S3, CreateMultipartUploadCommand } from "@aws-sdk/client-s3";
import { STORJ_ACCESS_KEY, STORJ_SECRET_ACCESS_KEY, STORJ_ENDPOINT, STORJ_REGION, STORJ_BUCKET } from '$env/static/private';

export const s3 = new S3({
    credentials: {
        accessKeyId: STORJ_ACCESS_KEY,
        secretAccessKey: STORJ_SECRET_ACCESS_KEY
    },
    endpoint: STORJ_ENDPOINT,
    forcePathStyle: true,
    region: STORJ_REGION
});

export const GET: RequestHandler = (async ({ url }) => {
    const path = url.searchParams.get('path') ?? generatePath()
    const filename = url.searchParams.get('filename') ?? generatePath()
    const bucketParams = {
        Bucket: STORJ_BUCKET,
        Key: `${path}/${filename}`,
    };
    const command = new CreateMultipartUploadCommand(bucketParams)
    let res
    try {
        res = await s3.send(command)
    } catch (error) {
        console.log(error)
    }
    return json(res)
})

Still the same response…

As the response status is 200, but the type is CORS, could it be that storj doesn’t include “ETag” in the “CORS-safelisted response-header name” (https://fetch.spec.whatwg.org/#cors-safelisted-response-header-name).

Here is a stack overflow thread showing how this would be done on the server side: json - How to fix this error I’m getting from Amazon s3: ETagMissing: No access to ETag property on response. Check CORS configuration to expose ETag header - Stack Overflow

Can you paste the output of the etag header?

console.log("etag", resp.headers.get("etag"));

I’m not having any trouble getting the etag, so I’m not sure what’s different on your end.

I do see the preflight CORS request, but it’s followed by the upload request (I don’t do anything special here, see code below).

From the upload request I can see the etag in the response in the console. You’ll also notice access-control-expore-headers includes Etag, so the s3 gateway seems to be configured correctly.

Not entirely a js readablestream example, but it’s uploading a file in javascript in the browser. It’s more or less the curl example above converted to javascript.

import "./App.css";
import { useState } from "react";

function App() {
  const [loading, setLoading] = useState(null);
  const onSubmit = async (event) => {
    event.preventDefault();

    setLoading(true);
    try {
      let file = event.target[0].files[0];
      let url = `http://localhost:3001/create-multipart`;
      let params = new URLSearchParams({
        filename: file.name,
        path: "test",
      });
      let resp = await fetch(url, {
        method: "POST",
        body: params,
      });
      let createMultipartResp = await resp.json();

      console.log("create-multipart resp", createMultipartResp);

      url = `http://localhost:3001/presign-multipart`;
      params = new URLSearchParams({
        filename: file.name,
        path: "test",
        uploadId: createMultipartResp.UploadId,
        partNumber: 1,
      });
      resp = await fetch(url, {
        method: "POST",
        body: params,
      });
      let presignResp = await resp.json();

      resp = await fetch(presignResp.url, {
        method: "PUT",
        headers: {
          "Content-Type": file.type,
        },
        body: file,
      });
      console.log("resp", resp);
      console.log("etag", resp.headers.get("etag"));
    } catch (err) {
      console.error(err);
    } finally {
      setLoading(false);
    }
  };
  return (
    <div className="App">
      <header className="App-header">
        <form onSubmit={onSubmit}>
          <input type="file" />
          <br />
          <button disabled={loading}>Upload!</button>
        </form>
      </header>
    </div>
  );
}

export default App;

Console output

create-multipart resp {$metadata: {…}, Bucket: 'demo-bucket', Key: 'test/Screen Shot 2022-10-05 at 3.54.43 PM.png', UploadId: '3svduYfAsQBXCtczhd58JgEhu5XTsJBqHY6J617e2DyKoTwYk2…d18Tpit5pCHGmMVj5KdSzc8wVU4cb6puUUJK1uHDeuUFw7BAD'}

resp Response {type: 'cors', url: 'https://demo-bucket.gateway.storjshare.io/test/Scr…5KdSzc8wVU4cb6puUUJK1uHDeuUFw7BAD&x-id=UploadPart', redirected: false, status: 200, ok: true, …}

etag "b855383d825388a64c782b30c330a822"
3 Likes

I can confirm that I’ve done exactly this flow with Storj using JS + Elixir, so it does work.

import * as React from "react";
import * as Sentry from "@sentry/browser";
import axios from "axios";
import { filesize } from "filesize";
import axiosRetry from "axios-retry";
import log from "loglevel";
import BusyIndicator from "./BusyIndicator";
import PQueue from "p-queue";
import {
  stubTrue,
  constant,
  cond,
  isEmpty,
  sumBy,
  reduce,
  map,
  values,
  orderBy,
  flow,
  keyBy,
  each,
  reject,
} from "lodash/fp";
import produce from "immer";

const UPLOAD_CONCURRENCY = 3;

const uploadAxios = axios.create();
axiosRetry(uploadAxios, { retries: 3, retryDelay: axiosRetry.exponentialDelay });

const uploadProgressAxios = axios.create();
axiosRetry(uploadProgressAxios, { retries: 10, retryDelay: axiosRetry.exponentialDelay });

/**
 * The queue ensures we limit the number of uploads to Storj that are
 * occuring at the same time. Too many and we simply overload the users
 * connection.
 *
 * We reject any parts that are marked as completed_at, no need to upload
 * them again. They will however still be marked as completed in the UI.
 */
const queue = new PQueue({ concurrency: UPLOAD_CONCURRENCY, autoStart: true });

const logUploadPart = (uploadPart, level, message) => {
  log[level](`Part number [${uploadPart.partNumber}]: ${message}`);
};

export default function UploadPerform({ userDetails, uploadDetails, onError, onComplete }) {
  const partProgress = React.useRef(null);
  const [lastProgressAt, setLastProgressAt] = React.useState(null);

  /**
   * Called to create the upload job on the part before placing it in the
   * queue.
   */
  const createUploadJob = React.useCallback((uploadPart, data) => {
    return async () => {
      /**
       * Perform the upload to Storj. We embed this code in an anonymous
       * function so it can be executed by the queue system.
       */

      // record the start time of the upload
      const nextPartProgress = produce(partProgress.current, (draftPartProgress) => {
        draftPartProgress[uploadPart.id] = {
          ...draftPartProgress[uploadPart.id],
          startedAt: Date.now(),
        };
      });

      partProgress.current = nextPartProgress;

      try {
        logUploadPart(uploadPart, "info", "Starting upload");

        const uploadResponse = await uploadAxios({
          method: "put",
          url: uploadPart.uploadUrl,
          data,
          headers: {
            "Content-Type": uploadDetails.mimeType,
          },
          onUploadProgress: async (progressEvent) => {
            /**
             * Calculate the progress of the upload to Storj and store it in
             * the partProgress ref. We also store the last progress event as
             * setting this will trigger a rerender of the React component.
             *
             * Finally, we report the progress of this piece to the server so
             * that the dashboard can display the upload progress. If this
             * reporting fails, it is not fatal to the app.
             */

            // Guard on not being able to calculate progress yet
            if (!progressEvent.loaded || !progressEvent.total) {
              return;
            }

            const progress = progressEvent.loaded / progressEvent.total;

            // If we cannot calculate progress as part of this upload, bail out
            if (!progress) {
              return;
            }

            const elapsedSeconds = (Date.now() - partProgress.current[uploadPart.id].startedAt) / 1000;
            const bytesCompleted = data.size * progress;
            const bytesPerSecond = bytesCompleted / elapsedSeconds;

            const nextPartProgress = produce(partProgress.current, (draftPartProgress) => {
              draftPartProgress[uploadPart.id] = {
                ...draftPartProgress[uploadPart.id],
                progress,
                bytesPerSecond,
              };
            });

            partProgress.current = nextPartProgress;
            setLastProgressAt(Date.now());

            try {
              logUploadPart(uploadPart, "info", `Reporting progress to the server: ${progress}`);

              await uploadProgressAxios({
                data: {
                  progress,
                },
                method: "put",
                url: `/api/upload_parts/${uploadPart.id}/progress`,
              });
            } catch (e) {
              logUploadPart(uploadPart, "error", `Failed to report progress to server, this is not fatal`);
              Sentry.captureException(e);
            }
          },
        });

        /**
         * Once the upload to Storj has completed, we then inform the FileYeet
         * backend that this piece has been completed. We pass the etag that was
         * given to us by Storj as this is needed to join the multipart upload
         * later.
         */
        const etag = JSON.parse(uploadResponse.headers.etag);

        try {
          logUploadPart(uploadPart, "info", "Notifying server that part upload was completed");

          const { data } = await uploadProgressAxios({
            data: {
              etag,
            },
            method: "put",
            url: `/api/upload_parts/${uploadPart.id}`,
          });

          const nextPartProgress = produce(partProgress.current, (draftPartProgress) => {
            draftPartProgress[uploadPart.id] = {
              ...draftPartProgress[uploadPart.id],
              completedAt: data.completedAt,
              progress: data.progress,
            };
          });

          partProgress.current = nextPartProgress;
          setLastProgressAt(Date.now());
        } catch (e) {
          logUploadPart(uploadPart, "error", "Failed to report to server that upload part completed");
          Sentry.captureException(e);
          onError(e, "We had a problem performing the upload, please try again");
        }
      } catch (e) {
        logUploadPart(uploadPart, "error", "Failed to upload part to S3");
        Sentry.captureException(e);
        onError(e, "We had a problem performing the upload, please try again");
      }
    };
  });

  /**
   * Start jobs for each part to upload to the backend
   */
  React.useEffect(() => {
    if (userDetails === null || uploadDetails === null) {
      return;
    }

    (async () => {
      /**
       * Initialize a map of progress based on id, this is updated when we
       * start uploading to Storj. We use a ref here because of stale scope
       * when running the jobs in the queue.
       */
      const initProgress = flow(
        orderBy(["partNumber"], ["asc"]),
        map((uploadPart) => {
          return {
            ...uploadPart,
            progress: 0.0,
            startedAt: null,
            bytesPerSecond: null,
          };
        }),
        keyBy("id")
      )(uploadDetails.uploadParts);

      partProgress.current = initProgress;

      flow(
        reject("completedAt"),
        each((uploadPart) => {
          const file = userDetails.file.slice(uploadPart.partStart, uploadPart.partStart + uploadPart.partSize);
          const job = createUploadJob(uploadPart, file);
          queue.add(job);
        })
      )(uploadDetails.uploadParts);

      setLastProgressAt(Date.now());

      await queue.onIdle();

      onComplete(null);
    })();
  }, [userDetails, uploadDetails]);

  if (lastProgressAt === null) {
    return <BusyIndicator message="Getting ready to upload..." />;
  }

  const uploadParts = values(partProgress.current);

  const renderedProgress = flow(
    orderBy(["partNumber"], ["asc"]),
    map((uploadPart) => {
      const partProgressPercentage = uploadPart.progress * 100;

      return (
        <div key={uploadPart.id}>
          <div className="w-full flex items-center">
            <div className="w-full h-6 bg-slate-900 relative border-slate-700 border">
              <div className="h-6 bg-slate-600" style={{ width: `${partProgressPercentage}%` }} />
            </div>
          </div>
        </div>
      );
    })
  )(uploadParts);

  const amountUploaded = reduce((uploaded, uploadPart) => {
    return uploaded + uploadPart.partSize * uploadPart.progress;
  }, 0)(uploadParts);

  const totalProgress = amountUploaded / userDetails.file.size;

  // Sum the speed of all non-complete parts (i.e. those that are being uploaded currently)
  const sumBytesPerSecond = flow(
    reject(({ bytesPerSecond, completedAt }) => {
      return bytesPerSecond === null || completedAt !== null;
    }),
    cond([
      [isEmpty, constant(null)],
      [stubTrue, sumBy("bytesPerSecond")],
    ])
  )(uploadParts);

  const gridCols = (() => {
    switch (uploadParts.length) {
      case 1:
        return "grid-cols-1";
      case 2:
        return "grid-cols-2";
      case 3:
        return "grid-cols-3";
      case 4:
        return "grid-cols-4";
      default:
        return "grid-cols-5";
    }
  })();

  return (
    <div className="text-center border border-white/20 p-8">
      <div className="text-center mb-4">
        <span className="text-white/80 block mb-2 text-3xl">{userDetails.uploaderFilename}</span>
      </div>
      <span className="text-white/90 text-2xl block mb-4">
        <span className="block mb-2 text-2xl">{Math.floor(totalProgress * 100)}% done</span>
        <span className="block mb-2 text-base">
          {sumBytesPerSecond === null ? "Calculating speed..." : `${filesize(sumBytesPerSecond)}/s`}
        </span>
      </span>
      <div className="mb-4">
        <div className={`grid border border-white/20 ${gridCols}`}>{renderedProgress}</div>
      </div>
    </div>
  );
}

It’s a big chunk of code, but you can see I’m being given an etag back in the header.

3 Likes

If you want to see what the flow looks like from the browser, you can create an account here and perform an upload:

Open the network connections developer tools and you’ll be able to see the exchange of payloads between my server <> browser <> storj

I am such a foool - Sorry!

Turns out I just didn’t query “etag” correctly from the header. I was trying to get it directly via .etag :confused:

Thanks a lot for your help y’all!

2 Likes

Nice!

I have a feeling a had a couple of other gotchas along the way with this flow so I’m happy to help if you have any other questions.

2 Likes