S3-like public bucket

Hi everyone,

I am new at Storj and trying to create an s3-like public bucket to share with others. I successfully created a public bucket with my account. I’d like to access it using python then I create this code

import s3fs
import xarray as xr
# Storj S3-compatible endpoint (from your provided details)
print(xr.__version__)

s3_endpoint = 'https://gateway.storjshare.io'

# Access credentials (from your provided details)
access_key = "zzzz"
secret_key = "xxxx"

# Create a s3fs filesystem with your credentials
fs = s3fs.S3FileSystem(
    key=access_key,
    secret=secret_key,
    client_kwargs={'endpoint_url': s3_endpoint}
)

# The bucket name and folder path
bucket_name = 'dtree-zarr'
object_key = 'Guaviare.zarr'

f'{bucket_name}/{object_key}'

file =  s3fs.S3Map(f'{bucket_name}/{object_key}', s3=fs)

dtree = xr.backends.api.open_datatree(
    file, 
    engine='zarr', 
    consolidated=True, 
    chunks={}
)

And it worked. The question is, do I need to expose the credentials for this public bucket if I want to access it?

I tried using anon=False, which works for all public buckets in aws, but I got Access denied error

Am I doing something wrong?

Thanks in advance for your help and time

Does this help? You shouldn’t have to share any credentials.

1 Like

If you also need to allow uploads to the bucket, then you have several options:

  • give them S3 credentials, which allows at least PUT (write permission), perhaps you do not want to allow them to list and/or delete;
  • run your own S3-compatible gateway and expose it either with a reverse proxy or directly, however you would need to share its generated S3 credentials (they are generated locally when you setup the gateway and would not change later, unless you would remove the config and setup it again) to give a possibility to upload. If you need only use the bucket as a static website, then you may run it in the static website mode;
  • implement your own site which would use the bucket as a storage.

Thanks for your prompt reply.

I am not sure if I understood correctly, but I was looking for a place to host a small dataset to share with other people, not necessarily friends or colleagues. The idea was to test data streaming under a minimum reproducible example.

I think I will look for a different alternative.

Thanks once more for your time and help!

So you want to create a bucket writeable by the whole world? Or do you want to host readonly data accessible by the world?

It’s hard to give advice if requirements are so vague.

Sorry for the confussion. I just want to create a read-only bucket with a small dataset to be queried/consulted by others.

Have a look at this

And specifically,

  • Path-based linkshare - displays a list of objects with a shared path in a browser. This feature allows sharing a folder of objects. When clicked in a browser, any of the objects will be displayed individually on a Linkshare web page
  • Direct download Linkshare - a URL to directly access and download an object via the internet without loading a web page

Is this what you need?

I think it is close to what I want. However, I’d like to provide s3 bucket-like object storage that can be accessed with tools like fsspec and s3fs, where Python users can mount and access S3-like buckets as local file systems. This will enable users to open and query datasets efficiently using data streaming (HTTP requests) instead of downloading the whole dataset locally.

perhaps this s3fs guide will help Connecting s3fs to Storj - Storj Docs

1 Like

Thanks, @heunland, for sharing this. It’s what I’ve been looking for. I wonder if secret access and keys are still needed for public buckets.

If you would share the local mount via S3 interface of your site/application, you doesn’t need to provide keys. They would need to provide your server address and port as an endpoint in their app if they want to use AWS S3 SDK.

However, you may skip the whole s3fs mount complication (I do not understand why do you ever need it) by running self-hosted S3-compatible Storj gateway in the static website mode, it will provide an interface for read only S3 requests - either xml or if you would put the index.html, then it would render it. If they would need not http access, but using AWS S3 SDK, then they would need to provide the endpoint to your instance.

The other way is to configure rclone and run rclone serve s3, see rclone serve s3 for details. If you wouldn’t provide some Access Key and the Secret Access Key in either the S3 gateway or rclone configuration, it would allow an anonymous access.

To do not allow the write permissions you may limit the used access grant to do not have write permissions.