Feature Request: Lifecycle Management

Is there any form of lifecycle management of files stored in a bucket?

Backblaze has it - as does Amazon

1 Like

You can set an expiration date for objects. If that is what you mean. Developers can then build such a service around Storj DCS if their requirements call for it.

2 Likes

Through the GUI?
I don’t see a way of doing this.

I guess there might be in the CLI - but I am not using that

Storj GUI is not meant to be full featured. The interface provided is mainly to allow developers and such to get an understanding of the tools at a high level. Most of the functionality is available at the Command Line level or via S3 connectivity to existing (and new) applications that leverage Storj, and which features they utilize is determined by the developers and what they need.

1 Like

No, the GUI doesn’t have this feature, you need to use uplink CLI for example:

The other way is to generate an access grant/S3 credentials with provided TTL, i.e.

uplink share --max-object-ttl 24h --readonly=false --not-after=none sj://my-bucket

then use this access grant in any of your tools. All objects uploaded with this access grant will expire after 24h.
You may also generate an S3 credentials with the same behavior:

uplink share --max-object-ttl 24h --readonly=false --not-after=none --register sj://my-bucket
2 Likes

We’ve just hit this problem when trying to migrate customers from both S3 and Backblaze. Both have lifecycle policy’s on the buckets that delete the files automatically after 30 days.

We are a big user of rclone and this doesn’t seem to support the expiry feature described above.

Hello @dogsbody,
Welcome to the forum!

You have two options:

  1. Provide the TTL header when you use rclone to upload data and specify a delta or the exact date when uploaded objects should be auto deleted
  2. Create the access grant/S3 credentials with TTL. All objects uploaded using this access grant or these S3 credentials would have this TTL.

See

1 Like

I would like to awaken this request. I have objects that need to expire after a certain amount of time has passed after the object is marked as deleted in a versioned bucket. Currently I have to manually clear out old/deleted objects using rclone. I don’t have a way to predict when the objects should expire at upload time as they may be arbitrarily deleted. All objects have a 30d compliance lock applied at upload time and extended as needed.

With Backblaze this deletion happened automatically based on the defined object lifecycle rule such as “delete old versions after x days”. This is the feature that I am missing.

1 Like

We have a feature to upload objects with TTL, you can also create an access grant/s3 credentials which would have this TTL, docs are linked above.

However, Object Lock and TTL are Mutually Exclusive.
So, you should either use an Object lock or TTL.

Please also note, that rclone didn’t delete versions, it puts a deletion marker as described in the article Working with delete markers - Amazon Simple Storage Service.
To delete versions you need to specify this option, e.g.

rclone delete --s3-versions --s3-version-deleted --min-age 30d --rmdirs -P storj-s3:my-bucket/my-prefix/

You can still delete the object and all its versions even if they were uploaded with TTL. But if you use an Object lock feature, you will not be able to upload with a TTL. And depending on a Object lock mode (Compliance, Governance or Legal Hold) you might be not able to delete the not expired version at all.

Thank you for the rclone command, that is actually very helpful. I was using
rclone backend cleanup-hidden storj-s3:backup
before and it errors on every object still locked because it doesn’t check if it is locked before trying to delete it. The software I am using is kopia backup. It automatically extends the object locks to prevent backups being deleted, and puts delete markers on objects that are no longer needed. It does not remove hidden objects after their lock expires, so I have to use rclone for that step. This is where the lifecycle management feature would normally remove the deleted (hidden) objects after the lock expires.

I’m aware of the TTL feature, but it doesn’t work in this case as the object lock and TTL are mutually exclusive as you said. It’s also impossible to tell how long a certain object will need to exist at upload time, so no TTL.

1 Like

I’m putting this out here for anyone else who decides to try these commands.
I used the rclone command rclone delete --s3-versions --s3-version-deleted --min-age 30d --rmdirs -P storj-s3:my-bucket/my-prefix/ you gave me and it deleted every object that was 30 days old (by adding a delete marker to them). This broke my backup. I was able to recover by removing all the delete markers to restore the versioned objects. Just putting this out here in case anyone else has the same problem.

I believe the issue here is that the rclone delete command automatically puts a delete marker on versioned objects when using the delete command. Then it chose objects that were 30 days old and deleted them. It was not removing already deleted (hidden) objects.

rclone backend cleanup-hidden does sort of work, but rclone doesn’t support object locks so it fails in a safer way. It will delete the unlocked objects with delete markers only, but then it tries to delete objects still locked, starting with the latest version which is always the delete marker, then it gets a 403 when it tries to remove the underlying object. So it actually ends up undeleting all the objects that are still locked. I’m looking into scripting this myself as rclone can’t handle it. rclone hasn’t implemented this because every other S3 provider has basic lifecycle management features that take care of this edge case automatically.

You are correct, rclone puts a deletion marker with the delete command, unless all versions would be requested to be deleted (which is not the case, when you use a filter like older than 30 days). However, I did expect, that it will delete all previous versions as well. Seems I was wrong.

I do not have any suggestions how to overcome this at this time and will share your feedback with the team.

Solid request. In M&A work, especially when we’re evaluating digital infrastructure during a business sale, lack of lifecycle visibility can become a blocker. Features that help define and control data/service lifecycles would be a win from both operational and compliance angles.

(Posted from the perspective of someone active in business brokerage — Phoenix, Peterson Acquisitions)

Hello @katehiggins,
Welcome to the forum!

The Lifecycle is exist: Setting Object Lifecycles - Storj Docs

Please clarify the use case in M&M to have it per-bucket, where it’s not possible to do the same with per-object?

Hi everyone,

TLDR; There is currently no way to automatically remove objects from a versioned Bucket, that are no longer locked, and marked as deleted. Right?

I also stumbled upon this problem while trying to realize Immutable S3 Backups with Storj and TrueNAS.

TrueNAS has an Integrated Job that allows Sync. Meaning, it will delete objects in the cloud, that are no longer existent at the Source.

If you consider Backups managed by a Software like Duplicati or Veeam this is really great as they will handle the deletion of Old Backups in the chain and TrueNAS will handle the the Deletion Marker in the cloud during a Sync Operation.

My Problem however is that versions marked as “deleted” will linger, and be billed, for as long as manual deletion takes as there is no possibility of Lifecycle management in Versioned buckets or am I missing something?

This resulted in a 300GB Backup chain wich should only be 30% bigger in the cloud due to versioning to be 2TB in the cloud as every version ever created was still existing in the Bucket.

Hi @Kazumsan,
Welcome to the Forum !
I’ve forwarded your post internally and someone will return with a reply.

Hello @Kazumsan,
Welcome to the forum!

There are many: Deleting Buckets Using Different Tools - Storj Docs
However, I believe you mean without running any additional command? If so, then you can either upload with a TTL, or integrate TTL to the access grant/S3 credentials, and expiring versions will be automatically deleted by the system.
But, if you would enable also the Object Lock, then you will be unable to use this TTL feature, they are mutually exclusive.
If you wanted to use the Object Lock feature with housekeeping, then you should use tools which are supports this correctly, like Veeam.
I think many backup tools may have this feature if they are able to recognize and use the Object Lock feature properly, otherwise it will be up on you to delete expired versions after them.
However, if you would use the backup tool which is not aware about neither versioning nor object lock features, and they are not a simple sync (which is actually not a backup solution, it’s a replication solution), then you must not use neither the object versioning nor the object lock even if TTL would be possible - you may corrupt their hash structure and it would be impossible to restore from a such backup copy. This is because you cannot be sure which index file or pack can be safely deleted, so auto TTL may destroy it.

For that case you may enable the object versioning and use the integrated TTL to S3 credentials. As a result it could be called a backup solution, because you will be able to restore the older version of the single file for example. However, it would likely consume more storage and more segments, driving your bills only up (you would have several copies of the same object as its versions).

So, I would recommend to configure the TrueCloud Task instead, it’s restic with UI, so it will have snapshots and will pack smaller files to a bigger chunks reducing your storage and segments costs. But in that case you must not use versioning, because versions are already integrated to that backup solutions.
The object lock could be enabled though, but you will have the same problem - you will not know, which expired lock versions are safe to delete. You may assume that the versions are safe to delete, if there is a deletion marker, but it’s difficult to automate - you need to select only objects with the deletion marker and delete only versions of these objects. So, it’s better to do not use this feature too in that case.