How to manage backup size

clemoun
clemoun Member Posts: 9
edited March 19 in Digital Asset Management

My users are uploading and deleting a lot of assets to Cloudinary.

To avoid any deletion horror stories, I activated the automatic backup to an ASW S3 Bucket. However, the backup is just growing in size and costs will keep rising forever...which is not super convenient for any company as you can imagine :-)

Is there any way to make sure the backup keeps an "efficient" size ? Looks like Cloudinary only offers 2 choices : either backup everything forever or nothing at all.

For instance, backing up assets for x months would make sense : if no one notices they disappeared after such a long time, then I no longer need to keep a backup version of them. It means they were rightfully deleted.


Is that something that could be done with a bit of code ? Am I the only one with that kind of issue ? 😅

Tagged:

Answers

  • Wissam
    Wissam Member, Cloudinary Staff Posts: 103

    Hi there,

    Cloudinary doesn’t directly offer a built-in feature to set a retention period for backups.

    Would you like to delete old assets and then delete these assets from your backup?

    You can use the bulk delete feature to achieve that :

    It's a manual process that you can do from time to time.

    Hope this is helpful!

    Regards,

    Wissam

  • clemoun
    clemoun Member Posts: 9

    Thanks Wissam for your answer !

    What I would need is bulk delete backed up assets that were deleted more than (say) 6 months ago.

    I am not sure to understand what the bulk delete feature allows right now regarding this issue. It seems to me it only allows deleting current active assets depending on their creation dates or tags...

    Unless there is a way to add a tag to deleted/backed up assets and then bulk delete them ? Is that possible?

  • Wissam
    Wissam Member, Cloudinary Staff Posts: 103

    Hi there,

    As of now, Cloudinary’s automatic backup feature retains backups indefinitely, and there isn’t a built-in option to automatically delete backups after a specific period (e.g., 6 months). However, I understand your concern about managing storage costs effectively.

    Here are some considerations and potential approaches:

    1. Custom Retention Policy:
      • Implement a custom solution using code to manage backups based on your desired retention period.
      • Store asset metadata (including upload timestamp) in a database or log.
      • Periodically (e.g., monthly), run a script to identify assets older than the retention period.
      • Delete the backups for those assets from your backup storage.
    2. Lifecycle Policies in Backup Storage:
      • If you’re using your own backup storage (e.g., Amazon S3 or Google Cloud Storage), set up lifecycle policies there.
      • Configure rules to automatically delete backups older than your desired retention period.
      • This approach is storage-specific and doesn’t rely on Cloudinary’s features.
    3. Tagging Deleted Assets:
      • Although Cloudinary doesn’t directly support tagging deleted assets, you can add a custom tag to assets before deletion (e.g., “to_be_deleted”).
      • Then, use the bulk delete feature in the Cloudinary console to filter by this tag and delete the corresponding backups.

    Remember that custom solutions require additional development effort, but they allow you to tailor your backup strategy to fit your specific needs. Feel free to choose the approach that best aligns with your requirements!

    Regards,

    Wissam

  • clemoun
    clemoun Member Posts: 9

    Thanks Wissam,

    I still have technical questions regarding solution 1 and 2.


    Regarding solution 1, could you be more specific on the last part where I programmatically delete a backedup asset ? Should I use the Admin API and the route DELETE /resources/backup/:asset_id ? I have read that the Admin API is rate-limited hourly so it would not be possible to run this extensively...


    Regarding solution 2, This page https://cloudinary.com/documentation/backups_and_version_management says explicitly in a note "When using your own backup storage, (...) no life-cycle policy (archiving, deletion), no versioning, and no object locks should be enforced on that location."

    Do you however confirm this could still be a valid strategy ?

  • Cloudinary_John_H
    Cloudinary_John_H Cloudinary Staff Posts: 51

    Hi Clemoun,

    To provide some more context on solutions 1 and 2 mentioned by Wissam above.

    These are both getting into the realm of custom solutions that go against our general guidelines of not modifying the backups that are stored in your own S3/ Google storage bucket. We generally advise to not delete the assets that are backed up in your own bucket because the implication is that you will not be able to restore those assets to Cloudinary later even though we have a record that the asset should be backed up and restorable.

    Some more context on solution 1: https://support.cloudinary.com/hc/en-us/articles/202520872-How-can-I-delete-assets-from-my-account

    You can delete a single asset using the destroy method of the Upload API or delete multiple assets using the deleteResources method of the Admin API.  Single destroy calls from the Upload API aren't rate limited. Admin API can take a list of 100 assets at a time, and your Admin API is rate limited to 5000 calls per hour.

    The final line "Delete the backups for those assets from your backup storage." refers to going into your remote bucket, locating the asset that was removed from Cloudinary, and also deleting the backed up version from your remote bucket. We don't offer an API for this, you would need to use the API of your remote bucket provider.

    Some more context on solution 2:

    The same caveat applies "We generally advise to not delete the assets that are backed up in your own bucket because the implication is that you will not be able to restore those assets to Cloudinary later even though we have a record that the asset should be backed up and restorable. " You may be able to automatically clean up assets that have been backed up for a longer period of time, like 2 years old, which would help to reduce your remote bucket bill - but this is generally not advised from our perspective as it could lead to questions about why an asset isn't restoring when you need to restore it from your remote bucket.


    What we see customers do more often are the steps in this article:

    https://support.cloudinary.com/hc/en-us/articles/205228402-Storage-cleanup-best-practices

    Specifically step 4 "Clearing images that haven't been accessed for a while". Our bulk-delete tool https://console.cloudinary.com/settings/settings/bulk_delete/new can be used to clear old assets from your Cloudinary account periodically with the "last accessed" option. This can be used to delete assets that no one has viewed for a period of time. Then, once that is done, on the Cloudinary side we can wipe all your backups from your remote bucket and re-run the "initial backup". (To do this, open a ticket here https://support.cloudinary.com/hc/en-us/requests/new, and ask us to clear your existing backups and re-run the initial backup process). The end result here is that only assets that are currently in your Cloudinary account will end up backed up once the process is complete, reducing your remote bucket bill.

    Hopefully that all makes sense, please let me know if you have any further questions.

  • clemoun
    clemoun Member Posts: 9
    edited March 20

    Thanks a lot John for those details ! Things are getting clearer.

    Unfortunately, the final proposition you are making will not fit my needs as users are relying on my service to store images that could be last accessed a long time ago but still needed on their part.

    Thanks a lot for clarifying the fact that the presence of a backed up file in Cloudinary's listing AND the actual existence of this file in my S3 bucket are two separate things. I did not have that in mind, though this makes sense now.

    One last thing I would need to clarify is when Wissam and you are talking about "deleting an asset" whether through:

    1. The destroy method of the Upload API
    2. The deleteResources method of the Admin API
    3. The bulk Delete interface

    If this asset is a backed-up asset, are the 3 methods above working ? Or are they only working on active assets ? The documentation make it seems (to me) like they only work on active assets.

    My main focus here is to limit the size of my backed up assets. And as you are both referring to those 3 methods (and not the DELETE /resources/backup/:asset_id route which is the only one I found clearly devoted to backed up assets), It seems like I could delete backed up assets from Cloudinary's listing using any of the 3 above.

  • clemoun
    clemoun Member Posts: 9

    Can someone please answer my last question ?

    The documentation is not crystal clear on whether the following methods are working on backed-up assets :

    1. The destroy method of the Upload API
    2. The deleteResources method of the Admin API
    3. The bulk Delete interface

    Wissam and John's answers make it seems like those 3 methods would allow deleting a backed-up asset from Cloudinary's listings. But a confirmation would be most welcome...

  • Wissam
    Wissam Member, Cloudinary Staff Posts: 103

    Hi Clemoun,

    The 3 delete options delete resources from Cloudinary's listing, but won't delete any backup.

    To delete backup versions of a resource, you can use this method:

    Admin API Reference | Cloudinary

    You can read more about delete resources in the following documentation :

    https://cloudinary.com/documentation/delete_assets

    Please let me know if you have additional questions.

    Regards,

    Wissam

  • clemoun
    clemoun Member Posts: 9

    @Wissam : I am sorry but this is still not clear to me.

    Could you please tell me which of the 3 actions those delete options do :

    A) delete an asset considered active (not deleted) in Cloudinary listings

    B) delete an asset considered deleted (but restorable) in Cloudinary listings

    C) delete the actual backed-up file in my Amazon S3 storage


    My understanding is :

    • The destroy method of the Upload API
    • The deleteResources method of the Admin API
    • The bulk Delete interface

    only do A and not B nor C. => Is that correct ?


    To do B, the only way is to use the DELETE /resources/backup/:asset_id route described here => is that correct ?

    To do C, the only way is to do it myself on my Amazon S3 bucket => is that correct ?

  • Wissam
    Wissam Member, Cloudinary Staff Posts: 103

    Hi there,

    Sorry for not being clear.

    Once you use one of these 3 options ( destroy method, deleteResources and bulk Delete ), it will delete the assets ( meaning they will considered as deleted and not accessible any more and you can restore it from your backup). Please note that in case the deletion was done through the bulk delete interface, the restore of this action can be done by contacting the Support center.

    To achieve B, you can indeed use the DELETE /resources/backup/:asset_id route as described in Cloudinary’s documentation.

    For C, you are correct that you would need to manage the deletion of files directly in your Amazon S3 bucket.

    I hope it is clearer now.
    Please let me know if you have follow up questions.

    Regards,

    Wissam

  • clemoun
    clemoun Member Posts: 9

    Thanks Wissam.

    This is much clearer now !

    Have a great day