Delete all achives from AWS S3 Glacier Vault instead of deleting one by one - amazon-glacier

I am looking for a way to delete multiple archives from AWS S3 Glacier.
I found ways to delete archives by id one by one, using AWSPowershell,awscli and Rest API.
Is there a way in which I can delete all archives in one execution by using any wildcard for the archives' id??

There is no single API or AWS provided single click to do this task.
You can write a simple script or will find many examples (like https://gist.github.com/veuncent/ac21ae8131f24d3971a621fac0d95be5).
You have to - Generate inventory and send delete request for each archive ID.
You can call AWS APIs as well.
During development, you can make REST API calls to Glacier using postman:
https://medium.com/#nisarg.1611/how-to-call-aws-s3-glacier-rest-apis-from-postman-e3db527508d5
Please update if you have found any other way.

Related

Ways to Send Snowflake Data to a REST API (POST)

I was wondering if anyone has sent data from Snowflake to an API (POST Request).
What is the best way to do that?
Thinking of using Snowflake to unload (COPY INTO) Azure blob storage then creating a script to send that data to an API. Or I could just use the Snowflake API directly all within a script and avoid blob storage.
Curious about what other people have done.
To send data to an API you will need to have a script running outside of Snowflake.
Then with Snowflake external functions you can trigger that script from within Snowflake - and you can send parameters to it too.
I did something similar here:
https://towardsdatascience.com/forecasts-in-snowflake-facebook-prophet-on-cloud-run-with-sql-71c6f7fdc4e3
The basic steps on that post are:
Have a script that runs Facebook Prophet inside a container that runs on Cloud Run.
Set up a Snowflake external function that calls the GCP proxy that calls that script with the parameters I need.
In your case I would look for a similar setup, with the script running within Azure.
https://docs.snowflake.com/en/sql-reference/external-functions-creating-azure.html

Fetch thousand of files from S3/minio with a single page webapp (no server)

I'm developing a single page app for image annotation. Each .jpg file is stored on S3/minIO services, coupled with a .xml file (Pascal VOC notation), which describes the coordinates and positions for each annotation associated to the image.
I'd like to fetch all the xml data, to be able filtering my image results within the webapp project (based upon ReactJS). But thousand of request to an S3 server directly from a web app seems a bit odd to me; nevertheless, I would prefer avoid using any "middleware" servers (like python/flask or nodejs), relying on the ReactJS app.
I've not been able to find any workaround to download all the xml files content with a single ajax call; do you have some idea to address this kind of issue?
The S3 API doesn't provide an API to fetch multiple files in a single operation. As you have suggested in your question, your application will need to handle this logic by first getting a list of the objects then iterating through that list.
Alternatively, if you can consider storing the xml files as a single archive.

Uploading images to S3 with React and Elixir/Phoenix

I'm trying to scheme how I'm going to accomplish this and so far I have the following:
I grab a file in the front end and on submit send the file name and type to the back end where it generates a presigned URL. I send that to the FE. I then send the file on the front end.
The issue here is that when I generate the presign, I want to commit my UUID filename going to S3 in my database via the back end. I don't know if the front end will successfully complete this task. I can think of some janky ways to garbage collect this - but I'm wondering, is there a typically prescribed way to do this that doesn't introduce the possibility of failures the BE isn't aware of?
Yes there's an alternate way. You can configure your bucket so that it sends an event whenever an object is created/updated. You can either send this event to a SNS topic or AWS Lambda.
From there you can make a request to your Phoenix app webhook, that can insert it into the database.
The advantage is that the event will come only when the file has been created.
For more info, you can read the following: https://docs.aws.amazon.com/AmazonS3/latest/dev/NotificationHowTo.html
The way I'm currently handling this is as such:
Compress the image client side.
Send the image to the backend application server.
Create a UUID on the backend.
Send the image from s3 to the backend, using the UUID as the key.
On success, put the UUID into the database.
Respond to the client with the UUID so it can display the image.
By following these steps, you don't introduce error into your database.

Download large file on Google App Engine Python

On my appspot website, I use a third party API to query a large amount of data. The user then downloads the data in CSV. I know how to generate a csv and download it. The problem is that because the file is huge, I get the DeadlineExceededError.
I have tried tried increasing the fetch deadline to 60 (urlfetch.set_default_fetch_deadline(60)). It doesn't seem reasonable to increase it any further.
What is the appropriate way to tackle this problem on Google App Engine? Is this something where I have to use Task Queue?
Thanks.
DeadlineExceededError means that your incoming request took longer than 60 secs, not your UrlFetch call.
Deploy the code to generate the CSV file into a different module that you setup with basic or manual scaling. The URL to download your CSV will become http://module.domain.com
Requests can run indefinitely on modules with basic or manual scaling.
Alternately, consider creating a file dynamically in Google Cloud Storage (GCS) with your CSV content. At that point, the file resides in GCS and you have the ability to generate a URL from which they can download the file directly. There are also other options for different auth methods.
You can see documentation on doing this at
https://cloud.google.com/appengine/docs/python/googlecloudstorageclient/
and
https://cloud.google.com/appengine/docs/python/googlecloudstorageclient/functions
Important note: do not use the Files API (which was a common way of dynamically create files in blobstore/gcs) as it has been depracated. Use the above referenced Google Cloud Storage Client API instead.
Of course, you can delete the generated files after they've been successfully downloaded and/or you could run a cron job to expire links/files after a certain time period.
Depending on your specific use case, this might be a more effective path.

Browser based file upload to AWS S3 and encode server-client workflow

Im writing a single-page-web-app (angularJs) and a server back-end (node.js). The communication between them is done via REST.
Currently im trying to implement the following scenario:
Upload big files from browser to S3 public bucket.
Copy uploaded file to private bucket on S3
Transcode uploaded file to HTML 5 compatible format (AWS Elastic Transcoder)
Store Meta-Object about the file in DB to access later
I'm racking my brains to get a well working design of the communication/ data-workflow between server and client, but always got stuck at the following questions?
Store file meta-object at the end or at the beginning of the process. If it is at the beginning, i have to store and handle some state information?
Who should start copying uploaded files to private bucket. Server or client? If it is the server, how can the client get informed about the job succeeded?
Who starts the transcoding process? If it is the server, how can the client get informed about the job succeeded?
How would you do this?
there is a pretty good tutorial which describes the use case you are planning to implement: http://www.bitcodin.com/blog/2015/02/create-mpeg-dash-hls-content-for-amazon-s3-and-cloudfront/
If your transcoding system has a RESTfull API (like bitcodin which is used in this tutorial, or any other service) you can do your application also client-side and use the API calls to get the state of your transcodings, etc. However, using the API you can do the same also server-side, whatever fits better for you.
I personally would store the metadata infos at the beginning of the process, as this is the point of time where you generate the "asset" in your database/CMS/etc.

Resources