How to call streaming API from Azure Logic Apps? - azure-logic-apps

Is there any way that I can use Azure Logic Apps to call streaming APIs? The API doesn't support chunking parameter so Range header is not an option.
I am looking for a logic similar to Python Requests library's Response.iter_content() that I can iterate over response data in chunks and write them back in DB.
My desired workflow: The logic app gets triggered once a day at a specific time. It calls the streaming API using HTTP connector and streams in all the data (data is much larger than HTTP connector internal buffer size). The API supports stream request and I am able to iterate through response in chunks using python Response.iter_content(chunk_size=1048576) to fetch them all.
PS: I know I can use Azure function and have it triggered within my Logic App workflow, but I'd prefer to use Logic Apps native connectors.

Related

Connecting API to database

I am using this API to build an app (Xcode) and the maximum number of calls a day is 5000. The way I have currently built the app for testing purposes is to call the API every time the user refreshes the data. So, I am running out of calls per day. So, I was wondering how to connect an API to a database like firebase. Then update the data in the database maybe 4 times a day at a specific time. When the user would be refreshing, they would pull data from the database instead. I'm new to programming and am not sure if this is the best solution and would appreciate if anyone could direct me to more resources. Thanks!
This is the api I am using: https://projects.propublica.org/api-docs/congress-api/?
Edit: Also would something like this also mean I would build a REST API? https://github.com/unitedstates/congress It is a repository that includes data importing scripts and scrapers. I'm guessing this isn't compatible with swift but is compatible with building a REST API in AWS or Firebase?
You can use AWS (Amazon Web Services). Their free tier allows many of their services for free (12 months, and usage limit) including the ones I would recommend you for this project:
Make an AWS account.
Use S3 storage buckets to host a datafile.
Use API Gateway to make an API.
Use Lambda to run a Python/Javascript in the cloud which connects the API with the S3 bucket (your data).
Use IAM to create roles and permissions for the S3 bucket, API and Lambda scripts to communicate.
Here's how you set up the API: https://www.youtube.com/watch?v=uFsaiEhr1zs
Here's how you read the S3 bucket: https://www.youtube.com/watch?v=6LvtSmJhVRE
You can also work with these tools to set up an API that PUTS data to the S3 bucket and updates the data regularly.

what google cloud component to use for my client-server pub/sub?

I'm building an app on google cloud that takes users' audio recording. Users would record 1 or more audio clip and upload them to the the backend, which would process the clips, run Machine Learning prediction model I built and return an integer back to the user for each of the audio that are uploaded. Processing and predicting 1 piece of audio takes about 10 seconds. Users can upload 20 audio at a time.
What I have so far is:
HTML, Javascript, css on the client side. The upload functionality is async using fetch and return a promise
The backend is running Google AppEngine (python3.7), Firebase Authentication, Google CLoud storage and cloud logging
The processing and the prediction is running on Google Cloud Function.
My question is as follow:
Since, it might take up to 200-300 seconds for the processing to complete, how should I be handling the task once users hit the upload button? Is simple request-response enough?
I have investigated the following:
Google Cloud tasks. This seems inappropriate, because client actually needs to know when the processing is done. There is really no call back when the task is done
Google Cloud PubSub. There is a call back for when the job is done (subscribe), but it's server side. This seems more appropriate for server to server communication, instead of client-server.
What is the appropriate piece of tech to use in this case?
There is to way to improve the user experience.
Firstly, on the processing, you can perform parallel processing. All the prediction should be handled by the same Cloud Functions. In App Engine, you should have a multi-thread processing which invoke your CLoud Functions for only one audio clip, and do that 20 time in parallel. I don't know how to achieve that with async in Python, but I know that you can
Then, if you implement that, you will wait all the end of all the audio clip processing to send a response to your users. the total should be between 15 - 20 seconds.
If you use Cloud Run, you can use streaming (or partial HTTP response). Therefore, you could send a partial response when you get a response from your CLoud Functions, whoever the audio clip (the 3rd can be finished before the 1st one).
As you note, Cloud Pub/Sub is more appropriate for server-to-server communication since there is currently a limit of 10,000 subscriptions per-topic/per-project (this may change in the future).
Firebase Cloud Messaging may be a more appropriate solution for client-server messaging.

How to stream data from database via REST API?

I have large data stored in Postres database and I need to send the data to the client via a REST API using Django. The requirement is to send the data in chunks and to not load the entire content into memory at once. I understand that there is a StreamingHttpResponse class in Django which I will explore. But are there any other better options? I've heard about Kafka and Spark for streaming applications but the tutorials I've checked about these two tend to involve streaming live data like interacting with Twitter data, etc. But is it possible to stream data from database using any of these two? If yes, how do I then integrate it with REST so that clients can interact with it? Any leads would be appreciated. Thanks.
You could use debezium or apache-kafka-connect to bulk load your database into Kafka.
Once the data is there, you can either put a Kafka consumer within your Django application or outside of it and make REST requests as messages are consumed. Spark isn't completely necessary, and shouldn't be used within Django

How to perform database queries in async microservices?

I have problem with one concept about async microservices.
Assume that all my services are subscribe to some event bus, and I have exposed API Gateway which takes HTTP request and translate them to AMQP protocol.
How to handle GET requests to my API Gateway? Should I use RPC? For single entity it’s ok, but what about some search or filtering (eg. get games by genre from Games service)?
I’m thinking about using RPC for getting single entities by ids and creating separate Search service with Elastic which will expose some GET endpoints to API Gateway. But maybe somewhere it’s simpler solution for my problem. Any ideas?
Btw., It’s correct to translate HTTP requests from API Gateway to AMQP messages?
Please take a look at Event Sourcing and CQRS. You can also take a look at my personal coding project:

Collecting Data from public mobile application

I'll like to collect information from a mobile application I created. The app allow users to use it without authentication and also I'll like to collect the data to highly-available service such as AWS SQS so I'll not miss any data.
The application is always connected to the internet so no need for offline collection of the data.
What bother me is how can I send the data in a secure manner so that users will not be able to send fake data into the same endpoint I'm using.
Google Analytics is not fit here because I need access to the raw data, not only aggregate of it.
You should look into STS for getting temporary access credentials from your app instead of hard coding AWS credentials into your app.
The fact that your application does not require authentication does not necessarily mean you are at an increased likelihood of having a malicious actor send bad data to your service. If your app had authentication it would still be possible for a malicious actor to reverse engineer the requests and send bad data using the authenticated credentials.
While sending data directly to SQS is a valid option, you could also send the data into SNS if you want to have the ability to fan out to multiple systems such as multiple SQS queues.
You could also look into using API Gateway + Lambda as the service that is called from your app even if the Lambda function only sends the data to SQS as this would allow for additional processing flexibility in the future such as validating input with additional logic before it is sent to SQS. However, this type of logic could just as easily be performed when the messages are pulled off the queue.

Resources