I'm fairly new to things that aren't strictly front end, so after reading the Google pub/sub docs and doing a few searches its not clear to me whether using it with react is possible.
My use case is I (hypothetically) have tens of thousands of people on my webpage at a time that all need to be told at the same time that some external event occurred (the message would be very small).
I know Google Firestore has a listener feature but based on this specification it would not be within the free tier usage anymore. I've seen libraries that allow Google Pub/Sub to be used with IOT devices so I'm confused on why I can't find any resources on using it in the browser.
Creating a Cloud Pub/Sub subscriber in the frontend would be an anti-pattern for several reasons. First of all, the quota limits only allow 10,000 subscriptions per topic. Since you say you have tens of thousands of people on the web page at a time, you would not be able to create enough subscriptions for this case. Additionally, subscriptions created when users come to the website would not be able to get any notifications from before the time the subscription was created; Cloud Pub/Sub only guarantees delivery of messages published after the subscription was successfully created. Finally, you'd have the issue of security and authentication. In order to start a subscriber from the client, you'd need to pass it credentials that it could use. If you use separate credentials for each webpage viewer, then you'd have to create these credentials on the fly and revoke them when the user disappears. If you use the same credentials across all of the subscribers, then one subscriber could intercept the feed of another subscriber.
Overall, Cloud Pub/Sub is designed for the torrents use case: fewer feeds with a lot of data that has to be processed by fewer subscribers. What you are talking about is the trickles use case: a small number of messages that need to be distributed among a large number of subscribers with individual ACLs. Firebase Cloud Messaging is the product designed for this latter case.
While it is true that Cloud Pub/Sub is on the path for Google Cloud IoT, it is used on the publish side: many devices send their events to a topic that can be processed by subscribers. Note that these messages from devices don't come directly into Cloud Pub/Sub; they go through a Cloud IoT server and that server is what publishes the messages to Cloud Pub/Sub. Device authentication is done via Cloud IoT and not via permissions on Cloud Pub/Sub topics. The delivery of messages to IoT devices is not done with Cloud Pub/Sub.
Related
A service A will run in EKS with replicated instances across several pods. This service does many things, among this it's meant to subscribe to a salesforce Streaming API that implements pub/sub for publishing messages.
The salesforce streaming API follows the Bayeaux protocol. The Bayeaux protocol is implemented with CometD which is just a long-polling over http strategy.
Long story short, this pub/sub pattern is based on a long-lived request from the subscriber to the publisher server. Publisher only responds when there's an update and a new long-lived request is sent right away.
I'm worried about they way AWS EKS would deal with this type of subscription, specially considering there will be multiple replica instances of service A running at all times. I need to guarantee each message is processed only once.
Maybe the inherent load-balancing in EKS will handle this? After googling around and reading for half a day I haven't found a concrete answer, and I don't yet have access to the EKS resources to set up a test.
My question:
Are my worries justified for this EKS to external long-polling pub/sub justified? Do I need to add additional AWS elements into the picture?
I'm working with a Django web app deployed on Google App Engine flexible environment.
I'm streaming my data while processing requests in my views using bigquery.Client(). But I think it is not the best way to do it. Do I need to delegate this process outside of the view (using pub/sub, tasks, cloud functions etc.? If so, give me a suitable architecture: which GCP product should I use, how to connect, and what to read.
Based on your comment, I could recommend you Cloud Run;
Cloud Run is a serverless container based product. You write a webserver (that handle your POST request), wrap it in a container and deploy it on Cloud Run.
With a brand new feature, named always on the CPU is not throttled after the response sent (the normal behavior). With always on, you keep the full CPU up to the Cloud Run instances off load (usually after 15 minutes, but can be quicker).
The benefit of the feature is the capacity to return immediately the response to the client, and then to continue to process, asynchronously, your data to store in BigQuery (in streaming mode).
We are developing an app that uses the Gmail API to synchronize the e-mails of our users. We are relying on watch to get change notifications through a PubSub, as recommended in the documentation.
Everything is okay, and we are receiving the notifications correctly.
However, as many software companies, we do have a staging environment to test our new features. We have a staging Google OAuth2 client with different Client ID / Client Secret to authenticate to Google, and a staging PubSub topic/subscription to receive notifications.
If I connect my Gmail account on the staging environment, everything works fine. I receive the notifications in staging. If I connect after that the same Gmail account on the production environment, I receive the notifications in production ; but the staging notifications stops coming. The same happens the other way round.
I thought that by using a different client and a different PubSub, we could get the notifications in both environments. It doesn't seem the case. Maybe Google limits the subscription per Google Cloud project?
Does anyone have already met this limitation or have more information about this?
Best regards,
François Voron
I am trying to use Google App Engine as a mediator between the mobile platform and a popular cloud storage service. The mobile app tells app engine what parts of a particular file it wants from the cloud storage, app engine should then fetch that file data, processes it and extracts the requested parts to send back to the mobile app. Yes it has to be set up this way, the mobile os is unable to read files of this particular format, but app engine can, and this particular cloud storage is integrated with a required desktop software.
The issue: processing the file and extracting the data exceeds the 60 second response limit and the Task Queue cannot return data back to the originally requesting mobile app. in most cases, the data would be ready to return in 1-3 minutes. I realize that the Channel Api could allow me to receive real-time messages via a web view as to when the data is ready, but this api is very expensive since I would need to allow for thousands of connections a day and each user has to have their own channel per the docs. Should I look in to polling (outside the channel api)? What design models, methods or even other services should I look in to (I have been using gae because of its ease of use, automatic scaling and security; I'm a one man show).
The product relies on a capability that only exists in Java to process the data. Thanks.
You could return a transaction id to the client, and then let the client periodically ping your server with that id to see if the long process is complete.
Appengine 'Backend' instances do not have the 60 seconds limit. You can see the comparison between normal frontend instance and backend instance here: https://developers.google.com/appengine/docs/java/backends/
I am currently using google app engine as my mobile application back end. I have a few tasks that can not be performed in the gae environment (mainly image recognition using opencv). My intention is to retain gae and use AWS to perform these specific tasks.
Is there a simple way to pass specific tasks from gae to AWS? E.g. A task queue?
You could either push tasks from GAE towards AWS, or have your AWS instances pull tasks from GAE.
If you push tasks from GAE towards AWS, you could use URLFetch to push your data towards your AWS instances.
If you prefer to have your AWS instances pull tasks from GAE, you could have your GAE instances put their tasks in the GAE Pull Queue, and then have your AWS instances use the Task Queue REST API to lease tasks from the queue.
In either case, the AWS instance could report back the processing result through a simple POST request to your GAE servlets, or through inserting tasks via the abovementioned REST API which would later be leased by your GAE instances. The latter could be useful if you want to control the rate of which your GAE app process the results.
Disclaimer: I'm a lead developer on the AppScale project.
One way that you could go is with AppScale - it's an open source implementation of the App Engine APIs that runs over Amazon EC2 (as well as other clouds). Since it's open source, you could alter the AppServer that we ship with it to enable OpenCV to be used. This would require you to run your App Engine app in AWS, but you could get creative and have a copy of your app running with Google, and have it send Task Queue requests to the version of your app running in AWS only when you need to use the OpenCV libraries.
Have you considered using amazon simple queue service ? http://aws.amazon.com/sqs/
You should be able to add items to the queue from gae using a standard http clint.
Sure. AppEngine has a Task Queue, where you can put in your tasks by simply implementing DeferredTask. In that task you can make requests to AWS.
Your intention to retain the application in GAE and use AWS to perform a few tasks, that can not be performed in the GAE, seems for me a right scenario.
I'd like to share a few ideas along with some resources to answer the main part of your question:
Is there a simple way to pass specific tasks from gae to AWS? E.g. A task queue?
If you need GAE and AWS to perform the task all the time (24/7) then your application will definitely depend on batch schedule or task queue. They are available by GAE.
However if you could arrange to pull the task in GAE and perform by AWG on interval basis (say twice a day of less than an hour each), you may no need to use them as long you can manage the GAE to put the data on Google Cloud Storage (GCS) as public.
For this scenario, you need to setup AWS EC2 Instance for On/Off Schedule and let the instance to run a boot script using cloud-init to collect the data through your domain that pointed to GCS (c.storage.googleapis.com) like so:
wget -q --read-timeout=0.0 --waitretry=5 --tries=400 \\
--background http://your.domain.com/yourfile?q=XXX...
By having the data from GCS, then AWS can perform these specific tasks. Let it fire up GAE to clean the data and put the result back to GCS to be ready to be used as your mobile application back end.
Following are some options to consider:
You should note that not all of the EC2 types are suitable for On/Off Schedule. I recommend to use EC2-VPC/EBS if you want to setup AWS EC2 Instance for On/Off Schedule
You may no need to setup EC2 if you can set AWS Lambda to perform the task without EC2. The cost is cheaper, a task running twice a day for typically less than 3 seconds with memory consumption up to 128MB typically costs less than $0.0004 USD/month
As outcome of rearranging you your application in GAE and set AWG to perform some of the tasks, it might finally rise your billing rates, try to to optimize the instance class in GAE.