Google Application Engine - Using URL Fetch Service

Google Application Engine - Using URL Fetch Service - google-app-engine

I've looked at http://code.google.com/appengine/docs/java/urlfetch/overview.html
but the code does not show a pooling example,
i mean if i want to fetch www.example.com/1.html, www.example.com/3.html, www.example.com/3.html, ...., www.example.com/1000.html
I'd have to open 1000 connection and close 1000 connections.
I think I could just open 1 connection 'keep-alive', and issue 1000 request and then close it.
that should be faster.
but i have no idea how to do that using url.openStream()

The URLFetch service operates at a higher level of abstraction than individual connections, and the native Python and Java libraries that use it are modified to use this service. As such, you have no direct control over connections - but you can expect that the underlying service will keep connections open when it deems it appropriate.

Unfortunately, as the docs for Java App Engine say, at this time "The Java API for the URL Fetch service only supports synchronous requests". The Python version of App Engine does support async requests, so, if porting to Python is just unthinkable, you may wait in the reasonable hope that such functionality will eventually be in the Java side too. After all, the Python version has been around for a year more, so of course it's more mature, stable, and function-rich.

Related

Can I use a WebSocket protocol to send and receive data from Cloud Firestore using an ESP32S3 Using ESP-IDF C

Google is deprecating Cloud Iot, so not an option.
https://cloud.google.com/iot/docs/release-notes
Cloud IoT Core will be retired on August 16, 2023. After August 15, 2023, the >documentation for IoT Core will no longer be available.
I would like to use Firebase - Firestore for my backend. It takes all the hassles out of keeping a server up and running, scalability etc.
I managed to send data after login and authentication from an ESP32S3 using ESP-IDF in C, (note not Arduino, and not C++), and would like to know if I can rather use a websocket for the communication, once the Authentication done, and if so, can you give me a code example or pointers.
With a websocket, I can send data to my own server hosted in Europe, in less than 400ms.
With Firestore, there is a large HTTP header, that includes the API key, and also the Auth Token, a large amount of data, quite a lot of handshaking going on over HTTPS, and eventually the data is sent. This takes more than 1400ms.
We are weighing items in a farming scenario, and need to weigh very frequently, and the 1400ms with fast internet is not acceptable.
So if I could still go with Firebase Authentication, and Firestore for data, I probably would be able to speed it up to even faster than 400ms if I could use a WebSocket client connection with the Firestore document store. I can use the Refresh Token if needed to refresh the Auth Token, and thus keep the socket connection up, every 3600s as required by Firebase, (that also takes quite long) but less of a hassle, as only once every say 55 minutes.
Any pointers, advice will be appreciated.

Firestore supports multiple SDKs and wire protocols, but none of them work over web sockets. The closest you can get with Firestore would be its REST API, which is documented here. It's not the easiest protocol to work with though, so I recommend using the API explorer that is built into the documentation to create examples for yourself.

Long-running script on Google App Engine

I'm attempting to create a microservice on Google App Engine that is not intended to handle HTTP requests.
Instead, I was hoping to have a continuously running Python script that monitors a remote queue--RabbitMQ, to be precise--and sends out an api-call to another service as tasks are pushed to the queue.
I was wondering, firstly, is it possible to run a script upon deployment--one that did not originate with a user action/request?
Secondly, how would I accomplish this?
Thanks in advance for your time!

You can deploy your "script" as a manually scaled module -- see https://cloud.google.com/appengine/docs/python/modules/ -- with exactly one instance. As the docs say, "When you start a manual scaling instance, App Engine immediately sends a /_ah/start request to each instance"; so, just set that module's handler for /_ah/start to the handler you want to run (in the module's yaml file and the WSGI app in the Python code, using whatever lightweight framework you like -- webapp2, falcon, flask, bottle, or whatever else... the framework won't be doing much for you in this case save the one-off routing).
Note that the number of free machine hours for manual scaling modules is limited to 8 hours per day (for the smaller, B1 instance class; proportionally fewer for larger instance classes), so you may need to upgrade to paid-app status if you need to run for more than 8 hours.

Like #brant said, App Engine is designed to handle HTTP requests. It's not a perfect fit for background jobs, unless you try to wrap your logic into one http request.
Further, App Engine will emit an error when the response timeout, depending on your scaling settings. If you want to try it, consider basic or manual scaling.
For this type of workload, I would suggest you use a VM.

I think there are a few problems with this design.
First, App Engine is designed to be an HTTP request processor, not a RabbitMQ message processor. GAE is intended for many small requests, not one long-running process.
Second, "RabbitMQ should not be exposed to the public internet, it wasn't created for such use case."
I would recommend that you keep the RabbitMQ clients on the same internal network as the RabbitMQ broker, and have the clients send HTTP requests to App Engine.

Mobile application backend

I'm currently developing a mobile application that will fetch data from server by request (page load) or by notification received (e.g. GCM).
Currently I'm starting to think about how to build the backend for that app.
I thought about using PHP to handle the http requests to my database (mySQL) and to return the response as JSON. As I see it there are many ways to implement such server and would like to hear to hear thoughts about my ideas for implementations:
1. create a single php page that will receive an Enum/Query, execute and send the results.
2. create a php page for every query needs to be made.
Which of my implementations should I use? if none please suggest another. Thank you.
P.S, this server will only use as a fetcher for SQL and push notifications. if you have any suggestion past experience about how to perform it (framework, language, anything that comes to mind) I'd be happy to learn.

You can use PHP REST Data services framework https://github.com/chaturadilan/PHP-Data-Services

I am also looking for information about how to power a web and mobile application that has to get and save data on the server.
I've been working with a PHP framework such as Yii Framework, and I know that this framework, and others, have the possibility to create a API/Web service.
APIS can be SOAP or REST, you should read about the differences of both to see wich is best for mobile. I think the main and most important one is that for SOAP, you need a Soap Client library on the device you are trying to connect, but for REST you just make a http request to the url.
I have built a SOAP API with Yii, is quite easy, and I have use it to communicate between two websites, to get and put data in the same database.
As for your question regarding to use one file or multiple files for every request, in the case of SOAP built on Yii, you have to normally define all the functions available to the API on the server side in only one file(controller) and to connect to that webservice you end up doing:
$client=new SoapClient("url/of/webservice);
$result=$client->methodName($param1, $param2, etc..);
So basically what you get is that from your client, you can run any method defined on the server side with the parameters that you wish.
Assuming that you use to work program php in the "classic way" I suggest you should start learning a framework, there are many reasons to do it but in the end, it is because the code will result more clean and stable, for example:
You shouldn't be writing manual queries (sometimes yes), but you can use the framework's models to handle data validation and storage into the database.
Here are some links:
http://www.larryullman.com/series/learning-the-yii-framework/
http://www.yiiframework.com/doc/guide/1.1/en/topics.webservice
http://www.yiiframework.com/wiki/175/how-to-create-a-rest-api/
As I said, I am also looking to learn how to better power a mobile application, I know this can be achieved with a API, but I don't know if that is the only way.

create a single php page that will receive an Enum/Query, execute and send the results.
I created a single PHP file named api.php that does exactly this, the project is named PHP-CRUD-API. It is quite popular.
It should take care of the boring part of the job and provides a sort of a framework to get started.
Talking about frameworks: you can integrate the script in Laravel, Symfony or SlimPHP to name a few.

Apache Httpclient and the Cloud

I want to put a scraping service using Apache HttpClient to the Cloud. I read problems are possible with Google App Engine, as it's direct network access and threads creation are prohibited. What's about other cloud hosting providers? Have anyone experince with Apache HttpClient + cloud?

AppEngine has threads and direct network access (HTTP only). There is a workaround to make it work with HttpClient.
Also, if you plan to use many parse tasks in parallel, you might check out Task Queue or even mapreduce.
Btw, there is a "misfeature" in GAE that you can not fully set custom User-agent header on your requests - GAE always adds "AppEngine" to the end of it (this breaks requests to certain sites - most notably iTunes).

It's certainly possible to create threads and access other websites from CloudFoundry, you're just time limited for each process. For example, if you take a look at http://rack-scrape.cloudfoundry.com/, it's a simple rack application that inspects the 'a' tags from Google.com;
require 'rubygems'
require 'open-uri'
require 'hpricot'
run Proc.new { |env|
doc = Hpricot(open("http://www.google.com"))
anchors = (doc/"a")
[200, {"Content-Type" => "text/html"}, [anchors.inspect]]
}
As for Apache HttpClient, I have no experience of this but I understand it isn't maintained any more.

http request from Google App Engine

I'm trying to make http requests from my Google App Engine webapp, and discovered I have to use URLConnection since it's the only whitelisted class. The corresponding Clojure library is clojure.contrib.http.agent, and my code is as follows:
(defroutes example
(GET "/" [] (http/string (http/http-agent "http://www.example.com")))
(route/not-found "Page not found"))
This works fine in my development environment- the browser displays the text for example.com. But when I test it out with Google's development app server:
phrygian:example wei$ dev_appserver.sh war
2010-09-28 14:53:36.120 java[43845:903] [Java CocoaComponent compatibility mode]: Enabled
...
INFO: The server is running at http://localhost:8080/
It just hangs when I load the page. No error, or anything. Any idea what might be going on?

http-agent creates threads so that might be why it does not work.
From the API documentation:
Creates (and immediately returns) an Agent representing an HTTP
request running in a new thread.
You could try http-connection, which is a wrapper around HttpURLConnection, so this should work.
Another alternative is to try clj-http. The API seems to be a bit more high-level, but it uses Apache HttpComponents which might be blacklisted.
I am guessing http.async.client is a definite no-go due to its strong asynchronous approach.

You might want to try appengine.urlfetch/fetch from appengine-clj (http://github.com/r0man/appengine-clj, also in clojars)

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight