Reactive REST (webflux) in google appengine: How to deploy? - google-app-engine

I have a REACTIVE-rest application, that uses springboot webflux. Uses jetty for backend.
How do I expose flux and mono rest endpoints through Appengine? Is there a way to do it?
Does google appengine support its own reactive-java based api endpoints?

Just deploy it the way you would normally deploy your application. App engine is just managed kubernetes. And therefore what is providing you is not a costume runtime but just Operating System (some lightweight version of Linux from Google.) and ability to scale, networking, security.
The reactive rest part comes from 2 things:
Number 1 how OS (any OS, windows does it too) do I/O operations (redding/writing from sockets, reading/writing from disk etc.) and they are inherently asynchronous.
Number 2 how java interacts with I/O. And until Java 7 came out they were artificially blocked by the Java Runtime. Java 7 introduced NIO (Non Blocking I/O). That gave java programmer ability to interact with I/O multiplexer via select() and poll() methods.
Servers like netty are using that API to stop this one thread per request model that was such a bottleneck for scaling.
You have to be careful though because Tomcat still uses the old ways. That why default http server for the spring-webflux project is netty while for spring-mvc is tomcat.
Bottom line app engine changes nothing if you deploy your app to any os that supports NIO you are good to go.

Related

Does Google App Engine support multiprocessing via python, and does the DB support multiple writes in localhost?

Regarding a production environment, I would like to know if the python standard environment (2.7) at Google App Engine supports code with multiprocessing and pooling? Using Google´s datastore. Or should Map Reduce be used instead?
And regarding development environment in a localhost, also I would like to know, how to avoid a database lock when writing to the same database from processes started from different shell terminals?
Thanks
You can have a look at this post on Google Groups, where it is confirmed that multiprocessing is not available in Google App Engine (GAE) Standard environment, but you can implement it in GAE Flexible. You might also be interested in this post about parallel execution in GAE, and Tasklets in particular with a Cloud Datastore example.
Regarding database lock:
Updates are actually done within a datastore transaction and NDB by default will retry the operation three times before failing altogether. It is recommended you only update an entity group once per second at the most. If you are seeing database locks, then you're probably doing something wrong. We implemented a version of the "fork-join queue" described by Brett Slatkin back in 2010 data pipelines talk, which is a method of "joining" many updates to the same entity such that they can all be applied at once at a controlled rate: https://www.youtube.com/watch?v=zSDC_TU7rtc&feature=youtu.be&t=33m37s
also, see the discussion going on here:
How to deal with eventual consistency in fork-join-queue

Is Socket.io Ideal for chat module

I am working on an Angularjs and Node.js based application. This is an organization based application. In this app, I have to implement chat functionality. So as we all know Socket.io is the best solution for instant messaging app and its reliability. But apart from this, I have few doubts regarding Socket.io. As of my understanding when we use socket programming (Socket.io in my case), for each and every connection it reserves a port. What if the size of an organization is too big? Will it work? At the server side, I am using Express js. Will Socket.io creates extra load on the server?
Should I go with Socket.io or HTTP?
Thanks.
HTTP polling for any sort of interactive timing is enormously inefficient. You will have tens of thousands of clients repeatedly asking your server, "do you have anything new for me?" and the server regularly responding "no, nothing yet".
webSockets (which socket.io uses as the transport) were invented precisely because they are more efficient for two way, interactive communication than HTTP polling.
Modern servers can be configured to handle hundreds of thousands of simultaneous webSocket connections. How many a single server of yours can actually handle in the real life working of your application depends upon dozens of factors, none of which you've disclosed in your question. But, selecting webSocket/socket.io is not a bad architectural choice for two-way chat - that's the kind of application is was invented for because it's generally better than HTTP polling at that sort of thing.
See these references:
What are the pitfalls of using Websockets in place of RESTful HTTP?
Ajax vs Socket.io
Can this technology stack scale?
Do HTML WebSockets maintain an open connection for each client? Does this scale?
600k concurrent websocket connections on AWS using Node.js
Node.js w/1M concurrent connections!
HTML5 WebSocket: A Quantum Leap in Scalability for the Web
For beginners, chat using socket.io is really simple to understand and integrate. However, the amount of bandwidth will depend heavily on the amount of data you're going to send from the server, and how much data the client will send. The bandwidth usage will also depend on which Socket.IO transport you're using, and the heartbeat interval of your application.
The performance impact of the application also varies on the type of application you're running and the performance capability of your machine and/or network. However, 5000+ clients will have a considerable impact on performance, regardless of your computer's capabilities unless you are scaling the application across multiple cores.
You can refer to this link for more details. Link
Go with Socket.io. It is incredibly relevant today for highly interactive applications like chat module. With web socket, there is no negotiation protocols and connection remain open as long as users concerned are registering for service with the web server. The payload is significantly less than http/https protocol.

Platform as a Service to handle tens of thousands of simultaneous long term network connections

Is there a Platform as a Service (PaaS, e.g. Google App Engine or Windows Azure) that for a reasonable cost can be used to run a server for relaying peer to peer "real time" communication between clients?
This system will in my case be used to relay (small amounts of) network traffic to and from small home automation gadgets with limited resources programmed in embedded C, to Android and iOS apps. In a few years I expect several tens of thousands of simultaneous connections.
The reason I am looking for a PaaS solution and not IaaS is that I would like to minimize the time and expertise needed for virtual computer, OS and server application maintenance.
Because of the resource constraints of the home automation gadget, a solution like PubNub is not possible. I have a few thousand bytes of available program flash for my embedded C code, so the protocol used would have to be pretty basic (e.g. raw TCP or UDP, HTTP or WebSockets).
Using "long polling" with Google App Engine (GAE) would be too expensive, as they bill for the whole duration of the connection even if almost no traffic is transfered. GAE supports Sockets, but only outgoing sockets and not listening sockets on the server. Is it possible to get around this limitation somehow by e.g. sending a UDP packet to GAE first (to punch a hole in the user's firewall, and having GAE then initiating an outgoing socket back to the home automation gadget or Android/iOS app?
Or do you see any other possible solutions using the PaaS aspects of Windows Azure or other PaaS providers?
Any tips or possible solutions are greatly appreciated!
AMQP seems like it would fit your protocol needs and the Apache Qpid/Proton project has some client libraries, their C code might meet your needs. On the service side you could test things out using Azure ServiceBus since it speaks AMQP. If that didn't meet your needs you could host a worker role and run one of the AMQP clients in there.
Another option to consider is ZeroMQ. They have a lot of very simple client APIs and building a relay service that ran in a Worker role would be a trivial amount of code. Java Sample C# Sample Those samples are using an "inproc" transport and I'm guessing you want to switch that to TCP.

Is there a framework for distributing browser automation testing over a cluster of EC2 instances?

I am aiming to simulate a large number of 'real users' hitting and realistically using our site at the same time, and ensuring they can all get through their use cases. I am looking for a framework that combines some EC2 grid management with a web automation tool (such as GEB/WATIR). Ideal 'pushbutton' operation would do all of this:
Start up a configurable number of EC2 instances (using a specified
AMI preconfigured with my browser automation framework and test
scripts)
Start the web automation framework test(s) running on all of them,
in parallel. I guess they would have to be headless.
Wait for completion
Aggregate results
Shut down EC2 instances.
While not a framework per se, I've been really happy with http://loader.io/
It has an API for your own custom integration, reporting and analytics for analysis.
PS. I'm not affiliated with them, just a happy customer.
However, in my experience, you need to do both load testing and actual client testing. Even loader.io will only hit your service from a handful of hosts. And, it skips a major part (the client-side performance from a number of different clients' browsers).
This video has more on that topic:
http://www.youtube.com/watch?v=Il4swGfTOSM&feature=youtu.be
BrowserMob used to offer such service. Looks like they got acquired.

Anyone up to creating a tomcat based alternative for GAE?

If we had the possibility to run GAE app without any code change on our servlet engine that would be great because:
in case that google changes their billing policy we can just jump to our own server or in case their current policy doesn't fit our app needs
we can do stuff which is not allowed in the GAE, compromising a 1 JVM, 1 DB
We don't actually need a distributed system but more of a realtime system with synchronize, true locking mechanisms, other servers/software installed on the server machine, socket interface etc...
Such a package should include at least:
TomCat (or equivalent)
DataNucleus Access Platform
(Task Queue service)
Any idea if it's easy to get such a thing or if it's already exist somewhere?
Thanks
Good question - GAE is excellent, but it has considerable limitations, so I think it is a good idea to keep your options open. With that in mind here are some options.
http://appscale.cs.ucsb.edu/
"AppScale is a platform that allows users to deploy and host their own Google App Engine applications. It executes automatically over Amazon EC2 and Eucalyptus as well as Xen and KVM. It has been developed and is maintained by the RACELab at UC Santa Barbara.
There is also TyphoonAE but it is Python specific so probably not useful for you.
Also take note of the Siena project...
http://www.sienaproject.com/index.html
This is supposed to provide GAE/J users with a persistence API that is better suited to the GAE Datastore then JDO/JPA, but is still portable to other platforms.

Resources