Snowplow NodeJS Tracker: snowplow-tracker vs snowplow-tracker-core - analytics

We have an NodeJS - Express application on top of which we have implemented Snowplow analytics, and are migrating away from Google Analytics. We want to now configure a JS Tracker in the NodeJS code. We are having difficulty choosing between the two available NodeJS trackers.
My question is - what are the differences between the two snowplow-tracker-* npm modules? I understand that snowplow-tracker is a more detailed implementation with more abstraction. But what are the features or level of complexity one should look at when choosing one over the other?
I'm looking at :
Complexity of application
Performance overhead between the two npm packages
Any particular features excluded from snowplow-tracker-core that one might want to use
Thanks!!

I answered this on the user group. My answer:
The core module contains shared functionality used by the client-side JavaScript Tracker, the snowplow-tracker module, and the Segment.io integration. It isn't really intended to be used directly and excludes some fairly important functionality, like methods to actually send events. You should probably use the snowplow-tracker module, also known as the Node.js Tracker.

Related

Change runtime from Python to Go in App Engine standard environment

I have a website on AppEngine that is 99% static. It is running on Python 2.7 runtime. Now the time has come to evolve this webapp, and since I have almost none Python code in it, I'd prefer to write it in Go instead.
Can I change runtime from Python 2.7 to Go, while keeping the project intact? Specifically, I want to keep the same app-ID, the same custom domain attached to it, the same SSL certificate, and so on.
What do I have to do in order to do that? I surely have to change runtime in the app.yaml. Is there anything else?
Bonus question: will such change happen without a downtime?
I'd be grateful for any links to documentation on exactly that (swapping runtime on a live app). I can't find any.
Specify a runtime as well as a new value for version. When deployed you'll have an older version that is Python and a newer version that is Go. There won't be any downtime (same as when deploying a newer version of Python).
Rather than trusting links/docs (that may be out of date or not 100% exactly what you're trying to do), why not create a new GAE-Std project for testing purposes and try it yourself. Having a GAE-Std test project is good for testing new function (especially by other testers who won't have access to the dev environ on your laptop).
The GAE services offer complete code isolation. So it should be possible to simply deploy a new version of the service, which can be written in a different language or even use a different GAE (standard/flex) environment. Personally I didn't go through a language change, but I did go through a split of a single-service app into a multi-service one, I see no reason for which the same principles wouldn't apply.
Maybe develop the new version as a separate app first, to be able to test it properly without risking an accidental impact on the old version and only after that bring the code as a new version in the old app. That'd be using the GAE project isolation. You can, in fact, test the entire version migration as a separate app if you so desire without even touching the existing app. I am using this technique - a separate app ID - to implement a staging environment for my app, completely isolated from my production app, see How to copy / clone entire Google App Engine Project
Make sure to not switch traffic to the new version at deployment time. This keeps the app working with the old version. Test first that the new version works as expected using Targeted routing. Then maybe use Splitting traffic across multiple versions to perform A/B testing with just a small percentage of the traffic going to the new version. Finally, when happy with the results, switch all traffic to the new version.
You need to pay special attention to the app-level configs (dispatch, cron, queue, datastore indexes), shared by all services/versions. They need to be functionally equivalent in the 2 versions. The service isolation doesn't apply to them, only project isolation can ensure no impact to the old version.
There should be no need to make any change to the app ID, custom domain mapping or SSL config. The above mentioned tests should confirm that.
A few potentially interesting posts related to re-working services/modules:
Converting App Engine frontend versions to modules
Google App Engine upgrading part by part
Migrating to app engine modules, test versions first?
Advantages of implementing CI/CD environments at GAE project/app level vs service/module level?

What is the general practice for express and react based application. Keeping the server and client code in same or different projects/folders?

I am from a microsoft background where I always used to keep server and client applications in separate projects.
Now I am writing a client-server application with express as back-end and react js as front-end. Since i am totally a newbie to these two tools, I would like to know..
what is the general practice?:
keeping the express(server) code base and react(client) code base as separate projects? or keeping the server and client code bases together in the same project? I could not think of any pros & cons of either of these approaches.
Your valuable recommendations are welcome!.
PS: please do not mark this question as opinionated.. i believe have a valid reason to ask for recommendations.
I would prefer keeping the server and client as separate projects because that way we can easily manage their dependencies, dev dependencies and unit tests files.
Also if in case we need to move to a different framework for front end at later point we can do that without disturbing the server.
In my opinion, it's probably best to have separate projects here. But you made me think a little about the "why" for something that seems obvious at first glance, but maybe is not.
My expectation is that a project should be mostly organized one-to-one on building a single type of target, whether that be a website, a mobile app, a backend service. Projects are usually an expression of all the dependencies needed to build or otherwise output one functioning, standalone software component. Build and testing tools in the software development ecosystem are organized around this convention, as are industry expectations.
Even if you could make the argument that there are advantages to monolithic projects that generate multiple software components, you are going against people's expectations and that creates the need for more learning and communication. So all things being equal, it's better to go with a more popular choice.
Other common disadvantages of monolithic projects:
greater tendency for design to become tightly coupled and brittle
longer build times (if using one "build everything" script)
takes longer to figure out what the heck all this code in the project is!
It's also quite possible to make macro-projects that work with multiple sub-projects, and in a way have the benefits of both approaches. This is basically just some kind of build script that grabs the output of sub-project builds and does something useful with them in a combination, e.g. deploy to a server environment, run automated tests.
Finally, all devs should be equipped with tools that let them hop between discreet projects easily. If there are pains to doing this, it's best to solve them without resorting to a monolothic project structure.
Some examples of practices that help with developing React/Node-based software that relies on multiple projects:
The IDE easily supports editing multiple projects. And not in some cumbersome "one project loaded at a time" way.
Projects are deployed to a repository that can be easily used by npm or yarn to load in software components as dependencies.
Use "npm link" to work with editable local versions of sub-projects all at once. More generally, don't require a full publish and deploy action to have access to sub-projects you are developing along with your main React-based project.
Use automated build systems like Jenkins to handle macro tasks like building projects together, deploying, or running automated tests.
Use versioning scrupulously in package.json. Let each software component have it's own version# and follow the semver convention which indicates when changes may break compatibility.
If you have a single team (developer) working on front and back end software, then set the dependency versions in package.json to always get the latest versions of sub-projects (packages).
If you have separate teams working on front and backend software, you may want to relax the dependency version to be major version#s only with semver range in package.json. (Basically, you want some protection from breaking changes.)

Google App Engine upgrading part by part

I have a complex appengine service that was written in PHP, now I want to migrate it to Python part by part.
Let's say that my service has 2 parts: /signIn/.... and /data/.... I just want to migrate /signIn/ part first, then /data/ later.
However, since my service is big, so I want to build new /signIn/ part in Python, then use Traffic Splitting to make some A/B Testing on this part.
My problem is that Traffic Splitting can be applied on versions only, so my old and new versions have to be in same module, and same module means that they have to written in same language (I was wrong here, see updated part). But I am migrating from PHP to Python.
What is the best solution for me?
Thanks,
Solution
With Dan Cornilescu's helping, this is what I do:
Split the app into 2 modules: default and old-version.
Dispatch /signIn/ into default module, the rest to old-version module.
Make another version of /signIn/ (default module) in Python
Configure Traffic Splitting to slowly increase requests percent into Python part. This will allow us to test and make sure there is no serious bug happen.
Note:
The /signIn/ part must be default module, since GAE's traffic splitting works at default module only.
I confirmed that we can make 2 versions in different language for a module.
One possible approach is to split your PHP app in modules in a 1st step. It's not a completely wasted effort, most of that will be needed anyways to just allow your app to work in multiple modules, not related to the language change. I suspect this is actually why you can't use A/B testing - mismatch between the modules. Unavoidable.
Once the split in modules is done then you can go on with your 2nd step - switching the language for selected module(s), with A/B testing as you intended.
A more brave approach is to mix the 2 and write the /signin/ module directly in python. On the PHP side you'd just remove the /signin/ portion (part of the earlier mentioned 1st step). Should work pretty well as long as you're careful to only use app language independent means for inter-module communication/operation: request paths, cookies, datastore/memcache keys, etc. A good module split would almost certainly ensure that.
You have testing options other than A/B, like this one: https://stackoverflow.com/a/33760403/4495081.
You can also have the new code/module able to serve the same requests as the old one, side-by-side/simultaneously and using a dispatch.yaml file to finely control which module actually serves which requests. This may allow a very focused migration, potentially offering higher testing confidence.
I'm also not entirely sure you can't actually have 2 versions of the same module in different languages - the versions should be pretty standalone instances, each serving their own requests in their own way using, at the lowest layer, the language-independent GAE infra services. AFAIK nothing stops a complete app re-write and deployment, with the same version or not - I've done that when learning GAE. But I didn't switch languages, it's true. I'd give it a try, but I don't have time to learn a new language right now :)

Parallel Module Deployment using App Engine SDK

TL;DR Is there a way to deploy App Engine modules in parallel?
I've built a go application using Google's App Engine SDK for Go. This application defines multiple modules. These modules are self-contained, and do not require any sort of dependency across other modules.
When I attempt to deploy the modules to the Google Cloud, I can't help but notice that the modules are uploaded sequentially. This would be fine if deployment was relatively quick, but each module requires it's own redundant compilation of the Go binary. Hence, on top of the regular upload time, I have to wait for my app to compile [module count] x [compilation time] every time I want to deploy.
The obvious (quick) solution is to deploy in parallel, so I created a simple bash script to deploy each module independently. The problem I immediately encountered with this "solution" was a HTTP 500 response from the App Engine API. The whole umbrella application, spanning across all the modules, seems to "lock" whenever any individual module is updated. This scenario creates a race condition, under which only the first module to trigger a deploy succeeds and the others fail.
I fear that this is a holdover from the legacy languages in App Engine. Since every module uses the same Go binary, it doesn't really necessitate multiple compilations of the same code. Repeated compilation is redundant, and there is no way to circumvent the lock.
One hypothetical solution, which I have only a vague understanding of, is to compile in parallel and deploy in series. I imagine that this approach would involve taking apart the configuration tool and reworking it to execute in the aforementioned manner- though I can't say for sure (yet).
Any help here would be much obliged. Thanks!
You can deploy to another "version" of your App Engine app, then when all modules are deployed, do a very fast version switch?
Versions also allow for traffic splitting if you need/want that kind of thing.

Merge standalone webapp and GAE in Go

I'm working on a very simple web app, written in Go language.
I have a standalone version and now port it to GAE. It seems like there is very small changes, mainly concerning datastore API (in the standalone version I need just files).
I also need to include appengine packages and use init() instead of main().
Is there any simple way to merge both versions? As there is no preprocessor in Go, it seems like I must write a GAE-compatible API for the standalone version and use this mock module for standalone build and use real API for GAE version. But it sounds like an overkill to me.
Another problem is that GAE might be using older Go version (e.g. now recent Go release uses new template package, but GAE uses older one, and they are incompatible). So, is there any change to handle such differences at build time or on runtime?
Thanks,
Serge
UPD: Now GAE uses the same Go version (r60), as the stable standalone compiler, so the abstraction level is really simple now.
In broad terms, use abstraction. Provide interfaces for persistence, and write two implementations for that, one based on the datastore, and one based on local files. Then, write a separate main/init module for each platform, which instantiates the appropriate persistence interface, and passes it to your main application to use.
My immediate answer would be (if you want to maintain both GAE and non-GAE versions) that you use a reliable VCS which is good at merging (probably git or hg), and maintain separate branches for each version. The GAE API fits in reasonably well with Go, so there shouldn't be too many changes.
As for the issue of different versions, you should probably maintain code in the GAE version and use gofix (which is unfortunately one-way) to make a release-compatible version. The only place where this is likely to cause trouble is if you use the template package, which is in the process of being deprecated; if necessary you could include the new template package in your GAE bundle.
If you end up with GAE code which you don't want to run on Google's servers, you can also look into AppScale.

Resources