exporting data for analytics use in SaaS - export

We are a SaaS product and we would like to be able have per-user data exports that will be used with various analytical (BI) tools like Tableau or PowerBI. Instead of just managing all those exports manually, we thought of using some cloud database such as AWS Redshift (which will be part of our service). But then, it is not clear how is user will access those databases naturally, unless we do some kind of SSO integration with AWS.
So - what is the best practice for exporting data for analytics use in SaaS products?

In this case you can build your security in to your backend API layer.
First you can set up processes to load your data to Redshift, then make sure that only your backend API server/cluster has access to redshift (e.g. through a vpc with no external ip access to redshift)
Now you have your data, you can validate your user as usual through your backend service, then when a user requests a download through the backend API, the backend can create a query to extract from redshift only the correct data based upon the users security role. In order to make this possible you may need to build some kind of security column into your redshift data model.

I am assuming getting data to redshift is not a problem.
What you are looking for, if I understand correctly is a OEM solutions.
The problem is how does one mimic the security model you have in place for your SaaS offering.
That depends on how complex is your security model.
If it is as simple as just authenticate the user and he has access to all tenant data or the data can be easily filtered for user. Things are simple for you. Trusted authentication will allow you to authenticate that user and user filtering will allow you to show him all that he has access to.
But here is the kicker, if your security is really complex , then it can become really difficult to mimic it within these products.
Here for integrating tableau this link will help:-
https://tableau.github.io/embedding-playbook/#
Power BI, this product am not a fan off. I tried to embed a view in one my applications and data refresh was a big issue.
Its almost like they want you to be a azure shop for real time reporting.( I like GCP more )
If you create the api's and populate datasets then they have crazy restrictions like 1MB/sec etc.
On the other instances datasets can be refreshed only 8 times.
I gave up on them.
Very recently I got a call from Sisense and they seemed promising as well from a OEM perspective. You might was to try them.

Related

How do I authenticate users of a web-app to access GCP data relevant only to them?

I have spent 3 days researching this problem and cannot find a solution or similar use case that shows how to solve the problem, so any pointers would be greatly appreciated.
I am creating a web-app that uses Google Cloud Storage and Bigquery. A user registers on the web app and then can upload data to Cloud Storage and Big Query. Two users could be from the same company and therefore should be able to view the same data - i.e. Jack and Jill work for company A and if Jack uploads a massive dataset via this app, Jill should also be able to view it later.
Another scenario will be I have two completely separate clients with users using this web-app. If users from Company A upload data, users from Company B should not be able to view Company A's data, and vice versa. But users from the same company should be able to view the data within their company.
Currently, I have an app that works for a single company. This has a React front-end that uses Firebase for authentication. Once the user is logged in, they can use the app which sends off API calls to a Flask back-end that does some error checking and authentication checking and then fires off an API call to GCP. This uses a service account and the key is loaded as an environment variable in the environment in which the Flask app is running.
However, if Company B want to use the app now, both Company A and Company B will be able to see each other's data and visualize it through the app. In addition, they will be sharing a project (I would like to change this to allocate billing more easily to have each client have their own project).
I ultimately want to get this app onto Kubernetes and ensure that each company is independent of each other, however, do not want to have to have separate URL's for every company using the app. Also, I want to abstract GCP away from the client. I would prefer to authenticate a user based on their login credentials and then they will be given access to their GCP project (via my front-end) accordingly.
I thought about perhaps having separate service keys for each client and then storing the service key info in Firebase, while using the respective keys for API calls but not sure this is best practice. It is however the only strategy I can think of.
If anyone could provide some help or guidance it would be very much appreciated. This is my first GCP project and have not been able to find any answers on GCP, SO, Google Groups, Slack or Medium.
Thanks,
TJ
First if all, welcome on GCP! It's an awesome platform, very powerful and flexible. But not magic.
Indeed, the use case that you describe is specific to your business logic. GCP provides told for securing access for user and VM(through service account) but not for customer. Here you have to implement your own custom and authorisation logic, with a database (I don't recommend bigquery for website, the latency is too high) to list three users, the companies where they work, the blobs of each company...
Nothing is magic and your use case specific.
If you want to discuss more about which component to use and to start, no problem. Let a comment.

Salesforce: is it possible to develop a web application on top of Salesforce

Let me start with a bit of background: I'm helping a non-profit organization that would like to have a browser-based application that is backed by Salesforce, but has very specific requirements.
I see Salesforce has a REST API that we can call, so we can develop a standalone application to serve the web pages they want and use the REST API to call Salesforce when needed.
I'm wondering if there is a way to host a web application directly on Salesforce; this way we don't have to have a separate application server. Any recommendations or pointers to documentation/open source products is greatly appreciated.
Yes, you can create services that will allow your app to hit Salesforce
Depending on the type of application, yes you can host it on salesforce using the Salesforce Sites feature, also you can develop and host your app on Heroku which is owned by salesforce and can sync data to and from salesforce using Heroku Connect, or you can build and host it on another service like AWS and connect via the REST API. You just need to investigate and choose the option that best fits your use-case. One thing to be aware of is that there are API limits (the number of calls you can make to salesforce in a rolling 24hr period). Depending the the needs of the app be sure to see if those limits will be an issue. Because if the app makes constant calls to salesforce that could be an issue. But there are things you can do to get around that, like caching.
Yes, both Force.com Sites and Site.com features allow you to host webpages on the Force.com Platform. The markup is stored in Visualforce Pages and can use Apex to access records in the Database. I have migrated multiple websites (including our company's www.mkpartners.com) to Force.com using Force.com Sites.
One thing to keep in mind is that you are limited to 500,000 views per month and the rendering of a page with images that are also stored on the platform will incur a single view for the page and a single view for each image. If you already have a very popular website, I wouldn't migrate. If you're a small business or nonprofit, then it should be fine.
Another thing to keep in mind is that dynamic functionality based on records in the database will not work during maintenance windows. There is the ability to upload a static version of your website to be rendered during these windows though.

Best practice for a multiuser CouchDB-based app?

I create a CMS from scratch and decided to use CouchDB as my database solution. For my CMS I need various accounts and of course different user roles (admin, author, unregistered user, etc.).
First I thought I would program authorization within my CMS myself, but CouchDB has stuff like this build in, so I want to ask:
What is the best practice creating a multiuser app with CouchDB?
Create only one admin for CouchDB and manage restrictions, roles and accounts by yourself?
Use build-in functionality of CouchDB for all this? (Say create a CouchDB admin user for every admin of the CMS?)
What if I want to add other 3rd-party authorization later? Say I want users to login via Twitter/Facebook/Google?
Greetings,
Pipo
The critical question is whether you want to expose CouchDB to the public or not.
If you want to build your CMS as a classical 3-tier architecture where CouchDB is exclusively accessed from a privileged scripting layer, e.g. PHP, then I would recommend you to roll your own authorization system. This will give you better control over the authorization logic. Particularly, you can realize document based read access control (not available in the CouchDB security system).
If instead you want to expose CouchDB to the public, things are different. You cannot actually write server side logic (except for separate asynchronous listeners via the changes feed) so you will have to use CouchDB's built in authentication/authorization system. That limits you to read access controlled on a database level (not document level!). Write access can be controlled with validation functions. CouchDB admins should not be equivalent to application admins as a CouchDB admin is rather comparable to a server admin in a traditional setting. A database admin in CouchDB would be a better fit (can change design documents and therefore make modifications to the CMS installation like adding plugins). All other users with write access can be realized as database members.
I would prefer the second approach, because this will give you the possibility to leverage all the nice features of CouchDB like replication and the changes feed. However, you will have to do some filtered replication between databases with different members if you need fine grained read access control.
If you want to use other authentication mechanisms than those offered by CouchDB, you will eventually have to modify the installation (which can be an issue if you want to use a hosted CouchDB). For a facebook plugin see e.g. https://github.com/ocastalabs/CouchDB-Facebook-Authentication.

Backend for iOS app

My question is how do i get information from a server to my iphone app. let's assume I have completed my current project I'm working on that only needs data to be uploaded to my application.
I understand there is a database or server I must create but how do I go about creating or modifying one for my needs.
I mainly want to store login information from one user and allow users to search for people who have entered login information (name) to add to a friends lists within the current app.
i think in your case you can use Django-tastypie for backend will be good choice.since using django you can develop it in quick time and the tastypie has api services which can used easily for retrieval and sending data
you can go through this
http://django-tastypie.readthedocs.org/en/latest/
Take a look at services like Stackmob or Parse. These types of service could make it really easy for you to get the server side part of your application up and running. These services would act as your database and also provide an easy api for you to access the server side pieces.

Web Analytics & Stats

We want to add tracking statistics to a web application we are building but are pretty unsure of how to go about it. (i.e. clicks, pageviews, unique visits etc)
Does anyone have any articles on the best way to go about incorporating tracking data into an application ? i.e. javascript tracking or IIS etc ?
We want to add tracking in as a ASP.NET MVC module - but we are unsure as to the best way to actually get the data and essentially 'track' this information ?
If anyone could help out - much appreciated.
Edit: just to be clear, we want to do this in-house and present the stats to our users as an additional fee module?
You can turn on the logging for IIS and then use the SQL Server Report Server Pack for IIS. It comes with many canned reports for your sites stats and then you could take it from there with your own custom reports.
You could also just use log parser to get the stats into a SQL Server DB and then you could use SQL from their to analyse and roll your own app.
Either way, you could modularize this and sell it as an add-on to your customer base.
You could use Piwik, you just need PHP version 5.1.3 or greater and MySQL version 4.1 or greater. As they say in their website, "Piwik aims to be an open source alternative to Google Analytics."
They have a demo on the official website so you can see if it's what you're looking for.
Google analytics is a popular service. You just insert a bit of javascript on every page that contains your sites name and Google tracks the data and provides all the report on a handy web based dashboard.
It's not an ASP.net MVC module like what you mentioned, but it will certain track stats for you and will be a lot simpler to set up than trying to code or integrate anything yourselves.
I'd look at analytics to begin with and only branch out to something more complex if it doesn't meet your requirements.
klabranche provided a holistic answer in terms of using logs of web server. I think using web server log is a a great way to analyse data of your web application.
That being said, depend on your web application and the scope of your analytics, just relay on web server log is not a good way to.
As you may know, web log does not record users behaviors like clicking certain tabs which may not trigger a web server request. Obviously your web log has no idea whether users clicked that tab or not, this may hurt your analyse.
Another you need to know is browser cache, this may create another black hole in your data.
RECAP
If you want to do a holistic analytics, you need to use two approaches, one is JavaScrip tag, another one is web log. Since both of them have shortages, combining them together will give you a complete picture.
Hope this helps

Resources