What are some good/cheap Data Obfuscation and Data Masking tools? - obfuscation

I am looking for a Data Masking tool to protect the sensitive data/files when I move them to dev environment (all data sits on S3). Is there anybody here that already doing data protection using third party tools? I would appreciate for the recommendations.

You could try FileMasker. It supports masking CSV and JSON files on S3. FileMasker can run as an AWS Lambda and can also run on local computers too. A free Community license is available. Disclaimer: I work for the developer.

Related

A plea for a basic Notebook example getting data into and out of Google Cloud Datalab

I have started to try to use the Google Cloud datalab. While I understand it is a Beta product, I find the Doc's very frustrating, to say the least.
The questions here and lack of responses as well as lack of new revisions or docs over the several months the project has been available make me wonder if there is any commitment to the product?
A beginning would be a notebook that shows data ingestion from external sources to both the datastore system and the Big query system. That is a common use case. I'd like to use my own data, it would be great to have a Notebook to ingest it. It seems that should be doable without huge effort? And it would get me (and others) out of this mess trying to link the various terse docs from various products and workspaces up and working together..
in addition to a better explanation of the Git hub connection process (prior question))
For BigQuery, see here: https://github.com/GoogleCloudPlatform/datalab/blob/master/content/datalab/tutorials/BigQuery/Importing%20and%20Exporting%20Data.ipynb
For GCS, see here: https://github.com/GoogleCloudPlatform/datalab/blob/master/content/datalab/tutorials/Storage/Storage%20Commands.ipynb
Those are the only two storage options currently supported in Datalab (which should not be used in any event for large scale data transfers; these are for small scale transfers that can fit in memory in the Datalab VM).
For Git support, see https://github.com/GoogleCloudPlatform/datalab/blob/master/content/datalab/intro/Using%20Datalab%20-%20Managing%20Notebooks%20with%20Git.ipynb. Note that this has nothing to do with Github, however.
As for the low level of activity recently, that is because we have been heads down getting ready for GCP Next (which happens this coming week). Once that is over we should be able to migrate a number of new features over to Datalab and get a new public release out soon.
Datalab isn't running on your local machine. Just the presentation part is in your browser. So if you mean the browser client machine, that wouldn't be a good solution - you'd be moving data from the local machine to a VM which is running the Datalab Python code (and this VM has limited storage space), and then moving it again to the real destination. Instead, you should use the cloud console or (preferably) gcloud command line on your local machine for this.

Semantria Integration with DB

I need to know, has someone integrated any DB to Semantria, and get output to any DB or excel or text file ?
I have tried to explore semantria via excel and API , but integration does not work perfectly.
It depends on what kind of integration you're looking for.
I have already done many integrations with different storages including indexing services and RDBMS solutions.
Unfortunately there are no ready-to-use components available on the market, so you will need to build integration by your own.
Semantria offers SDK (https://github.com/Semantria/semantria-sdk) for all modern languages, you will need to build a logic that will get analysis results and will save them to a certain storage.
Can you please explain what storage do you use and what Semantria output you're interested in?
Thanks George.
Well at the moment, we are just focusing on pulling the data from DB (take for instance mySQL, or Oracle), and output should again go back to same DB, i will take care of transformation needed in o/p.
Now where I am stuck, is the place where I can set up a link between DB and semantria, how will these SDK help, never worked on something like this.
A brief on this will surely be of great help

Create a back-end database for ios & web apps using filemaker

I'm need to create an ipad app that has access to a relational database as well a lot of images (which should be hosted on the web). My boss suggested that I use Filemaker, as he has a license for one of the versions.
I searched the web for information on what is the best way to do this but I'm not 100% sure I got it right.
Is it possible to use Filemaker for the following?
Create a relational database
'Host' images
Run custom scripts (optional but not strictly required)
Publish all of the above to the web, to be accessed by an ios app (and later an web app) through an API.
I understand the the first three points can be done but couldn't find much on how to publish everything and then access it securely later.
Are there better alternatives?
Thank you in advance
FileMaker Server has a fairly good XML API that supports all operations, including FileMaker scripts. It can be used as is or through the official PHP client. There's also an unofficial Python client for it.

Update a local/client Microsoft Access Database from a server (MS SQL Server2005)

I've got a website that runs on a shared hosting environment, using ASP.net 2.0 (C#) and MS SQL Server 2005. I've recently been asked if I can integrate my website with a piece of third party desktop software that uses the Access runtime as its database (transparent to the end user).
Primarily I want to be able to offer users of my website the option of exporting their data into the Access database on their local machine. The data schema's match sufficiently, the question is how to actually do this, and in the simplest way possible for the user.
Simply having a webpage update the local Access database isn't possible due to the obvious security restrictions. I've considered asking them to upload the Access database to the server, so I can migrate the data then allow them to download it again, however the competency of the users of this software is such that even locating the Access database, let alone uploading and downloading it from the website might be too complicated.
I've also considered if Adobe Air or Silverlight could help here, but don't know them well enough to know for sure. Similarly I'm assuming another exe could be written to perform this task that the user could simply download and run, however my experience is in web development, not program development, so this isn't a 100% certainty for me, or an ideal development option for me.
So, can this be done, and if so what technique can achieve this, with the stated aims being ease of use for the end user, followed by ease of development by someone with web development as their main skill. Many thanks!
You may find this answer of interest: Best way to stream files in ASP.NET
It is about transferring a file from the server. You could save Excel or CSV and use that to update Access.
Instead of trying to do this in a web page you might just expose some views from your sql server to some client specific logins.
Then within the Access application, allow them to tie to your sql server. You might even provide an access application for getting the data from your site and stuffing it in their local access database.
In my work we have done something similar that is transparent to the user by creating an ActiveX control. The problem is that you are limiting the users to use only Internet Explorer.
I think that the best way to achieve what you are trying to do is by installing a service in the client's computer. If creating a service is beyond your experience you can post a project in a place like oDesk and find somebody that can help you with the development for the money that you are willing to pay to complete your project.
Good Luck.

Generate Data Change Scripts from VSTS Database Edition

I'm using the GDR release of VSTS Database edition source control the DB and generate deployment scripts. It works pretty well but the problem is that it only seems to handle scripting and deploying the schema. It stops short of handling scripting and deployment of the actual data itself (i.e. the lookup and standing data which also deployed with the DB).
I know it's easy enough to write the deployment scripts by hand, but is this what every one does? Is there a recommended way of deploying data with the VSTS deployment engine? Is there some tooling that help with this - I don't mean a full product like SQLCompare, just something that fills the gap with VSTS DB.
Thanks in advance.
Kaneda
The VSTS: DB best practices blog advocates using post-deployment scripts to insert reference data into temporary tables, then update the target tables based on the delta (ie update x inner join temp where x.something <> temp.something)
There's some suggestions floating around that this might make a powertool, and at least one MVP has written a tool to generate those scripts.
(NB: I haven't tried this - I only just found out about it myself)
Personally I would still stick with RedGate if I had any choice in the matter.
GDR comes with a data comparison engine, but as far as I've been able to tell so far a data comparison can't even be stored in a project (let alone be properly supported by it) - so it's pretty ad-hoc. Unlike a Schema Compare, there is no File \ Save As.
The comparison engine can be automated via DDE but that's automation within the Visual Studio IDE, and not really suitable for some kind of scripted installation process. As much as anything there's no way I could see to specify which tables to include in the comparison (since all you get to do via DDE is open the wizard for the user to select)
Alternatively all the functionality appears to reside in Microsoft.VisualStudio.TeamSystem.DataPackage.dll , but since the API documentation hasn't been written yet (the help doco that comes with GDR is full of errors as it is) it's going to be a bit of a hit-and-miss adventure to work out where to start.
As someone who's used RedGate's SqlCompare, SqlDataCompare and their respective APIs to do this before, much of the GDR functionality seems a bit half-baked to me.
What I will probably do this time round is sync the data with a SSIS package (export to CSV at build time / import from CSV at install time), but I'd far rather be using the SqlDataCompare API (or SqlPackager) right now.

Resources