Data masking or security using Views?

Data masking or security using Views? - sql-server

Environment: SQL Server 2012
I am trying to help build a solution which includes data masking and encryption for our organisation.
Currently, we donot have any data masking in place and hence the need.
We are in the process of identifying the data which could identify the data as sensitive or not or some combinations of non-sensitive data which could lead to the identity of a person.
One approach would be to use some kind of tool like Redgate Data generator or DataVeil, which could generate fictitious data for the database for the fields we want to for Dev or UAT environment.
Other would be use some kind of function which would mask some characters as xxxx or **** based on the length.
In production environment, as per understanding, as masking is irreversible, encryption needs to happen which I will learn more about in coming weeks.
This above scenario would work where every user would see same data in UAT and Dev when data is generated from a tool or masked using TSQl code and based on access to the key for production env.
Please correct me on anything above you think doesn’t look right.
Next is user based access using views. There is not much material for security using views out there so asking on how we could implement if we take this route instead of the above mentioned.
I understand that the users could be granted access to the underlying tables using views.
What about existing queries and SSRS reports and Cube?
How could that work with views? Do I change every query ? I am little lost here.

The View option can be done by creating a new 'mask' view that includes all the columns from the source tables, and replaces sensitive columns with a dummy fixed value.
For example:
create view vMaskPeople
as
SELECT ID, DateCreated, 'Sample Name' as FullName, 'Sample Telephone' as Phone
FROM People
If you need more unique sample data, partially mask the columns, like:
SELECT ID, DateCreated,
Left(FullName,3)+'XXXXXX' as FullName,
'XXX-XXXX-'+Right(Phone,4) as Phone
If you are not able to somehow rig the Dev environments to use the new mask view, you could rename the source 'People' table to like 'People1' and then name the mask view 'People'

You mentioned SQL Data Generator, which creates a fresh data set from scratch, but here at Redgate we also have Data Masker, which allows you to take an existing database and specify masking rules, which sounds like it might suit your scenario better.

Related

Can I use Master Data Services to import data via Excel add-in ? Mainly Measures! (Numbers/Values)

Can I use Master Data Services to import data via Excel add-in mainly Measures (Numbers/Values)
Shortversion:
Looking for the best way to comfortably input data to an SQl-Server table with immediate feedback for the user.
Set-up:
We have a Datawarehouse (dwh) based on SQL Server 2012.
Everything is set up with the Tools from MS BI Suite (SSIS, SSAS, SSRS and so on)
The Departments access the BI-Cubes via Excel. They prefer to do everything in Excel if possible.
Most sources for the DWH are databases but one use-case has Excel-files as a source.
Use-Case with Excel files as a source
As-Is:
We have several Excel-files placed in a network folder.
Each Excel file is edited by a different user.
The files are ingested by an SSIS process looping through the files on a daily base.
The contents of the Excel-files is like this (fake data):
Header: Category | Product | Type | ... | Month | abc_costs | xyz_costs | abc_budget | xyz_budget | ...
Data: A Soup Beta 2017-06 16656 89233 4567 34333
Data Flow:
source.Excel -> 1.-> dwh.Stage -> 2.-> dwh.intermediateLayer -> 3.-> dwh.FactTable
Step 1 to 3 are SSIS ETL-Packages.
Step 3 looks-up the the Surrogate-Keys from the Dimensions and saves
them as Foreign-Keys in Fact-table based on the "Codes" provided by
the Excel (Code e.g. can be 'A' for Category).
Problems:
Step 1 "ingesting the Excel-files" is very error-prone.
Users can easily misstype the codes and numbers can be in the wrong
format.
Error messages regarding excel-sources are often missleading &
debugging Excel-sources in SSIS becomes a pain.
Sometimes Users leave Excel file open and a temporary Lock-File
blocks the whole ingestion process.
Requirements
I want to avoid the problems coming up when ingesting Excel-files.
It should be possible to validate data input and give a quick
feedback to the user
As BI-Developers we will try to avoid a solution that would involve
webdevelopment in the first place.
Excel-like input is preferred by the users.
Idea:
As Master Data Services comes with an Excel- addin that allows data manipulation
we thought that could be used for this data-input-scenario as well.
That would give us the oppurtunity to Test MDS at the same time.
But I'am not sure if this use-case fits to Master-Data-Services.
Doing a research I could not find any MDS example showing how measures are
entered via Excel-addin [samples are about modelling and and managing entities].
Can anybody clarify if this Use Case fits to MDS?
If it does not fit to MDS ? What can be a good choice that fits into
this BI-ecosystem? (preferrable Excel-based). [Lightswitch, Infopath, Powerapps or if no ther option Webdevelopment -> I am a bit confused about the options]

Keep in mind, an Entity in MDS does not represent a table in the database. This means when you load data in MDS, there are underlying tables populated with the data and metadata to keep track of changes, for example.
Using the Excel plugin to import data into MDS, and then expose the data to another system can work, considering the following:
Volume of data. The excel plugin handles large volumes in batches. So the process can become tedious.
Model setup. You need to configure the model properly with the Entities and Attributes well defined. The MDS architecture is 'pseudo data warehouse' where the entities can be considered 'facts' and the domain based attributes 'dimensions'. This is an oversimplification of the system but once you define a model you will understand what I mean.
A nice functionality is subscription views. Once you have the data in MDS, then you can expose it with subscription views which combines entities with domain based attributes in one view.
Considering your requirements:
I want to avoid the problems coming up when ingesting Excel-files.
This is possible, just keep in mind the Excel plugin has its own rules. So Excel effectively becomes the 'input form' of MDS, where data is input and committed. The user will need to have a connection set up to MDS using the credential manager etc.
It should be possible to validate data input and give a quick feedback
to the user
This can easily be handled with domain based attributes and business rules
As BI-Developers we will try to avoid a solution that
would involve webdevelopment in the first place. Excel-like input is
preferred by the users.
Keep in mind, the MDS plugin determines how the excel sheet looks and feels. No customization is possible. So your entity definitions need to be correct to facilitate a good user experience.

I have worked on a DWH project in which an MDS instance was used as a single source of truth for many dimensions. Most of the data have been rather read-only (lists of states, countries, currencies, etc.) and were maintained via the Excel plug-in. There was also some more volatile stuff which was imported via MDS import procedures.
In order to expose the MDS data to the warehouse, views were created that pointed directly to the MDS database. I have even written a SQL script that refreshed these views, depending on the MDS metadata and settings stored in the warehouse. Unfortunately, I don't have it around anymore, but it's all quite transparent there.
Everything was very much alive. Can't recall any problems with queries that involved these MDS views.

Managing multiple datasources in CakePHP

I'm planning to develop a web application in CakePHP that shows information in graphics and cards. I chose CakePHP because the information that we need to show is very structured, so the model approach makes easier to manage data; also I have some experience with MVC from ASP.NET and I like how simple is to use the routing.
So, my problem is that the multiple organizations that could use the app would have their own database with a different schema that the one we need. I can't just set their string connection in the app.php file because their database won't match my model.
And the organization datasource couldn't fit my model for a lot of reasons: the tables don't have the same name, the schema is different, the fields of my entity are in separated tables, maybe they have the info in different databases or also in different DBMS!
I want to know if there's a way to make an interface that achieves this
In such a way that cakephp Model/Entity can use data regardless of the source. Do you have any suggestions of how to do that? Does CakePHP have an option to make this possible? Should I use PHP with some kind of markup language like JSON or XML? Maybe MySQL has an utility to transform data from different sources into a view and I can make CakePHP use the view instead of the table?
In case you have an answer be as detailed as you can.
This other options are possible if it's impossible to make the interface:
- Usw another framework that can handle this easier and has the features I mentioned above.
- Make the organization change their database so it matches my model (I don't like this one, and probably they won't do it).
- Transfer the data in the application own database.
Additional information:
The data shown in graphics are from students in university. Any university has its own database with their own structure and applications using the db, that's why isn't that easy to change structure. I just want to make it as easy as possible to any school to configure their own db.
EDIT:
The version is CakePHP 3.2.
An important appointment is that it doesn't need all CRUD operations, only "reading". Hope that makes the solution easier.

I don't think your "question" can be answered properly, it doesn't contain enough information, not enough details. I guess there is something that will stay the same for all organizations but their data and business logic will be different. But I'll try it.
And the organization datasource couldn't fit my model for a lot of reasons: the tables don't have the same name, the schema is different, the fields of my entity are in separated tables, maybe they have the info in different databases or also in different DBMS!
Model is a whole layer, so if you have completely different table schemas your business logic, which is part of that layer, will be different as well. Simply changing the database connection alone won't help you then. The data needs to be shown in the views as well and the views must be different as well then.
So what you could try to do and what your 2nd image shows is, that you implement a layer that contains interfaces and base classes. Then create a Cake plugin for each of the organizations that uses these interfaces and base classes and write some code that will conditionally use the plugin depending on whatever criteria (guess domain or sub-domain) is checked. You will have to define the intermediate interfaces in a way that you can access any organization the same way on the API level.
And one technical thing: You can define the connection of a table object in the model layer. Any entity knows about it's origin but you should not implement business logic inside an entity nor change the connection through an entity.
EDIT: The version is CakePHP 3.2. An important appointment is that it doesn't need all CRUD operations, only "reading". Hope that makes the solution easier.
If that's true either use the CRUD plugin (yes, you can use only the R part of it) or write some code, like a class that describes the organization and will be used to create your table objects and views on the fly.
Overall it's a pretty interesting problem but IMHO to broad for a simple answer or solution that can be given here. I think this would require some discussion and analysis to find the best solution. If you're interested in consulting you can contact me, check my profile.

I found a way without coding any interface. In fact, it's using some features already included in the DBMS and CakePHP.
In the case that the schema doesn't fit the model, you can create views to match de table names and column names from the model. By definition, views work as a table so CakePHP searches for the same table name and columns and the DBMS makes the work.
I made a test with views in MySQL and it worked fine. You can also combine the data from different tables.
MySQL views
SQL Server views.
If the user uses another DBMS you just change the datasource in app.php, and make the views if it's necessary
If the data is distributed in different DBMS, CakePHP let's you set a datasource for each table, you just add it to app.php and call it in the table if it's required.
Finally, in case you just need the "reading" option, create a user with limited access to the views and only with SELECT privileges.
USING:
CakePHP 3.2
SQL SERVER 2016
MySQL5.7

Standard practice/API for sharing database data without giving direct database access

We would like to give some of our customers the option to read data from our central database. The data is live and new records are being added every few seconds. Our database is MySQL running on Amazon RDS.
I was wondering what is the common practice for doing so.
One option would be to give them select right from specific tables, in that case they would be able to access other customers' data as well.
I have tried searching for database, interface, and API key words and some other key words, but I couldn't find a good answer.
Thanks!

Use REST for exposing specific tables to do CRUD operations. You can control the access on it too.

Collect data from 80 users, hiding other user's data

My wife works for a medium sized retail chain. Managers from each of the 80 outlets have to fill in one row of performance info for each of their staff (900 in all), but aren't allowed to see the data of other stores' staff.
My wife currently manages this with lots of spreadsheets, because each month the executive change what they want to collect, and their IT team don't have the resources to update their SAS system. She has to manually compile all the data into 1 spreadsheet for analysis which is time consuming and error prone. She's recently gone from having to do this for 20 outlets to 80 outlets and thinks she must be an easier way.
Is there a simple form based system, that can leverage what is already installed (microsoft office and lotus but not MSAccess), or can be run from a network drive. Cloud apps are banned. Excel's security is all wrong. Can word form templates append to a shared data source? Any ideas?
TIA

You could have a single table with all the data, then create 'shadowtables' on this table for each individual store.
in MySQL this would probably be either a 'partition table' (I've never used this so not sure how it works) or the use of temp tables.
You would then need to implement a method whereby when a user logs in at a given location (IP address) a trigger would create the temp table, then populate it with the relevent data for the store at that IP address.
An alternative (probably easier too) would be to have a specied table for each store, then grant users specific priveleges on each table you create. Again you'll need trigers to either populate a single 'master table' with info as it is updated, or you will just send a
select * from outlet1, outlet2... outlet80
again you may decide to create a temp table from the above select, and implement a custom script to create it only when required.
In fact that is probably how I would do it.
Then in you web interface have a button to create the temp table, and display it to the current user (provided they have the required priveleges to view all the tables of course).
I don't know for certain if Lotus is able to implement this, I don't know about its 'database' solution. I know that to do something similar in Access isn't that hard, the only downside would be needing to handle user identification (which Access doesn't do natively), again I don't know about Lotus.
In my experience the 'flat file database systems' don't generally handle user permisions in a native fashion, it is put onto the interface development to hand this.
I'm not sure how helpful the answer is, but it may take you a little way to a solution (even if you end up going for a server/client dbms system)

You can use Lotus for this. A simple start for you:
Create a database with one form and one view
On the form add whatever fields you want but also add a computed-when-composed multi-value field of type "Readers" with formula:
"[Admin]" : #Name( [CANONICALIZE];#userName)
With the exception of those with an Admin role (e.g., your wife), the view will display to each user only the records that the user created. The users will have to create one record per row.
Alternatively you could create an agent in the database that reads the data from an Excel file and builds the documents (records) with the READERS field's value computed as the documents are created.
If that's the route you want to take post a reply here and I'll post some code to (i) prompt a user to select an excel file, (ii) read the excel file data into lotus notes, (iii) implement a READERS field to see that documents are kept confidential between the creator and the Admin role people.
Hope that helps.

Dynamic web form targetting user specified database fields

I have an issue where I'm creating a greenfield web application using ASP.NET MVC to replace a lengthy paper form that manually gets (mostly) entered into an existing SQL Server 2005 database. So the front end is the new part, but I'm working against an existing moderately normalized schema. I can easily add new tables, views, etc. to the schema, but modifying tables is going to be near impossible. There's currently at least 2 existing applications (that I'm aware of) that reference this schema and I've stumbled upon at least a dozen "SELECT * FROM..." statements in each. They exist both in code and in views/triggers/stored procs/etc. That's why modifying existing table schemas is a no-go.
All that being said, the form targets different fields in multiple tables in database. It also has to be dynamic enough to allow the end users to add new questions targeting fields. The end users have a rough idea of the existing database schema so they're savvy enough to know how to pick out tables/fields to be targeted.
I'm have a really rough idea of how I could tackle this, but it seems like complete overkill and will be difficult to write up. I'm hoping somebody might have a simple(r) way of handling this sort of project that I haven't thought of.

If users know DB schema maybe you should go with Dynamic Data project and just create a web app front end of that DB to them. So you would only make the model they need and do the application that will display data from those tables with insert/edit capabilities.
But it's completely different story if they have some additional functionality to it.