MS SQL Server Data Anonymise - sql-server

I have this requirement: we are going to restore production database into UAT for testing few use cases. We should anonymise required data (Ex: SSN, Credit Card etc) from all types of users (including sysadmin, db owner). No one should be able to get real records.
Also application/UI layer not going to implement any changes (to decrypt or unmask) because of time line.
I have tried below option but not able to achieve my use case
Dynamic Masking - Admin Can able to view
Always Encrypted - It encrypts records (not mask)
Static Mask - This override actual data and will miss real record
We have to anonymise the data from all users with out missing actual records. What is the best option to achieve this?

Related

Import Active Directory to SQL Server

I'm working on a Microsoft BI project.
I am currently in the process of connecting my systems to SQL Server. I want to connect my Active Directory to a table in SQL Server and I want to sync to one table per hour. This means that every hour the details of the Active Directory will be updated.
I realized that it is necessary to use SSIS to do this I would be happy for help to connect my AD to SQL Server with the help of SSIS.
There are two routes available to you to sync AC user classes to a table. You can use an ADO source in an SSIS Data Flow Task or you can write custom .NET code as part of a Script Source. The right answer depends on your team's ability to maintain and troubleshoot a particular solution as well as the size of your AD tree/forest. If you're a small shop (under a thousand) anything is going to work. If you're a larger shop, then you need to worry about the query mechanism and the total rows returned as there is an upper boundary of how many results can be returned in a single query. In that case, then a script task likely makes more sense as you can more easily write a query to pull all the accounts that start with A, B, etc. I've never worked with Hebrew, so I assume one could do a similar filter for aleph, bet, etc.
General steps
Identify your domain controller as you need to know what server to ask information from. I do not know how to deal with Azure Active Directory requests as I believe it works a bit different there but haven't had client work that needed it.
Create a Connection Manager for ADO.NET . Use the ".Net Providers for OleDb\OLE DB Provider for Microsoft Directory Services" and point that to your DC.
Write a query to pull back the data you need. Based on the comment, it seems you want something like this
SELECT
distinguishedName
, mail
, samaccountname
, mobile
, telephoneNumber
, objectSid
, userAccountControl
, title
, sn
FROM
'LDAP://DC=domain,DC=net'
WHERE
sAMAccountType = 805306368
ORDER BY
sAMAccountName ASC
Using that query, we'll add a Data Flow Task and within it, add an ADO.NET Source. Configure it to use our ADO.NET Connection manager and use the above query (adjusting for the LDAP line and any other fields you do/don't need)
Add an OLE DB Connection Manager to your package and point it to the database that will record the data.
Add an OLE DB Destination to the Data Flow and connect the output line from the ADO.NET Source to this destination. Pick the table in the drop down list and on the Columns tab, make sure you have all of your columns connected. You might run into issues where the data types don't match so you'll need to figure out how to handle that - either change your table definition to match the source or you need to add data conversion/derived columns components to the data flow to mangle the data into the correct shape.
You might be tempted to pull in group membership. Do not. Make that a separate task as a person might be a member of many groups (at one client, I am in 94 groups). Also, the MemberOf data type is a DistinguishedName, DN, which SSIS cannot handle. So, check your types before you add them into an AD query.
References
ldap query to get disabled user records with whenchanged within 30 days
http://billfellows.blogspot.com/2011/04/active-directory-ssis-data-source.html
http://billfellows.blogspot.com/2013/11/biml-active-directory-ssis-data-source.html
Is there a particular part of the AD that you want? In any but the smallest corporations the AD tends to be huge. Making a SQL copy of an entire forest every hour is a very strange thing that may have many adverse effects on your AD, network, security and domain-wide performance.
If you are just looking to backup your AD, I believe that there are other options available, specific to the Windows AD (maybe even built-in, I'm not an AD expert).
If you really, truly want to do this here is a link to get you started: https://social.technet.microsoft.com/Forums/ie/en-US/79bb4879-4d82-4a41-81a4-c62afc6c4b1e/copy-all-ad-objects-to-sql-database?forum=winserverDS. You can find many more articles on this just by Googling "Copy AD to Sql".
However, heed the warnings well: the AD is effectively a multi-domain-wide distributed database, attempting to copy it into a centralized database like SQL Server every hour is contra-indicated. You are really fighting against its design.
UPDATE Based on the Comments:
Basically you've got too much in one question here. Sql Server, SSIS and the Active Directory (AD) are each huge subjects in and of themselves and the first time that you attempt to use all of them together you will run into many individual issues depending on your environment, experience and specific project goals. We cannot anticipate all of them in a single answer on this site.
You need to start using the information you have from the following links to begin to implement this yourself, and then ask specific questions as you run into problems along the way.
Here are the links that you can start with,
The link I provided above from MS: https://social.technet.microsoft.com/Forums/ie/en-US/79bb4879-4d82-4a41-81a4-c62afc6c4b1e/copy-all-ad-objects-to-sql-database?forum=winserverDS
The link that you provided in the comments that explains how to setup ADSI as a linked server and how to use T-SQL on it: https://yiengly.wordpress.com/2018/04/08/query-active-directory-in-sql-server-with-linked-server/
This one explain how to use AD from within an SSIS DataFlow task (but is limited to 1000 rows): https://dataqueen.unlimitedviz.com/2012/05/importing-data-from-active-directory-using-ssis/
This related one explains how to use AD within an SSIS Script task to get around the DataFlow task limits: https://dataqueen.unlimitedviz.com/2012/09/get-around-active-directory-paging-on-ssis-import/
As you work your way through this you may run into specific problems, which you can ask about at https://dba.stackexchange.com which has more specific expertise with Sql Server and SSIS.
Based on your goals, I think that you will want to use a staging table approach. That is, use your AD/Sql query to import all of the AD users records into a new/empty temporary table that has the same column definition as your production table, then use a Merge query to find and update the changed user records and insert the new user records (this is called a Differential or Type II update).

Azure SQL: How to be notified if someone exports the database?

I run a system based around an Azure SQL Database.
A few different team members need to have read access to this database to perform support and management tasks.
However, I am concerned that by having access to the database, one of them may - with the best of intentions - export the database and manage the backup carelessly, resulting in a data breach.
How can I get Azure to notify me if somebody backs up the database (or downloads more than X million rows, maybe?) These people need to have database access, I would just like to know if they use it in a way that could cause a security risk for the platform.
You can use Extended Events for this.
To set it up on Azure you can follow this tutorial.
For your case
You create a session
You Select the rpc_completed (docs) event and click configure
In the Global Fields tab you can select the fields you want to keep track of. I.e.: Username, sql_text, session_id, database_name, client_*
In the Filter tab you can select a filter condition. In your case row_count would be appropriate.
When malicious users are smart, and retrieve small numbers of rows and page them this will go undetected. So a second filter could be Querys without WHERE clauses or a different approach based on your case.
When extended events are setup to write to blobstorage. You would have a different process (Azure Function, Runbook, ...) that would inspect the result and alert you.
Extended events are moslty used for troubleshooting, they replace SQL profiler. So turning it on a production server may have a performance impact.

SQL Server 2008 Column Encryption

I've been trying to figure out a good way to encrypt sensitive columns in my DB. I thought the built-in encryption mechanisms of SQL Server would do the trick but either I'm missing something or doing it wrong.
The original plan was to create a table with columns that were encrypted with a symmetric key, and have a view select the data from the table unencrypted. However, I was unable to figure out how to use the DecryptByKey method in the view select statement. Plus it occurred to me that the data would be unencrypted going TO and FROM the view, so unless the connection was secure then it would sorta be pointless.
Then I thought to bring all the encryption/decryption to my app. I figured that
If the DB was completely unable to decrypt its own data, then someone infiltrating the DB wouldn't be able to do much at all.
It would save the server the effort of trying to decrypt/encrypt the info, as encryption/decryption in the DB could affect performance globally instead of just on a single workstation.
So as it sits, my app has "hard-coded" IVs and Keys for each column that needs to be encrypted. It sends the encrypted info to the DB, and receives encrypted info from the DB. This is just for messing around mind you, I know I have to put the IVs and keys somewhere else...they simply aren't safe in the app code.
I was thinking of this crazy idea:
The client app would contain a single Key and IV. The server would contain the Keys/IVs of all of the encrypted columns in a single table. However, the values of the Keys/IVs would be encrypted with the Key/IV that the client app held.
On startup, the client app would load all the Keys/IVs from the DB into memory, and decrypt them as needed to view the data selected from the server.
There could also be a relation which would join users with keys they were allowed to use. So the app would only decrypt columns that the user was authorized to see.
Do you think the idea is a win or loose? And how have some of you implemented encryption given a client-app/SQL Server scenario?
YOu loose. Point. No chance to use indices etc.
If you want safety, put it on a safe server and load Enterprise Edition and use database file encryption.
Consider putting in a middle tier to handle the encryption/decryption for you. Assuming you can put it on the server you can keep control of the bits and not worry about the client app (which may be somewhat out of your control) from being decompiled (and exposing keys).

Is there a free GUI tool for data sync between DB in which it is possible to script rules?

What I need to do is some data between 2 databases. The source can be anything (comma separated file, xls file, any database, ...), the destination is MS SQL Server.
I do not need to sync all data, I just need to sync particular tables.
Example:
I need to sync accounting Software (runs on PostgreSQL) CUSTOMERS table with CRM (runs on SQL Server).
Some problems this tool should be able to face:
1) Accounting software customers table has 1 field that is not mapped in crm customers table. (In this way I want to map this extra field to the field CUSTOMERS_CUSTOM_DATA.EXTRA_FIELD)
2) Having some rules (like sync only customers whose code is between 10000 and 99999)
3) Allowing to perform some post insert tasks (for example I am using manually managed seuqences for the tanble IDs, so after inserting 10 records I need to add 10 to the sequence)
4) Having an exception handling mechanism so if something is wrong it can wither call a sql server stored procedure (that I already have and it will send an e-mail to me) or simply send a message to notify that something was wrong in the nightly sync.
5) Be easy to schedule when to perform data sync (hourly, daily, INCLUDING MANUAL)
6) Perform data conversion: if Surname field in source table is varchar(20) and in destination table is varchar(15) I want to explicitly say "perform a truncation".
7) Have different rules for insert or update. For example in the source e-mail field is not present, but I want to populate it in the destination I decide to perform this operation on insert only, not on update. (for example as I insert a new customer I want to populate the e-mail field concatenating name and surname, but then I want to let the users to modify it, this first insertion is just to simplify data entry, but then this particular case will be handled manually. So I want to say (on insert populate e-mail field, on update don't do anything with email field)
8) In case of delete in the source db don't delete on the destination but only change the varchar(10) STATUS to DELETED.
Note: I know that Integration Services will be perfect for this, but I must support the Express Edition, so SSIS is not an option.
I created a bunch of scripts and scheduled stored procedure that at present do what I need, but it is very hard to maintain and the total lack of a GUI makes the work much slower. I remember having seeing TALEND time ago, maybe that tool is also the answr I need, anyway I need to provide a quick answer to management, so I have now no time to investigate all the tools on the market, and I would prefer to have a suggestion from an expert.
I believe SQL Server Integration Services does all that, and I believe SQL Server Management Studio allows you to create and package your SSIS jobs so that they can be deployed elsewhere.
Finally I went for TALEND, I never really used SSIS, I just saw a live demo of it at a SQL Server conference. Anyway Talend is a free alternative (and quite rich) to SSIS, so it will suite the needs of all customers, including the ones (95%) that has SQL Server Express.

Preparing to move to a single database

We have an application that has 1000+ databases and 600+ sprocs. Each database represents a different client.
Problem: We need to move this to a single database while creating as little effect on the ui as possible, meaning dont change all the sproc signatures at 1 time.
The connection string currently sets the database attribute, a proposal is to move that to the user attribute. This attribute (using SYSTEM_USER) could be used to determine the site identifier which would be used on the where clause.
The above would not be final solution, but allows us to make changes to the sproc signature at a slow controlled pace. Once all are done we can correct the connstring and get some connection pooling.
Are there any limitation to the number of logins/users that we can have on sqlserver 2005/8. Or has anyone been down this path that could shed some light on a better option.
See my answer here
Ideas for Combining Thousand Databases into One Database
Sounds like you two are working the same project. YOu will need to change every proc before you can move to one datbase or each client will see the others' data.
As for the number of logins on SQL Server 2005 / 08 - I don't think anyone has ever run into a hard limit here. A few thousand will NOT be any problem at all.
What you could consider for this scenario might be one schema inside your single DB per customer, e.g. customer "Miller" has a "miller" schema, with its objects inside, and customer "Brown" will have a "brown" schema.
And contrary to what HLGEM just responded - no, customers won't see each others data, if you specify proper permissions - each customer (and its users) into its own schema only - should work just fine.
Marc
You might also consider setting a distinctive application name in the connection string rather than using a distinctive user, which you can get into your where clause using APP_NAME(). I'm sure that SQL Server won't have a problem with thousands of logins, but you may prefer not to have to create them.

Resources