I was experimenting with Dynamic Data Masking on a table. I have encountered a case wherein on running SQL on a table with DDM applied its failing with error
SQL compilation error: Compilation memory exhausted.
CASE 1: Created DDM using MD5 appending a default string(salt value),
create or replace masking policy hash_masking_str1 AS ( val string) RETURNS string ->
CASE
WHEN is_role_in_session('ACCOUNTADMIN') THEN val
ELSE md5(val||'|'||'TEXT')
END
alter table CUSTOMER_TBL modify column C_MKTSEGMENT set masking policy hash_masking_str1
Able to run SQL on table CUSTOMER_TBL using both ACCOUNTADMIN and other roles having select access on the table. DDM working fine.
Case 2: Created DDM using MD5 again. Now pulling default string(salt value) using another table.(Table has one row/one column)
create or replace masking policy hash_masking_str1 AS ( val string) RETURNS string ->
CASE
WHEN is_role_in_session('ACCOUNTADMIN') THEN val
ELSE md5(val||'|'||(select val from mask_values))
END
On running the SQL with ACCOUNTADMIN role I am able to get result from the table.
On using any other role the SQL is instantly failing with the compilation memory exhausted error.
Question: Is there anything wrong calling a select statement in DDM? Or is there any other way to pull the "salt" value from another table and use in MD5?
On the ETL server I have a DW user table.
On the prod OLTP server I have the sales database. I want to pull the sales only for users that are present in the user table on the ETL server.
Presently I am using an execute SQL task to fetch the DW users into a SSIS System.Object variable. Then using a for each loop to loop through each item (userid) in this variable and via a data flow task fetch the OLTP sales table for each user and dump it into the DW staging table. The for each is taking long time to run.
I want to be able to do an inner join so that the response is quicker, but I cant do this since they are on separate servers. Neither can I use a global temp table to make the inner join, for the same reason.
I tried to collect the DW users into a comma separated string variable and then using it (via string_split) to query into OLTP, but this is also taking more time at the pre-execute phase (not sure why exactly) even for small number of users.
I also am aware of lookup transform but that too will result in all oltp rows to be brought into the dw etl server to test the lookup condition.
Is there any alternate approach to be able to do an inner join by taking the list of users into the source?
Note: I do not have write permissions on the OLTP db.
Based on the comments, I think we can use a temporary table to solve this.
Can you help me understand this restriction? "Neither can I use a global temp table to make the inner join, for the same reason."
The restriction is since oltp server and dw server are separate so can't have global temp table common to both servers. Hope makes sense.
The general pattern we're going to do is
Execute SQL Task to create a temporary table on the OLTP server
A Data Flow task to populate the new temporary table. Source = DW. Destination = OLTP. Ensure Delay Validation = True
Modify existing Data Flow. Modify source to be a query that uses the temporary table i.e. SELECT S.* FROM oltp.sales AS S WHERE EXISTS (SELECT * FROM #SalesPerson AS SP WHERE SP.UserId = S.UserId); Ensure Delay Validation = True
A long form answer on using temporary tables (global to set the metadata, regular thereafter)
I don't use temp table in SSIS
Temporary tables, live in tempdb. Your OLTP and DW connection managers likely do not point to tempdb. To be able to reference a temporary table, local or global, in SSIS you need to either define an additional connection manager for the same server that points explicitly at tempdb so you can use the drop down in the source/destination components (technically accurate but dumb). Or, you use an SSIS Variable to hold the name of the table and use the ~From Variable~ named option in source/destination component (best option, maximum flexibility).
Soup to nuts example
I will use WideWorldImporters as my OLTP system and WideWorldImportersDW as my DW system.
One-time task
Open SQL Server Management Studio, SSMS, and connect to your OLTP system. Define a global temporary table with a unique name and the expected structure. Leave your connection open so the table structure remains intact during initial development.
I used the following statement.
DROP TABLE IF EXISTS #SO_70530036;
CREATE TABLE #SO_70530036(EmployeeId int NOT NULL);
Keep track of your query because we'll use it later on but as I advocate in my SSIS answers, perform the smallest task, test that it works and then go on to the next. It's the only way to debug.
Connection Managers
Define two OLE DB Connection Managers. WWI_DW uses points to the named instance DEV2019UTF8 and WWI_OLTP points to DEV2019EXPRESS. Right click on WWI_OLTP and select Properties. Find the property RetainSameConnection and flip that from the default of False to True. This ensures the same connection is used throughout the package. As temporary tables go out of scope when the connection goes away, closing and reopening a connection in a package will result in a fatal error.
These two databases on different instances so we can't cheat and directly comingle data.
Variables
Define 4 variables in SSIS, all of type String.
TempTableName - I used a value of ##SO_70530036 but use whatever value you specified in the One-time task section.
QuerySourceEmployees - This will be the query you run to generate the candidate set of data to go into the temporary table. I used SELECT TOP (3) E.[WWI Employee ID] AS EmployeeId FROM Dimension.Employee AS E WHERE E.[Is SalesPerson] = CAST(1 AS bit);
QueryDefineTables - Remember the drop/create statements from the on-time task? We're going to use the essence of them but use the expression builder to let us dynamically swap the table name. I clicked the ellipses, ..., on the Expression section and used the following "DROP TABLE IF EXISTS " + #[User::TempTableName] + "; CREATE TABLE " + #[User::TempTableName] + "( EmployeeId int NOT NULL);" You should be able to copy the Value from the row and paste it into SSMS to confirm it works.
QuerySales - This is the actual query you're going to use to pull your filtered set of sales data. Again, we'll use the Expression to allow us to dynamically reference the temporary table name. The prettified version of the expression would look something like
"SELECT
SI.InvoiceID
, SI.SalespersonPersonID
, SO.OrderID
, SOL.StockItemID
, SOL.Quantity
, SOL.OrderLineID
FROM
Sales.Invoices AS SI
INNER JOIN
Sales.Orders AS SO
ON SO.OrderID = SI.OrderID
INNER JOIN
Sales.OrderLines AS SOL
ON SO.OrderID = SOL.OrderID
WHERE
EXISTS (SELECT * FROM " + #[User::TempTableName] + " AS TT WHERE TT.EmployeeID = SI.SalespersonPersonID);"
Again, you should be able to pull the Value from the three queries and run them independently and verify they work.
Execute SQL Task
Add an Execute SQL task to the Control Flow. I named mine SQL Create temporary table My Connection Manager is WWI_OLTP and I changed the SQLSourceType to Variable and the SourceVariable is User::QueryDefineTables
Every time your package runs, the first thing it will do is establish create the temporary table. Which is good because SSIS is a metadata driven ETL engine and the next two steps would fail if the table didn't exist.
Data Flow Task - Prime the pump
This data flow is where we'll transfer DW data back to the OLTP system so can filter in the source system.
Drag a Data Flow Task onto the Control Flow. I named mine DFT Load Temp and before you click into it, right click on the Task and find the DelayValidation property and change this from the default of False to True. Normally, a package validates all metadata before actual execution begins as the idea is you want to know everything is good before any data starts moving. Since we're using temporary tables, we need to tell the execution engine "trust us, it'll be ready"
Double click inside the Data Flow Task.
Add an OLE DB Source. I named mine OLESRC SourceEmployees I use the connection manager WWI_DW. My data access mode changes to SQL command from variable and then I select my variable User::QuerySourceEmployees
Add an OLE DB Destination. I named mine OLEDST TempTableName and double clicked to configure it. The Connection Manager is WWI_OLTP and again, since the table lives in tempdb, we can't select it from the drop down. Change the Data access mode to Table name or view name variable - fast load and then select your variable name User::TempTableName. Click the Mapping tab and ensure source columns map to destination columns.
Data Flow Task - Transfer data
Finally, we will pull our source data, nicely filtered against the data from our target system.
Add an OLE DB Source. I named it OLESRC QuerySales. The Connection Manager is WWI_OLTP. Data access mode again changes to SQL command from variable and the variable name is User::QuerySales
From here, do whatever else you need to do to make the magic happen.
Instead of having 270k rows with an unfiltered query
I have 67k as there are only 3 employees in the temporary table.
Reference package
But wait, there's more!
Close out visual studio, open it back up and try to touch something in the data flows. Suddenly, there are red Xs everywhere! Any time you close a data flow component, it fires a revalidate metadata operation and guess what, it can't do that as the connection to the temporary table is gone.
The package will run fine, it will not throw VS_NEEDSNEWMETADATA but editing/maintenance becomes a pain.
If you switched from global temporary table to local, switch the table name variable's value back to a global and then run the define statement in SSMS. Once that's done, then you can continue editing the package.
I assure you, the local temporary table does work once you have the metadata set and you use queries via variables for source/destination.
No need for the global temporary table hack, or the SET FMTONLY OFF hack (which no longer works).
Just specify the result set metadata in the SQL query with WITH RESULT SETS. eg
EXEC ('
create table #t
(
ID INT,
Name VARCHAR(150),
Number VARCHAR(15)
)
insert into #t (Id, Name, Number)
select object_id, name, 12
from sys.objects
select * from #t
')
WITH RESULT SETS
(
(
ID INT,
Name VARCHAR(150),
Number VARCHAR(15)
)
)
If you need to parameterize the query, there's a bit of a catch because there are some limitations in how SSIS discovers parameters. SSIS runs sp_describe_undeclared_parameters, which doesn't really work with batches that call sp_executesql, because sp_executesql has a very unique way it handles parameters, one which you couldn't replicate with a user stored procedure.
So to parameterize the query you'll either need to pass the parameter values into the query using the "query from variable" and SSIS expressions, or push all this TSQL into a stored procedure.
I have a Microsoft SQL Server with data that needs to be protected (certain sensitive columns of some tables) and an application that queries that database like this:
SELECT BoringColumn, SensitiveColumn FROM Table
I have the following restrictions:
I have multiple users (3-4) each with different columns visible or not.
In this example SensitiveColumn should not be accessible.
I can not directly update the queries that the application sends
What did I try already:
I tried to use SQL Servers Dynamic Data Masking feature. However it's not granular enough, you can just turn it on or off per user but not just for some columns. And its can leak data in queries, the link above explains that as well.
I know I can just deny the user SELECT on Table.SensitiveColumn.
However then any existing query asking for the table just breaks with permission errors.
What other options do I have left?
Ideally I would like something that replaces the query on the serverside and executes something like this:
SELECT BoringColumn, 'N/A' as SensitiveColumn FROM Table
I think I found a possible solution:
Change the table structure - Rename the SensitiveColumn to a different name, and add a computed column with the old name of the SensitiveColumn, that will show results based on current_user.
CREATE TABLE tblTest
(
boringColumn int,
SensitiveBase varchar(10), -- no user should have direct access to this column!
SensitiveColumn as
case current_user
when 'Trusted login' then SensitiveBase
else 'N/A'
end
)
The one thing I'm not sure about is if you can deny access to the SensitiveBase column but grant it to the SensitiveColumn.
I'll leave you to test it yourself.
If that can't be done, you can simply grant select permissions on the SensitiveBase column only to trusted login and deny them for everyone else.
We have a requirement to provide customer access to a staging database so that they can extract their data into their own servers, but every table contains all customers data. All of the tables have a 'CustomerID' column. Customers should only see rows where the customerID is the same as theirs.
I am not looking for suggestions to create separate databases or views for each customer as both suggestions are high maintenance and low efficiency.
My solution has to work with:
100GB database
400 Tables
Updates every 30 minutes from the core transaction database
Quarterly schema changes (Application is in continuous Development).
Can anyone give me a definitive answer as to why the following method is not secure or will not work?:
I've set up a database user for each customer, with their customerID as an extended property.
I've created a view of every table that dynamically selects * from the table where the customerID column is the same as the extended property CustomerID of the logged in user. The code looks like this and appears to work well:
CREATE VIEW [CustomerAccessDatabase].[vw_Sales]
AS SELECT * FROM [CustomerAccessDatabase].[Sales]
WHERE [Sales].[CustomerID]=
(SELECT CONVERT(INT,p.value) AS [Value]
FROM sys.extended_properties
JOIN sys.sysusers ON extended_properties.major_id=sysusers.[uid]
AND extended_properties.name = 'CustomerID'
AND sysusers.[SID]=(SELECT suser_sid())
);
GO
To provide access to the views I've created a generic database role 'Customer_Access_Role'. This role has access granted to all of the table views, but access to the database tables themselves is denied.
To prevent users from changing their own customerID I've denied access to the extended properties like so:
USE [master];
GO
DENY EXEC ON sys.sp_addextendedproperty to [public];
GO
DENY EXEC ON sys.sp_dropextendedproperty to [public];
GO
DENY EXEC ON sys.sp_updateextendedproperty to [public];
GO
The end result is that I only need one database, and one set of permissions.
To add a new customer all I would need to do is create a new user with their customerID as an extended attribute and add them to the Customer_Access_Role. Thats it!
I am going to reiterate what everyone is stating already and sum it up.
You are making your job harder than it has to be.
Create a View, that is just their data and then give them Security access to that View.
Alternatively, extract all their data out of the "Core" database and into their own and give them the necessary access to that data.
I have an Access database in which I drop the table and then create the table afresh. However, I need to be able to test for the table in case the table gets dropped but not created (i.e. when someone stops the DTS package just after it starts -roll-eyes- ). If I were doing this in the SQL database I would just do:
IF (EXISTS (SELECT * FROM sysobjects WHERE name = 'Table-Name-to-look-for'))
BEGIN
drop table 'Table-Name-to-look-for'
END
But how do I do that for an Access database?
Optional answer: is there a way to have the DTS package ignore the error and just go to the next step rather than checking to see if it exists?
SQL Server 2000
I'm not sure whether you can query the system objects table in an Access database from a DTS package.
If that doesn't work, why not just try doing a SELECT * from the Access table in question and then catch the error if it fails?
Try the same T-SQL, but in MS ACCESS the sys objects table is called:
MSysObjects.
Try this:
SELECT * FROM MSysObjects WHERE Name = 'your_table';
and see if it works from there.
You can take a look at these tables if you go to Tools -> Options -> View (a tab) -> and check Hidden Objects, System Objects. So you can see both. If you open the table, you should see your table names, queries, etc. Do not change this manually or the DB could panic :)
Martin.
P.D.: Your If Exists should also check of object type:
IF EXISTS (SELECT * FROM sysobjects WHERE id = object_id(N'[dbo].[Your_Table_Name]') AND OBJECTPROPERTY(id, N'IsUserTable') = 1)
Microsoft Access has a system table called MSysObjects that contains a list of all database objects, including tables. Table objects have Type 1, 4 and 6.
It is important to reference the type:
... Where Name='TableName' And Type In (1,4,6)
Otherwise, what is returned could be a some object other than a table.