VS 2012 - Comparing dates between tables to control progression of package steps - sql-server

I've created an SSIS package that pulls data from various sources and aggregates it as needed for the business. The goal of this processing is to create a single table, for example "Data_Tableau". This table is the datasource for connected Tableau dashboards.
The Tableau dashboards need to be available during the processing, so I don't truncate "Data_Tableau" and re-populate with the SSIS package. Instead, the SSIS package steps create "Data_Stage". Then the final step of the package is a drop/rename, wherein I drop "Data_Tableau" and sp_rename "Data_Stage" to "Data_Tableau".
USE dbname
DROP TABLE Data_Tableau
EXEC sp_rename Data_Stage, Data_Tableau
Before this final step, I expect max(buydate) from "Data_Stage" to be greater than max(buydate) from "Data_Tableau", since "Data_Stage" would have additional records since the last time the process ran.
However, sometimes there are issues with upstream data and I end up with max(buydate) from "Data_Stage" = max(buydate) from "Data_Tableau". In such cases, I would not want the final drop/rename process to run. Instead, I want the job to fail and I'll send an alert to the appropriate upstream data team when I get the failure notification.
That's the long-winded background. My question is...how do I check the dates and cause a failure within the SSIS package. I'm using VS 2012.
I was thinking of creating a constraint before the final drop/rename step, but I haven't created variables or expressions before and am unsure how to achieve this.
I was also considering creating a 2-row table as follows:
SELECT MAX(buydate) 'MaxDate', 'Tableau' 'FieldType' FROM dbname.dbo.Data_Tableau
UNION ALL
SELECT MAX(buydate) 'MaxDate', 'Stage' 'FieldType' FROM dbname.dbo.Data_Stage
and then using a query against that table as some sort of constraint, but not sure if that makes any sense and/or is better than the option of creating variables/expressions.
Goal: If MAX(buydate) from "Data_Stage" > MAX(buydate) from "Data_Tableau", then I'd want the drop/rename step to run, otherwise it should fail and "Data_Tableau" will contain the same data as before the package ran.
Suggestions? Step-by-step instrux would be greatly appreciated.

I would do this by putting this:
Then the final step of the package is a drop/rename, wherein I drop
"Data_Tableau" and sp_rename "Data_Stage" to "Data_Tableau".
into a stored procedure that gets called by the SSIS package.
Then it's simply a matter of using an IF block before that part of the code:
--psuedocode
IF (SELECT MaxBuyDateFromTableA) >= (SELECT MaxBuyDateFromTableB)
BEGIN
DROP TABLE Data_Tableau
EXEC sp_rename Data_Stage, Data_Tableau
END
ELSE
--do something else (or nothing at all)

Related

How to create a task in SSIS, in which the user can change the values of variables, every time he runs the package?

I have created a package in ssis in which i use some date-variables inside my SQL statements ( i.e declare #DateIn ="2018-02-22" and declare #DateTo = "2018-03-22"), in order to load the corresponding data inside the tables of the data warehouse.
What I need to do is to create a task or a different package, which will give me the possibility to define externally the values of these variables, every time i run it, in order to fill in the tables of the warehouse with the data that corresponds to the dates i set every time.
From what I've read, I should maybe use a script task or an execute sql task or parameters
Could you help me please? Or could you suggest me a good tutorial/link?
I have found plenty but can't decide if they meet the needs of what i am describing above.
Thank you
Create DTSX package with variables #DateStart and #DateEnd
Create table containing 3 columns DateStart, DateEnd, Active
Create stored procedure that reads DateStart, DateEnd where Active = 1 from your newly created table and does an alter on the SQL Server Job updating your variables value that are inside of your DTSX package with your desired value using sp_update_jobstep
See link
Ex of command:
dtexec /f YourPackage.dtsx
/set \package.variables[DateStart].Value;myvalue
/set \package.variables[DateStart].Value;myvalue
Add sp_start_job inside the stored procedure to start the job with the new variable values.
Create job with 1 step containing the execute of the stored procedure from Step 3
All you need to do is update the values from your table created in Step 2 and then execute job to run the stored procedure to update DTSX job exec command and start it. You can trigger this from a website and control the tables values from textboxes.
Also specific Permissions are required and the SP that updates the SQL Agent job needs to be run by Sysadmin
Good question by the way for the new learner!
There are many ways for this scenario,few of them I have mentioned below.
1-Create variable in variable pane #DateIn and #DateTo for storing the date and data type will be date.
Now put 2 entry in Excel ,text or xml for these two variables and call it by using foreachloop container and assign this to variables.
2-Create a SQl table in which you can store those values either by manually on daily basis or load the table with excel ,text ,xml or csv file and call the table in Execute SQL Task and select the result set and pass the result set values to the variables.
I hope it will solve your problem.

Replace NULL columns in live database with data from a SQL Server backup

I recently had a horrible blunder.
While attempting to fix an issue we were having with our Exact Synergy system I was attempting to replace the data in two columns for one account with NULL, instead I replaced those two columns in ALL accounts with NULL. Completely restoring from a backup is not an option so now I am left trying to figure out how to replace the missing data.
I have made a full restore of a recent backup for this database to a test database and have confirmed that the data I need is there. I am trying to figure out how to properly write a query that will replace the data in the two columns.
Since this is a backup of the same database, the tables and columns are all identically named.
The databases are Synergy and Synergy_TESTDB
The owner of the tables is dbo
The table is called Addresses
The columns are called textfield1 and textfield2
What I would like to do is take the data in textfield1 and textfield2 from the backup database and use it to populate the empty, or NULL, columns in the live database.
I am extremely new to SQL, and would appreciate any help.
This is obviously untested. I take no responsibility for you using this code.
That said I'd like to try and help you.
The main point is the 3 part database.table naming: I'm assuming you restored backup to same server. I'm also assuming you have a primary key on the table? And that Synergy_TESTDB is the restored database:
update target
set target.textfield1 = source.textfield1
from Synergy.dbo.Addresses target
join Synergy_TESTDB.dbo.Addresses source on target.PrimaryKeyCol = source.PrimaryKeyCol
where target.textfield1 IS NULL
update target
set target.textfield2 = source.textfield2
from Synergy.dbo.Addresses target
join Synergy_TESTDB.dbo.Addresses source on target.PrimaryKeyCol = source.PrimaryKeyCol
where target.textfield2 IS NULL
(Sure it could be done in a single update, but I'm trying to keep it simple.)
I strongly suggest you try in another test database first.
A good habit to get in to is to use a pattern like this:
BEGIN TRANSACTION
-- Perform updates
-- Examine the results: select * from dbo.Blah ...
-- If results are wrong, we just rollback anyway
ROLLBACK
-- If results are what you want, uncomment the COMMIT and comment out the ROLLBACK
-- COMMIT TRANS

Executing query from SSDT

I'm using Visual Studio 2015, SSIS to run set of sql tasks in Execute Sql task and then do a data transfer between tables which are in SSMS by executing package in SSIS. When we run a series of sql statements on SSMS, we get results like rows effected for every sql successful activity. However, now I want to automate the process using SSIS to reduce the turn around time. I would like to get the rows effected for every sql query like select, insert, delete which are in execute sql task. How can it be done in SSIS? I don't have dbo_owner permission to stored procedures in SSMS. I'm thinking SSIS would be a quick way. But it is very important for me to make a log of rows effected to validate the data, as it is financial data. I have nearly 10 sql statements in each sql task like select and delete. But the output is only one table.
For example my sql task is like below
select * from dbo.table1;
select * from dbo.table2 where city = 'Chicago';
create dbo.table3(id int, name varchar(50);
insert into dbo.table3(1,'a');
select * from dbo.table3;
If I execute this in SSMS I get rows effected for each select statement and also table is created. If I execute the same through package in SSIS, how will get messages for each of them?
I assume your data lies on SQL Server. With selects, you could use data flow tasks and row counts instead of Excecute Sql's.
For inserts and updates there's a few ways to get affected rowcount, like this: https://stackoverflow.com/a/1834264/5605866
or like this: http://microsoft-ssis.blogspot.fi/2011/03/rowcount-for-execute-sql-statement.html
Basically the same thing but with a bit different syntax.
You can use the Row Count transaformation after the Data source and save it the variable. Can refer to this get the number of rows returned from the Source that SHOULD be processed.
Hope this help.

Strange Issue in SSIS with WITH RESULTS SET returning wrong number of columns

So I have a stored procedure in SQL Server. I've simplified its code (for this question) to just this:
CREATE PROCEDURE dbo.DimensionLookup as
BEGIN
select DimensionID, DimensionField from DimensionTable
inner join Reference on Reference.ID = DimensionTable.ReferenceID
END
In SSIS on SQL Server 2012, I have a Lookup component with the following source command:
EXECUTE dbo.DimensionLookup WITH RESULT SETS (
(DimensionID int, DimensionField nvarchar(700) )
)
When I run this procedure in Preview mode in BIDS, it returns the two columns correctly. When I run the package in BIDS, it runs correctly.
But when I deploy it out to the SSIS catalog (the same server the database is on), point it to the same data sources, etc. - it fails with the message:
EXECUTE statement failed because its WITH RESULT SETS clause specified 2 column(s) for result set number 1, but the statement sent
3 column(s) at run time.
Steps Tried So Far:
Adding a third column to the result set - I get a different error, VS_NEEDSNEWMETADATA - which makes sense, kind of proof there's no third column.
SQL Profiler - I see this:
exec sp_prepare #p1 output,NULL,N'EXECUTE dbo.DimensionLookup WITH RESULT SETS ((
DimensionID int, DimensionField nvarchar(700)))',1
SET FMTONLY ON exec sp_execute 1 SET FMTONLY OFF
So it's trying to use FMTONLY to get the result set data ... needless to say, running SET FMTONLY ON and then running the command in SSMS myself yields .. just the two columns.
SET NOTCOUNT ON - Nothing changed.
So, two other interesting things:
I deployed it out to my local SQL 2012 install and it worked fine, same connections, etc. So it may be a server / database configuration. Not sure what if anything it is, I didn't install the dev server and my own install was pretty much click through vanilla.
Perhaps the most interesting thing. If I remove the join from the procedure's statement so it just becomes
select DimensionID, DimensionField from DimensionTable
It goes back to just sending 2 columns in the result set! So adding a join, without adding any additional output columns, ups the result set to 3 columns. Even if I add 6 more joins, just 3 columns. So one guess is its some sort of metadata column that only gets activated when there's a join.
Anyway, as you can imagine, it's driving me kind of mad. I have a workaround to load the data into a temp table and just return that, but why won't this work? What extra column is being sent back? Why only when I add a join?
Gah!
So all credit to billinkc: The reason is because of a patch.
In Version 11.0.2100.60, SSIS Lookup SQL command metadata is gathered using the old SET FMTONLY method. Unfortunately, this doesn't work in 2012, as the Books Online entry on SET FMTONLY helpfully notes:
Do not use this feature. This feature has been replaced by sp_describe_first_result_set.
Too bad they didn't follow their own advice!
This has been patched as of version 11.0.2218.0. Metadata is correctly gathered using the sp_describe_first_result_set system stored procedure.
This can happen if the specified WITH results set in SSIS identifies that there are more columns than being returned by the stored proc being called. Check your stored proc and ensure that you have the correct number of output columns as the WITH results set.

SQL Server 2000: search through out database

Some how some records in my table are getting updated with value of xyz in a certain column. Out of hundred of stored procedures, functions, triggers, how can I determine which code is doing this action. Is there a way to search through the database some how through each and every script of the code?
Please help.
One approach is to check syscomments
Contains entries for each view, rule,
default, trigger, CHECK constraint,
DEFAULT constraint, and stored
procedure within the database. The
text column contains the original SQL
definition statements..
e.g. select text from syscomments
If you are having trouble finding that literal string, the values could be coming from a table, or they could be being concatenated within a routine.
Try this
Select text from syscomments
where CharIndex('x', text) > 0
and CharIndex('y', text) > 0
and CharIndex('z', text) > 0
That might help you either find the right routine, or further indicate that the values are coming from a table.
This is going to be nearly impossible to do in SQL Server 2000 because the update might very well be from a variable that has that value or a join to another table that has that value and not hard-coded into the stored proc, trigger etc. The update could also be coming from a DTS package, a job, a piece of dynamic code run by the app or even from query analyzer, so the code itself may not be recorded inthe datbase anywhere.
Perhaps a better approach might be to create an audit table for the table in question and have it record the user and the code from the spid that generated the change as well as the old and new values. You'll have to wait until it happens again, but then you would know exactly what changed the value and what value to put it back to if need be.
Alternatively you could run profiler on the system until it happens but profiler tends to hurt performance and is not usually a good idea to run on a production system. If it is happening very often, it might be an acceptable alternative.
Here's a hint as to how you might get some of the info you want for the eventual trigger code you write:
create table #temp (eventtype nvarchar (1000), parameters int, eventinfo nvarchar (4000), myspid int)
declare #myspid int
select #myspid =##spid
insert #temp (eventtype,parameters, eventinfo)
exec ('dbcc inputbuffer (##spid)')
update #temp
set myspid = #myspid
select hostname, program_name, eventinfo
from #temp t
join sysprocesses s on t.myspid = s.spid
WHERE spid = #myspid
You might use sql-profiler to trac the update of a given table / column.

Resources