ssis package for checking condition by other table's value

ssis package for checking condition by other table's value - sql-server

I have query like-
select work_Id, startdate,work_per from Work a
where work_per is not null
and StartDate = (select max(StartDate) from Work b where a.work_Id=b.work_Id)
i want to create a package in ssis for this query..
i am face problem in giving condition.in
select max(StartDate) from Work b where a.work_Id=b.work_Id.
here i am checking column a.work_Id's values to b.work_Id.
i really don't know how can i check it in ssis. i tried with condition split but it filter only one table's value only.
and i dont want to use any query in package.. plz give some suggestions..

You can use MERGE JOIN transformation to achieve this. It will have two inputs.
1) select work_Id, startdate,work_per from Work a where work_per is not null
2) select work_Id,max(StartDate) as StartDate from Work b group by b.work_Id
These two inputs will be joined in MERGE JOIN on Work_ID and Start_Date.
Hope this will help..!!
Thanks,
Swapnil

Related

snowflake merge statement using golden gate json as source table

while executing target table in snowflake using json data as source table
merge into cust tgt using (
select parse_json(s.$1):application_num as application num
from prd_json s qualify
row_number() over(partition application
order_by application desc)=1) src
on tgt.application =src.application
when not matched and op_type='I' then
insert(application) values (src.application );
qualify commands ignores all the duplicate data present and gives only unique record but while putting joins its show only less records when compare to normal select statement.
for example :
select distinct application
from prd_json where op_type='I';
--15000 rows are there
while putting joins it shows there is not matching records in target . if it is not matched it should insert all 15000rows but 8500 rows only inserting even though it was not an duplicate record . is there any function available without using "qualify" shall we insert the record. if i ignore qualify am getting dml error duplication. pls guide me if anyone knows.

How about using SELECT DISTINCT?

You demo SQL does not compile. and you using the $1 means it's also hard to guess the names of your columns to know how the ROW_NUMBER is working.
So it's hard to nail down the problem.
But with the following SQL you can replace ROW_NUMBER with DISTINCT
CREATE TABLE cust(application INT);
CREATE OR REPLACE table prd_json as
SELECT parse_json(column1) as application, column2 as op_type
FROM VALUES
('{"application_num":1,"other":1}', 'I'),
('{"application_num":1,"other":2}', 'I'),
('{"application_num":2,"other":3}', 'I'),
('{"application_num":1,"other":1}', 'U')
;
MERGE INTO cust AS tgt
USING (
SELECT DISTINCT
parse_json(s.$1):application_num::int as application,
s.op_type
FROM prd_json AS s
) AS src
ON tgt.application = src.application
WHEN NOT MATCHED AND src.op_type = 'I' THEN
INSERT(application) VALUES (src.application );
number of rows inserted
2
SELECT * FROM cust;
APPLICATION
1
2
running the MERGE code a second time gives:
number of rows inserted
0
Now if truncate CUST and I swap to using this SQL for the inner part:
SELECT --DISTINCT
parse_json(s.$1):application_num::int as application,
s.op_type
FROM prd_json AS s
qualify row_number() over (partition by application order by application desc)=1
I get three rows inserted, because the partition by application, is effectively binding to the s.application not the output application, and there are 3 different "applications" because of the other values.
The reason I wrote my code this way is your
select distinct application
from prd_json where op_type='I';
implies there is something called application already, in the table.. and thus it runs the chance of being used in the ROW_NUMBER statement..
Anyways, there is a large possible problem is you also have "update data" I guess U in your transaction block, that you want to ORDER BY the sub-select so you never have a Inser,Update trying action in Update,Inser order. And assuming you want all update operations if there are many of them.. I will stop. But if you do not have Updates, the sub-select should have the op_type='I' to avoid the non-insert ops making it. Out, or possible worse again, in your ROW_NUMBER pattern replacing the Intserts. Which I suspect is the underlying cause of your problem.

Rows 'not in' and rows 'in' does not equal total rows in table

I still have much to learn in database work, so please be kind.
I am attempting to combine two tables that have similar data, but wanted to be sure that I wasn't duplicating any entries. I decided to use the query below to see how many names were already in the target table
select A.Name
From SourceTable A
where Name NOT IN
(
select B.Name
From [Production].[dbo].[DestinationTable] B
)
This returned 0 rows, so I assumed that every Name was already in the target table. But when I changed the query to
select A.Name
From SourceTable A
where Name IN
(
select B.Name
From [Production].[dbo].[DestinationTable] B
)
I got back about half of the total rows in the source table. How can these two totals not add up to the total number of rows in the source table? I assumed duplicate names, but the numbers still don't add up. What could I be missing here?

Kamil's answer is a good explanation of what's going on with IN and NOT IN. But a better way to see if your destination table is missing any names from the source table would be to use a LEFT JOIN and check for NULL.
The query would look like this:
SELECT A.Name
FROM SourceTable A
LEFT JOIN [Production].[dbo].[DestinationTable] B ON A.Name = B.Name
WHERE B.Name IS NULL
This would return all names from your source that aren't in your destination.

The reason you are not getting the total row count from both queries combined is because you have NULL values in your DestinationTable.
Generally you are ommitting checking for null values and this is the reason. You could add OR name is null to see it.
Check it using
select count(*) from destinationtable where name is null
Alternatively you could perform a CROSS JOIN and see for yourself where the data doesn't match and inspect why

sql query to find all the updated column of a table

I need a dynamic sql query that can get a column values of a table on the basis of any case/condition. I want to do that while update any record. And I need updated value of column and for that I am using Inserted and Deleted tables of SQL Server.
I made one query which is working fine with one column but I need a generic query that should work for all of the columns.
SELECT i.name
FROM inserted i
INNER JOIN deleted d ON i.id = d.id
WHERE d.name <> i.name
With the help of above query we can get "Name" column value if it's updated. But as it's specific to one column same thing I want for all the columns of a table in which there should be no need to define any column name it should be generic/dynamic query.
I am trying to achieve that by adding one more inner join with PIVOT of "INFORMATION_SCHEMA.COLUMNS" for columns but I am not sure about it whether we can do that or not by this.
SELECT COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME LIKE '%table1%'

You can't get that kind of information using just a query. It sounds like you need to be running an update trigger. In there you can code the criteria get your columns, etc. From your question, it sounds like only one column can be updated, you are just not sure which one it will be. If multiple columns are being updated, that complicates things, but not that much
There are several ways that you can get the data you need. Off the top, I'd suggest some simple looping once you get the column names.
What has worked for me in similar scenarios is to do a simple query to INFORMAATION_SCHEMA.Columns and put the results in a table variable. From there, iterate through the table variable to get one column name at a time and run a simple dynamic query to see if the value has changed.
Hope this helps.
:{)

SQL Server : group all data by one column

I need some help in writing a SQL Server stored procedure. All data group by Train_B_N.
my table data
Expected result :
expecting output
with CTE as
(
select Train_B_N, Duration,Date,Trainer,Train_code,Training_Program
from Train_M
group by Train_B_N
)
select
*
from Train_M as m
join CTE as c on c.Train_B_N = m.Train_B_N
whats wrong with my query?

The GROUP BY smashes the table together, so having columns that are not GROUPED combine would cause problems with the data.
select Train_B_N, Duration,Date,Trainer,Train_code,Training_Program
from Train_M
group by Train_B_N
By ANSI standard, the GROUP BY must include all columns that are in the SELECT statement which are not in an aggregate function. No exceptions.
WITH CTE AS (SELECT TRAIN_B_N, MAX(DATE) AS Last_Date
FROM TRAIN_M
GROUP BY TRAIN_B_N)
SELECT A.Train_B_N, Duration, Date,Trainer,Train_code,Training_Program
FROM TRAIN_M AS A
INNER JOIN CTE ON CTE.Train_B_N = A.Train_B_N
AND CTE.Last_Date = A.Date
This example would return the last training program, trainer, train_code used by that ID.
This is accomplished from MAX(DATE) aggregate function, which kept the greatest (latest) DATE in the table. And since the GROUP BY smashed the rows to their distinct groupings, the JOIN only returns a subset of the table's results.
Keep in mind that SQL will return #table_rows X #Matching_rows, and if your #Matching_rows cardinality is greater than one, you will get extra rows.
Look up GROUP BY - MSDN. I suggest you read everything outside the syntax examples initially and obsorb what the purpose of the clause is.
Also, next time, try googling your problem like this: 'GROUP BY, SQL' or insert the error code given by your IDE (SSMS or otherwise). You need to understand why things work...and SO is here to help, not be your google search engine. ;)
Hope you find this begins your interest in learning all about SQL. :D

Select one column multiple time

I want to select one column two time from a table.
E.g
( Select rent as rent1, rent as rent2 From Expense)
But I don't know how I can Select this column multiple time as each has its own Where Clause.
Means I want to select One Column two time On two different condition.

Use a union
select test, 'rent1' from tableA where condA
union
select test, 'rent2' from tableA where condB

If you would want to have both values in one result line (can't think of a reason why you would want this, but anyway...) you could use something like a self-join:
select a.rent as rent1, b.rent as rent2
from Expense a,
Expense b
where a.condition
and b.condition
and a/b-join-condition

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

ssis package for checking condition by other table's value - sql-server

Related

snowflake merge statement using golden gate json as source table

Rows 'not in' and rows 'in' does not equal total rows in table

sql query to find all the updated column of a table

SQL Server : group all data by one column

Select one column multiple time

Categories

Resources