Merge data from two tables horizontally

Merge data from two tables horizontally - sql-server

I have data coming from two source table "T" & "S" as below
I want to merge the data horizontally from two tables and show the result as below
Column name in Result are fixed where
D1,D2,D3 are date column that shows last 3 max(TDATE) after merging data from table T & S for each group "Code"
TID1,TID2,TID3 are ID
from table T for Date D1,D2,D3
SID1,SID2,SID3 are ID from table S for
Date D1,D2,D3
My below query is returning result
WITH L1 AS
(
SELECT NULL AS TID, SID, CODE, TDATE
FROM (VALUES
('S1','A','2001-01-01'),
('S3','A','2001-01-03'),
('S5','B','2001-01-05')
) V(SID,CODE,TDATE)
UNION ALL
SELECT TID, NULL AS SID, CODE, TDATE
FROM (VALUES
('T1','A','2001-01-01'),
('T2','A','2001-01-02'),
('T5','B','2001-01-05')
) V(TID,CODE,TDATE)
)
,L2 AS
(
SELECT SID,TID,CODE,[TDATE],
DENSE_RANK () over (
partition by CODE
order by [TDATE] desc
) RNO
FROM L1
)
,L3 AS
(
SELECT MAX(TID) AS TID,MAX(SID) AS SID,CODE,TDATE,MAX(RNO)AS RNO
FROM L2
GROUP BY CODE,TDATE,RNO
HAVING MAX(RNO)<=3
)
SELECT * FROM L3
Now if I replace the last line of the query "Select * from L3" with the below query that pivot the result horizontally I am getting the correct result.
SELECT
R1.CODE
,R1.TID AS TID1,R2.TID AS TID2,R3.TID AS TID3
,R1.SID AS SID1,R2.SID AS SID2,R3.SID AS SID3
,R1.TDATE AS D1 , R2.TDATE AS D2 , R3.TDATE AS D3
FROM L3 R1
LEFT OUTER JOIN L3 R2 ON R1.RNO+1=R2.RNO AND R1.CODE=R2.CODE
LEFT OUTER JOIN L3 R3 ON R1.RNO+2=R3.RNO AND R1.CODE=R3.CODE
where R1.RNO=1
The problem is the above query is very slow when I have to run it with thousands of record in Table T & S with the last 5 dates in column D1,D2,D3,D4,D5
Is there any other way to optimise this query or a new query that can run fast?
I tried to add indexes on table T & S as below
The primary Key On T(TID) & S(SID)
Index on T(Code,TDATE)
Index on T(TDATE)
Index on S(Code,TDATE)
Index on S(TDATE)
Total rows in Table T & S are approx 50K each and the query is still running after 6 mins
Here is a link for an execution plan for the actual query that is very slow
Execution Plan of actual slow query

I think you are killing performance by doing the joins in your last select.
What about trying a pivot operation instead? Replace the select with:
SELECT
CODE
,Max(IIF(rno = 1, TID, NULL)) Tid1
,Max(IIF(rno = 2, TID, NULL)) Tid2
,Max(IIF(rno = 3, TID, NULL)) Tid3
,Max(IIF(rno = 1, SID, NULL)) Sid1
,Max(IIF(rno = 2, SID, NULL)) Sid2
,Max(IIF(rno = 3, SID, NULL)) Sid3
,Max(IIF(rno = 1, Tdate, NULL)) D1
,Max(IIF(rno = 2, Tdate, NULL)) D2
,Max(IIF(rno = 3, Tdate, NULL)) D3
FROM L3
GROUP BY Code

Related

TSQL Compare tables based on multiple rows same column?

i will try and be as clear as possible on this one, as i have no idea what to do next and would love a kick in the right direction.
Im trying to compare the values within 2 tables. The tables look like this:
Table1:
Table2
INSERT INTO #table1 ([elementName], [elementValue])
VALUES
('t1','Project'),
('p1','test1'),
('n1','value1'),
('t2','Project'),
('p2','test2'),
('n2','value2'),
('t3','Project'),
('p3','test3'),
('n3','value3'),
('t4',''),
('p4',''),
('n4',''),
('t5',''),
('p5',''),
('n5','')
INSERT INTO #table2 ([elementName], [elementValue])
VALUES
('t1','Project'),
('p1',''),
('n1',''),
('t2','Project'),
('p2','test3'),
('n2','value123'),
('t3','Project'),
('p3',''),
('n3',''),
('t4','Package'),
('p4',''),
('n4',''),
('t5','Project'),
('p5','Testtest'),
('n5','valuevalue')
I used this code to fill the testtables. Normally this is an automated process, and the tables are filled from an XML string.
Furthermore, the numbers in the element name are considered "groups" meaning T1 P1 and N1 are together.
I would like to compare T1 and P1 etc from Table1 to any combination of T and P from table2
If they match, i would like to overwrite the value of Table 1 N1 with the value of the matched N on table 2. (in the example, table1 N3 would be replaced with table2 N2
Besides that i also want to keep every group in table 1 that is not in table 2
but also add every group that is in table 2 but not in table 1 on one of the blank spots.
Last but not least, if the T value is filled, but P value is empty, it does not have to overwrite/change anything in table1.
The expected result would be this:
Table1:
i made the changes bold.
I dont really have an idea on where to start on this. Ive tried functions as except and intersect, but did not get even close to what i would like to see.

with t1 as (
select * from (values
('t1','Project'),
('p1','test1'),
('n1','value1'),
('t2','Project'),
('p2','test2'),
('n2','value2'),
('t3','Project'),
('p3','test3'),
('n3','value3'),
('t4',''),
('p4',''),
('n4',''),
('t5',''),
('p5',''),
('n5','')
) v([elementName], [elementValue])
),
t2 as (
select * from (values
('t1','Project'),
('p1',''),
('n1',''),
('t2','Project'),
('p2','test3'),
('n2','value123'),
('t3','Project'),
('p3',''),
('n3',''),
('t4','Package'),
('p4',''),
('n4',''),
('t5','Project'),
('p5','Testtest'),
('n5','valuevalue')
) v([elementName], [elementValue])
),
pivoted_t1 as (
select *
from
(select left([elementName], 1) letter, right([elementName], len([elementName]) - 1) number, [elementValue] as value from t1) t1
pivot(min(value) for letter in ([t], [p], [n])) pvt1
),
pivoted_t2 as (
select *
from
(select left([elementName], 1) letter, right([elementName], len([elementName]) - 1) number, [elementValue] as value from t2) t2
pivot(min(value) for letter in ([t], [p], [n])) pvt2
),
amended_values as (
select
pvt1.number,
coalesce(pvt2.t, pvt1.t) as t,
coalesce(pvt2.p, pvt1.p) as p,
coalesce(pvt2.n, pvt1.n) as n,
count(case when pvt1.t = '' and pvt1.p = '' then 1 end) over(order by pvt1.number rows between unbounded preceding and current row) as empty_row_number
from
pivoted_t1 pvt1
left join pivoted_t2 pvt2 on pvt1.t = pvt2.t and pvt1.p = pvt2.p and pvt1.t <> '' and pvt1.p <> ''
),
added_new_values as (
select
a.number,
coalesce(n.t, a.t) as t,
coalesce(n.p, a.p) as p,
coalesce(n.n, a.n) as n
from
amended_values a
left join (
select number, t, p, n, row_number() over (order by number) as row_number
from pivoted_t2 t2
where
t2.t <> ''
and t2.p <> ''
and not exists (select * from pivoted_t1 t1 where t1.t = t2.t and t1.p = t2.p)
) n on n.row_number = a.empty_row_number
)
select
concat([elementName], number) as [elementName],
[elementValue]
from
added_new_values
unpivot ([elementValue] for [elementName] in ([t], [p], [n])) upvt
;

SQL Query Get Last record Group by multiple fields

Hi I have a table with following fields:
ALERTID POLY_CODE ALERT_DATETIME ALERT_TYPE
I need to query above table for records in the last 24 hour.
Then group by POLY_CODE and ALERT_TYPE and get the latest Alert_Level value ordered by ALERT_DATETIME
I can get up to this, but I need the AlertID of the resulting records.
Any suggestions what would be an efficient way of getting this ?
I have created an SQL in SQL Server. See below
SELECT POLY_CODE, ALERT_TYPE, X.ALERT_LEVEL AS LAST_ALERT_LEVEL
FROM
(SELECT * FROM TableA where ALERT_DATETIME >= GETDATE() -1) T1
OUTER APPLY (SELECT TOP 1 [ALERT_LEVEL]
FROM (SELECT * FROM TableA where ALERT_DATETIME >= GETDATE() -1) T2
WHERE T2.POLY_CODE = T1.POLY_CODE AND
T2.ALERT_TYPE = T1.ALERT_TYPE ORDER BY T2.[ALERT_DATETIME] DESC) X
GROUP BY POLY_CODE, ALERT_TYPE, X.[ALERT_LEVEL]
POLY_CODE ALERT_TYPE ALERT_LEVEL
04575 Elec 2
04737 Gas 3
06239 Elec 2
06552 Elec 2
06578 Elec 2
10320 Elec 2

select top 1 with ties *
from TableA
where ALERT_DATETIME >= GETDATE() -1
order by row_number() over (partition by POLY_CODE,ALERT_TYPE order by [ALERT_DATETIME] DESC)
The way this works is that for each group of POLY_CODE,ALERT_TYPE get their own row_number() starting from the most recent alert_datetime. Then, the with ties clause ensures that all rows(= all groups) with the row_number value of 1 get returned.

One way of doing it is creating a cte with the grouping that calculates the latesdatetime for each and then crosses it with the table to get the results. Just keep in mind that if there are more than one record with the same combination of poly_code, alert_type, alert_level and datetime they will all show.
WITH list AS (
SELECT ta.poly_code,ta.alert_type,MAX(ta.alert_datetime) AS LatestDatetime,
ta.alert_level
FROM dbo.TableA AS ta
WHERE ta.alert_datetime >= DATEADD(DAY,-1,GETDATE())
GROUP BY ta.poly_code, ta.alert_type,ta.alert_level
)
SELECT ta.*
FROM list AS l
INNER JOIN dbo.TableA AS ta ON ta.alert_level = l.alert_level AND ta.alert_type = l.alert_type AND ta.poly_code = l.poly_code AND ta.alert_datetime = l.LatestDatetime

How to paginate Stack Exchange Data Explorer (SEDE) results?

Using data explorer to create queries:
SELECT P.id, creationdate,tags,owneruserid,answercount
--SELECT DISTINCT TAGNAME ,TAGID
FROM TAGS AS T
JOIN POSTTAGS AS PT
ON T.ID = PT.TAGID
JOIN POSTS AS P
ON PT.POSTID = P.ID
--WHERE CAST(P.TAGS AS VARCHAR) IN('JAVA')
WHERE PT.TAGID = 3143
How is it possible to add pagination in the query in order to take not only the first 50,000 results, but then run the query again to take the next remaining results?

There are a few ways to "page" through TSQL results; see:
How to return a page of results from SQL?
and
SQL performance: WHERE vs WHERE(ROW_NUMBER)
Here I will use the CTE method as:
It uses convenient row numbers to page through results, rather than trying to track less predictable factors such as creationdate.
It reportedly performs faster than the OFFSET method.
So, that question's query becomes this SEDE query:
-- StartRow: Starting row for paging
-- EndRow: Ending row for paging (Max 50K rows at a time)
WITH allData AS (
SELECT
ROW_NUMBER() OVER (ORDER BY P.creationdate) AS row
, P.id
, P.creationdate
, P.tags
, P.owneruserid
, P.answercount
FROM Posttags AS PT
JOIN Posts AS P ON PT.postid = P.id
WHERE PT.tagid = 3143 -- tag [scala]
)
SELECT *
FROM allData
WHERE row >= ##StartRow:INT?1##
AND row <= ##EndRow:INT?50000##
ORDER BY row

How to combine fields from 2 columns to create a "matrix"?

I have a logging table in my application that only logs changed data, and leaves the other columns NULL. What I'm wanting to do now is create a view that takes 2 of those columns (Type and Status),
and create a resultset that returns the Type and Status on the entry of that log row, assuming that either one or both columns could be null.
For example, with this data:
Type Status AddDt
A 1 7/8/2013
NULL 2 7/7/2013
NULL 3 7/6/2013
NULL NULL 7/5/2013
B NULL 7/4/2013
C NULL 7/3/2013
C 4 7/2/2013
produce the resultset:
Type Status AddDt
A 1 7/8/2013
A 2 7/7/2013
A 3 7/6/2013
A 3 7/5/2013
B 3 7/4/2013
C 3 7/3/2013
C 4 7/2/2013
From there I'm going to figure out the first time in these results the Type and Status meet certain conditions, such as a Type of B and Status 3 (7/4/2013) and ultimately use that date in a calculation, so performance is a huge issue with this.
Here's what I was thinking so far, but it doesn't get me where I need to be:
SELECT
Type.TypeDesc
, Status.StatusDesc
, *
FROM
jw50_Item c
OUTER APPLY (SELECT TOP 10000 * FROM jw50_ItemLog csh WHERE csh.ItemID = c.ItemID AND csh.StatusCode = 'OPN' ORDER BY csh.AddDt DESC) [Status]
OUTER APPLY (SELECT TOP 10000 * FROM jw50_ItemLog cth WHERE cth.ItemID = c.ItemID AND cth.ItemTypeCode IN ('F','FG','NG','PF','SXA','AB') ORDER BY cth.AddDt DESC) Type
WHERE
c.ItemID = #ItemID
So with the help provided below, I was able to get where I needed. Here is my final solution:
SELECT
OrderID
, CustomerNum
, OrderTitle
, ItemTypeDesc
, ItemTypeCode
, StatusCode
, OrdertatusDesc
FROM
jw50_Order c1
OUTER APPLY (SELECT TOP 1 [DateTime] FROM
(SELECT c.ItemTypeCode, c.OrderStatusCode, c.OrderStatusDt as [DateTime] FROM jw50_Order c WHERE c.OrderID = c1.OrderID
UNION
select (select top 1 c2.ItemTypeCode
from jw50_OrderLog c2
where c2.UpdatedDt >= c.UpdatedDt and c2.ItemTypeCode is not null and c2.OrderID = c.OrderID
order by UpdatedDt DESC
) as type,
(select top 1 c2.StatusCode
from jw50_OrderLog c2
where c2.UpdatedDt >= c.UpdatedDt and c2.StatusCode is not null and c2.OrderID = c.OrderID
order by UpdatedDt DESC
) as status,
UpdatedDt as [DateTime]
from jw50_OrderLog c
where c.OrderID = c1.OrderID AND (c.StatusCode IS NOT NULL OR c.ItemTypeCode IS NOT NULL)
) t
WHERE t.ItemTypeCode IN ('F','FG','NG','PF','SXA','AB') AND t.StatusCode IN ('OPEN')
order by [DateTime]) quart
WHERE quart.DateTime <= #FiscalPeriod2 AND c1.StatusCode = 'OPEN'
Order By c1.OrderID
The union is to bring in the current data in addition to the log table data to create the resultset, since the current data maybe what meets the conditions required. Thanks again for the help guys.

Here is an approach that uses correlated subqueries:
select (select top 1 c2.type
from jw50_Item c2
where c2.AddDt >= c.AddDt and c2.type is not null
order by AddDt
) as type,
(select top 1 c2.status
from jw50_Item c2
where c2.AddDt >= c.AddDt and c2.status is not null
order by AddDt
) as status,
(select AddDt
from jw50_Item c
If you have indexes on jw50_item(AddDt, type) and jw50_item(AddDt, status), then the performance should be pretty good.

I suppose you want to "generate a history": for those dates that has some data missing, the next available data should be set.
Something similar should work:
Select i.AddDt, t.Type, s.Status
from Items i
join Items t on (t.addDt =
(select min(t1.addDt)
from Items t1
where t1.addDt >= i.addDt
and t1.Type is not null))
join Items s on (s.addDt =
(select min(s1.addDt)
from Items s1
where s1.addDt >= i.addDt
and s1.status is not null))
Actually I'm joining the base table to 2 secondary tables and the join condition is that we match the smallest row where the respective column in the secondary table is not null (and of course smaller than the current date).
I'm not absolutely sure that it will work, since I don't have an SQL Server in front of me but give it a try :)

Join the table valued function in the query

I have one table vwuser. I want join this table with the table valued function fnuserrank(userID). So I need to cross apply with table valued function:
SELECT *
FROM vwuser AS a
CROSS APPLY fnuserrank(a.userid)
For each userID it generates multiple records. I only want the last record for each empid that does not have a Rank of Term(inated). How can I do this?
Data:
HistoryID empid Rank MonitorDate
1 A1 E1 2012-8-9
2 A1 E2 2012-9-12
3 A1 Term 2012-10-13
4 A2 E3 2011-10-09
5 A2 TERM 2012-11-9
From this 2nd record and 4th record must be selected.

In SQL Server 2005+ you can use this Common Table Expression (CTE) to determine the latest record by MonitorDate that doesn't have a Rank of 'Term':
WITH EmployeeData AS
(
SELECT *
, ROW_NUMBER() OVER (PARTITION BY empId, ORDER BY MonitorDate DESC) AS RowNumber
FROM vwuser AS a
CROSS APPLY fnuserrank(a.userid)
WHERE Rank != 'Term'
)
SELECT *
FROM EmployeeData AS ed
WHERE ed.RowNumber = 1;
Note: The statement before this CTE will need to end in a semi-colon. Because of this, I have seen many people write them like ;WITH EmployeeData AS...

You'll have to play with this. Having trouble mocking your schema on sqlfiddle.
Select bar.*
from
(
SELECT *
FROM vwuser AS a
CROSS APPLY fnuserrank(a.userid)
where rank != 'TERM'
) foo
left join
(
SELECT *
FROM vwuser AS b
CROSS APPLY fnuserrank(b.userid)
where rank != 'TERM'
) bar
on foo.empId = bar.empId
and foo.MonitorDate > bar.MonitorDate
where bar.empid is null
I always need to test out left outers on dates being higher. The way it works is you do a left outer. Every row EXCEPT one per user has row(s) with a higher monitor date. That one row is the one you want. I usually use an example from my code, but i'm on the wrong laptop. to get it working you can select foo., bar. and look at the results and spot the row you want and make the condition correct.
You could also do this, which is easier to remember
SELECT *
FROM vwuser AS a
CROSS APPLY fnuserrank(a.userid)
) foo
join
(
select empid, max(monitordate) maxdate
FROM vwuser AS b
CROSS APPLY fnuserrank(b.userid)
where rank != 'TERM'
) bar
on foo.empid = bar.empid
and foo.monitordate = bar.maxdate
I usually prefer to use set based logic over aggregate functions, but whatever works. You can tweak it also by caching the results of your TVF join into a table variable.
EDIT:
http://www.sqlfiddle.com/#!3/613e4/17 - I mocked up your TVF here. Apparently sqlfiddle didn't like "go".
select foo.*, bar.*
from
(
SELECT f.*
FROM vwuser AS a
join fnuserrank f
on a.empid = f.empid
where rank != 'TERM'
) foo
left join
(
SELECT f1.empid [barempid], f1.monitordate [barmonitordate]
FROM vwuser AS b
join fnuserrank f1
on b.empid = f1.empid
where rank != 'TERM'
) bar
on foo.empId = bar.barempid
and foo.MonitorDate > bar.barmonitordate
where bar.barempid is null

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Merge data from two tables horizontally - sql-server

Related

TSQL Compare tables based on multiple rows same column?

SQL Query Get Last record Group by multiple fields

How to paginate Stack Exchange Data Explorer (SEDE) results?

How to combine fields from 2 columns to create a "matrix"?

Join the table valued function in the query

Categories

Resources