The suggested answer, in this post, works great for two columns.
I have about 50 different date columns, where I need to be able to report on the most recent interaction, regardless of table.
In this case, I am bringing the columns in to a view, since they are coming from different tables in two different databases...
CREATE VIEW vMyView
AS
SELECT
comp_name AS Customer
, Comp_UpdatedDate AS Last_Change
, CmLi_UpdatedDate AS Last_Communication
, Case_UpdatedDate AS Last_Case
, AdLi_UpdatedDate AS Address_Change
FROM Company
LEFT JOIN Comm_Link on Comp_CompanyId = CmLi_Comm_CompanyId
LEFT JOIN Cases ON Comp_CompanyId = Case_PrimaryCompanyId
LEFT JOIN Address_Link on Comp_CompanyId = AdLi_CompanyID
...
My question is, how I would easily account for the many possibilities of one column being greater than the others?
Using only the two first columns, as per the example above, works great. But considering that one row could have column 3 as the highest value, another row could have column 14 etc...
SELECT Customer, MAX(CASE WHEN (Last_Change IS NULL OR Last_Communication> Last_Change)
THEN Last_Communication ELSE Last_Change
END) AS MaxDate
FROM vMyView
GROUP BY Customer
So, how can I easily grab the highest value for each row in any of the 50(ish) columns?
I am using SQL Server 2008 R2, but I also need this to work in versions 2012 and 2014.
Any help would be greatly appreciated.
EDIT:
I just discovered that the second database is storing the dates in NUMERIC fields, rather than DATETIME. (Stupid! I know!)
So I get the error:
The type of column "ARCUS" conflicts with the type of other columns specified in the UNPIVOT list.
I tried to resolve this with a CAST to make it DATETIME, but that only resulted in more errors.
;WITH X AS
(
SELECT Customer
,Value [Date]
,ColumnName [Entity]
,BusinessEmail
,ROW_NUMBER() OVER (PARTITION BY Customer ORDER BY Value DESC) rn
FROM (
SELECT comp_name AS Customer
, Pers_EmailAddress AS BusinessEmail
, Comp_UpdatedDate AS Company
, CmLi_UpdatedDate AS Communication
, Case_UpdatedDate AS [Case]
, AdLi_UpdatedDate AS [Address]
, PLink_UpdatedDate AS Phone
, ELink_UpdatedDate AS Email
, Pers_UpdatedDate AS Person
, oppo_updateddate as Opportunity
, samdat.dbo.ARCUS.AUDTDATE AS ARCUS
FROM vCompanyPE
LEFT JOIN Comm_Link on Comp_CompanyId = CmLi_Comm_CompanyId
LEFT JOIN Cases ON Comp_CompanyId = Case_PrimaryCompanyId
LEFT JOIN Address_Link on Comp_CompanyId = AdLi_CompanyID
LEFT JOIN PhoneLink on Comp_CompanyId = PLink_RecordID
LEFT JOIN EmailLink on Comp_CompanyId = ELink_RecordID
LEFT JOIN vPersonPE on Comp_CompanyId = Pers_CompanyId
LEFT JOIN Opportunity on Comp_CompanyId = Oppo_PrimaryCompanyId
LEFT JOIN Orders on Oppo_OpportunityId = Orde_opportunityid
LEFT JOIN SAMDAT.DBO.ARCUS on IDCUST = Comp_IdCust
COLLATE Latin1_General_CI_AS
WHERE Comp_IdCust IS NOT NULL
AND Comp_deleted IS NULL
) t
UNPIVOT (Value FOR ColumnName IN
(
Company
,Communication
,[Case]
,[Address]
,Phone
,Email
,Person
,Opportunity
,ARCUS
)
)up
)
SELECT Customer
, BusinessEmail
,[Date]
,[Entity]
FROM X
WHERE rn = 1 AND [DATE] >= DATEADD(year,-2,GETDATE()) and BusinessEmail is not null
You could use CROSS APPLY to manually pivot your fields, then use MAX()
SELECT
vMyView.*,
greatest.val
FROM
vMyView
CROSS APPLY
(
SELECT
MAX(val) AS val
FROM
(
SELECT vMyView.field01 AS val
UNION ALL SELECT vMyView.field02 AS val
...
UNION ALL SELECT vMyView.field50 AS val
)
AS manual_pivot
)
AS greatest
The inner most query will pivot each field in to a new row, then the MAX() re-aggregate them back in to a single row. (Also skipping NULLs, so you don't need to explicitly cater for them.)
;WITH X AS
(
SELECT Customer
,Value [Date]
,ColumnName [CommunicationType]
,ROW_NUMBER() OVER (PARTITION BY Customer ORDER BY Value DESC) rn
FROM (
SELECT comp_name AS Customer
, Comp_UpdatedDate AS Last_Change
, CmLi_UpdatedDate AS Last_Communication
, Case_UpdatedDate AS Last_Case
, AdLi_UpdatedDate AS Address_Change
FROM Company
LEFT JOIN Comm_Link on Comp_CompanyId = CmLi_Comm_CompanyId
LEFT JOIN Cases ON Comp_CompanyId = Case_PrimaryCompanyId
LEFT JOIN Address_Link on Comp_CompanyId = AdLi_CompanyID
) t
UNPIVOT (Value FOR ColumnName IN (Last_Change,Last_Communication,
Last_Case,Address_Change))up
)
SELECT Customer
,[Date]
,[CommunicationType]
FROM X
WHERE rn = 1
Related
I have following result set,
Now with above results i want to print the records via select query as below attached image
Please note, I will have only two types of columns in output Present Employee & Absent Employees.
I tried using pivot tables, temporary table but cant achieve what I want.
One method would be to ROW_NUMBER each the the "statuses" and then use a FULL OUTER JOIN to get the 2 datasets into the appropriate columns. I use a FULL OUTER JOIN as I assume you could have a different amount of employees who were present/absent.
CREATE TABLE dbo.YourTable (Name varchar(10), --Using a name that doesn't require delimit identification
Status varchar(7), --Using a name that doesn't require delimit identification
Days int);
GO
INSERT INTO dbo.YourTable(Name, Status, Days)
VALUES('Mal','Present',30),
('Jess','Present',20),
('Rick','Absent',30),
('Jerry','Absent',10);
GO
WITH RNs AS(
SELECT Name,
Status,
Days,
ROW_NUMBER() OVER (PARTITION BY Status ORDER BY Days DESC) AS RN
FROM dbo.YourTable)
SELECT P.Name AS PresentName,
P.Days AS PresentDays,
A.Name AS AbsentName,
A.Days AS AbsentDays
FROM (SELECT R.Name,
R.Days,
R.Status,
R.RN
FROM RNs R
WHERE R.Status = 'Present') P
FULL OUTER JOIN (SELECT R.Name,
R.Days,
R.Status,
R.RN
FROM RNs R
WHERE R.Status = 'Absent') A ON P.RN = A.RN;
GO
DROP TABLE dbo.YourTable;
db<>fiddle
2 CTE's is actually far neater:
WITH Absents AS(
SELECT Name,
Status,
Days,
ROW_NUMBER() OVER (ORDER BY Days DESC) AS RN
FROM dbo.YourTable
WHERE Status = 'Absent'),
Presents AS(
SELECT Name,
Status,
Days,
ROW_NUMBER() OVER (ORDER BY Days DESC) AS RN
FROM dbo.YourTable
WHERE Status = 'Present')
SELECT P.Name AS PresentName,
P.Days AS PresentDays,
A.Name AS AbsentName,
A.Days AS AbsentDays
FROM Absents A
FULL OUTER JOIN Presents P ON A.RN = P.RN;
By running the following query I realized that I have duplicates on the column QueryExecutionId.
SELECT DISTINCT qe.QueryExecutionid AS QueryExecutionId,
wfi.workflowdefinitionid AS FlowId,
qe.publishing_date AS [Date],
c.typename AS [Type],
c.name As Name
INTO #Send
FROM
[QueryExecutions] qe
JOIN [Campaign] c ON qe.target_campaign_id = c.campaignid
LEFT JOIN [WorkflowInstanceCampaignActivities] wfica ON wfica.queryexecutionresultid = qe.executionresultid
LEFT JOIN [WorkflowInstances] wfi ON wfica.workflowinstanceid = wfi.workflowinstanceid
WHERE qe.[customer_idhash] IS NOT NULL;
E.g. When I test with one of these QueryExecutionIds, I can two results
select * from ##Send
where QueryExecutionId = 169237
We realized the reason is that these two rows have a different FlowId (second returned value in the first query). After discussing this issue, we decided to take the record with a FlowId that has the latest date. This date is a column called lastexecutiontime that sits in the third joined table [WorkflowInstances] which is also the table where FlowId comes from.
How do I only get unique values of QueryExecutionId with the latest value of WorkflowInstances.lastexecution time and remove the duplicates?
You can use a derived table with first_value partitioned by workflowinstanceid ordered by lastexecutiontime desc:
SELECT DISTINCT qe.QueryExecutionid AS QueryExecutionId,
wfi.FlowId,
qe.publishing_date AS [Date],
c.typename AS [Type],
c.name As Name
INTO #Send
FROM
[QueryExecutions] qe
JOIN [Campaign] c ON qe.target_campaign_id = c.campaignid
LEFT JOIN [WorkflowInstanceCampaignActivities] wfica ON wfica.queryexecutionresultid = qe.executionresultid
LEFT JOIN
(
SELECT DISTINCT workflowinstanceid, FIRST_VALUE(workflowdefinitionid) OVER(PARTITION BY workflowinstanceid ORDER BY lastexecutiontime DESC) As FlowId
FROM [WorkflowInstances]
) wfi ON wfica.workflowinstanceid = wfi.workflowinstanceid
WHERE qe.[customer_idhash] IS NOT NULL;
Please note that your distinct query is pertaining to the selected variables,
eg. Data 1 (QueryExecutionId = 169237 and typename = test 1)
Data 2 (QueryExecutionId = 169237 and typename = test 2)
The above 2 data are considered as distinct
Try partition by and selection the [seq] = 1 (the below code are partition by their date)
SELECT *
into #Send
FROM
(
SELECT *,ROW_NUMBER() OVER (PARTITION BY [QueryExecutionid] ORDER BY [Date] DESC) [Seq]
FROM
(
SELECT qe.QueryExecutionid AS QueryExecutionId,
wfi.FlowId,
qe.publishing_date AS [Date], --should not have any null values
qe.[customer_idhash]
c.typename AS [Type],
c.name As Name
FROM [QueryExecutions] qe
JOIN [Campaign] c
ON qe.target_campaign_id = c.campaignid
LEFT JOIN [WorkflowInstanceCampaignActivities] wfica
ON wfica.queryexecutionresultid = qe.executionresultid
LEFT JOIN
(
SELECT DISTINCT workflowinstanceid, FIRST_VALUE(workflowdefinitionid) OVER(PARTITION BY workflowinstanceid ORDER BY lastexecutiontime DESC) As FlowId
FROM [WorkflowInstances]
) wfi ON wfica.workflowinstanceid = wfi.workflowinstanceid
) a
WHERE [customer_idhash] IS NOT NULL
) b
WHERE [Seq] = 1
ORDER BY [QueryExecutionid]
I have been trying to filter one table by two dates with an order of importance (date2 > date1) as follows:
SELECT
t1.customer, t1.weights, t1.max(t1.date1) as date1, t1.date2
FROM
(SELECT *
FROM table
WHERE CAST(date2 AS smalldatetime) = '10/29/2017') t2
INNER JOIN
table t1 ON t1.customer = t2.customer
AND t1.date2 = t2.date2
GROUP BY
t1.customer, t1.date2
ORDER BY
t1.customer;
It filters the table correctly by date2 first, the max(t1.date1) doesn't what I want it to do though. I get duplicate customers, that share the same (and correct) date2, but show different date1's. These duplicate records have the following in common: The weight row is different. What would I need to do to output just the the customer records connected to the most current date1 without taking other columns into consideration?
I am still a noob, help would be greatly appreciated!
Solution for t-sql (all based on the accepted answer):
SELECT * FROM (
SELECT row_number() over(partition by t1.customer order by t1.date1 desc) as rownum, t1.customer, t1.weights, t1.date1 , t1.date2
FROM
(SELECT *
FROM table
WHERE CAST(date2 AS smalldatetime) = '10/29/2017') t2
INNER JOIN
table t1 ON t1.customer = t2.customer
AND t1.date2 = t2.date2
)t3
where rownum = 1;
If I understood correctly, then instead of a group by logic, I would just use a qualify row statement :)
Try the code below and tell me if it's what you needed - what I'm telling it to do is to bring back only one row per customer ID....but where we select the row based on the dates (by sorting them in ascending order) - however, I'm unclear of what you mean by importance of the 2 dates so I may be completely off base here...can you please give an example of input and desired output?
SELECT t1.customer, t1.weights, t1.date1, t1.date2
FROM
(
Select *
FROM table
WHERE Cast(date2 as smalldatetime)='10/29/2017'
) t2
Inner Join table t1
ON t1.customer = t2.customer
AND t1.date2 = t2.date2
Qualify row_number() over(partition by t1.customer order by date2 , date1)=1
Order By t1.customer;
I have a table with data and I am trying to find max date verified
Create table staging(ID varchar(5) not null, Name varchar(200) not null, dateverified datetime not null,dateinserted datetime not null)
ID,Name,DateVerified,DateInserted
42851775,384,2014-05-24 08:48:20.000,2014-05-28 14:28:10.000
42851775,384,2014-05-28 13:13:07.000,2014-05-28 14:36:12.000
42851775,a1d,2014-05-28 09:17:22.000,2014-05-28 14:36:12.000
42851775,a1d,2014-05-28 09:17:22.000,2014-05-28 14:28:10.000
42851775,a1d,2014-05-28 09:17:22.000,2014-05-28 14:29:08.000
42851775,bc5,2014-05-28 09:17:21.000,2014-05-28 14:29:08.000
42851775,bc5,2014-05-28 09:17:21.000,2014-05-28 14:28:10.000
42851775,bc5,2014-05-28 09:17:21.000,2014-05-28 14:36:12.000
I want to display max dateverified for each keyid i.e.
42851775,384,2014-05-28 13:13:07.000,2014-05-28 14:36:12.000
42851775,a1d,2014-05-28 09:17:22.000,2014-05-28 14:36:12.000
42851775,bc5,2014-05-28 09:17:21.000,2014-05-28 14:29:08.000
SELECT i.[ID],i.name,i.dateinserted,r.maxdate
FROM (select id,name,max(dateverified) as maxdate from
[dbo].[staging] where id=42851775 group by id,name) r
inner join
[dbo].[staging] i
on r.id=i.id and r.jobpostingurl=i.jobpostingurl and r.maxdate=i.dateverified
group by i.id,i.jobpostingurl,r.maxdate
I get an error,dateinserted is invalid as it is not contained in group by clause. But if I add it in group by clause I get all 8 records. How to handle this?
Thanks
R
SELECT
KeyID,
MAX(yourDate)
FROM
Staging
GROUP BY
KeyID
If you want additional information join this to another table for instance:
SELECT
b.KeyID,
a.dateinserted,
b.TheDate
FROM YourTable a
INNER JOIN
(
SELECT
KeyID,
MAX(yourDate) AS TheDate
FROM
Staging
GROUP BY
KeyID
) b
ON
b.KeyID = a.KeyID
If you need to get the dateinserted you can use a cte and join it back to the original table:
WITH cte
AS ( SELECT [ID] ,
name ,
MAX(dateverified) AS dateverified
FROM [dbo].[staging]
GROUP BY ID ,
name
)
SELECT cte.[ID] ,
cte.NAME ,
cte.dateverified ,
s.Dateinserted
FROM cte
INNER JOIN dbo.staging s ON cte.[ID] = s.[ID]
AND cte.NAME = s.NAME
AND cte.dateverified = s.dateverified
If I have the following full text search query:
SELECT *
FROM dbo.Product
INNER JOIN CONTAINSTABLE(Product, (Name, Description, ProductType), 'model') ct
ON ct.[Key] = Product.ProductID
Is it possible to weigh the columns that are being searched?
For example, I care more about the word model appearing in the Name column than I do the
Description or ProductType columns.
Of course if the word is in all 3 columns then I would expect it to rank higher than if it was just in the name column. Is there any way to have a row rank higher if it just appears in Name vs just in Description/ProductType?
You can do something like the following query. Here, WeightedRank is computed by multiplying the rank of the individual matches. NOTE: unfortunately I don't have Northwind installed so I couldn't test this, so look at it more like pseudocode and let me know if it doesn't work.
declare #searchTerm varchar(50) = 'model';
SELECT 100 * coalesce(ct1.RANK, 0) +
10 * coalesce(ct2.RANK, 0) +
1 * coalesce(ct3.RANK, 0) as WeightedRank,
*
FROM dbo.Product
LEFT JOIN
CONTAINSTABLE(Product, Name, #searchTerm) ct1 ON ct1.[Key] = Product.ProductID
LEFT JOIN
CONTAINSTABLE(Product, Description, #searchTerm) ct2 ON ct2.[Key] = Product.ProductID
LEFT JOIN
CONTAINSTABLE(Product, ProductType, #searchTerm) ct3 ON ct3.[Key] = Product.ProductID
order by WeightedRank desc
Listing 3-25. Sample Column Rank-Multiplier Search of Pro Full-Text Search in SQL Server 2008
SELECT *
FROM (
SELECT Commentary_ID
,SUM([Rank]) AS Rank
FROM (
SELECT bc.Commentary_ID
,c.[RANK] * 10 AS [Rank]
FROM FREETEXTTABLE(dbo.Contributor_Birth_Place, *, N'England') c
INNER JOIN dbo.Contributor_Book cb ON c.[KEY] = cb.Contributor_ID
INNER JOIN dbo.Book_Commentary bc ON cb.Book_ID = bc.Book_ID
UNION ALL
SELECT c.[KEY]
,c.[RANK] * 5
FROM FREETEXTTABLE(dbo.Commentary, Commentary, N'England') c
UNION ALL
SELECT ac.[KEY]
,ac.[RANK]
FROM FREETEXTTABLE(dbo.Commentary, Article_Content, N'England') ac
) s
GROUP BY Commentary_ID
) s1
INNER JOIN dbo.Commentary c1 ON c1.Commentary_ID = s1.Commentary_ID
ORDER BY [Rank] DESC;
Similar to Henry's solution but simplified, tested and using the details the question provided.
NB: I ran performance tests on both the union and left join styles and found the below to require far less logical reads on the union style below with my datasets YMMV.
declare #searchTerm varchar(50) = 'model';
declare #nameWeight int = 100;
declare #descriptionWeight int = 10;
declare #productTypeWeight int = 1;
SELECT ranksGroupedByProductID.*, outerProduct.*
FROM (SELECT [key],
Sum([rank]) AS WeightedRank
FROM (
-- Each column that needs to be weighted separately
-- should be added here and unioned with the other queries
SELECT [key],
[rank] * #nameWeight as [rank]
FROM Containstable(dbo.Product, [Name], #searchTerm)
UNION ALL
SELECT [key],
[rank] * #descriptionWeight as [rank]
FROM Containstable(dbo.Product, [Description], #searchTerm)
UNION ALL
SELECT [key],
[rank] * #productTypeWeight as [rank]
FROM Containstable(dbo.Product, [ProductType], #searchTerm)
) innerSearch
-- Grouping by key allows us to sum each ProductID's ranks for all the columns
GROUP BY [key]) ranksGroupedByProductID
-- This join is just to get the full Product table columns
-- and is optional if you only need the ordered ProductIDs
INNER JOIN dbo.Product outerProduct
ON outerProduct.ProductID = ranksGroupedByProductID.[key]
ORDER BY WeightedRank DESC;