Combine two columns in the ON search condition of MERGE sql server - sql-server

I have two date columns, [START DATE] and [END DATE] and I want to only insert records whose dates are not in either column. This is what my logic looks like:
MERGE TABLE_A AS Target
USING (SELECT DISTINCT * FROM TABLE_B) AS Source
ON (SELECT Target.[START DATE] + Target.[END DATE] AS Target.[COMPAREDATE] =
SELECT Source.[START DATE] + Source.[END DATE] AS Source.[COMPAREDATE])
WHEN NOT MATCHED THEN
INSERT ([DATE ADDED], [END DATE], ...)
VALUES (Source.[DATE ADDED], Source.[END DATE], ...);
I also tried the code below but it continued to insert duplicates:
ON (Target.[START DATE] = Source.[START DATE] AND Target.[END DATE] = Source.[END DATE])
I'd appreciate any help, thanks!

This code was close:
ON (Target.[START DATE] = Source.[START DATE] AND Target.[END DATE] = Source.[END DATE])
It says, "if both the StartDate and the EndDate match, then the rows match".
If you want to say "if either the StartDates match or the EndDates match, then the rows match", then you should do this:
ON (Target.[START DATE] = Source.[START DATE] OR Target.[END DATE] = Source.[END DATE])

Related

How to get the data of specific date when the Datetime is stored in string [duplicate]

I have this query in SQL Server:
SELECT
[Order Date], SUM([Profit])
FROM
[sample].[dbo].[superstore]
WHERE
[Order Date] BETWEEN '2012-06-21 00:00:00.000' AND '2012-09-21 23:59:59.999'
GROUP BY
[Order Date]
ORDER BY
[Order Date];
result of this query has record with [Order Date] of 2012-06-21 00:00:00.000
but with this query
SELECT
[Order Date], SUM([Profit])
FROM
[sample].[dbo].[superstore]
WHERE
[Order Date] BETWEEN '2012-06-21 00:00:00.000' AND '2012-09-21 23:59:59.998'
GROUP BY
[Order Date]
ORDER BY
[Order Date];
I don't get this record.
Does SQL Server have any properties that is missing here?
Hint: type of [Order Date] is datetime.
SQL Server date/time columns have precisions that vary depending on the exact type. In your first, the "0.999" is being rounded up.
The safest approach is to eschew between and the time components, with something like:
SELECT [Order Date] , sum([Profit])
FROM [sample].[dbo].[superstore]
WHERE [Order Date] >= '2012-06-21' AND
[Order Date] < '2012-09-22'
GROUP BY [Order Date]
ORDER BY [Order Date];
Use Below Query
SELECT
CAST([Order Date] AS date),
SUM([Profit])
FROM [sample].[dbo].[superstore]
WHERE CAST([Order Date] AS date) BETWEEN '2012-06-21' AND '2012-09-21'
GROUP BY CAST([Order Date] AS date)
ORDER BY CAST([Order Date] AS date);
Using the BETWEEN operator for date ranges seems appealing at first but, as you're finding out, it's a headache. If you stick to the following format, you'll find that it works no matter the date/time data type.
Note: Where BETWEEN is a logical >= and <=, we're shifting the "end date, to midnight of the following day.
SELECT
[Order Date], SUM(Profit)
FROM
sample.dbo.superstore
WHERE
[Order Date] >= '2012-06-21'
AND [Order Date] < '2012-09-22'
GROUP BY
[Order Date]
ORDER BY
[Order Date];

LEFT JOIN Subtotal from the same TABLE

I am new to SQL, I am trying to query from a ledger table
I have two Queries that I need to join into one table
1st Query:
SELECT *
FROM TABLE 1
WHERE DATEDIFF(MONTH, [Transaction Date], GETDATE()) <= 3
AND [ITEMTYPE] = 'F-C'
AND ([Description] NOT LIKE '%REFILE%'
AND [Description] NOT LIKE '%CROSSOVER%')
AND ([UserNAME] LIKE '%USER1%'
OR [UserNAME] LIKE '%user2%'
OR [UserNAME] LIKE '%user3%')
2nd Query:
SELECT [ENCID], SUM([Trans]) AS Total_Charges
FROM Table 1
WHERE DATEDIFF(MONTH, [Transaction Date], GETDATE()) <= 3
AND [ITEMTYPE] IN ('C')
GROUP BY [ENCID];
I need to Left join the 2nd Query to the 1st Query on ENCID
THanks.
Well, you just join it. Nothing special about it, except that you have to remember to give an alias e.g. Totals
I have also changed your DATEDIFF(MONTH, [Transaction Date], GETDATE()) <= 3 to [Transaction Date] >= DATEADD(MONTH, -3, GETDATE()), to allow SQL Server to use indexes on [Transaction Date] column, if you have any.
Also leading wild card text searches e.g. [UserNAME] LIKE '%user2%' are slow, avoid them if you can: https://sqlperformance.com/2017/02/sql-indexes/seek-leading-wildcard-sql-server
SELECT *
FROM TABLE 1 AS a
LEFT JOIN
( SELECT [ENCID], SUM([Trans]) AS Total_Charges
FROM Table 1
WHERE [Transaction Date] >= DATEADD(MONTH, -3, GETDATE())
AND [ITEMTYPE] IN ('C')
GROUP BY [ENCID] ) AS Totals ON Totals.[ENCID] = a.[ENCID]
WHERE
[Transaction Date] >= DATEADD(MONTH, -3, GETDATE())
AND [ITEMTYPE] = 'F-C'
AND ([Description] NOT LIKE '%REFILE%'
AND [Description] NOT LIKE '%CROSSOVER%')
AND ([UserNAME] LIKE '%USER1%'
OR [UserNAME] LIKE '%user2%'
OR [UserNAME] LIKE '%user3%')
When you introduce any joins (to tables or subqueries) it is vital that you refer to columns from those sources by aliases.
SELECT t1.*
FROM TABLE_1 t1
LEFT JOIN (
SELECT [ENCID], SUM([Trans]) AS Total_Charges
FROM Table_1
WHERE [Transaction Date] >= DATEADD(month,-3,GETDATE())
AND [ITEMTYPE] IN ('C')
GROUP BY [ENCID]
) sq ON t1.[ENCID] = sq.[ENCID]
WHERE t1.[Transaction Date] >= DATEADD(month,-3,GETDATE())
AND t1.[ITEMTYPE] = 'F-C'
AND (t1.[Description] NOT LIKE '%REFILE%'
AND t1.[Description] NOT LIKE '%CROSSOVER%')
AND (t1.[UserNAME] LIKE '%USER1%'
OR t1.[UserNAME] LIKE '%user2%'
OR t1.[UserNAME] LIKE '%user3%')
NOTE: I have altered the way the dates are considered so that an index on [Transaction Date] could be utilized. Instead of calculating a number of months on every row, it is far more efficient to calculate a date then compare existing data to that value. For best performance try to avoid using functions on data in the where clause.

Select old records in a table

I want the oldest row by date for each Distinct Number. I created this script but the problem is I keep on getting the newest record.
SELECT*
FROM
[Data].[dbo].[IAPT] t1
WHERE
[Last Contact Date] IN
(SELECT MAX([Last Contact Date])
FROM [Data].[dbo].[IAPT]
WHERE t1.[Number] = [Data].[dbo].[IAPT].[Number]
AND
[Last Contact Date] NOT IN
(SELECT MAX([Last Contact Date])
FROM [Data].[dbo].[IAPT]
WHERE t1.[Pseudo] = [Data].[dbo].[IAPT].[Pseudo]))
The Table:
Pseudo Number Last Contact Date
0X1 18 17/06/2013
0X1 18 16/04/2013
0X2 19 25/04/2013
0X2 19 16/07/2013
Desired Result:
Number Last Contact Date
1 16/04/2013
2 25/04/2013
Any help would be appreciated. Thank You
You should use MIN function instead of MAX function
SELECT*
FROM
[Data].[dbo].[IAPT] t1
WHERE
[Last Contact Date] IN
(SELECT MIN([Last Contact Date])
FROM [Data].[dbo].[IAPT]
WHERE t1.[Number] = [Data].[dbo].[IAPT].[Number]
AND
[Last Contact Date] NOT IN
(SELECT MIN([Last Contact Date])
FROM [Data].[dbo].[IAPT]
WHERE t1.[Pseudo] = [Data].[dbo].[IAPT].[Pseudo]))
You can use ROW_NUMBER with a PARTITION BY clause:
SELECT Pseudo, Number, [Last Contact Date]
FROM (
SELECT Pseudo, Number, [Last Contact Date],
ROW_NUMBER() OVER (PARTITION BY Number
ORDER BY [Last Contact Date]) AS rn
FROM [Data].[dbo].[IAPT]) AS t
WHERE t.rn = 1
The first record within each Number partition is the one having the oldest date.
This way simple
SELECT PSEUDO, NUMBER , MIN ([LAST CONTACT DATE]) FROM [DATA].[DBO].[IAPT] T1
GROUP BY PSEUDO, NUMBER

Complex IIF to SQL 2008

Hi I am trying to duplicate a complex IIF function I used in MS Access but can't seem to translating it into a SLQ 2008 using CASE. I am trying to create a WHERE clause something like this
.....
WHERE
(IIF([Created Date]= NULL, IIF(DATEDIFF(day,[Created Date],[POSTED Date])<=3,1,IIF([Created Date] BETWEEN [Disti Reported Sales Date] AND [Posted Date]),1,NULL)))=1
AND
......
Basically what its doing is looking at a date from one column and comparing it to two other columns but if one of the columns is NULL then it uses a different comparison.
The literal translation should look close to this:
WHERE (CASE
WHEN [Created Date] IS NULL
THEN (CASE
WHEN DATEDIFF(DD, [Created Date], [POSTED Date]) <= 3
THEN 1
ELSE (CASE
WHEN [Created Date] BETWEEN [Disti Reported Sales Date] AND [Posted Date]
THEN 1
ELSE 0
END)
END)
ELSE 0
END) = 1
However, this can be simplified to something like this:
WHERE (CASE
WHEN [Created Date] IS NULL AND DATEDIFF(DD, [Created Date], [POSTED Date]) <= 3 THEN 1
WHEN [Created Date] IS NULL AND [Created Date] BETWEEN [Disti Reported Sales Date] AND [Posted Date] THEN 1
ELSE 0 END) = 1
As noted in my comment, it still seems odd that if [Created Date] IS NULL, you are trying to still use it in any calculation.

T-SQL return top one field based on count of another

I've struggled to search successfully for this as I haven't figured out a search string describing what I want to do, apologies if this has been covered already.
I have a table that contains among others a contract number, a start date, a serial number and a datestamp. These are Many:Many.
What I'm trying to achieve is to return the start date for each individual contract number with the largest number of unique serial numbers and the most recent datestamp, where that start date is valid.
This, as I guess is obvious to T-SQL experts only returns me the one contract number with the largest number of serials. Can anyone tell me what I'm doing wrong?
SELECT TOP (1)
[Contract ID], [Item Begin Date] AS Start_Date,
COUNT([Serial Number]) AS CountSerials,
Datestamp
FROM
SourceTable
GROUP BY
[Contract ID], [Item Begin Date], Datestamp
HAVING
([Item Begin Date] > CONVERT(DATETIME, '1900-01-01 00:00:00', 102))
ORDER BY
CountSerials DESC, Datestamp DESC
Cheers,
Alex
You can put that into a temporary table (without using TOP (1) or oder by):
I changed some table names in the process,
if exists (select * from tempdb.sys.tables where name = '##tmp')
drop table ##tmp
SELECT * into ##tmp
from
(
select
[Contract_ID], [Begin_Date] AS Start_Date,
COUNT([serials]) AS CountSerials,
Datestamp
FROM
SourceTable
GROUP BY
[Contract_ID], [begin_date], Datestamp
HAVING
[begin_date] > CONVERT(DATETIME, '1900-01-01 00:00:00', 102)
) a
select contract_id,Start_Date, max(countserials) as MAXCOUNT, Datestamp from ##tmp
group by contract_id,Start_Date,Datestamp
you can do a subquery with aggregation and extract your desired results from it:
SELECT distinct *
FROM
(
Select
[Contract ID], [Item Begin Date] AS Start_Date,
COUNT([Serial Number]) OVER(PARTITION BY [Contract ID]) AS CountSerials,
datestamp,
MAX(Datestamp) OVER (PARTITION BY [Contract ID]) maxdatestamp
FROM
SourceTable
WHERE
([Item Begin Date] > CONVERT(DATETIME, '1900-01-01 00:00:00', 102))
) x
WHERE
datestamp=maxdatestamp

Resources