How can I match a row in SQL Server only once?

How can I match a row in SQL Server only once? - sql-server

I have the following problem where I am kindly asking for your help when joining two tables in SQL Server 2016 (v13).
I have 2 tables, Revenues and Cashins.
Revenues:
RevenueID
ProductID
InvoiceNo
Amount
123
456
987
1000
234
456
987
1000
Cashins:
CashinID
ProductID
InoviceNo
Amount
ABC
456
987
1000
CDE
456
987
1000
The goal is to match cashins automatically to revenues (but only once!).
Both tables have their unique-ids but the columns used to join these tables are
ProductID
InvoiceNo
Amount
For entries with only one row in each table with those criteria, everything works fine.
Sometimes though, there are several rows that have the same value within these columns (as above) but with a unique ID (this is no error, but the way it is supposed to be).
The problem with it is, that while joining it results in a cartesian product.
To recreate the tables, here the statements:
DROP TABLE IF EXISTS Revenues
GO
CREATE TABLE Revenues
(
RevenueID [nvarchar](10) NULL,
ProductID [nvarchar](10) NULL,
InvoiceNo [nvarchar](10) NULL,
Amount money NULL
)
GO
DROP TABLE IF EXISTS CashIns
GO
CREATE TABLE CashIns
(
CashinID [nvarchar](10) NULL,
ProductID [nvarchar](10) NULL,
InvoiceNo [nvarchar](10) NULL,
Amount money NULL
)
GO
INSERT INTO [Revenues] VALUES ('123', '456', '987', 1000)
INSERT INTO [Revenues] VALUES ('234', '456', '987', 1000)
INSERT INTO [CashIns] VALUES ('ABC', '456', '987', 1000)
INSERT INTO [CashIns] VALUES ('BCD', '456', '987', 1000)
Desired output:
RevenueID
ProductID
InvoiceNo
Amount
CashinID
123
456
987
1000
ABC
234
456
987
1000
CDE
SELECT
R.RevenueID,
R.ProductID,
R.InvoiceNo,
R.Amount,
C.CashinID,
FROM
[Revenues] R
LEFT JOIN
[CashIns] C ON R.ProductID = C.ProductID
AND R.InvoiceNo = C.InvoiceNo
AND R.Amount = C.Amount
Results:
RevenueID
ProductID
InvoiceNo
Amount
CashinID
123
456
987
1000
ABC
123
456
987
1000
CDE
234
456
987
1000
ABC
234
456
987
1000
CDE
Which in theory makes sense, but I just can't seem to find a solution where each row is just used once.
Two things I found and tried are windowing functions and the OUTER APPLY function with a TOP(1) selection. Both came to the same result:
SELECT
*
FROM
[Revenues] R
OUTER APPLY
(SELECT TOP(1) *
FROM [CashIns] C) C
Which returns the desired columns from the Revenues table, but only matched the first appearance from the Cashins table:
RevenueID
ProductID
InvoiceNo
Amount
CashinID
123
456
987
1000
ABC
234
456
987
1000
ABC
I also thought about something like updating the Revenues table, so that the matched CashinID is next to a line and then check every time that the CashinID is not yet used within that table, but I couldn't make it work...
Many thanks in advance for any help or hint in the right direction!

As I said in my comment, you have a fundamental problem with your data relationships. You need to reference the unique identifier of the other table in one of your tables. If you don't do that, then you can only order your transactions in both tables and join them by the row number. You're using a hope and prayer to join your data instead of unreputable identifier's.
--This example orders the transactions in each transaction table and uses
--the order number to join them.
WITH RevPrelim AS (
SELECT *
, ROW_NUMBER() OVER(PARTITION BY InvoiceNo, ProductID, Amount ORDER BY RevenueID) AS row_num
FROM [Revenues] R
), CashinsPrelim AS (
SELECT *
, ROW_NUMBER() OVER(PARTITION BY InvoiceNo, ProductID, Amount ORDER BY CashinID) AS row_num
FROM [CashIns] AS C
)
SELECT *
FROM RevPrlim AS r
LEFT OUTER JOIN CashinsPrelim AS c
ON c.ProductID = r.ProductID
AND c.InvoiceNo = r.InvoiceNo
AND c.Amount = r.Amount
AND c.row_num = r.row_num

Related

SQL Check to see if any value in a list exists in another table

I have a temp table that looks something like this:
Record DepartmentId PositionId EmployeeId StatusId CustomerId
1 Null Null Null 4
2 7 454 Null Null
3 Null 454 Null 3
3 Null Null Null Null 214
3 Null Null Null Null 100
3 Null Null Null Null 312
4 Null Null Null Null 357
I inserted the above into the temp table from tables that looked like this:
Record Table Record-to-Department Record-to-Position
Record Name Record DepartmentId Record PositionId
1 Red 2 7 2 454
2 blue 3 454
3 Green
4 Purple
Record-To-Status Record-To-Customer
Record StatusId Record CustomerId
1 4 3 214
3 3 3 100
3 312
4 357
I have an employee whose record looks something like this:
EmployeeId DepartmentId PositionId StatusId
342 7 454 4
Employee Customers:
EmployeeId CustomerId
342 357
342 95
342 720
In this scenario, it would return Record 1 (because it matches the StatusId), Record 2 (because it matches both the DepartmentId and the PositionId), but it would not return Record 3 because it only matches the PositionId and not the StatusId, and it would return RecordId 4 because one of the Employee CustomerIds matches the CustomerId on Record 4.
I got part of this answer on another question enter link description here (please forgive me I am new and trying to figure out how to ask everything I need to know), but I can't figure out how to handle the multi-records.
I tried selecting the Employees customer Id's into a table variable and then attempted to use the Coalesce like this:
Declare #Customers table(CustomerId int)
INSERT INTO #Customers(CustomerId)
SELECT DISTINCT S.CustomerId
FROM employee_Customers
Select * from tbl
WHERE
COAlesce(StatusId,#StatusId)=#StatusId AND
COALESCE(DepartmentId,#DepartmentId)=#DepartmentId AND
Coalesce(PositionId,#PositionId)=#PositionId AND
Coalesce(EmployeeCompanyId,#EmployeeCompanyId) = #EmployeeCompanyId AND
COALESCE((Select CustomerId from tbl_Requirement_to_Customer),(Select CustomerId from #Customers)) = (Select CustomerId from #Customers)
But I receive the error "Subquery Returned more than 1 value".

I have a possible solution you can try. I don't think it will be plug-and-play but hopefully you can adapt it to your situation. I am using just the data as presented in your temp table, your employee record and your Employee-customers correlation.
The basic logic is to join your temp table to the employee(s) using or condition, but then to get a count of populated values, and compare this count to a count of the number of matching values, which must be at least the first count and greater than zero.
This returns your desired output:
select t.*
from Temp t
left join emp e on e.DepartmentId=t.DepartmentId or e.PositionId=t.PositionId or e.EmployeeId=t.EmployeeId or e.StatusId=t.StatusId
outer apply (
select case when exists (
select * from EmployeeCustomers ec join emp e on e.EmployeeId=ec.EmployeeId where ec.CustomerID=t.CustomerId
) then 1 else 0 end CustomerIdMatch
)c
outer apply (
values (
Iif(t.departmentId is null,0,1) +
Iif(t.PositionId is null,0,1) +
Iif(t.EmployeeId is null,0,1) +
Iif(t.StatusId is null,0,1) +
c.CustomerIdMatch
))x(Cnt)
outer apply (
values (
Iif(t.departmentId=e.DepartmentId,1,0) +
Iif(t.PositionId=e.PositionId,1,0) +
Iif(t.EmployeeId=e.EmployeeId,1,0) +
Iif(t.StatusId=e.StatusId,1,0) +
c.CustomerIdMatch
))y(Cnt2)
where cnt2>=cnt and cnt2>0
See working DB<>Fiddle

sql server 2012 merge values from two tables

I have two tables
tblA(sn, ID int pk, name varchar(50), amountA decimal(18,2))
and
tblB(ID int fk, amountB decimal(18,2))
here: tblA occures only once and tblB may occure multiple time
I need the query to display data like:
sn ID name AmountA amountB Balance
1 1001 abc 5000.00 5000.00
2 1002 xyz 10000.00
1002 4000.00 6000.00 (AmountA-AmountB)
3 1003 pqr 15000.00
1003 4000.00
1003 3000.00
1003 2000.00 6000.00 (AmountA-sum(AmountB))
Please ask if any confusion
I tried using lag and lead function but I couldnot get the desire result, Please help.

Since you are using SQL Server 2012, you can use a partition with an aggregate function (SUM):
SELECT t.sn,
t.ID,
t.name,
t.credits AS AmountA,
t.debits AS amountB,
SUM(t.credits - t.debits) OVER (PARTITION BY t.ID ORDER BY t.debits, t.credits) AS Balance
FROM
(
SELECT sn,
ID,
name,
AmountA AS credits,
0 AS debits
FROM tblA
UNION ALL
SELECT 0 AS sn,
ID,
NULL AS name,
0 AS credits,
amountB AS debits
FROM tblB
) t
ORDER BY t.ID,
t.debits,
t.credits
Explanation:
Since the records in tables A and B each represent a single transaction (i.e. a credit or debit), using a UNION query to bring both sets of data into a single table works well here. After this, I compute a rolling sum using the difference between credit and debit, for each record, for each ID partition group. The ordering is chosen such that credits appear at the top of each partition while debits appear on the bottom.

SQL Server: how to populate ID to table using an ID from another table?

I have TaxTable with primary key taxid
the structure is as follow
taxid Type noteID
1 A
2 G
3 G
4 G
I also have a noteTable the table looks like this
NoteID SNoteText
456 Hellow joe
457 Non-Taxable
458 Non-Taxable
459 Non-Taxable
Now I need to populate taxType=G with noteID that Snotetext=Non-Taxable
so the end result will be like follow
taxid Type noteID
1 A
2 G 457
3 G 458
4 G 459
The order does not really matter, 457 can be in taxID 2 or 3 or 4. I mean there is no link between noteID 457 and TaxID 2 , it can be anywhere.
The important key to know is that the total count of Snotetext="Non-Taxable" will always be the same as taxType=G. So in this example, there is 3 rows of TaxType=G and there is 3 rows of sNoteText="Non-Taxable" and this is important.
I hope that make sense. Thank you for the help

If you don't have a link between tables then create it. Something like this:
create table #taxTable(taxid int, [type] char(1),noteID int)
insert #taxTable(taxid,[type])
values (1,'A'),(2,'G'),(3,'G'),(4,'G'),
(5,'A'),(6,'G'),(7,'G'),(8,'G'),(9,'G')
create table #noteTable(NoteID int,SNoteText varchar(50))
insert #noteTable
values
(456, 'Hellow joe'),
(457 ,'Non-Taxable'),
(458 ,'Non-Taxable'),
(459 ,'Non-Taxable')
declare #cnt int
select #cnt=COUNT(noteid) from #noteTable where SNoteText='Non-Taxable'
-- It is possible to add 3rd CTE for cnt
;with n as (select row_number() over(order by noteID) rn,
noteId
from #noteTable where SNoteText='Non-Taxable'),
t as (select row_number() over(order by taxid) rn,
taxid,[type], noteid from #taxtable where [type]='G')
update t
set noteID = n.noteid
from n inner join t on n.rn= (t.rn%#cnt)+1
select * from #taxtable

SQL - Query log from Users table

Based on the following example : (it is a "QueryLog" table, this table store interactions between a user and two different products N and R):
Id Date UserID Product
--------------------------------------------------
0 2013-06-09 14:50:24.000 100 N
1 2013-06-09 15:27:23.000 100 N
2 2013-06-09 15:29:23.000 100 N
3 2013-06-17 15:31:23.000 100 N
4 2013-06-17 15:32:23.000 100 N
5 2014-05-19 15:30:23.000 250 N
6 2014-07-19 15:27:23.000 250 N
7 2014-07-19 15:27:23.000 333 R
8 2014-08-19 15:27:23.000 333 R
Expected results :
Count
-----
1
(Only UserID 250 is inside my criteria)
If one user interacts 10 times with the product in only one month, he's not in my criteria.
To resume, I am looking for :
The Number of distinct users that had interactions with product N on at least more than one month (what ever the number of interactions this user may have had during a single month)
This is the code I've tried:
select distinct v.UserID, v.mois , v.annee
from
(select c.UserID , c. mois, c.annee, COUNT(c.UserID) as frequence
from
(
SELECT
datepart(month,[DATE]) as mois,
datepart(YEAR,[DATE]) as annee ,
Username,
UserID,
Product
FROM QueryLog
where Product = 'N'
) c
group by c.UserID, c.annee, c.mois
) v
group by v.UserID, v.mois, v.annee

try this:
DECLARE #YourTable table (Id int, [Date] datetime, UserID int, Product char(1))
INSERT INTO #YourTable VALUES (0,'2013-06-09 14:50:24',100 ,'N')
,(1,'2013-06-09 15:27:23',100 ,'N')
,(2,'2013-06-09 15:29:23',100 ,'N')
,(3,'2013-06-17 15:31:23',100 ,'N')
,(4,'2013-06-17 15:32:23',100 ,'N')
,(5,'2014-05-19 15:30:23',250 ,'N')
,(6,'2014-07-19 15:27:23',250 ,'N')
,(7,'2014-07-19 15:27:23',333 ,'R')
,(8,'2014-08-19 15:27:23',333 ,'R')
;WITH MultiMonthUsers AS
(
select
UserID
FROM (select
UserID
FROM #YourTable
WHERE product='N'
GROUP BY UserID, YEAR([Date]),MONTH([Date])
)dt2
GROUP BY UserID
HAVING COUNT(*)>1
)
SELECT COUNT(*) FROM MultiMonthUsers
Depending on number of rows and indexes, this will run slow. Using YEAR([Date]),MONTH([Date]) will prevent any index usage.

I think this will do it, but I need a better dataset to test with:
SELECT COUNT(*)
FROM (
--roll all month/user records into single row
SELECT UserID, datediff(month 0, [date]) As MonthGroup
FROM QueryLog
WHERE Product='N'
GROUP BY datediff(month 0, [date]), UserId
) t
-- look for users with multiple rows
GROUP BY UserID
HAVING COUNT(UserID) > 1
Seems like there should be a way to roll this up further, to avoid the need for the nested select.

SQL Server - Update Column with Handing Duplicate and Unique Rows Based Upon Timestamp

I'm working with SQL Server 2005 and looking to export some data off of a table I have. However, prior to do that I need to update a status column based upon a field called "VisitNumber", which can contain multiple entries same value entries. I have a table set up in the following manner. There are more columns to it, but I am just putting in what's relevant to my issue
ID Name MyReport VisitNumber DateTimeStamp Status
-- --------- -------- ----------- ----------------------- ------
1 Test John Test123 123 2014-01-01 05.00.00.000
2 Test John Test456 123 2014-01-01 07.00.00.000
3 Test Sue Test123 555 2014-01-02 08.00.00.000
4 Test Ann Test123 888 2014-01-02 09.00.00.000
5 Test Ann Test456 888 2014-01-02 10.00.00.000
6 Test Ann Test789 888 2014-01-02 11.00.00.000
Field Notes
ID column is a unique ID in incremental numbers
MyReport is a text value and can actually be thousands of characters. Shortened for simplicity. In my scenario the text would be completely different
Rest of fields are varchar
My Goal
I need to address putting in a status of "F" for two conditions:
* If there is only one VisitNumber, update the status column of "F"
* If there is more than one visit number, only put "F" for the one based upon the earliest timestamp. For the other ones, put in a status of "A"
So going back to my table, here is the expectation
ID Name MyReport VisitNumber DateTimeStamp Status
-- --------- -------- ----------- ----------------------- ------
1 Test John Test123 123 2014-01-01 05.00.00.000 F
2 Test John Test456 123 2014-01-01 07.00.00.000 A
3 Test Sue Test123 555 2014-01-02 08.00.00.000 F
4 Test Ann Test123 888 2014-01-02 09.00.00.000 F
5 Test Ann Test456 888 2014-01-02 10.00.00.000 A
6 Test Ann Test789 888 2014-01-02 11.00.00.000 A
I was thinking I could handle this by splitting each types of duplicates/triplicates+ (2,3,4,5). Then updating every other (or every 3,4,5 rows). Then delete those from the original table and combine them together to export the data in SSIS. But I am thinking there is a much more efficient way of handling it.
Any thoughts? I can accomplish this by updating the table directly in SQL for this status column and then export normally through SSIS. Or if there is some way I can manipulate the column for the exact conditions I need, I can do it all in SSIS. I am just not sure how to proceed with this.

WITH cte AS
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY VisitNumber ORDER BY DateTimeStamp) rn from MyTable
)
UPDATE cte
SET [status] = (CASE WHEN rn = 1 THEN 'F' ELSE 'A' END)

I put together a test script to check the results. For your purposes, use the update statements and replace the temp table with your table name.
create table #temp1 (id int, [name] varchar(50), myreport varchar(50), visitnumber varchar(50), dts datetime, [status] varchar(1))
insert into #temp1 (id,[name],myreport,visitnumber, dts) values (1,'Test John','Test123','123','2014-01-01 05:00')
insert into #temp1 (id,[name],myreport,visitnumber, dts) values (2,'Test John','Test456','123','2014-01-01 07:00')
insert into #temp1 (id,[name],myreport,visitnumber, dts) values (3,'Test Sue','Test123','555','2014-01-01 08:00')
insert into #temp1 (id,[name],myreport,visitnumber, dts) values (4,'Test Ann','Test123','888','2014-01-01 09:00')
insert into #temp1 (id,[name],myreport,visitnumber, dts) values (5,'Test Ann','Test456','888','2014-01-01 10:00')
insert into #temp1 (id,[name],myreport,visitnumber, dts) values (6,'Test Ann','Test789','888','2014-01-01 11:00')
select * from #temp1;
update #temp1 set status = 'F'
where id in (
select id from #temp1 t1
join (select min(dts) as mindts, visitnumber
from #temp1
group by visitNumber) t2
on t1.visitnumber = t2.visitnumber
and t1.dts = t2.mindts)
update #temp1 set status = 'A'
where id not in (
select id from #temp1 t1
join (select min(dts) as mindts, visitnumber
from #temp1
group by visitNumber) t2
on t1.visitnumber = t2.visitnumber
and t1.dts = t2.mindts)
select * from #temp1;
drop table #temp1
Hope this helps

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight