How Merge Duplicate Data using MergeTable?

How Merge Duplicate Data using MergeTable? - sql-server

There are duplicate record in my table. I wrote a query to find them. The result is like this:
+-----------+-------------+-------------+
| Row | NationalNo | Client ID |
+-----------+-------------+-------------+
| 1 | 10003 | 34 |
+-----------+-------------+-------------+
| 2 | 10003 | 75 |
+-----------+-------------+-------------+
| 1 | 20023 | 23 |
+-----------+-------------+-------------+
| 2 | 20023 | 55 |
+-----------+-------------+-------------+
| 3 | 20023 | 12 |
+-----------+-------------+-------------+
The above result means we have one client with National-No of 10003 whom inserted twice and another client with National-No of 20023 whom inserted 3 time in Client table.
But I am not going to delete the extra. I want to keep the first record active and the rest will be inactive.
The Task is to save this actions as History IN MergeTable. MergeTable has 3 Columns: ClientIDA, ClientIDB, Date
I want to Consider the records with Row of 1 As ClientIDA and rest of them As ClientIDB.
So the output needed to insert into MergeTable is:
+-----------+-----------+-------------+
| ClientIDA | ClientIDB | Date |
+-----------+-----------+-------------+
| 34 | 75 | 2014-06-10 |
+-----------+-----------+-------------+
| 23 | 55 | 2014-06-10 |
+-----------+-----------+-------------+
| 23 | 12 | 2014-06-10 |
+-----------+-----------+-------------+

Here is example how u can do.
You split your table into two (data which you insert and data which will not)
And then you just join this two tables.
DECLARE #duplicates TABLE (Row INT, NationalNo INT, ClientID INT)
INSERT INTO #duplicates (Row, NationalNo, ClientID) SELECT 1, 10003, 34
INSERT INTO #duplicates (Row, NationalNo, ClientID) SELECT 2, 10003, 75
INSERT INTO #duplicates (Row, NationalNo, ClientID) SELECT 1, 20023, 23
INSERT INTO #duplicates (Row, NationalNo, ClientID) SELECT 2, 20023, 55
INSERT INTO #duplicates (Row, NationalNo, ClientID) SELECT 3, 20023, 12
;WITH ClientIDA AS (
SELECT Row, NationalNo, ClientID
FROM #duplicates
WHERE Row = 1
), ClientIDB AS (
SELECT Row, NationalNo, ClientID
FROM #duplicates
WHERE Row != 1
)
SELECT A.ClientID AS ClientIDA, B.ClientID AS ClientIDB, GETDATE() AS DATE
FROM ClientIDB AS B
INNER JOIN ClientIDA AS A
ON A.NationalNo = B.NationalNo

Related

How to use last_value with group by with count in SQL Server?

I have table like:
name | timeStamp | previousValue | newValue
--------+---------------+-------------------+------------
Mark | 13.12.2020 | 123 | 155
Mark | 12.12.2020 | 123 | 12
Tom | 14.12.2020 | 123 | 534
Mark | 12.12.2020 | 123 | 31
Tom | 11.12.2020 | 123 | 84
Mark | 19.12.2020 | 123 | 33
Mark | 17.12.2020 | 123 | 96
John | 22.12.2020 | 123 | 69
John | 19.12.2020 | 123 | 33
I'd like to mix last_value, count (*) and group to get this result:
name | count | lastValue
--------+-----------+-------------
Mark | 5 | 33
Tom | 2 | 534
John | 2 | 69
This part:
select name, count(*)
from table
group by name
returns table:
name | count
--------+---------
Mark | 5
Tom | 2
John | 2
but I have to add the last value for each name.
How to do it?
Best regards!

LAST_VALUE is a windowed function, so you'll need to get that value first, and then aggregate:
WITH CTE AS(
SELECT [name],
[timeStamp], --This is a poor choice for a column's name. timestamp is a (deprecated) synonym of rowversion, and a rowversion is not a date and time value
previousValue,
newValue,
LAST_VALUE(newValue) OVER (PARTITION BY [name] ORDER BY [timeStamp] ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS lastValue
FROM dbo.YourTable)
SELECT [Name],
COUNT(*) AS [count],
lastValue
FROM CTE
GROUP BY [Name],
lastValue;

I got a solution that works, but here's another one:
SELECT
[name], COUNT([name]), [lastValue]
FROM (
SELECT
[name], FIRST_VALUE([newValue]) OVER (PARTITION BY [name] ORDER BY TimeStamp DESC ROWS UNBOUNDED PRECEDING) AS [lastValue]
FROM [table]
) xyz GROUP BY [name], [lastValue]
Keep well!

How to Select all Entrys from one Table, and SUM a subset of another table

I have a larger Database with Times that employees entered. They enter an activity, when it was and how long they spent on it, as well as a customer.
I'm now trying to return a table with all employees, that Sums their times, but only if it's timed for a subset of Customers. I can get either a table with The Correct times, but employees that didn't enter any time are omitted, or I get all employees but with the sum time from all customers.
The tables I have are:
EMPLOYEE for the employees
ACTIVITY for all activities
CUSTOMER for the customers
To have some "example Data":
| EMPLOYEE | | ACTIVITY |
+------------+---------+ +------------+------------+------------+
| I_EMPLOYEE | S_NAME1 | | I_EMPLOYEE | I_CUSTOMER | N_DURETIME |
+------------+---------+ +------------+------------+------------+
| 1 | A | | 1 | 1 | 5 |
| 2 | B | | 2 | 3 | 10 |
| 3 | C | | 1 | 3 | 15 |
+------------+---------+ | 3 | 2 | 10 |
| 1 | 2 | 10 |
+------------+------------+------------+
What i'd expect to get when i want all times except Customer 2:
+----------+----------+
| EMPLOYEE | DURETIME |
+----------+----------+
| 1 | 20 |
| 2 | 10 |
| 3 | - |
+----------+----------+
I get either of those two out:
+----------+----------+ +----------+----------+
| EMPLOYEE | DURETIME | | EMPLOYEE | DURETIME |
+----------+----------+ +----------+----------+
| 1 | 20 | | 1 | 30 |
| 2 | 10 | | 2 | 10 |
+----------+----------+ | 3 | 10 |
+----------+----------+
To get the correct times i use the following:
SELECT emp.S_NAME1 AS Mitarbeiter, SUM(act.N_DURETIME)/60 as Zeit
FROM EMPLOYEE AS emp
LEFT JOIN ACTIVITY AS act on act.I_EMPLOYEE = emp.I_EMPLOYEE
LEFT JOIN CUSTOMER AS cust on cust.I_CUSTOMER = act.I_CUSTOMER
WHERE cust.CUSTNO NOT '2'
to get the full list of employees i used:
SELECT emp.S_NAME1 AS Mitarbeiter, SUM(act.N_DURETIME)/60 as Zeit
FROM EMPLOYEE AS emp
LEFT JOIN ACTIVITY AS act on act.I_EMPLOYEE = emp.I_EMPLOYEE
LEFT JOIN CUSTOMER AS cust on cust.I_CUSTOMER = act.I_CUSTOMER AND cust.CUSTNO NOT '2'
So, depending on whether I put my "Customer Filter" in the JOIN or the WHERE statement, I get half of the correct table. How can I combine those to get the correct output?

Create Table #emp
(
i_emp Int,
s_name1 Char(1)
)
Insert Into #emp Values
(1,'A'),
(2,'B'),
(3,'C')
Create Table #Activity
(
i_emp Int,
i_cust Int,
n_duretime Int
)
Insert Into #Activity Values
(1,1,5),
(2,3,10),
(1,3,15),
(3,2,10),
(1,2,10)
Query
Select
e.i_emp,
Sum(Case When a.i_cust = 2 Then Null Else a.n_duretime End) As durationTot
From
#emp e Left Join
#Activity a On e.i_emp = a.i_emp
Group By
e.i_emp
Result:
i_emp durationTot
1 20
2 10
3 NULL

You can try the following query
create table Employee(I_EMPLOYEE int, S_NAME1 char(1))
insert into Employee Values (1, 'A'),(2, 'B'),(3, 'C')
create table ACTIVITY (I_EMPLOYEE int, I_CUSTOMER int, N_DURETIME int)
insert into ACTIVITY Values(1, 1, 5 ),( 2, 3, 10), (1, 3, 15), ( 3, 2, 10), ( 1 , 2 , 10 )
select EMPLOYEE, sum(isnull(DURETIME, 0)) as DURETIME from(
select EMPLOYEE.S_NAME1 as EMPLOYEE, case I_Customer when 2 then 0 else N_DURETIME end as DURETIME from activity
inner join Employee on activity.I_EMPLOYEE = Employee.I_EMPLOYEE
)a group by EMPLOYEE
Below is the output
I_EMPLOYEE EMPLOYEE DURETIME
--------------------------------
1 A 20
2 B 10
3 C 0

Delete duplicate rows from temp table in SQL

I have a table with the below columns
+-------+------------+------------+
| AssID | QuestionID | AnswerText |
+-------+------------+------------+
| 12 | 34 | Null |
| 12 | 34 | Sample |
| 13 | 35 | null |
| 13 | 35 | test1 |
+-------+------------+------------+
I need to remove answertext null row with same AssId and QuestionID
Final Output needs to be in this format
+-------+------------+------------+
| AssId | QuestionID | AnswerText |
+-------+------------+------------+
| 12 | 34 | Sample |
| 13 | 35 | test1 |
+-------+------------+------------+
Please help me with the delete query
Thanks in advance
Sree

You can use exist to see if the NULL answerText row also has a Non-Null answerText Row
DELETE t
FROM MyTABLE t
WHERE t.AnswerText IS NULL
AND EXISTS
(
SELECT *
FROM MyTable m
WHERE m.AssID = t.AssID
AND m.QuestionID = t.QuestionID
AND m.AnswerText IS NOT NULL
)

You can use cte and row_number to delete
;with cte as (
select *, RowN = Row_number() over (partition by assid, questionid order by answertext) from yourtable
)--or order by your id because you have not provided logic for which one to select in answertext
delete from cte where RowN > 1

Update All other Records Based on a single record

I have a table with a million records. I need to update some columns which are null based on the existing 'not null' records of a particular id based columns. I've tried with one query, it seems to be working fine but I don't have confidence in it that it will be able to update all those 1 million records exactly the way I need. I'm providing you some sample data how my table looks like.Any help will be appreciated
SELECT * INTO #TEST FROM (
SELECT 1 AS EMP_ID,10 AS DEPT_ID,15 AS ITEM_NBR ,NULL AS AMOUNT,NULL AS ITEM_NME
UNION ALL
SELECT 1,20,16,500,'ABCD'
UNION ALL
SELECT 1,30,17,NULL,NULL
UNION ALL
SELECT 2,10,15,1000,'XYZ'
UNION ALL
SELECT 2,30,16,NULL,NULL
UNION ALL
SELECT 2,40,17,NULL,NULL
) AS A
Sample data:
+--------+---------+----------+--------+----------+
| EMP_ID | DEPT_ID | ITEM_NBR | AMOUNT | ITEM_NME |
+--------+---------+----------+--------+----------+
| 1 | 10 | 15 | NULL | NULL |
| 1 | 20 | 16 | 500 | ABCD |
| 1 | 30 | 17 | NULL | NULL |
| 2 | 10 | 15 | 1000 | XYZ |
| 2 | 30 | 16 | NULL | NULL |
| 2 | 40 | 17 | NULL | NULL |
+--------+---------+----------+--------+----------+
Expected result:
+--------+---------+----------+--------+----------+
| EMP_ID | DEPT_ID | ITEM_NBR | AMOUNT | ITEM_NME |
+--------+---------+----------+--------+----------+
| 1 | 10 | 15 | 500 | ABCD |
| 1 | 20 | 16 | 500 | ABCD |
| 1 | 30 | 17 | 500 | ABCD |
| 2 | 10 | 15 | 1000 | XYZ |
| 2 | 30 | 16 | 1000 | XYZ |
| 2 | 40 | 17 | 1000 | XYZ |
+--------+---------+----------+--------+----------+
I tried this but I'm unable to conclude whether it is updating all the 1 million records properly.
SELECT * FROM #TEST T
inner JOIN #TEST T1 ON T1.EMP_ID=T.EMP_ID
WHERE T1.AMOUNT IS NOT NULL
UPDATE T SET AMOUNT=T1.AMOUNT
FROM #TEST T
inner JOIN #TEST T1 ON T1.EMP_ID=T.EMP_ID
WHERE T1.AMOUNT IS not NULL

I have used UPDATE using inner join
UPDATE T
SET T.AMOUNT = X.AMT,T.ITEM_NME=X.I_N
FROM #TEST T
JOIN
(SELECT EMP_ID,MAX(AMOUNT) AS AMT,MAX(ITEM_NME) AS I_N
FROM #TEST
GROUP BY EMP_ID) X ON X.EMP_ID = T.EMP_ID

SELECT * into #Test1
FROM #TEST
WHERE AMOUNT IS NOT NULL
For records validation run this query first
SELECT T.AMOUNT, T1.AMOUNT, T1.EMP_ID,T1.EMP_ID
FROM #TEST T
inner JOIN #TEST1 T1 ON T1.EMP_ID=T.EMP_ID
WHERE T.AMOUNT IS NULL
Begin Trans
UPDATE T
SET T.AMOUNT=T1.AMOUNT, T.ITEM_NME= = T1.ITEM_NME
FROM #TEST T
inner JOIN #TEST1 T1 ON T1.EMP_ID=T.EMP_ID
WHERE T.AMOUNT IS NULL
rollback

SELECT EMP_ID,MAX(AMOUNT) as AMOUNT MAX(ITEM_NAME) as ITEM_NAME
INTO #t
FROM #TEST
GROUP BY EMP_ID
UPDATE t SET t.AMOUNT = t1.AMOUNT, t.ITEM_NAME = t1.ITEM_NAME
FROM #TEST t INNER JOIN #t t1
ON t.emp_id = t1.emp_id
WHERE t.AMOUNT IS NULL and t.ITEM_NAME IS NULL
Use MAX aggregate function to get amount and item name for each employee and then replace null values of amount and item name with those values. For validation use COUNT function to calculate the number of rows with values of amount and item name as null. If the number of rows is zero then table is updated correctly

skip records based on columns condition

I have a question in sql server
table name : Emp
Id |Pid |Firstname| LastName | Level
1 |101 | Ram |Kumar | 3
1 |100 | Ravi |Kumar | 2
2 |101 | Jaid |Balu | 10
1 |100 | Hari | Babu | 5
1 |103 | nani | Jai |44
1 |103 | Nani | Balu |10
3 |103 |bani |lalu |20
Here need to retrieve unique records based on id and Pid columns and records which have duplicate records need to skip.
Finally I want output like below
Id |Pid |Firstname| LastName | Level
1 |101 | Ram |Kumar | 3
2 |101 | Jaid |Balu | 10
3 |103 |bani |lalu |20
I found duplicate records based on below query
select id,pid,count(*) from emp group by id,pid having count(*) >=2
this query get duplicated records 2 that records need to skip to retrieve output
please tell me how to write query to achieve this task in sql server.

Since your output is based on unique ID and PID which do not have any duplicate value, You can use COUNT with partition to achieve your desired result.
SQL Fiddle
Sample Data
CREATE TABLE Emp
([Id] int, [Pid] int, [Firstname] varchar(4), [LastName] varchar(5), [Level] int);
INSERT INTO Emp
([Id], [Pid], [Firstname], [LastName], [Level])
VALUES
(1, 101, 'Ram', 'Kumar', 3),
(1, 100, 'Ravi', 'Kumar', 2),
(2, 101, 'Jaid', 'Balu', 10),
(1, 100, 'Hari', 'Babu', 5),
(1, 103, 'nani', 'Jai', 44),
(1, 103, 'Nani', 'Balu', 10),
(3, 103, 'bani', 'lalu', 20);
Query
SELECT *
FROM
(
SELECT *,rn = COUNT(*) OVER(PARTITION BY ID,PID)
FROM Emp
) Emp
WHERE rn = 1
Output
| Id | Pid | Firstname | LastName | Level |
|----|-----|-----------|----------|-------|
| 1 | 101 | Ram | Kumar | 3 |
| 2 | 101 | Jaid | Balu | 10 |
| 3 | 103 | bani | lalu | 20 |

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

How Merge Duplicate Data using MergeTable? - sql-server

Related

How to use last_value with group by with count in SQL Server?

How to Select all Entrys from one Table, and SUM a subset of another table

Delete duplicate rows from temp table in SQL

Update All other Records Based on a single record

skip records based on columns condition

Categories

Resources