Count the match values and display in matrix form - sql-server

I have the following tables with some sample data:
Table: TblTestDB
CREATE TABLE TblTestDB (id int,name varchar(100));
INSERT INTO TblTestDB VALUES(1,'Sam'),(2,'Jack'),
(3,'Rock'),(4,'Don'),(5,'Tam');
Table: TblDB1
CREATE TABLE TblDB1 (id int,name varchar(100));
INSERT INTO TblDB1 VALUES(1,'Sam'),(2,'Jack'),
(3,'Rock');
Table: TblDB2
CREATE TABLE TblDB2 (id int,name varchar(100));
INSERT INTO TblDB2 VALUES(1,'Jazz'),(2,'Dsouze'),
(3,'Rock'),(4,'Jack');
Table: TblDB3
CREATE TABLE TblDB3 (id int,name varchar(100));
INSERT INTO TblDB3 VALUES(1,'Sam'),(2,'Jazz'),
(3,'Rock');
I want to show the result in the form of:
TableName Name ID
------------------------------
TblDB1 3 3
TblDB2 2 4
TblDB3 2 3
Explaination about result set: I want to show count of column values which are matching between table TblTestDB and others(TblDB1,TblDB2,TblDB3).
Tried with the following query:
Query:
SELECT DB,MAX(Name) AS Name, MAX(ID) AS ID
FROM
(
SELECT 'TblDB1' AS DB,COUNT(a.Name) AS Name,0 AS ID
FROM TblTestDB a
INNER JOIN TblDB1 b ON a.Name = b.Name
UNION
SELECT 'TblDB2' AS DB,COUNT(a.Name) AS Name,0 AS ID
FROM TblTestDB a
INNER JOIN TblDB2 b ON a.Name = b.Name
UNION
SELECT 'TblDB3' AS DB,COUNT(a.Name) AS Name,0 AS ID
FROM TblTestDB a
INNER JOIN TblDB3 b ON a.Name = b.Name
UNION
SELECT 'TblDB1' AS DB,0 AS Name,COUNT(a.ID) AS ID
FROM TblTestDB a
INNER JOIN TblDB1 b ON a.ID = b.ID
UNION
SELECT 'TblDB2' AS DB,0 AS Name,COUNT(a.ID) AS ID
FROM TblTestDB a
INNER JOIN TblDB2 b ON a.ID = b.ID
UNION
SELECT 'TblDB3' AS DB,0 AS Name,COUNT(a.ID) AS ID
FROM TblTestDB a
INNER JOIN TblDB3 b ON a.ID = b.ID
) a
GROUP BY DB
Issue: I may get more than 10 columns to show like matrix/pivot, the above query grows as per the columns list.
Any better way to make it short and simple.

SELECT DISTINCT 'TblDB1' AS DB
,COUNT(CASE WHEN a.Name = b.Name THEN 1 ELSE NULL END) AS Name
,COUNT(CASE WHEN a.ID = b.ID THEN 1 ELSE NULL END) AS ID
FROM TblTestDB a
CROSS JOIN TblDB1 b
UNION ALL
SELECT DISTINCT 'TblDB2' AS DB
,COUNT(CASE WHEN a.Name = b.Name THEN 1 ELSE NULL END) AS Name
,COUNT(CASE WHEN a.ID = b.ID THEN 1 ELSE NULL END) AS ID
FROM TblTestDB a
CROSS JOIN TblDB2 b
UNION ALL
SELECT DISTINCT 'TblDB3' AS DB
,COUNT(CASE WHEN a.Name = b.Name THEN 1 ELSE NULL END) AS Name
,COUNT(CASE WHEN a.ID = b.ID THEN 1 ELSE NULL END) AS ID
FROM TblTestDB a
CROSS JOIN TblDB3 b

This simple enough?
SELECT
tableName, sum(s.[Name]) as ncnt, max(s.id) as ID
FROM TblTestDB t
cross apply (
SELECT tableName, count(name) as [name], max(id) as ID
FROM (
SELECT 'TblDB1' as tid, * FROM TblDB1
union
SELECT 'TblDB2' as tid, * FROM TblDB2
union
SELECT 'TblDB3' as tid, * FROM TblDB3
) i
where i.name = t.name
group by tableName
) s
group by tableName
Not sure what you mean by, "I may get more than 10 columns to show like matrix/pivot".
Eg, if you're going to add a Surname column, should that be matched independently, or in combination with the Name column? (The former might get a bit complicated...)

Related

Query in getting multiple duplicate rows in SQL Server

I have 2 tables Table1 and Table2 in which I want to get the total count of duplicate rows:
Expected output:
Query tested:
SELECT
t1.name,
t1.duplicates,
ISNULL(t2.active, 0) AS active,
ISNULL(t3.inactive, 0) AS inactive
FROM
(SELECT
t1.name, COUNT(*) AS duplicates
FROM
(SELECT c.name
FROM table1 c
INNER JOIN table2 as cd on cd.id = c.id)) t1
GROUP BY
name
HAVING
COUNT(*) > 1) t1
LEFT JOIN
(SELECT c.name, COUNT(*) AS active
FROM table1 c
WHERE name IN (SELECT c.name FROM table1 c)
GROUP BY c.name AND status = 'Active'
GROUP BY name) t2 ON t1.name = t2.name
LEFT JOIN
(SELECT c.name, COUNT(*) AS inactive
FROM table1 c
WHERE name IN (SELECT c.name FROM table1 c GROUP BY c.name)
AND status = 'InActive'
GROUP BY name) t3 ON t1.name = t3.name
ORDER BY
name
It is still returning duplicate rows and I'm unable to get the id and creator column
If you would pardon subquery and left join, i'd suggest the following query:
select b.*,
count(creator) as creator_count
from
(select a.mainid,
a.name,
sum(case when a.status = "active"
then 1 else 0 end) as active_count,
sum(case when a.status = "inactive"
then 1 else 0 end) as inactive_count,
count(a.name) as duplicate_count
from table1 as a
group by a.name
having count(a.name) > 1) as b
left join table2 as c
on b.mainid = c.mainid
group by c.mainid
having count(c.creator) > 1
rather than forcing our way to join the two table directly. First, derive the information we can get from the Table1 then join it with the Table2 to get the creator count.
SQL Fiddle: http://sqlfiddle.com/#!9/4daa19e/28

Unexpected result using CTE to perform a random join on two tables for all rows one-to-many

I am attempting to randomly join the rows of two tables (TableA and TableB) such that each row in TableA is joined to only one row in TableB and every row in TableB is joined to at least one row in TableA.
For example, a random join of TableA with 5 distinct rows and TableB with 3 distinct rows should result in something like this:
TableA TableB
1 3
2 1
3 1
4 2
5 1
However, sometimes not all the rows from TableB are included in the final result; so in the example above might have row 2 from TableB missing because in its place is either row 1 or 3 joined to row 4 on TableA. You can see this occur by executing the script a number of times and checking the result. It seems that it is necessary for some reason to use an interim table (#Q) to be able to ensure that a correct result is returned which has all rows from both TableA and TableB.
Can someone please explain why this is happening?
Also, can someone please advise on what would be a better way to get the desired result?
I understand that sometimes no result is returned due to a failure of some kind in the cross apply and ordering which i have yet to identify and goes to the point that I am sure there is a better way to perform this operation. I hope that makes sense. Thanks in advance!
declare #TableA table (
ID int
);
declare #TableB table (
ID int
);
declare #Q table (
RN int,
TableAID int,
TableBID int
);
with cte as (
select
1 as ID
union all
select
ID + 1
from cte
where ID < 5
)
insert #TableA (ID)
select ID from cte;
with cte as (
select
1 as ID
union all
select
ID + 1
from cte
where ID < 3
)
insert #TableB (ID)
select ID from cte;
select * from #TableA;
select * from #TableB;
with cte as (
select
row_number() over (partition by TableAID order by newid()) as RN,
TableAID,
TableBID
from (
select
a.ID as TableAID,
b.ID as TableBID
from #TableA as a
cross apply #TableB as b
) as M
)
select --All rows from TableB not always included
TableAID,
TableBID
from cte
where RN in (
select
top 1
iCTE.RN
from cte as iCTE
group by iCTE.RN
having count(distinct iCTE.TableBID) = (
select count(1) from #TableB
)
)
order by TableAID;
with cte as (
select
row_number() over (partition by TableAID order by newid()) as RN,
TableAID,
TableBID
from (
select
a.ID as TableAID,
b.ID as TableBID
from #TableA as a
cross apply #TableB as b
) as M
)
insert #Q
select
RN,
TableAID,
TableBID
from cte;
select * from #Q;
select --All rows from both TableA and TableB included
TableAID,
TableBID
from #Q
where RN in (
select
top 1
iQ.RN
from #Q as iQ
group by iQ.RN
having count(distinct iQ.TableBID) = (
select count(1) from #TableB
)
)
order by TableAID;
See if this gives you what you're looking for...
DECLARE
#CountA INT = (SELECT COUNT(*) FROM #TableA ta),
#CountB INT = (SELECT COUNT(*) FROM #TableB tb),
#MinCount INT;
SELECT #MinCount = CASE WHEN #CountA < #CountB THEN #CountA ELSE #CountB END;
WITH
cte_A1 AS (
SELECT
*,
rn = ROW_NUMBER() OVER (ORDER BY NEWID())
FROM
#TableA ta
),
cte_B1 AS (
SELECT
*,
rn = ROW_NUMBER() OVER (ORDER BY NEWID())
FROM
#TableB tb
),
cte_A2 AS (
SELECT
a1.ID,
rn = CASE WHEN a1.rn > #MinCount THEN a1.rn - #MinCount ELSE a1.rn end
FROM
cte_A1 a1
),
cte_B2 AS (
SELECT
b1.ID,
rn = CASE WHEN b1.rn > #MinCount THEN b1.rn - #MinCount ELSE b1.rn end
FROM
cte_B1 b1
)
SELECT
A = a.ID,
B = b.ID
FROM
cte_A2 a
JOIN cte_B2 b
ON a.rn = b.rn;

How to get running total by ranges?

How to get a running total based on range:
name, Level, value, runningtotal;
A 5 5;
B 1 3 10;
B 2 2 10;
C 1 11;
I can't tell what column you are doing your running total on. But in 2008 the best way of doing a running total is a self join like this:
CREATE TABLE #Test(
ID INT NOT NULL PRIMARY KEY,
AValue INT NOT NULL)
INSERT INTO #Test VALUES (1,4), (2,2), (3,18)
SELECT T1.ID, T1.AValue, SUM(T2.AValue) RunningTotal
FROM #Test T1
JOIN #Test T2 ON T1.ID >= T2.ID
GROUP BY T1.ID, T1.AValue
DROP TABLE #Test
with cte(name,value) as
(
select name,sum(value) as val from A group by name
)
,
cte2 as
(
SELECT c.name,SUM(b.value) as val
FROM cte c inner join
cte b
on b.name <= c.name
GROUP BY c.name
)
select c.name,c.value,b.val from A c inner join cte2 b on c.name=b.name
order by c.name

SQL query for displaying count if same name comes in adjacent row it should show the count else 1

I have a table tb1 with columns id,name,
if same name comes in adjacent row it should display the count count else 1
For eg:
id name
1 sam
2 jose
3 sam
4 sam
5 dev
6 jose
Result want to be
name counts
sam 1
jose 1
sam 2
dev 1
jose 1
please help.
Check out this one :(SELF JOIN)
create table #sampele(id int,name varchar(50))
insert into #sampele values(1,'sam')
insert into #sampele values(2,'jose')
insert into #sampele values(3,'sam')
insert into #sampele values(4,'sam')
insert into #sampele values(5,'dev')
insert into #sampele values(6,'jose')
select a.id,a.name,case when a.name = b.name then 2 else 1 end as cnt from
#sampele a
left outer join
#sampele b
on a.id = b.id+1
Try a combination with a sub query, "COUNT(*) OVER (PARTITION", and row_number():
--DROP TABLE #Test;
SELECT id = IDENTITY(INT,1,1), name INTO #Test FROM
(
SELECT name = 'sam' UNION ALL
SELECT 'jose' UNION ALL
SELECT 'sam ' UNION ALL
SELECT 'sam ' UNION ALL
SELECT 'sam ' UNION ALL
SELECT 'dev ' UNION ALL
SELECT 'dev ' UNION ALL
SELECT 'jose' UNION ALL
SELECT 'sam ' UNION ALL
SELECT 'sam ' UNION ALL
SELECT 'jose'
) a;
GO
WITH GetEndID AS (
SELECT *
, EndID =(SELECT MIN(id) FROM #Test b WHERE b.name != a.name AND b.id > a.id)
FROM #Test a
), GetCount AS
(
SELECT
*
, NameCount = COUNT(*) OVER (PARTITION BY EndID)
, OrderPrio = ROW_NUMBER() OVER (PARTITION BY EndID ORDER BY id)
FROM GetEndID
)
SELECT id, name, NameCount FROM GetCount WHERE OrderPrio = 1 ORDER BY id;
select distinct a.name,case when a.name = b.name then 2 else 1 end as cnt from
tb1 a
left outer join
tb1 b
on a.id = b.id+1
sQlfiddle
Click to see running

Need Unique Value from table

I have two tables:
Table A
ID Name
1 abc
2 xyz
Table B
ID Name
1 abc
2 xyz
3 mno
I need the distinct value form above two table, I mean i want only ID 3 Name mno from Table B (as it is unique from two table)
Please let me know how I can get this value.
Thanks,
Ajay
This query will get you the rows from B that don't exist in A:
SELECT b.* FROM TableB b
OUTER JOIN TableA a ON a.ID = b.ID AND a.Name = b.Name
WHERE a.ID IS NULL
you could then do the adverse and use a UNION ALL to get it both ways:
SELECT a.* FROM TableA a
OUTER JOIN TableB b ON b.ID = a.ID AND b.Name = a.Name
WHERE b.ID IS NULL
UNION ALL
SELECT b.* FROM TableB b
OUTER JOIN TableA a ON a.ID = b.ID AND a.Name = b.Name
WHERE a.ID IS NULL
Another way of achieving it would be:
;WITH MatchingRows AS (
SELECT a.ID FROM TableA a
JOIN TableB b ON b.ID = a.ID AND b.Name = a.Name
)
SELECT * FROM TableA
WHERE ID NOT IN (SELECT m.ID FROM MatchingRows m)
UNION ALL
SELECT * FROM TableB
WHERE ID NOT IN (SELECT m.ID FROM MatchingRows m)
I'm not sure if that performs better or not - it's just something I thought of. If I'm not mistaken this will actually run the WITH query twice (see the answer to this question) because it's being used twice - so there may be some performance implications with this approach.
The EXCEPT operator may work for you. Here is an example using your data.
CREATE TABLE TableA (id int, name varchar(50))
INSERT INTO TableA VALUES (1, 'abc'),(2,'xyz')
CREATE TABLE TableB (id int, name varchar(50))
INSERT INTO TableB VALUES (1, 'abc'),(2,'xyz'),(3,'mno')
SELECT * FROM TableB
EXCEPT
SELECT * FROM TableA
Be warned though it acts like UNION. It's only going to exclude rows where there is an exact match on all columns.

Resources