Two tables,
TableA and TableB with column "filename" which has same value in both table.
only the number of occurance of data is different.
e.g
|###TableA#########|
|id|filename_TableA|
|01|file1 |
|02|file1 |
|03|file2 |
|04|file2 |
|05|file3 |
|06|file4 |
|## TableB ########|
|id|filename_TableB|
|01|file1 |
|02|file1 |
|03|file1 |
|04|file2 |
|05|file2 |
|06|file3 |
|07|file3 |
|08|file4 |
|09|file4 |
I need to generate a SQL query which shows the distinct filename with there
number of count and totalcount of the distinct filename.
like this:
using select count(distinct filename_TableA) as totalCount from TableA
gives the totalCount of filename but I am not able to generate the sql query for above result output.
Tried for single table:
select
filename_TableA,
count(filename_TableA)as filecount_TableA,
totalCount = (
select count(distinct filename_TableA) from TableA
)
from TableA
group by filename_TableA
Always try to break your problem into smaller parts!
Your question consists of two parts:
Get distinct files and counts from tableA
Get distinct files and counts from tableB
We write queries:
1.
SELECT filename_TableA, COUNT( * ) AS filecount_TableA
FROM TableA
GROUP BY filename_TableA
2.
SELECT filename_TableB, COUNT( * ) AS filecount_TableB
FROM TableB
GROUP BY filename_TableB
Check that the results of each individual query are correct.
Then we combine the queries:
SELECT filename_TableA, filename_TableB, filecount_TableA, filecount_TableB, ISNULL( filecount_TableA, 0 ) + ISNULL( filecount_TableB, 0 ) AS totalCount,
COUNT(*) OVER() AS UniqueFileCount
FROM
( SELECT filename_TableA, COUNT( * ) AS filecount_TableA
FROM TableA
GROUP BY filename_TableA ) AS A
FULL OUTER JOIN
( SELECT filename_TableB, COUNT( * ) AS filecount_TableB
FROM TableB
GROUP BY filename_TableB ) AS B
ON A.filename_TableA = filename_TableB
Note: To cover scenarios where a file name may appear in one table but not the other I have used FULL OUTER JOIN.
If you do not have such a scenario i.e. each file name will appear at least once in every table, then you should use INNER JOIN as it will be faster.
Related
I've been searching for this and it seems like it should be something simple, but apparently not so much. I want to return a resultSet within PostgreSQL 9.4.x using an array parameter so:
| id | count |
--------------
| 1 | 22 |
--------------
| 2 | 14 |
--------------
| 14 | 3 |
where I'm submitting a parameter of {'1','2','14'}.
Using something (clearly not) like:
SELECT id, count(a.*)
FROM tablename a
WHERE a.id::int IN array('{1,2,14}'::int);
I want to test it first of course, and then write it as a storedProc (function) to make this simple.
Forget it, here is the answer:
SELECT a.id,
COUNT(a.id)
FROM tableName a
WHERE a.id IN
(SELECT b.id
FROM tableName b
WHERE b.id = ANY('{1,2,14}'::int[])
)
GROUP BY a.id;
You can simplify to:
SELECT id, count(*) AS ct
FROM tbl
WHERE id = ANY('{1,2,14}'::int[])
GROUP BY 1;
More:
Check if value exists in Postgres array
To include IDs from the input array that are not found I suggest unnest() followed by a LEFT JOIN:
SELECT id, count(t.id) AS ct
FROM unnest('{1,2,14}'::int[]) id
LEFT JOIN tbl t USING (id)
GROUP BY 1;
Related:
Preserve all elements of an array while (left) joining to a table
If there can be NULL values in the array parameter as well as in the id column (which would be an odd design), you'd need (slower!) NULL-safe comparison:
SELECT id, count(t.id) AS ct
FROM unnest('{1,2,14}'::int[]) id
LEFT JOIN tbl t ON t.id IS NOT DISTINCT FROM id.id
GROUP BY 1;
I feel like I'm missing something really obvious here.
Using T-SQL/SQL-Server:
I have unique values in more than one column but want to select the max version based on one particular column.
Dataset:
Example
ID | Name| Version | Code
------------------------
1 | Car | 3 | NULL
1 | Car | 2 | 1000
1 | Car | 1 | 2000
Target status: I want my query to only select the row with the highest version value. Running a MAX on the version column pulls all three because of the distinct values in the 'Code' column:
SELECT ID
,Name
,MAX(Version)
,Code
FROM Table
GROUP BY ID, Name, Code
The net result is that I get all three entries as per the data set due to the unique values in the Code column, but I only want the top row (Version 3).
Any help would be appreciated.
You need to identify the row with the highest version as 1 query and use another outer query to pull out all the fields for that row. Like so:
SELECT t.ID, t.Name, GRP.Version, t.Code
FROM (
SELECT ID
,Name
,MAX(Version) as Version
FROM Table
GROUP BY ID, Name
) GRP
INNER JOIN Table t on GRP.ID = t.ID and GRP.Name = t.Name and GRP.Version = t.Version
You can also use row_number() to do this kind of logic, for example like this:
select ID, Name, Version, Code
from (
select *, row_number() over (order by Version desc) as RN
from Table1
) X where RN = 1
Example in SQL Fiddle
add the top statment to force the return of a single row. Also add the order by notation
SELECT top 1 ID
,Name
,MAX(Version)
,Code
FROM Table
GROUP BY ID, Name, Code
order by max(version) desc
I have a record table that is recording changes within a table. I can pull the data from the first table fine, however when i try to join in another table to add some of its column information it stops displaying the information.
PartNumber | PartDesc | value | date
1 | test | 1 | 3/4/2015
I wanted to include the Aisle tag's from the location table
PartNumber| AisleTag | AisleTagTwo
1 | A1 | N/A
here is what i have as my sql statement so far
Select t1.PartNumber, t1.PartDesc , t1.NewValue , t1.Date,t2.AisleTag,t2.AisleTagTwo
from InvRecord t1
JOIN PartAisleListTbl t2 ON t1.PartNumber = t2.PartNumber
where Date = (select max(Date) from InvRecord where t1.PartNumber = InvRecord.PartNumber)
order by t1.PartNumber
it is coming up blank, my original sql statement doesn't include anything from t2. I am not sure what approach to go with in terms of getting the data combined any help is much appreciated thank you !
this should be the end result
PartNumber | PartDesc | value | date | AisleTag | AisleTagTwo
1 | test | 1 | 3/4/2015 | A1 | N/A
Pull the most recent row (based on Date) for each PartNumber in Table A and append data from Table B (joined on PartNumber):
SELECT *
FROM (
SELECT A.PartNumber
, A.PartDesc
, A.NewValue
, A.Date
, B.AisleTag
, B.AisleTagTwo
, DateSeq = ROW_NUMBER() OVER(PARTITION BY A.PartNumber ORDER BY A.Date DESC)
FROM InvRecord A
LEFT JOIN PartAisleListTbl B
ON A.PartNumber = B.PartNumber
) A
WHERE A.DateSeq = 1
ORDER BY A.PartNumber
Are you returning no records at all, or only records with AisleTag and AisleTagTwo as null?
Your sentence "it is coming up blank, my original sql statement doesn't include anything from t2." makes it sound like you're getting records with nulls for the t2 fields.
If you are, then you probably have a record in t2 that has nulls for those fields.
For troubleshooting purposes, try running the query without the WHERE clause:
Select t1.PartNumber, t1.PartDesc , t1.NewValue , t1.Date,t2.AisleTag,t2.AisleTagTwo
from InvRecord t1
JOIN PartAisleListTbl t2 ON t1.PartNumber = t2.PartNumber
order by t1.PartNumber
If you do get records, your problem is with the WHERE clause. If you don't, your problem is with the PartNumber fields in InvRecord and PartAisleListTbl not matching.
Not sure why your's isn't working... is date in both t1 and t2 by any chance?
Here's it re factored to use a inline view instead of a correlated query wonder if it makes a difference.
Select t1.PartNumber, t1.PartDesc , t1.NewValue , t1.Date,t2.AisleTag,t2.AisleTagTwo
from InvRecord t1
JOIN PartAisleListTbl t2
ON t1.PartNumber = t2.PartNumber
JOIN (select max(Date) mdate, PartNumber from InvRecord GROUP BY PartNumber) t3
on t3.partNumber= T1.PartNumber
and T3.mdate = T1.Date
order by t1.PartNumber
So I have joined two tables to identify claims and their corresponding reversals if there are any.
The following is a simplified explanation as to what I have done: Join where MbrNo is the same in both tables, and where Amount=-Amount. So now I have an output table contians duplicate column names:
MbrNo | ClaimType | Amount | MbrNo | ClaimType | Amount
xyz | Medicine | R 300 | xyz | Reversal | - R300
I can not input this in a table as column names are not unique.
But I would like to
1. Format this table to look as follows
MbrNo | ClaimType | Amount
xyz | Medicine | R 300
xyz | Reversal | - R300
with t as
(
select *,
count(*) over(partition by [MbrNo], [DepNo], [PracticeNo], [DisciplineCd], [ServiceDt],[PayAmt]) as rownum
from Claims
)
Select * from
(Select * from t where PayAmt<0) a
left outer join
(Select * from t where PayAmt>0) b
on a.[MbrNo]=b.[MbrNo]
and a.[DepNo]=b.[DepNo]
and a.[PracticeNo]=b.[PracticeNo]
and a.[DisciplineCd]=b.[DisciplineCd]
and a.[ServiceDt]=b.[ServiceDt]
and a.[PayAmt]=-b.[PayAmt]
Basically I want to put the 2nd table in the joined table underneath the first table.
Please help:(
If I've understood your requirements correctly then I think you want the UNION operator. See if this gets you going in the right direction.
with t as
(
select *,
count(*) over(partition by [MbrNo], [DepNo], [PracticeNo], [DisciplineCd], [ServiceDt],[PayAmt]) as rownum
from Claims
)
Select t.* from t where PayAmt < 0
union all
select b.* from
(Select * from t where PayAmt < 0) a
inner join
(Select * from t where PayAmt > 0) b
on a.[MbrNo] = b.[MbrNo]
and a.[DepNo] = b.[DepNo]
and a.[PracticeNo] = b.[PracticeNo]
and a.[DisciplineCd] = b.[DisciplineCd]
and a.[ServiceDt] = b.[ServiceDt]
and a.[PayAmt] = -b.[PayAmt]
I want to delete the duplicate(same from and same to values) tuples from a table but need to keep the tuple with minimum object id(object id is pk).
So here are columns:
from | to | time | distance | object_id
I can see the correct number of tuples that will be deleted by executing
select [from],[to],count(*)
FROM table
where [object_id] NOT IN(
SELECT min([object_id])
FROM table
group by [from],[to]
having count(*) > 1)
group by [from],[to]
having count(*) > 1
but I want to first see the object_id's which are counted on the SQL above.
You could try with this (untested)...
;WITH temp AS (
SELECT [from_id], [to_id], [object_id] = min([object_id])
FROM table
group by [from_id],[to_id]
having count(*) > 1)
SELECT
t2.[from_id],
t2.[to_id],
t.[object_id]
FROM
table t
join temp t2
on t2.[from_id] = t.[from_id]
AND t2.[to_id] = t.[to_id]
AND t2.[object_id] != t.[object_id]
EDIT:
CTE temp will yield all distinct from/to groupings with min object_id, one you would like to keep.
SELECT [from_id], [to_id], [object_id] = min([object_id])
FROM table
group by [from_id],[to_id]
having count(*) > 1
There are other pairs you would like to remove, and these are same from/to pairs, but with different object_id. Last select should output those records exactly.