T-SQL getting all unique groups with their usage count

T-SQL getting all unique groups with their usage count - sql-server

How do I find the unique groups that are present in my table, and display how often that type of group is used?
For example (SQL Server 2008R2)
So, I would like to find out how many times the combination of
PMI 100
RT 100
VT 100
is present in my table and for how many itemid's it is used;
These three form a group because together they are assigned to a single itemid. The same combination is assigned to id 2527 and 2529, so therefore this group is used at least twice. (usagecount = 2)
(and I want to know that for all types of groups that are appearing)
The entire dataset is quite large, about 5.000.000 records, so I'd like to avoid using a cursor.
The number of code/pct combinations per itemid varies between 1 and 6.
The values in the "code" field are not known up front, there are more than a dozen values on average
I tried using pivot, but I got stuck eventually and I also tried various combinations of GROUP-BY and counts.
Any bright ideas?
Example output:
code pct groupid usagecount
PMI 100 1 234
RT 100 1 234
VT 100 1 234
CD 5 2 567
PMI 100 2 567
VT 100 2 567
PMI 100 3 123
PT 100 3 123
VT 100 3 123
RT 100 4 39
VT 100 4 39
etc

Just using a simple group:
SELECT
code
, pct
, COUNT(*)
FROM myTable
GROUP BY
code
, pct
Not too sure if that's more like what you're looking for:
select
uniqueGrp
, count(*)
from (
select distinct
itemid
from myTable
) as I
cross apply (
select
cast(code as varchar(max)) + cast(pct as varchar(max)) + '_'
from myTable
where myTable.itemid = I.itemid
order by code, pct
for xml path('')
) as x(uniqueGrp)
group by uniqueGrp

Either of these should return each combination of code and percentage with a group id for the code and the total number of instances of the code against it. You can use them for also adding the number of instances of the specific code/pct combo too for determining % contribution etc
select
distinct
t.code, t.pct, v.groupcol, v.vol
from
[tablename] t
inner join (select code, rank() over(order by count(*)) as groupcol,
count(*) as vol from [tablename] s
group by code) v on v.code=t.code
or
select
t.code, t.pct, v.groupcol, v.vol
from
(select code, pct from [tablename] group by code, pct) t
inner join (select code, rank() over(order by count(*)) as groupcol,
count(*) as vol from [tablename] s
group by code) v on v.code=t.code

Grouping by Code, and Pct should be enough I think. See the following :
select code,pct,count(p.*)
from [table] as p
group by code,pct

Related

Choose row that equal to the max value from a query

I want to know who has the most friends from the app I own(transactions), which means it can be either he got paid, or paid himself to many other users.
I can't make the query to show me only those who have the max friends number (it can be 1 or many, and it can be changed so I can't use limit).
;with relationships as
(
select
paid as 'auser',
Member_No as 'afriend'
from Payments$
union all
select
member_no as 'auser',
paid as 'afriend'
from Payments$
),
DistinctRelationships AS (
SELECT DISTINCT *
FROM relationships
)
select
afriend,
count(*) cnt
from DistinctRelationShips
GROUP BY
afriend
order by
count(*) desc
I just can't figure it out, I've tried count, max(count), where = max, nothing worked.
It's a two columns table - "Member_No" and "Paid" - member pays the money, and the paid is the one who got the money.
Member_No
Paid
14
18
17
1
12
20
12
11
20
8
6
3
2
4
9
20
8
10
5
20
14
16
5
2
12
1
14
10
It's from Excel, but I loaded it into sql-server.
It's just a sample, there are 1000 more rows

It seems like you are massively over-complicating this. There is no need for self-joining.
Just unpivot each row so you have both sides of the relationship, then group it up by one side and count distinct of the other side
SELECT
-- for just the first then SELECT TOP (1)
-- for all that tie for the top place use SELECT TOP (1) WITH TIES
v.Id,
Relationships = COUNT(DISTINCT v.Other),
TotalTransactions = COUNT(*)
FROM Payments$ p
CROSS APPLY (VALUES
(p.Member_No, p.Paid),
(p.Paid, p.Member_No)
) v(Id, Other)
GROUP BY
v.Id
ORDER BY
COUNT(DISTINCT v.Other) DESC;
db<>fiddle

How to repeat a record n times - SQL Server

I'm querying webdata that returns a list of items and the quantity owned. I need to translate that into multiple records - one for each item owned. For example, I might see this result: {"part_id": 118,"quantity": 3}. But in my database I need to be able to interact with each item individually, to assign them locations, properties, etc.
It would look like this:
Part_ID CopyNum
-------------------
118 1
118 2
118 3
In the past, I've kept a table I called [Count] that was just a list of integers from 1 to 100 and I did a cross join with the condition that Count.Num <= Qty
I'd like to do this without the Count table, which seems like a hack. How can I do this on the fly?

If you don't have a tally/numbers table (highly recommended), you can use an ad-hoc tally table in concert with a CROSS APPLY
Example
Declare #YourTable Table ([Part_ID] int,[Quantity] int) Insert Into #YourTable Values
(118,3)
,(125,2)
Select A.Part_ID
,CopyNum = B.N
From #YourTable A
Cross Apply ( Select Top (Quantity) N=Row_Number() Over (Order By (Select NULL))
From master..spt_values n1, master..spt_values n2
) B
Returns
Part_ID CopyNum
118 1
118 2
118 3
125 1
125 2

Results from query values in one column

I am just curios about something I've never come across in sql server before.
This query:
SELECT N FROM (VALUES(0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) T(N)
gives me result:
+---+
| N |
+---+
| 0 |
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
| 6 |
| 7 |
| 8 |
| 9 |
+---+
What is the rule here? Obviously this is aligning all values into one column. Is sql server's grammar that defines this with T(N)?
On the other side, this query gives results by separate columns:
select 0,1,2,3,4,5,6,7,8,9
I just don't understand why results from the first query aligned all into one column?

The values clause is similar what you can use in the insert statement, and it's called Table Value Constructor. Your example has only one column and several rows, but you can also have multiple columns separated by comma. The T(N) define you the alias name for the table (T) and name for the column (N).

James Z is right on the money, but to expand on what it does in the answer you were referencing:
In the code that is pulled from, that section is used to start numbers table for a stacked cte. The numbers themselves don't matter, but I like them like that. They could all be 1, or 0, it would not change how it is used in this instance.
Basically we have 10 rows, and then we are going to cross join it to self N number of times to increase the row count until as many or more than we need. In the cross join I alias n with the resulting amount of rows deka is 10, hecto is 100, kilo is 1,000, et cetera.
Here is a similar query outside of the function that you were referencing:
declare #fromdate date = '20000101';
declare #years int = 30;
;with n as (select n from (values(0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) t(n))
, dates as (
select top (datediff(day, #fromdate,dateadd(year,#years,#fromdate)))
[Date]=convert(date,dateadd(day,row_number() over(order by (select 1))-1,#fromdate))
from n as deka cross join n as hecto cross join n as kilo
cross join n as tenK cross join n as hundredK
order by [Date]
)
select [Date]
from dates;
The stacked cte is very efficient for generating or simulating a numbers or dates table, though using an actual numbers or calendar table will perform better as the scale increases.
Check these out for related benchmarks:
Generate a set or sequence without loops - 1 - Aaron Bertrand
Generate a set or sequence without loops - 2 - Aaron Bertrand
Generate a set or sequence without loops - 3 - Aaron Bertrand
In hist articles, Aaron Bertrand creates a stacked cte using
;WITH e1(n) AS
(
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
),
e2(n) AS (SELECT 1 FROM e1 CROSS JOIN e1 AS b),
....

Turning string into rows

I have an old vintage system with a table looking like this.
OptionsTable
id options
=== ========================
101 Apple,Banana
102 Audi,Mercedes,Volkswagen
In the application that consumes the data, a function will break down the options column into manageable lists and populate dropdowns etc.
The problem is that this kind of data isn't very SQL friendly, making it difficult to make ad-hoc queries and reports.
To that end, I'd like to transform the data into a friendlier view, looking like this:
OptionsView
id name value
=== ========== =====
101 Apple 1
101 Banana 2
102 Audi 1
102 Mercedes 2
102 Volkswagen 3
Now, there have been some topics on splitting string into rows in t-sql (Turning a Comma Separated string into individual rows comes to mind), but apart from splitting the strings into rows, I also need to generate values based on the position in the string.
The plan is to make a view that hides the uglines of the original table.
It will be used in a join with the table housing the answers in order to make ad-hoc statistical queries.
Is there a good way of doing this without having to use cursors etc?

Perhaps adding a udf is overkill for your needs, but I created a split function a long time ago that returns the value, the startposition within the string and the index. With it, the usage in this scenario would be:
select id, String as [Name], ItemIndex as value from OptionsTable
outer apply dbo.Split(options, ',')
Results:
id Name value
101 Apple 1
101 Banana 2
102 Audi 1
102 Mercedes 2
102 Volkswagen 3
And the split function (unrevised since then):
ALTER function [dbo].[Split] (
#StringToSplit varchar(2048),
#Separator varchar(128))
returns table as return
with indices as
(
select 0 S, 1 E, 0 I
union all
select E, charindex(#Separator, #StringToSplit, E) + len(#Separator) , I + 1
from indices
where E > S
)
select substring(#StringToSplit,S,
case when E > len(#Separator) then e-s-len(#Separator) else len(#StringToSplit) - s + 1 end) String
,S StartIndex, I ItemIndex
from indices where S >0

This should work for you:
DECLARE #OptionsTable TABLE
(
id INT
, options VARCHAR(100)
);
INSERT INTO #OptionsTable (id, options)
VALUES (101, 'Apple,Banana')
, (102, 'Audi,Mercedes,Volkswagen');
SELECT OT.id, T.name, t.value
FROM #OptionsTable AS OT
CROSS APPLY (
SELECT T.column1, ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM dbo.GetTableFromList(OT.options, ',') AS T
) AS T(name, value);
Here dbo.GetTableFromList is a split string function.
CROSS APPLY executes this function for each row resulting in options split into names in seperate rows. And I used ROW_NUMBER() to add value row, If you want to order result set by name, please use ROW_NUMBER() OVER (ORDER BY t.column1), that should and probably will make results look consistent all the time.
Result:
id name value
-----------------
101 Apple 1
101 Banana 2
102 Audi 1
102 Mercedes 2
102 Volkswagen 3

You could convert your string to XML and then parse the string to transpose it to rows something like this:
SELECT A.[id]
,Split.a.value('.', 'VARCHAR(100)') AS Name
,ROW_NUMBER() OVER (PARTITION BY [id] ORDER BY (SELECT NULL)) as Value
FROM (
SELECT [id]
,CAST('<M>' + REPLACE([options], ',', '</M><M>') + '</M>' AS XML) AS Name
FROM optionstable
) AS A
CROSS APPLY Name.nodes('/M') AS Split(a);
Credits: #SRIRAM
SQL Fiddle Demo

Removing Duplicates of two columns in a query

I have a select * query which gives lots of row and lots of columns of results. I have an issue with duplicates of one column A when given the same value of another column B that I would like to only include one of.
Basically I have a column that tells me the "name" of object and another that tells me the "number". Sometimes I have an object "name" with more than one entry for a given object "number". I only want distinct "numbers" within a "name" but I want the query to give the entire table when this is true and not just these two columns.
Name Number ColumnC ColumnD
Bob 1 93 12
Bob 2 432 546
Bob 3 443 76
This example above is fine
Name Number ColumnC ColumnD
Bob 1 93 12
Bob 2 432 546
Bill 1 443 76
Bill 2 54 1856
This example above is fine
Name Number ColumnC ColumnD
Bob 1 93 12
Bob 2 432 546
Bob 2 209 17
This example above is not fine, I only want one of the Bob 2's.

Try it if you are using SQL 2005 or above:
With ranked_records AS
(
select *,
ROW_NUMBER() OVER(Partition By name, number Order By name) [ranked]
from MyTable
)
select * from ranked_records
where ranked = 1

If you just want the Name and number, then
SELECT DISTINCT Name, Number FROM Table1
If you want to know how many of each there are, then
SELECT Name, Number, COUNT(*) FROM Table1 GROUP BY Name, Number

By using a Common Table Expression (CTE) and the ROW_NUMBER OVER PARTION syntax as follows:
WITH
CTE AS
(
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY Name, Number ORDER BY Name, Number) AS R
FROM
dbo.ATable
)
SELECT
*
FROM
CTE
WHERE
R = 1

WITH
CTE AS
(
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY Plant, BatchNumber ORDER BY Plant, BatchNumber) AS R
FROM dbo.StatisticalReports WHERE dbo.StatisticalReports. \!"FermBatchStartTime\!" >= DATEADD(d,-90, getdate())
)
SELECT
*
FROM
CTE
WHERE
R = 1
ORDER BY dbo.StatisticalReports.Plant, dbo.StatisticalReports.FermBatchStartTime

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

T-SQL getting all unique groups with their usage count - sql-server

Grouping by Code, and Pct should be enough I think. See the following : select code,pct,count(p.*) from [table] as p group by code,pct

Related

Choose row that equal to the max value from a query

How to repeat a record n times - SQL Server

Results from query values in one column

Turning string into rows

Removing Duplicates of two columns in a query

Categories

Resources