How to calculate distinct & continuous range - sql-server

This is the table structure which records each query on a cache.
SequenceId CacheInstance QueryCondition
------------------------------------------
1 100 'x=1 '
2 100 'x=1'
3 100 'y=a'
4 100 'x=1'
5 200 'x=1'
5 200 'x=1'
Is there a simple statement to get the folloing "distinct count"?
CacheInstance QueryCondition distinctcount
-------------------------------------------
100 'x=1' 2
100 'y=a' 1
200 'x=1' 1
If'x=1' occurs continuously, it is counted as same one. But if it occurs after a different query condition, the distinct count will increase 1.

try this...using group by
select CacheInstance,QueryCondition ,COUNT(QueryCondition) as distinctcount from YourtableName group by CacheInstance,QueryCondition

Use group by for those two columns
CREATE TABLE Cache
(
SequenceId int,
CacheInstance int,
QueryCondition nvarchar(20)
)
SELECT CacheInstance, QueryCondition, COUNT(QueryCondition)
FROM Cache
GROUP BY CacheInstance, QueryCondition

Related

Convert Frequency Table Back to Non-Frequency Table (ungroup-ing)

In SQL Server, I have the following table (snippet) which is the source data I receive (I cannot get the raw table it was generated from).
Gradelevel | YoS | Inventory
4 | 0 | 4000
4 | 1 | 3500
4 | 2 | 2000
The first row of the table is saying for grade level 4, there are 4,000 people with 0 years of service (YoS).
I need to find the median YoS for each Grade level. This would be easy if the table wasn't given to me aggregated up to the Gradelevel/YoS level with a sum in the Inventory column, but sadly I'm not so lucky.
What I need is to ungroup this table such that I have a new table where the first record is in the table 4,000 times, the next record 3,500 times, the next 2,000, etc (the inventory column would not be in this new table). Then I could take the percent_disc() of the YoS column by grade level and get the median. I could also then use other statistical functions on YoS to glean other insights from the data.
So far I've looked at unpivot (doesn't appear to be a candidate for my use case), CTEs (can't find an example close to what I'm trying to do), and a function which iterates through the above table inserting the number of rows indicated by the value in inventory to a new table which becomes my 'ungrouped' table I can run statistical analyses on. I believe the last approach is the best option available to me but the examples I've all seen iterate and focus on a single column from a table. I need to iterate through each row, then use the gradelevel, and yos values to insert [inventory] number of times before moving on to the next row.
Is anyone aware of:
A better way to do this other then the iteration/cursor method?
How to iterate through a table to accomplish my goal? I've been reading Is there a way to loop through a table variable in TSQL without using a cursor? but am having a hard time figuring out how to apply that iteration to my use case.
Edit 10/3, here is the looping code I got working which produces the same as John's cross apply. Pro is any statistical function can then be run on it, con is it is slow.
--this table will hold our row (non-frequency) based inventory data
DROP TABLE IF EXISTS #tempinv
CREATE TABLE #tempinv(
amcosversionid INT NOT null,
pp NVARCHAR(3) NOT NULL,
gl INT NOT NULL,
yos INT NOT NULL
)
-- to transform the inventory frequency table to a row based inventory we need to iterate through it
DECLARE #MyCursor CURSOR, #pp AS NVARCHAR(3), #gl AS INT, #yos AS INT, #inv AS int
BEGIN
SET #MyCursor = CURSOR FOR
SELECT payplan, gradelevel, step_yos, SUM(inventory) AS inventory
FROM
mytable
GROUP BY payplan, gradelevel, step_yos
OPEN #MyCursor
FETCH NEXT FROM #MyCursor
INTO #pp, #GL, #yos, #inv
WHILE ##FETCH_STATUS = 0
BEGIN
DECLARE #i int
SET #i = 1
--insert into our new table for each number of people in inventory
WHILE #i<=#inv
BEGIN
INSERT INTO #tempinv (pp,gl,yos) VALUES (#pp,#gl,#yos)
SET #i = #i + 1
END
FETCH NEXT FROM #MyCursor
INTO #pp, #GL, #yos, #inv
END;
One Option is to use an CROSS APPLY in concert with an ad-hoc tally table. This will "expand" your data into N rows. Then you can perform any desired analysis you want.
Example
Select *
From YourTable A
Cross Apply (
Select Top ([Inventory]) N=Row_Number() Over (Order By (Select NULL))
From master..spt_values n1, master..spt_values n2
) B
Returns
Grd Yos Inven N
4 0 4000 1
4 0 4000 2
4 0 4000 3
4 0 4000 4
4 0 4000 5
...
4 0 4000 3998
4 0 4000 3999
4 0 4000 4000
4 1 3500 1
4 1 3500 2
4 1 3500 3
4 1 3500 4
...
4 1 3500 3499
4 1 3500 3500
4 2 2000 1
4 2 2000 2
4 2 2000 3
...
4 2 2000 1999
4 2 2000 2000

How to fix Aggregation in Group By, missing aggregation values

I have a table of sales info, and am interested in Grouping by customer, and returning the sum, count, max of a few columns. Any ideas please.
I checked all the Select columns are included in the Group By statement, a detail is returned not the Groupings and aggregate values.
I tried some explicit naming but that didn't help.
SELECT
customerID AS CUST,
COUNT([InvoiceID]) AS Count_Invoice,
SUM([Income]) AS Total_Income,
SUM([inc2015]) AS Tot_2015_Income,
SUM([inc2016]) AS Tot_2016_Income,
MAX([prodA]) AS prod_A,
FROM [table_a]
GROUP BY
customerID, InvoiceID,Income,inc2015, inc2016, prodA
There are multiple rows of CUST, i.e. there should be one row for CUST 1, 2 etc.... it should say this...
---------------------------------------------
CUST Count_Invoice Total_Income Tot_2015_Income Tot_2016_Income prod_A
1 2 600 300 300 2
BUT IT IS RETURNING THIS
======================================
CUST Count_Invoice Total_Income Tot_2015_Income Tot_2016_Income prod_A
1 1 300 300 0 1
1 1 300 0 300 1
2 1 300 0 300 1
2 1 500 0 500 0
3 2 800 0 800 0
3 1 300 0 300 1
You don't need to group by other columns, since they are already aggregating by count, min, max or sum.
So you may try this
SELECT customerID as CUST
,count([InvoiceID]) as Count_Invoice
,sum([Income]) as Total_Income
,sum([inc2015]) as Tot_2015_Income
,sum([inc2016]) as Tot_2016_Income
,max([prodA]) as prod_A --- here you are taking Max but in output it seems like sum
FROM [table_a]
Group By customerID
Note: For column prod_A you are using max which gives 1 but in result it is showing 2 which is actually sum or count. Please check.
for more info you may find this link of Group by.
From the description of your expected output, you should be aggregating by customer alone:
SELECT
customerID A CUST,
COUNT([InvoiceID]) AS Count_Invoice,
SUM([Income]) AS Total_Income,
SUM([inc2015]) AS Tot_2015_Income,
SUM([inc2016]) AS Tot_2016_Income,
MAX([prodA]) AS prod_A
FROM [table_a]
GROUP BY
customerID;

Transitive Group Query on 2 Columns in SQL Server

I need help with a transitive query in SQL Server.
I have a table with [ID] and [GRPID].
I would like to update a third column [NEWGRPID] based on the following logic:
For each [ID], get its GRPID;
Get all of the IDs associated with the GRPID from (1);
Set [NEWGRPID] equal to an integer (variable that is incremented by 1), for all of the rows from step (2)
The idea is several of these IDs are "transitively" linked across different [GRPID]s, and should all be having the same [GRPID].
The below table is the expected result, with [NEWGRPID] populated.
ID GRPID NEWGRPID
----- ----- ------
1 345 1
1 777 1
2 777 1
3 345 1
3 777 1
4 345 1
4 999 1
5 345 1
5 877 1
6 999 1
7 877 1
8 555 2
9 555 2
Try this code:
IF OBJECT_ID('tempdb..#tmp') IS NOT NULL
BEGIN
DROP TABLE #tmp;
END;
SELECT GRPID, count (*) AS GRPCNT
INTO #tmp
FROM yourtable
GROUP BY GRPID
UPDATE TGT
SET TGT.NEWGRPID = SRC.GRPCNT
FROM yourtable TGT
JOIN #tmp ON #tmp.GRPID = TGT.GRPID
If the values are likely to change over time you should think about a computed column or a trigger.

How to create range table based on table values

I have this table
TableA
ID Value
1 125
2 400
3 99
4 130
5 300
6 350
7 399
..
..
I want below table as a Output where range off set (100) is pre defined. Range value 100= TableA values between 0-100, 200 means 101-200
ResultTable
Range Count
100 1
200 2
300 1
400 3
..
..
What is the best way to do it any idea suggestions.
Depends on RDMS you are using, the syntax will be a bit different (example is for Oracle), but general idea is
CREATE TABLE new_table AS
SELECT CAST(value/100 as INT)*100 as range, count(*) as cnt
GROUP BY CAST(value/100 as INT)*100
FROM old_table;

Need To select Data From One Table After Minus With One Value

I have a table and a single value
Table 1
SNo Amount
1 100
2 500
3 400
4 100
Value: 800
Now I want the result from the table is Minus the value and finally
I would like to having rolling subtraction, that is subtract the Value 800 From Table 1's first Amount and then apply it to subsequent rows. Eg:
for type-1
800 - 100 (Record1) = 700
700 - 500 (record2) = 200
200 - 400 (record3) = -200
The table records starts from record 3 with Balance Values Balance 200
Table-Output
SNo Amount
1 200
2 100
that means if minus 800 in first table the first 2 records will be removed and in third record 200 is Balance
The easiest way to do this would be to use a running aggregate. In your original example, you had two tables, and if this is the case, simply run a sum on that table like I am doing in the subselect and store that value in the variable I created #Sum.
The CTE calculates what the value would be as it is added together for each record, and is then added to the total calculated, and then keeps the ones that are positive.
I believe that this will fit your need.
DECLARE #Sum INT;
SET #Sum = 800;
WITH RunningTotals
AS (
SELECT [SNo]
, [Amount]
, [Amount] + (
SELECT ISNULL(SUM([Amount]), 0)
FROM [Table1] t2
WHERE t2.[SNo] < t.SNo
) [sums]
FROM [Table1] t
),
option_sums
AS (
SELECT ROW_NUMBER() OVER ( ORDER BY [SNo] ) [SNo]
, CASE WHEN ( [Sums] - #Sum ) > 0 THEN [Sums] - #Sum
ELSE [Amount]
END AS [Amount]
, sums
, [Amount] [OriginalAmount]
, [OriginalID] = [SNo]
FROM [RunningTotals] rt
WHERE ( [Sums] - #Sum ) > 0
)
SELECT [SNo]
, CASE [SNo]
WHEN 1 THEN [Amount]
ELSE [OriginalAmount]
END AS [Amount]
, [OriginalID]
FROM option_sums
SNo Amount OriginalID
--- ------ ----------
1 200 3
2 100 4
3 100 5
4 500 6
5 400 7
6 100 8
7 200 9

Resources