Convert Frequency Table Back to Non-Frequency Table (ungroup-ing) - sql-server

In SQL Server, I have the following table (snippet) which is the source data I receive (I cannot get the raw table it was generated from).
Gradelevel | YoS | Inventory
4 | 0 | 4000
4 | 1 | 3500
4 | 2 | 2000
The first row of the table is saying for grade level 4, there are 4,000 people with 0 years of service (YoS).
I need to find the median YoS for each Grade level. This would be easy if the table wasn't given to me aggregated up to the Gradelevel/YoS level with a sum in the Inventory column, but sadly I'm not so lucky.
What I need is to ungroup this table such that I have a new table where the first record is in the table 4,000 times, the next record 3,500 times, the next 2,000, etc (the inventory column would not be in this new table). Then I could take the percent_disc() of the YoS column by grade level and get the median. I could also then use other statistical functions on YoS to glean other insights from the data.
So far I've looked at unpivot (doesn't appear to be a candidate for my use case), CTEs (can't find an example close to what I'm trying to do), and a function which iterates through the above table inserting the number of rows indicated by the value in inventory to a new table which becomes my 'ungrouped' table I can run statistical analyses on. I believe the last approach is the best option available to me but the examples I've all seen iterate and focus on a single column from a table. I need to iterate through each row, then use the gradelevel, and yos values to insert [inventory] number of times before moving on to the next row.
Is anyone aware of:
A better way to do this other then the iteration/cursor method?
How to iterate through a table to accomplish my goal? I've been reading Is there a way to loop through a table variable in TSQL without using a cursor? but am having a hard time figuring out how to apply that iteration to my use case.
Edit 10/3, here is the looping code I got working which produces the same as John's cross apply. Pro is any statistical function can then be run on it, con is it is slow.
--this table will hold our row (non-frequency) based inventory data
DROP TABLE IF EXISTS #tempinv
CREATE TABLE #tempinv(
amcosversionid INT NOT null,
pp NVARCHAR(3) NOT NULL,
gl INT NOT NULL,
yos INT NOT NULL
)
-- to transform the inventory frequency table to a row based inventory we need to iterate through it
DECLARE #MyCursor CURSOR, #pp AS NVARCHAR(3), #gl AS INT, #yos AS INT, #inv AS int
BEGIN
SET #MyCursor = CURSOR FOR
SELECT payplan, gradelevel, step_yos, SUM(inventory) AS inventory
FROM
mytable
GROUP BY payplan, gradelevel, step_yos
OPEN #MyCursor
FETCH NEXT FROM #MyCursor
INTO #pp, #GL, #yos, #inv
WHILE ##FETCH_STATUS = 0
BEGIN
DECLARE #i int
SET #i = 1
--insert into our new table for each number of people in inventory
WHILE #i<=#inv
BEGIN
INSERT INTO #tempinv (pp,gl,yos) VALUES (#pp,#gl,#yos)
SET #i = #i + 1
END
FETCH NEXT FROM #MyCursor
INTO #pp, #GL, #yos, #inv
END;

One Option is to use an CROSS APPLY in concert with an ad-hoc tally table. This will "expand" your data into N rows. Then you can perform any desired analysis you want.
Example
Select *
From YourTable A
Cross Apply (
Select Top ([Inventory]) N=Row_Number() Over (Order By (Select NULL))
From master..spt_values n1, master..spt_values n2
) B
Returns
Grd Yos Inven N
4 0 4000 1
4 0 4000 2
4 0 4000 3
4 0 4000 4
4 0 4000 5
...
4 0 4000 3998
4 0 4000 3999
4 0 4000 4000
4 1 3500 1
4 1 3500 2
4 1 3500 3
4 1 3500 4
...
4 1 3500 3499
4 1 3500 3500
4 2 2000 1
4 2 2000 2
4 2 2000 3
...
4 2 2000 1999
4 2 2000 2000

Related

Move a specific column for every rows of a group from a result set into a single row

I tried using pivot in SQL Server, but I'm just going in circles with no good results.
I have this result set that could vary in number of records:
ForeignID
Name
Value
1
A
1
1
B
2
1
C
3
2
D
4
How can I do a SELECT to get this for all rows with ForeignID of 1:
ForeignID
A
B
C
1
1
2
3

A WHILE LOOP in SQL that I want to have loop once with 5 updates through all rows

I have a WHILE LOOP in an SQL query.
I have a table with 5 ROWS matching the counter
I'm randomizing 2048 rows and want to INSERT 1 - 5 over those rows, randomly into a single column but what I'm getting is, the query loops once over 2048 and inserts "1", then it loops a second time and inserts "5", then inserts, "3", then "4", and finally "2".
What I seek is loop through one time through the 2048 rows and insert randomly, 1 - 5 through 2048 rows (1 time) in the single column.
Here's the SQL which works but wrong.
declare #counter int
SET #counter = 1
BEGIN TRAN
WHILE (#counter <= 6)
BEGIN
SELECT id, city, wage_level
FROM myFirstTable
ORDER BY NEWID()
UPDATE myFirstTable
SET wage_level = #counter
SET #counter = #counter + 1
CONTINUE
END
COMMIT TRAN
The values in the table that contain 5 rows are irrelevant but fact that the "IDs" in that table are from 1 - 5 "ARE."
I'm close, but no cigar...
The result should be something like this:
id-----city------wage_level
---------------------
1 Denver 2
2 Chicago 3
3 Seattle 5
4 Los Angeles 1
5 Boise 4
---
2047 Charleston 2
2048 Rochester 1
And so on...
Thanks, everyone
No need for a loop. SQL works best on a set based approach.
Here is one way to do it:
Create and populate sample table (Please save us this step in your future questions)
CREATE TABLE myFirstTable
(
id int identity(1,1),
city varchar(20),
wage_level int
)
INSERT INTO myFirstTable (city) VALUES
('Denver'),
('Chicago'),
('Seattle'),
('Los Angeles'),
('Boise')
The update statement:
UPDATE myFirstTable
SET wage_level = (ABS(CHECKSUM(NEWID())) % 5) + 1
Check the update:
SELECT *
FROM myFirstTable
Results:
id city wage_level
1 Denver 3
2 Chicago 3
3 Seattle 2
4 Los Angeles 4
5 Boise 3
Explanation: use NEWID() to generate a guid, CHECKSUM() to get a number based on that guid, ABS() to get only positive values, % 5 to get only values between 0 and 4, and finally, + 1 to get only values between 1 and 5:

SQL Server : how to loop through a table applying multi-record updates to another?

I am looking to update records in a table X, based on a table Y containing details of the updates. The difficulties are that
One row of Y represents an update to a specified number of records in X.
There may be multiple records in Y that specify different updates to records in X that are alike in the field by which they are matched; in this case, the updates should be applied to disjoint subsets of X.
Suppose X = materials(id, type_id, status, data); Y = material_updates(run_id, type_id, quantity, data)
(id is just an internal primary key field)
Then what I'd like to do is (the equivalent of) to loop through a simple query like
SELECT *
FROM material_updates
WHERE run_id = :run;
and for each row of the result set, apply something like
UPDATE TOP(row.quantity) materials
SET data = row.data, status = 1
WHERE status = 0 AND type_id = row.type_id;
(the change to status happens to be constant in the problem I am trying to solve)
Sample data
materials_update table:
run_id type_id quantity data
1 1 3 42
1 2 2 69
1 2 1 105
materials table before the update:
type_id status data
1 1 17
1 1 17
1 0 0
1 0 0
1 0 0
1 0 0
2 0 0
2 0 0
2 0 0
2 0 0
materials table after the update:
type_id status data
1 1 17
1 1 17
1 1 42
1 1 42
1 1 42
1 0 0
2 1 69
2 1 69
2 1 105
2 0 0
I think it can be done using a cursor, but is this the best solution, or is there a more efficient way?
This is perfect for a CURSOR (msdn link), which allows you to iterate over the results of a query row-by-row and perform operations for each one.
This one here is a good tutorial about it.
Your need would be solved by this piece of code:
-- the best fit for this code would be a Stored Procedure with one parameter
-- which is the run_id value you want.
-- error checking omitted for brevity
DECLARE CURSOR theCursor
FOR SELECT type_id, quantity, data FROM material_updates WHERE run_id = #run_id;
DECLARE #type_id int; -- types should match your material_updates fields
DECLARE #quantity int;
DECLARE #data int;
OPEN theCursor;
FETCH NEXT FROM theCursor INTO #type_id, #quantity, #data;
WHILE ##FETCH_STATUS = 0
BEGIN
UPDATE TOP(#quantity) materials
SET data = #data, status = 1
WHERE status = 0 AND type_id = #type_id;
END;
CLOSE theCursor;
DEALLOCATE theCursor;
Another solution would be using UPDATE FROM (SO already has info about it) but I'm not aware of a way to make it update a specific quantity of rows. It most likely can't do this.
Beware though that the data you're going to end up with makes no sense, because there is no ORDER: you'll never know which rows will be/have been updated.

How to calculate distinct & continuous range

This is the table structure which records each query on a cache.
SequenceId CacheInstance QueryCondition
------------------------------------------
1 100 'x=1 '
2 100 'x=1'
3 100 'y=a'
4 100 'x=1'
5 200 'x=1'
5 200 'x=1'
Is there a simple statement to get the folloing "distinct count"?
CacheInstance QueryCondition distinctcount
-------------------------------------------
100 'x=1' 2
100 'y=a' 1
200 'x=1' 1
If'x=1' occurs continuously, it is counted as same one. But if it occurs after a different query condition, the distinct count will increase 1.
try this...using group by
select CacheInstance,QueryCondition ,COUNT(QueryCondition) as distinctcount from YourtableName group by CacheInstance,QueryCondition
Use group by for those two columns
CREATE TABLE Cache
(
SequenceId int,
CacheInstance int,
QueryCondition nvarchar(20)
)
SELECT CacheInstance, QueryCondition, COUNT(QueryCondition)
FROM Cache
GROUP BY CacheInstance, QueryCondition

Find out the values between a range in SQL Server 2005(SET BASED APPROACH)?

I have a table like
Id Value
1 Start
2 Normal
3 End
4 Normal
5 Start
6 Normal
7 Normal
8 End
9 Normal
I have to bring the output like
id Value
1 Start
2 Normal
3 End
5 Start
6 Normal
7 Normal
8 End
i.e. the records between Start & End. Records with id's 4 & 9 are outside the Start & End henceforth are not there in the output.
How to do this in set based manner (SQLServer 2005)?
Load a table #t:
declare #t table(Id int,Value nvarchar(100));
insert into #t values (1,'Start'),(2,'Normal'),(3,'End'),(4,'Normal'),(5,'Start'),(6,'Normal'),(7,'Normal'),(8,'End'),(9,'Normal');
Query:
With RangesT as (
select Id, (select top 1 Id from #t where Id>p.Id and Value='End' order by Id asc) Id_to
from #t p
where Value='Start'
)
select crossT.*
from RangesT p
cross apply (
select * from #t where Id>=p.Id and Id<=Id_to
) crossT
order by Id
Note that I'm assuming no overlaps. The result:
Id Value
----------- ------
1 Start
2 Normal
3 End
5 Start
6 Normal
7 Normal
8 End

Resources