How to generate permutations in Oracle? - database

In Oracle, I have a table of object types.
I would like to generate all the permutations on ITEM_PURPOSE_CODE.
The table looks something like this:
ITEM_PURPOSE_CODE ITEM_CATEGORY_ID ITEM_ID
==========================================
1 101 50
2 202 94
2 202 95
What I would like then, is to generate a bunch of table types representing the permutations, for example:
ITEM_PURPOSE_CODE ITEM_CATEGORY_ID ITEM_ID
==========================================
1 101 50
2 202 94
and
ITEM_PURPOSE_CODE ITEM_CATEGORY_ID ITEM_ID
==========================================
1 101 50
2 202 95
Obviously this is a very simple case. There could be any number of item purpose codes (1 to n) and these codes could be repeated any number of times for differing item category IDs/item IDs.
Thanks for any advice.

Please find the solution to generating combinations here. It was a nice variant on a previous problem we've had in our software for real estate development.
Create and fill datamodel
First set up:
create table contents
( item_purpose_code number
, item_category_id number
, item_id number
)
/
begin
insert into contents values (1, 101, 50);
insert into contents values (2, 202, 94);
insert into contents values (2, 202, 95);
commit;
end;
/
Assisting views
First I create some views. But ofcourse you can also inline them or use with.
--
-- Add to each row the consecutive number of the driver columns
-- (here only item_purpose_code) and for each different value
-- for the driver columns a consecutive number that restarts
-- when a new driver column value starts.
--
create or replace force view sequencedrows
as
select item_purpose_code
, item_category_id
, item_id
, dense_rank()
over
( order
by item_purpose_code
) driver_seq
, row_number()
over
( partition
by item_purpose_code
order
by item_category_id
, item_id
)
values_per_driver_seq
from contents
/
--
-- Generate list of combinations.
--
create or replace force view combinations
as
select sys_connect_by_path (driver_seq || '-' || values_per_driver_seq, '#') || '#' combination
from sequencedrows
where level = ( select max(driver_seq) from sequencedrows )
start
with driver_seq = 1
connect
by
nocycle driver_seq = prior driver_seq + 1
/
With these, it becomes really simple since the combination is already contained in the field combination and the rows have been numbered:
select c.combination
, s.item_purpose_code
, s.item_category_id
, s.item_id
from combinations c
join sequencedrows s
on c.combination like '%#' || to_char(s.driver_seq) || '-' || to_char(s.values_per_driver_seq) || '#%'
order
by c.combination
, s.driver_seq
, s.values_per_driver_seq
/
The results are:
#1-1#2-1# 1 101 50
#1-1#2-1# 2 202 94
#1-1#2-2# 1 101 50
#1-1#2-2# 2 202 95
Performance
Depending on the data volume and indexes, the performance can be insufficient for interactive use. In our real estate development package we've however found that even with 50K rows generated performance is acceptable since Oracle 11g. Oracle 10g did a less optimal job on optimization.
When performance is unacceptable at your site, please list some key statistics or add a reproduction scenario.

Related

How to repeat a record n times - SQL Server

I'm querying webdata that returns a list of items and the quantity owned. I need to translate that into multiple records - one for each item owned. For example, I might see this result: {"part_id": 118,"quantity": 3}. But in my database I need to be able to interact with each item individually, to assign them locations, properties, etc.
It would look like this:
Part_ID CopyNum
-------------------
118 1
118 2
118 3
In the past, I've kept a table I called [Count] that was just a list of integers from 1 to 100 and I did a cross join with the condition that Count.Num <= Qty
I'd like to do this without the Count table, which seems like a hack. How can I do this on the fly?
If you don't have a tally/numbers table (highly recommended), you can use an ad-hoc tally table in concert with a CROSS APPLY
Example
Declare #YourTable Table ([Part_ID] int,[Quantity] int) Insert Into #YourTable Values
(118,3)
,(125,2)
Select A.Part_ID
,CopyNum = B.N
From #YourTable A
Cross Apply ( Select Top (Quantity) N=Row_Number() Over (Order By (Select NULL))
From master..spt_values n1, master..spt_values n2
) B
Returns
Part_ID CopyNum
118 1
118 2
118 3
125 1
125 2

Convert Frequency Table Back to Non-Frequency Table (ungroup-ing)

In SQL Server, I have the following table (snippet) which is the source data I receive (I cannot get the raw table it was generated from).
Gradelevel | YoS | Inventory
4 | 0 | 4000
4 | 1 | 3500
4 | 2 | 2000
The first row of the table is saying for grade level 4, there are 4,000 people with 0 years of service (YoS).
I need to find the median YoS for each Grade level. This would be easy if the table wasn't given to me aggregated up to the Gradelevel/YoS level with a sum in the Inventory column, but sadly I'm not so lucky.
What I need is to ungroup this table such that I have a new table where the first record is in the table 4,000 times, the next record 3,500 times, the next 2,000, etc (the inventory column would not be in this new table). Then I could take the percent_disc() of the YoS column by grade level and get the median. I could also then use other statistical functions on YoS to glean other insights from the data.
So far I've looked at unpivot (doesn't appear to be a candidate for my use case), CTEs (can't find an example close to what I'm trying to do), and a function which iterates through the above table inserting the number of rows indicated by the value in inventory to a new table which becomes my 'ungrouped' table I can run statistical analyses on. I believe the last approach is the best option available to me but the examples I've all seen iterate and focus on a single column from a table. I need to iterate through each row, then use the gradelevel, and yos values to insert [inventory] number of times before moving on to the next row.
Is anyone aware of:
A better way to do this other then the iteration/cursor method?
How to iterate through a table to accomplish my goal? I've been reading Is there a way to loop through a table variable in TSQL without using a cursor? but am having a hard time figuring out how to apply that iteration to my use case.
Edit 10/3, here is the looping code I got working which produces the same as John's cross apply. Pro is any statistical function can then be run on it, con is it is slow.
--this table will hold our row (non-frequency) based inventory data
DROP TABLE IF EXISTS #tempinv
CREATE TABLE #tempinv(
amcosversionid INT NOT null,
pp NVARCHAR(3) NOT NULL,
gl INT NOT NULL,
yos INT NOT NULL
)
-- to transform the inventory frequency table to a row based inventory we need to iterate through it
DECLARE #MyCursor CURSOR, #pp AS NVARCHAR(3), #gl AS INT, #yos AS INT, #inv AS int
BEGIN
SET #MyCursor = CURSOR FOR
SELECT payplan, gradelevel, step_yos, SUM(inventory) AS inventory
FROM
mytable
GROUP BY payplan, gradelevel, step_yos
OPEN #MyCursor
FETCH NEXT FROM #MyCursor
INTO #pp, #GL, #yos, #inv
WHILE ##FETCH_STATUS = 0
BEGIN
DECLARE #i int
SET #i = 1
--insert into our new table for each number of people in inventory
WHILE #i<=#inv
BEGIN
INSERT INTO #tempinv (pp,gl,yos) VALUES (#pp,#gl,#yos)
SET #i = #i + 1
END
FETCH NEXT FROM #MyCursor
INTO #pp, #GL, #yos, #inv
END;
One Option is to use an CROSS APPLY in concert with an ad-hoc tally table. This will "expand" your data into N rows. Then you can perform any desired analysis you want.
Example
Select *
From YourTable A
Cross Apply (
Select Top ([Inventory]) N=Row_Number() Over (Order By (Select NULL))
From master..spt_values n1, master..spt_values n2
) B
Returns
Grd Yos Inven N
4 0 4000 1
4 0 4000 2
4 0 4000 3
4 0 4000 4
4 0 4000 5
...
4 0 4000 3998
4 0 4000 3999
4 0 4000 4000
4 1 3500 1
4 1 3500 2
4 1 3500 3
4 1 3500 4
...
4 1 3500 3499
4 1 3500 3500
4 2 2000 1
4 2 2000 2
4 2 2000 3
...
4 2 2000 1999
4 2 2000 2000

SQL Server query to display all columns but with distinct values in one of the columns (not grouping anything)

I have a table with 106 columns. One of those columns is a "Type" column with 16 types.
I want 16 rows, where the Type is distinct. So, row 1 has a type of "Construction", row 2 has a type of "Elevator PVT", etc.
Using Navicat.
From what I've found (and understood) so far, I can't use Distinct (because that looks across all rows), I can't use Group By (because that's for aggregating data, which I'm not looking to do), so I'm stuck.
Please be gentle- I'm really really new at this.
Below is a part of the table (how can I share this normally?)- it's really big so I didn't share the whole thing. Below is a partial result I'm looking for, where the Violation_Type is unique and the rest of the columns display.
Got it.. Sheesh... (took me forever, but got it...)
D_ID B_ID V_ID V_Type S_ID c_f d_y l_u p_s du_p
------ ------ ------- -------------- ------ ----- ------ ------ ----- ------
184 117 V 032 Elevator PVT 2 8 0 0
4 140 V 100 Construction 1 8 0 0
10 116 V 122 Electric 1 8 2005 0 0
11 117 V 033 Boiler Local 1 0 2005 0 0
You can use ROW_NUMBER for this:
SELECT *
FROM(
SELECT *,
rn = ROW_NUMBER() OVER(PARTITION BY V_Type ORDER BY (SELECT NULL))
FROM tbl
)t
WHERE rn = 1
Modify the ORDER BY depending on what row you want to prioritize.
From the documentation:
Returns the sequential number of a row within a partition of a result
set, starting at 1 for the first row in each partition.
This means that for every row within a partition (specified by the PARTITION BY clause), sql-server assigns a number from 1 depending on the order specified in the ORDER BY clause.
ROW_NUMBER requires an ORDER BY clause. SELECT NULL tells the sql-server that we do not want to enforce a particular order. We just want the rows numbered by partition.
The WHERE rn = 1 obviously filters only rows that has a ROW_NUMBER of 1. This gives you one row for every V_TYPE available.

Turning string into rows

I have an old vintage system with a table looking like this.
OptionsTable
id options
=== ========================
101 Apple,Banana
102 Audi,Mercedes,Volkswagen
In the application that consumes the data, a function will break down the options column into manageable lists and populate dropdowns etc.
The problem is that this kind of data isn't very SQL friendly, making it difficult to make ad-hoc queries and reports.
To that end, I'd like to transform the data into a friendlier view, looking like this:
OptionsView
id name value
=== ========== =====
101 Apple 1
101 Banana 2
102 Audi 1
102 Mercedes 2
102 Volkswagen 3
Now, there have been some topics on splitting string into rows in t-sql (Turning a Comma Separated string into individual rows comes to mind), but apart from splitting the strings into rows, I also need to generate values based on the position in the string.
The plan is to make a view that hides the uglines of the original table.
It will be used in a join with the table housing the answers in order to make ad-hoc statistical queries.
Is there a good way of doing this without having to use cursors etc?
Perhaps adding a udf is overkill for your needs, but I created a split function a long time ago that returns the value, the startposition within the string and the index. With it, the usage in this scenario would be:
select id, String as [Name], ItemIndex as value from OptionsTable
outer apply dbo.Split(options, ',')
Results:
id Name value
101 Apple 1
101 Banana 2
102 Audi 1
102 Mercedes 2
102 Volkswagen 3
And the split function (unrevised since then):
ALTER function [dbo].[Split] (
#StringToSplit varchar(2048),
#Separator varchar(128))
returns table as return
with indices as
(
select 0 S, 1 E, 0 I
union all
select E, charindex(#Separator, #StringToSplit, E) + len(#Separator) , I + 1
from indices
where E > S
)
select substring(#StringToSplit,S,
case when E > len(#Separator) then e-s-len(#Separator) else len(#StringToSplit) - s + 1 end) String
,S StartIndex, I ItemIndex
from indices where S >0
This should work for you:
DECLARE #OptionsTable TABLE
(
id INT
, options VARCHAR(100)
);
INSERT INTO #OptionsTable (id, options)
VALUES (101, 'Apple,Banana')
, (102, 'Audi,Mercedes,Volkswagen');
SELECT OT.id, T.name, t.value
FROM #OptionsTable AS OT
CROSS APPLY (
SELECT T.column1, ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM dbo.GetTableFromList(OT.options, ',') AS T
) AS T(name, value);
Here dbo.GetTableFromList is a split string function.
CROSS APPLY executes this function for each row resulting in options split into names in seperate rows. And I used ROW_NUMBER() to add value row, If you want to order result set by name, please use ROW_NUMBER() OVER (ORDER BY t.column1), that should and probably will make results look consistent all the time.
Result:
id name value
-----------------
101 Apple 1
101 Banana 2
102 Audi 1
102 Mercedes 2
102 Volkswagen 3
You could convert your string to XML and then parse the string to transpose it to rows something like this:
SELECT A.[id]
,Split.a.value('.', 'VARCHAR(100)') AS Name
,ROW_NUMBER() OVER (PARTITION BY [id] ORDER BY (SELECT NULL)) as Value
FROM (
SELECT [id]
,CAST('<M>' + REPLACE([options], ',', '</M><M>') + '</M>' AS XML) AS Name
FROM optionstable
) AS A
CROSS APPLY Name.nodes('/M') AS Split(a);
Credits: #SRIRAM
SQL Fiddle Demo

Partition a dataset by multiple conditions TSQL

I got an interesting requirement to partition a dataset using different conditions.
Say, it is not simple GROUP BY or ORDER BY I have to say at first place.
Is it a ranking? Yeah little bit closer, but the challenge here is to write a single query for that.
Well I'm still wondering and looking for a straight forward option. Let me introduce a problem.
Name ----- Age ----- MarksForMaths ---- AvgByTotal
Above is a simple sample schema where it can be a marks taken by few students for maths and all average marks.
I need to filter out this set based on following criterias.
people who got 75 > Mathsmarks > 50 should be on top
people who got Mathsmarks > 90 should be a next set
people who average > 65 should take place thereafter
Older people Age > 55 should be a last set
Yeah obviously rank and filter is an option but can we do it in a optimized query?
Tip - what I did basically is create a additional column name RANK and update the column with a index based on conditions.
Then it's just a matter or filter the data order by RANK. Piece of cake !
But the question here is , can we go for one shot query? Appreciate tips.
Thanks
Does the below query fits with your requirement :
DECLARE #BaseTable TABLE (Name VARCHAR(50), Age INT, MarksForMaths INT, AvgByTotal INT)
INSERT INTO #BaseTable (Name, Age, MarksForMaths, AvgByTotal)
SELECT 'A', 1, 65, 12 UNION ALL
SELECT 'B', 1, 5, 75 UNION ALL
SELECT 'C', 1, 95, 12 UNION ALL
SELECT 'D', 65, 65, 12 UNION ALL
SELECT 'E', 65, 5, 12
SELECT tmp.Name, tmp.TmpRank
FROM
(
SELECT
Name,
CASE
WHEN (MarksForMaths > 50 AND MarksForMaths < 75) THEN 1
WHEN (MarksForMaths > 90) THEN 2
WHEN (AvgByTotal > 65) THEN 3
WHEN (Age > 55) THEN 4
ELSE 5
END AS TmpRank
FROM #BaseTable
) tmp
ORDER BY tmp.TmpRank

Resources