Pivoting a table with multiple columns in SQL - sql-server

My goal here is to take a list of two corresponding store numbers and provide an output similar to:
Ultimate goal: produce a list of closest stores by travel time and distance based on source data of 2 rows per zip9 where each row is the travel time in distance, and in time, to a store in question.
The result is that each zip code has 2 stores to choose from, and the requirement is being able to return one row with both options.
+-----------+---------------+---------------------+-------------------+-------------------------+
| zip | Shortest_time | Shortest_time_store | Shortest_distance | Shortest_distance_store |
+-----------+---------------+---------------------+-------------------+-------------------------+
| 70011134 | 38.7035 | 75 | 21.3124 | 115 |
| 70011186 | 38.4841 | 75 | 21.4144 | 115 |
| 70011207 | 39.1567 | 75 | 21.1826 | 115 |
| 100013232 | 22.976 | 145 | 9.5031 | 115 |
| 112075140 | 21.888 | 145 | 7.3705 | 115 |
+-----------+---------------+---------------------+-------------------+-------------------------+
Original dataset
+---------------+--------------------------+-----------------------+------------------+
| CORRECTED_ZIP | SourceOrganizationNumber | Travel Time (Minutes) | Distance (Miles) |
+---------------+--------------------------+-----------------------+------------------+
| 70011134 | 75 | 38.7035 | 26.8628 |
| 70011134 | 115 | 39.3969 | 21.3124 |
| 70011186 | 75 | 38.4841 | 26.7609 |
| 70011186 | 115 | 39.6389 | 21.4144 |
| 70011207 | 75 | 39.1567 | 31.2771 |
| 70011207 | 115 | 39.188 | 21.1826 |
| 100013232 | 115 | 28.6561 | 9.50311 |
| 100013232 | 145 | 22.976 | 10.0307 |
| 112075140 | 115 | 36.1803 | 7.37053 |
| 112075140 | 145 | 21.888 | 9.50123 |
+---------------+--------------------------+-----------------------+------------------+
Dataset after I've modified it with this query:
SELECT TOP 1000 [corrected_zip]
, TRY_CONVERT( DECIMAL(18, 4), ROUND([Travel Time (Minutes)], 4)) AS [Unit of Measurement]
, [SourceOrganizationNumber]
, 'Time' AS [Type]
FROM [db].[dbo].[my_table_A] [tt]
WHERE [tt].[CORRECTED_ZIP] IN('070011134', '070011186', '070011207', '112075140', '100013232')
AND [Travel Time (Minutes)] IN
(
SELECT MIN([Travel Time (Minutes)])
FROM [db].[dbo].[my_table_A]
WHERE [CORRECTED_ZIP] = [tt].[CORRECTED_ZIP]
GROUP BY [CORRECTED_ZIP]
)
UNION ALL
SELECT TOP 1000 [corrected_zip]
, TRY_CONVERT( DECIMAL(18, 4), ROUND([Distance (Miles)], 4))
, [SourceOrganizationNumber]
, 'Distance'
FROM [db].[dbo].[my_table_A] [tt]
WHERE [tt].[CORRECTED_ZIP] IN('070011134', '070011186', '070011207', '112075140', '100013232')
AND [Distance (Miles)] IN
(
SELECT MIN([Distance (Miles)])
FROM [db].[dbo].[my_table_A]
WHERE [CORRECTED_ZIP] = [tt].[CORRECTED_ZIP]
GROUP BY [CORRECTED_ZIP]
)
ORDER BY [CORRECTED_ZIP];
+---------------+---------------------+--------------------------+----------+
| corrected_zip | Unit of Measurement | SourceOrganizationNumber | Type |
+---------------+---------------------+--------------------------+----------+
| 70011134 | 38.7035 | 75 | Time |
| 70011134 | 21.3124 | 115 | Distance |
| 70011186 | 21.4144 | 115 | Distance |
| 70011186 | 38.4841 | 75 | Time |
| 70011207 | 39.1567 | 75 | Time |
| 70011207 | 21.1826 | 115 | Distance |
| 100013232 | 9.5031 | 115 | Distance |
| 100013232 | 22.976 | 145 | Time |
| 112075140 | 21.888 | 145 | Time |
| 112075140 | 7.3705 | 115 | Distance |
+---------------+---------------------+--------------------------+----------+
Data after I attempted to pivot it
+---------------+--------------------------+----------+---------+
| corrected_zip | SourceOrganizationNumber | Distance | Time |
+---------------+--------------------------+----------+---------+
| 070011134 | 115 | 21.3124 | NULL |
| 070011134 | 75 | NULL | 38.7035 |
| 070011186 | 115 | 21.4144 | NULL |
| 070011186 | 75 | NULL | 38.4841 |
| 070011207 | 115 | 21.1826 | NULL |
| 070011207 | 75 | NULL | 39.1567 |
| 100013232 | 115 | 9.5031 | NULL |
| 100013232 | 145 | NULL | 22.9760 |
| 112075140 | 115 | 7.3705 | NULL |
| 112075140 | 145 | NULL | 21.8880 |
+---------------+--------------------------+----------+---------+
It seems like my issue is picking the correct store ID as opposed to grouping by store ID?

You can use row_number() twice in a subquery(once to rank by time, another by distance), and then do conditional aggregation in the outer query:
select
corrected_zip,
min(travel_time) shortest_time,
min(case when rnt = 1 then source_organization_number end) shortest_time_store,
min(distance) shortest_distance,
min(case when rnd = 1 then source_organization_number end) shortest_distance_store
from (
select
t.*,
row_number() over(partition by corrected_zip order by travel_time) rnt,
row_number() over(partition by corrected_zip order by distance) rnd
from mytable t
) t
group by corrected_zip

Related

How can I improve the response time of this query in Oracle

this query takes 24 seconds and returns 1891 results:
SELECT p.STATE, p.REFNUM, p.CODE, p.TYPE, i.STATE, pj.NAME, pj.DOCUMENT
TABLE p
inner join TABLE2 i on i.REFNUM = p.REFNUM
inner join TABLE3 pj on pj.NUMBER = i.NUMBER and p.OFIC_ID = pj.OFIC_ID and p.PUB_ID = pj.PUB_ID
inner join OFICE o on t.OFIC_ID = p.OFIC_ID and o.PUB_ID = p.PUB_ID
inner join GROUP glad on glad.GROUP_CODE=p.GROUP_CODE
WHERE glad.GROUP_TYPE ='3' AND i.STATE = '1'
AND p.PUB_ID IN ('05','11','12','09','08','13','04','02','01','06','10','03','07','14')
AND pj.NAME LIKE 'BANK%'
ORDER BY o.NAME,p.ID;
I have these indexes:
CREATE INDEX IND_TABLE1_REFNUM_ZONE ON TABLE1 (PUB_ID, OFIC_ID, REFNUM, ZONE_ID)
CREATE INDEX IND_TABLE1_REFNUMPUB ON TABLE1 (REFNUM, PUB_ID, OFIC_ID, GROUP_CODE );
CREATE INDEX IND_TABLE1_GROUP ON TABLE1 (PUB_ID, GROUP_CODE, REFNUM, OFIC_ID)
CREATE INDEX IND_TABLE2_REF ON TABLE2 (REFNUM, NUMBER, STATE);
CREATE INDEX IND_TABLE2_QUERY ON TABLE2 (NUMBER, TYPE, STATE, REFNUM, NUM, CODE);
CREATE INDEX IND_TABLE2_REFNUM ON TABLE2 (REFNUM)
CREATE INDEX IND_TABLE2_NUMBER ON TABLE2 (NUMBER)
CREATE INDEX IND_TABLE3_NUM ON TABLE3 (NUMBER, PUB_ID, OFIC_ID, NAME );
CREATE INDEX IND_TABLE3_NAME ON TABLE3 ( NAME );
CREATE INDEX IND_GROUP_COD ON GROUP (GROUP_CODE, GROUP_TYPE)
I made the following queries to see how many records are in each table:
SELECT count(*) FROM TABLE1 --> 18298458 results
SELECT count(*) FROM TABLE2 --> 60627924 results
SELECT count(*) FROM TABLE3 --> 18425913 results
SELECT count(*) FROM OFICE --> 65 results
SELECT count(*) FROM TABLE1 p INNER JOIN GROUP glad on glad.GROUP_CODE=p.GROUP_CODE where glad.GROUP_TYPE ='3' AND p.PUB_ID IN ('05','11','12','09','08','13','04','02','01','06','10','03','07','14') --> 1314077 results
SELECT count(*) FROM TABLE1 p INNER JOIN GROUP glad on glad.GROUP_CODE=p.GROUP_CODE where glad.GROUP_TYPE ='3' AND p.PUB_ID IN ('05') --> 53754 results
SELECT count(*) FROM TABLE3 WHERE NAME LIKE 'BANK%' --> 1922081 results
this is the plan generated by oracle:
-----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost | Pstart| Pstop |
-----------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 291K| 38M| 384K| | |
| 1 | SORT ORDER BY | | 291K| 38M| 384K| | |
| 2 | HASH JOIN | | 291K| 38M| 375K| | |
| 3 | TABLE ACCESS FULL | OFICE | 64 | 960 | 3 | | |
| 4 | HASH JOIN | | 291K| 34M| 375K| | |
| 5 | INDEX SKIP SCAN | IND_GROUP_COD | 47 | 329 | 1 | | |
| 6 | HASH JOIN | | 452K| 50M| 375K| | |
| 7 | PART JOIN FILTER CREATE | :BF0000 | 452K| 50M| 375K| | |
| 8 | NESTED LOOPS | | 452K| 50M| 375K| | |
| 9 | NESTED LOOPS | | | | | | |
| 10 | STATISTICS COLLECTOR | | | | | | |
| 11 | HASH JOIN | | 2100K| 166M| 252K| | |
| 12 | NESTED LOOPS | | 2100K| 166M| 252K| | |
| 13 | STATISTICS COLLECTOR | | | | | | |
| 14 | PARTITION RANGE ALL | | 1681K| 89M| 82582 | 1 | 19 |
| 15 | PARTITION HASH ALL | | 1681K| 89M| 82582 | 1 | 32 |
| 16 | TABLE ACCESS FULL | TABLE3 | 1681K| 89M| 82582 | 1 | 608 |
| 17 | INDEX RANGE SCAN | IND_TABLE2_QUERY | 1 | 27 | 103K| | |
| 18 | INDEX FAST FULL SCAN | IND_TABLE2_QUERY | 32M| 845M| 103K| | |
| 19 | INDEX RANGE SCAN | IND_TABLE1_REFNUM_ZONE| | | | | |
| 20 | TABLE ACCESS BY GLOBAL INDEX ROWID| TABLE1 | 1 | 35 | 70380 | ROWID | ROWID |
| 21 | PARTITION RANGE ALL | | 19M| 650M| 70380 | 1 | 19 |
| 22 | PARTITION HASH JOIN-FILTER | | 19M| 650M| 70380 |:BF0000|:BF0000|
| 23 | TABLE ACCESS FULL | TABLE1 | 19M| 650M| 70380 | 1 | 608 |
-----------------------------------------------------------------------------------------------------------------
I think it takes time because this one is using TABLE ACCESS FULL for TABLE1 and TABLE3
if I perform the query filtering only PUB_ID='05' instead of all the numbers in the above query, the query returns 181 results and takes 8 seconds and in that case oracle generates this plan:
--------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 797 | 108K| 312K| | |
| 1 | SORT ORDER BY | | 797 | 108K| 312K| | |
| 2 | NESTED LOOPS | | 797 | 108K| 312K| | |
| 3 | HASH JOIN | | 1238 | 160K| 312K| | |
| 4 | TABLE ACCESS BY INDEX ROWID BATCHED| OFICE | 3 | 45 | 2 | | |
| 5 | INDEX RANGE SCAN | SYS_C0034405 | 3 | | 1 | | |
| 6 | HASH JOIN | | 2091 | 240K| 312K| | |
| 7 | PART JOIN FILTER CREATE | :BF0000 | 66316 | 5375K| 241K| | |
| 8 | NESTED LOOPS | | 66316 | 5375K| 241K| | |
| 9 | PARTITION RANGE ALL | | 53085 | 2903K| 82490 | 1 | 19 |
| 10 | PARTITION HASH ALL | | 53085 | 2903K| 82490 | 1 | 32 |
| 11 | TABLE ACCESS FULL | TABLE3 | 53085 | 2903K| 82490 | 1 | 608 |
| 12 | INDEX RANGE SCAN | IND_TABLE2_QUERY | 1 | 27 | 3 | | |
| 13 | PARTITION RANGE ALL | | 762K| 25M| 68657 | 1 | 19 |
| 14 | PARTITION HASH JOIN-FILTER | | 762K| 25M| 68657 |:BF0000|:BF0000|
| 15 | TABLE ACCESS FULL | TABLE1 | 762K| 25M| 68657 | 1 | 608 |
| 16 | INDEX RANGE SCAN | IND_GROUP_COD | 1 | 7 | 0 | | |
--------------------------------------------------------------------------------------------------------------
SYS_C0034405 is the primary key of OFFICE which contains these fields: (PUB_ID, REG_ID)
if in addition to filtering only PUB_ID='05' I remove the "order by", the query takes only 3.5 seconds but I definitely have to return the ordered data and I would prefer to be able to filter several PUB_IDs
I thought the query could be improved if I removed the "inner join" from GROUP and changed the filter "glad.GROUP_TYPE ='3'" to "p.GROUP_CODE in ('01','07','10','21 ')" (these are all type 3 codes), because now it should use the IND_TABLE1_GROUP index but instead of improving, it gets worse, it takes 13 seconds even filtering only PUB_ID='05'; This is the plan that oracle generates:
------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost | Pstart| Pstop |
------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 47 | 6251 | 172K| | |
| 1 | SORT ORDER BY | | 47 | 6251 | 172K| | |
| 2 | HASH JOIN | | 47 | 6251 | 172K| | |
| 3 | PARTITION RANGE ALL | | 47786 | 2613K| 82490 | 1 | 19 |
| 4 | PARTITION HASH ALL | | 47786 | 2613K| 82490 | 1 | 32 |
| 5 | TABLE ACCESS FULL | TABLE3 | 47786 | 2613K| 82490 | 1 | 608 |
| 6 | HASH JOIN | | 41945 | 3154K| 89633 | | |
| 7 | NESTED LOOPS | | 41945 | 3154K| 89633 | | |
| 8 | NESTED LOOPS | | 75740 | 3154K| 89633 | | |
| 9 | STATISTICS COLLECTOR | | | | | | |
| 10 | NESTED LOOPS | | 18935 | 924K| 15985 | | |
| 11 | TABLE ACCESS BY INDEX ROWID BATCHED | OFICE | 3 | 45 | 2 | | |
| 12 | INDEX RANGE SCAN | SYS_C0034405 | 3 | | 1 | | |
| 13 | INLIST ITERATOR | | | | | | |
| 14 | TABLE ACCESS BY GLOBAL INDEX ROWID BATCHED| TABLE1 | 6312 | 215K| 7039 | ROWID | ROWID |
| 15 | INDEX RANGE SCAN | IND_TABLE1_GROUP | 6828 | | 284 | | |
| 16 | INDEX RANGE SCAN | IND_TABLE2_REFNUM | 4 | | 2 | | |
| 17 | TABLE ACCESS BY INDEX ROWID | TABLE2 | 2 | 54 | 4 | | |
| 18 | INDEX FAST FULL SCAN | IND_TABLE2_QUERY | 2 | 54 | 2 | | |
------------------------------------------------------------------------------------------------------------------------------
And if I put all the PUB_IDs, Oracle generates this plan (it doesn't even use the IND_TABLE1_GROUP index anymore):
-------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost | Pstart| Pstop |
-------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 35661 | 4631K| 345K| | |
| 1 | SORT ORDER BY | | 35661 | 4631K| 345K| | |
| 2 | HASH JOIN | | 35661 | 4631K| 344K| | |
| 3 | TABLE ACCESS FULL | OFICE | 64 | 960 | 3 | | |
| 4 | HASH JOIN | | 35661 | 4109K| 344K| | |
| 5 | PARTITION RANGE ALL | | 1360K| 45M| 79580 | 1 | 19 |
| 6 | PARTITION HASH ALL | | 1360K| 45M| 79580 | 1 | 32 |
| 7 | TABLE ACCESS FULL | TABLE1 | 1360K| 45M| 79580 | 1 | 608 |
| 8 | HASH JOIN | | 2100K| 166M| 252K| | |
| 9 | NESTED LOOPS | | 2100K| 166M| 252K| | |
| 10 | STATISTICS COLLECTOR | | | | | | |
| 11 | PARTITION RANGE ALL | | 1681K| 89M| 82582 | 1 | 19 |
| 12 | PARTITION HASH ALL | | 1681K| 89M| 82582 | 1 | 32 |
| 13 | TABLE ACCESS FULL | TABLE3 | 1681K| 89M| 82582 | 1 | 608 |
| 14 | INDEX RANGE SCAN | IND_TABLE2_QUERY | 1 | 27 | 103K| | |
| 15 | INDEX FAST FULL SCAN | IND_TABLE2_QUERY | 32M| 845M| 103K| | |
-------------------------------------------------------------------------------------------------

How to find max(sortnumber) on item code in SQL Server?

I have following SQL Server table ITEM:
+------------+-----------+------+--------+-----------+------------+
| Date | item_code | name | in/out | total_qty | SortNumber |
+------------+-----------+------+--------+-----------+------------+
| 08/07/2019 | 001 | A | -50 | 100 | 8 |
| 07/07/2019 | 001 | A | 50 | 100 | 7 |
| 06/07/2019 | 003 | C | 25 | 25 | 6 |
| 05/07/2019 | 001 | A | 50 | 50 | 5 |
| 04/07/2019 | 002 | B | 100 | 200 | 4 |
| 03/07/2019 | 003 | C | -25 | 0 | 3 |
| 02/07/2019 | 003 | C | 25 | 25 | 2 |
| 01/07/2019 | 002 | B | 100 | 100 | 1 |
+------------+-----------+------+--------+-----------+------------+
I've tried:
select itemcode, max(Sort_Number)
from ITEM
group by item_code
order by item_code asc
but I want result:
+---------------------+-----------+------------------+
| Distinct(item_code) | Total_qty | Max(Sort_Number) |
+---------------------+-----------+------------------+
| 001 | 100 | 8 |
| 002 | 200 | 4 |
| 003 | 25 | 6 |
+---------------------+-----------+------------------+
Can anyone help me?
The below query gives you the desired result -
With cteItem as
(
select item_code, total_qty, SortNumber,
Row_Number() over (partition by item_code order by SortNumber desc) maxSortNumber
from ITEM
)
select item_code, total_qty, SortNumber from cteItem where maxSortNumber = 1
just need to add max(sort_number) to your query
select item_code ,max(total_qty), max(sort_number)
from ITEM
group by item_code
order by item_code asc

Sum, Group by and Null

I'm dipping my toes into SQL. I have the following table
+------+----+------+------+-------+
| Type | ID | QTY | Rate | Name |
+------+----+------+------+-------+
| B | 1 | 1000 | 21 | Jack |
| B | 2 | 2000 | 12 | Kevin |
| B | 1 | 3000 | 24 | Jack |
| B | 1 | 1000 | 23 | Jack |
| B | 3 | 200 | 13 | Mary |
| B | 2 | 3000 | 12 | Kevin |
| B | 4 | 4000 | 44 | Chris |
| B | 4 | 5000 | 43 | Chris |
| B | 3 | 1000 | 26 | Mary |
+------+----+------+------+-------+
I don't know how I would leverage Sum and Group by to achieve the following result.
+------+----+------+------+-------+------------+
| Type | ID | QTY | Rate | Name | Sum of QTY |
+------+----+------+------+-------+------------+
| B | 1 | 1000 | 21 | Jack | 5000 |
| B | 1 | 3000 | 24 | Jack | Null |
| B | 1 | 1000 | 23 | Jack | Null |
| B | 2 | 3000 | 12 | Kevin | 5000 |
| B | 2 | 3000 | 12 | Kevin | Null |
| B | 3 | 200 | 13 | Mary | 1200 |
| B | 3 | 1000 | 26 | Mary | Null |
| B | 4 | 4000 | 44 | Chris | 9000 |
| B | 4 | 5000 | 43 | Chris | Null |
+------+----+------+------+-------+------------+
Any help is appreciated!
You can use window function :
select t.*,
(case when row_number() over (partition by type, id order by name) = 1
then sum(qty) over (partition by type, id order by name)
end) as Sum_of_QTY
from table t;

Need to update "orderby" column

I have a table test
+----+--+------+--+--+----------+--+--------------+
| ID | | Name | | | orderby | | processgroup |
+----+--+------+--+--+----------+--+--------------+
| 1 | | ABC | | | 10 | | 1 |
| 10 | | DEF | | | 12 | | 1 |
| 15 | | LMN | | | 1 | | 1 |
| 44 | | JKL | | | 4 | | 1 |
| 42 | | XYZ | | | 3 | | 2 |
+----+--+------+--+--+----------+--+--------------+
I want to update the orderby column in the sequence, I am expecting output like
+----+--+------+--+--+----------+--+--------------+
| ID | | Name | | | orderby | | processgroup |
+----+--+------+--+--+----------+--+--------------+
| 1 | | ABC | | | 1 | | 1 |
| 10 | | DEF | | | 2 | | 1 |
| 15 | | LMN | | | 3 | | 1 |
| 44 | | JKL | | | 4 | | 1 |
| 42 | | XYZ | | | 5 | | 1 |
+----+--+------+--+--+----------+--+--------------+
Logic behind this is when we have procesgroup as 1, orderby column should update as 1,2,3,4 and when procesgroup is 2 then update orderby as 5.
This might help you
;WITH CTE AS (
SELECT ROW_NUMBER() OVER (ORDER BY processgroup, ID ) AS SNO, ID FROM TABLE1
)
UPDATE TABLE1 SET TABLE1.orderby= CTE.SNO FROM CTE WHERE TABLE1.ID = CTE.ID

Table data into Pivot

An Accounting Table has the sample data shown below:
(There could be more AcctgHeads than those shown here)
+-------+-----------+--------+-----+
| Loan | AcctgHead | Amount | D_C |
+-------+-----------+--------+-----+
| 1 | Principal | 10000 | D |
| 1 | Principal | 500 | C |
| 1 | Cash | 10000 | C |
| 1 | Cash | 500 | D |
| 2 | Principal | 5000 | D |
| 2 | Cash | 5000 | C |
| 2 | Cash | 300 | D |
| 2 | Principal | 300 | C |
| 1 | IntDue | 50 | D |
| 1 | IntIncome | 50 | C |
+-------+-----------+--------+-----+
The desired ouput is:
+------+-------------+-------------+--------+--------+----------+----------+-------------+-------------+
| Loan | Principal_D | Principal_C | Cash_D | Cash_C | IntDue_D | IntDue_C | IntIncome_D | IntIncome_C |
+------+-------------+-------------+--------+--------+----------+----------+-------------+-------------+
| 1 | 10000 | 500 | 500 | 10000 | 50 | 0 | 0 | 50 |
| 2 | 5000 | 300 | 300 | 5000 | 0 | 0 | 0 | 0 |
+------+-------------+-------------+--------+--------+----------+----------+-------------+-------------+
What would be the query to accomplish this?
Thanks in advance for the help.
Try this: (assuming your acctghead has fixed number of values as shown)
select loan,
isnull(Principal_D,0) Principal_D,
isnull(Principal_C,0) Principal_C,
isnull(Cash_D,0) Cash_D,
isnull(Cash_C,0) Cash_C,
isnull(IntDue_D,0) IntDue_D,
isnull(IntDue_C,0) IntDue_C,
isnull(IntIncome_D,0) IntIncome_D,
isnull(IntIncome_C,0) IntIncome_C
from
(select loan, amount, AcctgHead + '_' + D_C As AcctgHeadDC from t) t
pivot
(
max(amount) for AcctgHeadDC in
(Principal_D,Principal_C,Cash_D,Cash_C,
IntDue_D,IntDue_C,IntIncome_D,IntIncome_C)
) p
SQL DEMO

Resources