How to sort comma delimited column in SQL - sql-server

I have a database table with a Value column that contains comma delimited strings and I need to order on the Value column with a SQL query:
Column ID Value
1 ff, yy, bb, ii
2 kk, aa, ee
3 dd
4 cc, zz
If I use a simple query and order by Value ASC, it would result in an order of Column ID: 4, 3, 1, 2, but the desired result is Column ID: 2, 1, 4, 3, since Column ID '2' contains aa, '1' contains bb, and etc.
By the same token, order by Value DESC would result in an order of Column ID: 2, 1, 3, 4, but the desired result is Column ID: 4, 1, 2, 3.
My initial thought is to have two additional columns, 'Lowest Value' and 'Highest Value', and within the query I can order on either 'Lowest Value' and 'Highest value' depending on the sort order. But I am not quite sure how to sort the highest and lowest in each row and insert them into the appropriate columns. Or is there another solution without the use of those two additional columns within the sql statement? I'm not that proficient in sql query, so thanks for your assistance.

Best solution is not to store a single comma separated value at all. Instead have a detail table Values which can have multiple rows per ID, with one value each.
If you have the possibility to alter the data structure (and by your own suggestion of adding columns, it seems you have), I would choose that solution.
To sort by lowest value, you could then write a query similar to the one below.
select
t.ID
from
YourTable t
left join ValueTable v on v.ID = t.ID
group by
t.ID
order by
min(v.Value)
But this structure also allows you to write other, more advanced queries. For instance, this structure makes it easier and more efficient to check if a row matches a specific value, because you don't have to parse the list of values every time, and separate values can be indexed better.

string / array splitting (and creation, for that matter) is covered quite extensively. You might want to have a read of this, one of the best articles out there covering a comparison of the popular methods. Once you have the values the rest is easy.
http://sqlperformance.com/2012/07/t-sql-queries/split-strings
Funnily enough I did something like this just the other week in a cross-applied table function to do some data cleansing, improving performance 8 fold over the looped version in place.

Related

Snowflake query pruning by Column

in the Snowflake Docs it says:
First, prune micro-partitions that are not needed for the query.
Then, prune by column within the remaining micro-partitions.
What is meant with the second step?
Let's take the example table t1 shown in the link. In this example table I use the following query:
SELECT * FROM t1
WHERE
Date = ‚11/3‘ AND
Name = ‚C‘
Because of the Date = ‚11/3‘ it would only scan micro partitions 2, 3 and 4. Because of the Name = 'C' it can prune even more and only scan micro-partions 2 and 4.
So in the end only micro-partitions 2 and 4 would be scanned.
But where does the second step come into play? What is meant with prune by column within the remaining micro partitions?
Does it mean, that only rows 4, 5 and 6 on micro-partition 2 and row 1 on micro-partition 4 are scanned, because date is my clustering key and is sorted so you can prune even further with the date?
So in the end only 4 rows would be scanned?
But where does the second step come into play? What is meant with prune by column within the remaining micro partitions?
Benefits of Micro-partitioning:
Columns are stored independently within micro-partitions, often referred to as columnar storage.
This enables efficient scanning of individual columns; only the columns referenced by a query are scanned.
It is recommended to avoid SELECT * and specify required columns explicitly.
It simply means to only select the columns that are required for the query. So in your example it would be:
SELECT col_1, col_2 FROM t1
WHERE
Date = ‚11/3‘ AND
Name = ‚C‘

Navision tables and value mapping

i m looking into Navision tables i have notice that in many cases there are numeric values which represent words.
For example:
In table Sales Line there is a column named Document Type which represent Quote, Order, Invoice, Credit Memo, Blanket Order, Return Order. In database table those Document Types are represented by numeric values, 1,2 etc.
Is there any website which provide the mapping among the words and the numeric values for each table?
Thanks a lot for your help.
No, there isn’t.
You can only get that mapping from inside Nav.
Yes. You will find that info at https://dynamicsdocs.com/. Just be careful in how you read the entries. The values are zero-based arrays.
So for example, [Value Entry].[Source Type] is [ ,Customer,Vendor,Item]
Which means that 0 is skipped and the values are.
1 = Customer
2 = Vendor
3 = Item
And [Value Entry].[Type] is [Work Center,Machine Center, ,Resource] - notice that the values are 0, 1, 3, while 2 is skipped.

How do you use ArrayFormula with arrays (after aggregation)?

In my example:
https://docs.google.com/spreadsheets/d/1QQNTw_r9-q-FqVNwUoYklup73niZCFyO0VDUYImP5fo/edit?usp=sharing
I'm using Google Forms as an eBay clone to sell rare items. Each bid is outputted from the form to the "Data" worksheet and then I have ArrayFormulas set up inside the "Processed" worksheet. The idea is that I want to process the bids so that we filter everything except the items with the highest bids. All data should be automatically updated, hence why I want to use ArrayFormulas.
My strategy is that in colum A, I first filter all unique items (=unique(filter(Data!A2:A,Data!A2:A<>""))) and end up with:
Jurassic Park 6-Pog Hologram Set
Princess the Bear TY Beanie Baby
Holographic 1st Ed Charizard
However, then in column B, we have to find the highest bid that corresponds to that unique item, e.g.:
=IF(ISBLANK(A2),,ArrayFormula(MAX(IF(Data!A2:A=A2,Data!B2:B))))
However, I don't want to have A2 be a single cell (A2) but an array (A2:A) so that it doesn't have to be manually copied down the rows. Similarly, I also want columns D and E to be automatic as well. Is there any way to achieve this?
Not sure if it would be considered easier than the previously posted answer, but in case this thread is found in the future, I think that this is a slightly simpler way to solve these kinds of problems:
Try this on a fresh tab in cell A1:
=FILTER(Data!A:D,COUNTIFS(Data!A:A,Data!A:A,Data!B:B,">"&Data!B:B)=0)
I did some research and found an answer very similar to what you were looking for. After rearranging the formula slightly to match your sheet, I was able to get this to work:
=ArrayFormula(vlookup(query({row(Data!A2:A),sort(Data!A2:C)},"select max(Col1) where Col2 <> '' group by Col2 label max(Col1)''",0),{row(Data!A2:A),sort(Data!A2:D)},{2,3,4,5},0))
This formula automatically populates product name, highest bid, username, and timestamp. I ran some tests, adding my own random names and values into the data sheet, and the formula worked great.
Reference: https://webapps.stackexchange.com/a/75637
use:
={A1:D1; SORTN(SORT(A2:D, 1, , 2, ), 9^9, 2, 1, )}
translated:
{A1:D1} - headers
SORT(A2:D, 1, , 2, ) - sort 1st column then 2nd column descending
9^9 - output all possible rows
2 - use 2nd mode of sortn which will group selected column
1 - selected column to be marged based on unique values

Is there a way to sum an entire quantity in SQL with unique values

I am trying to get a total summation of both the ItemDetail.Quantity column and ItemDetail.NetPrice column. For sake of example, let's say the quantity that is listed is for each individual item is 5, 2, and 4 respectively. I am wondering if there is a way to display quantity as 11 for one single ItemGroup.ItemGroupName
The query I am using is listed below
select Location.LocationName, ItemDetail.DOB, SUM (ItemDetail.Quantity) as "Quantity",
ItemGroup.ItemGroupName, SUM (ItemDetail.NetPrice)
from ItemDetail
Join ItemGroupMember
on ItemDetail.ItemID = ItemGroupMember.ItemID
Join ItemGroup
on ItemGroupMember.ItemGroupID = ItemGroup.ItemGroupID
Join Location
on ItemDetail.LocationID = Location.LocationID
Inner Join Item
on ItemDetail.ItemID = Item.ItemID
where ItemGroup.ItemGroupID = '78' and DOB = '11/20/2019'
GROUP BY Location.LocationName, ItemDetail.DOB, Item.ItemName,
ItemDetail.NetPrice, ItemGroup.ItemGroupName
If you are using SQL Server 2012 , you can use the summation on partition to display the
details and aggregates in the same query.
SUM(SalesYTD) OVER (ORDER BY DATEPART(yy,ModifiedDate)),1)
Link :
https://learn.microsoft.com/en-us/sql/t-sql/functions/sum-transact-sql?view=sql-server-ver15
We can't be certain without seeing sample data. But I suspect you need to remove some fields from you GROUP BY clause -- probably Item.ItemName and ItemDetail.NetPrice.
Generally, you won't GROUP BY a column that you are applying an aggregate function to in the SELECT -- as in SUM(ItemDetail.NetPrice). And it is not very common, in my experience, to GROUP BY columns that aren't included in the SELECT list - as you are doing with Item.ItemName.
I think you need to go back to basics and read about what GROUP BY does.
First of all welcome to the overflow...
Second: The answer is going to be "It depends"
Any time you aggregate data you will need to Group by the other fields in the query, and you have that in the query. The gotcha is what happens when data is spread across multiple locations.
My suggestion is to rethink your problem and see if you really need these other fields in the query. This will depend on what the person using the data really wants to know.
Do they need to know how many of item X there are, or do they really need to know that item X is spread out over three sites?
You might find you are better off with two smaller queries.

Using a sort order column in a database table

Let's say I have a Product table in a shopping site's database to keep description, price, etc of store's products. What is the most efficient way to make my client able to re-order these products?
I create an Order column (integer) to use for sorting records but that gives me some headaches regarding performance due to the primitive methods I use to change the order of every record after the one I actually need to change. An example:
Id Order
5 3
8 1
26 2
32 5
120 4
Now what can I do to change the order of the record with ID=26 to 3?
What I did was creating a procedure which checks whether there is a record in the target order (3) and updates the order of the row (ID=26) if not. If there is a record in target order the procedure executes itself sending that row's ID with target order + 1 as parameters.
That causes to update every single record after the one I want to change to make room:
Id Order
5 4
8 1
26 3
32 6
120 5
So what would a smarter person do?
I use SQL Server 2008 R2.
Edit:
I need the order column of an item to be enough for sorting with no secondary keys involved. Order column alone must specify a unique place for its record.
In addition to all, I wonder if I can implement something like of a linked list: A 'Next' column instead of an 'Order' column to keep the next items ID. But I have no idea how to write the query that retrieves the records with correct order. If anyone has an idea about this approach as well, please share.
Update product set order = order+1 where order >= #value changed
Though over time you'll get larger and larger "spaces" in your order but it will still "sort"
This will add 1 to the value being changed and every value after it in one statement, but the above statement is still true. larger and larger "spaces" will form in your order possibly getting to the point of exceeding an INT value.
Alternate solution given desire for no spaces:
Imagine a procedure for: UpdateSortOrder with parameters of #NewOrderVal, #IDToChange,#OriginalOrderVal
Two step process depending if new/old order is moving up or down the sort.
If #NewOrderVal < #OriginalOrderVal --Moving down chain
--Create space for the movement; no point in changing the original
Update product set order = order+1
where order BETWEEN #NewOrderVal and #OriginalOrderVal-1;
end if
If #NewOrderVal > #OriginalOrderVal --Moving up chain
--Create space for the momvement; no point in changing the original
Update product set order = order-1
where order between #OriginalOrderVal+1 and #NewOrderVal
end if
--Finally update the one we moved to correct value
update product set order = #newOrderVal where ID=#IDToChange;
Regarding best practice; most environments I've been in typically want something grouped by category and sorted alphabetically or based on "popularity on sale" thus negating the need to provide a user defined sort.
Use the old trick that BASIC programs (amongst other places) used: jump the numbers in the order column by 10 or some other convenient increment. You can then insert a single row (indeed, up to 9 rows, if you're lucky) between two existing numbers (that are 10 apart). Or you can move row 370 to 565 without having to change any of the rows from 570 upwards.
Here is an alternative approach using a common table expression (CTE).
This approach respects a unique index on the SortOrder column, and will close any gaps in the sort order sequence that may have been left over from earlier DELETE operations.
/* For example, move Product with id = 26 into position 3 */
DECLARE #id int = 26
DECLARE #sortOrder int = 3
;WITH Sorted AS (
SELECT Id,
ROW_NUMBER() OVER (ORDER BY SortOrder) AS RowNumber
FROM Product
WHERE Id <> #id
)
UPDATE p
SET p.SortOrder =
(CASE
WHEN p.Id = #id THEN #sortOrder
WHEN s.RowNumber >= #sortOrder THEN s.RowNumber + 1
ELSE s.RowNumber
END)
FROM Product p
LEFT JOIN Sorted s ON p.Id = s.Id
It is very simple. You need to have "cardinality hole".
Structure: you need to have 2 columns:
pk = 32bit int
order = 64bit bigint (BIGINT, NOT DOUBLE!!!)
Insert/UpdateL
When you insert first new record you must set order = round(max_bigint / 2).
If you insert at the beginning of the table, you must set order = round("order of first record" / 2)
If you insert at the end of the table, you must set order = round("max_bigint - order of last record" / 2)
If you insert in the middle, you must set order = round("order of record before - order of record after" / 2)
This method has a very big cardinality. If you have constraint error or if you think what you have small cardinality you can rebuild order column (normalize).
In maximality situation with normalization (with this structure) you can have "cardinality hole" in 32 bit.
It is very simple and fast!
Remember NO DOUBLE!!! Only INT - order is precision value!
One solution I have used in the past, with some success, is to use a 'weight' instead of 'order'. Weight being the obvious, the heavier an item (ie: the lower the number) sinks to the bottom, the lighter (higher the number) rises to the top.
In the event I have multiple items with the same weight, I assume they are of the same importance and I order them alphabetically.
This means your SQL will look something like this:
ORDER BY 'weight', 'itemName'
hope that helps.
I am currently developing a database with a tree structure that needs to be ordered. I use a link-list kind of method that will be ordered on the client (not the database). Ordering could also be done in the database via a recursive query, but that is not necessary for this project.
I made this document that describes how we are going to implement storage of the sort order, including an example in postgresql. Please feel free to comment!
https://docs.google.com/document/d/14WuVyGk6ffYyrTzuypY38aIXZIs8H-HbA81st-syFFI/edit?usp=sharing

Resources