snowflake - Order by not sorting correctly - snowflake-cloud-data-platform

snowflake - Order by not sorting correctly - snowflake-cloud-data-platform

I am running a query in snowflake with a group by and order by clause and I notice that it is not ordering the first column in ascending order
select distinct columnA from table order by columnA
ColumnA
------------ +
AMP 1
AMP 2
Aluminum
Apple
In the example, Aluminum should be the first row however, it falls in the third. Seems to me that there is an uppercase and lowercase prioritization in sorting. How will I be able to make row 3 be the first row?

Upper caps rank above lower caps.
This will sort then irrespective of capitalization:
with data as (select * from table(split_to_table('AMP 1
AMP 2
Aluminum
Apple', '\n')))
select distinct value
from data
order by lower(value);

Related

how to merge-union sorted data sets without another sort

I'm trying to get an efficient data feed for a downstream process, but the generated query plan tries to cache the entire output before passing me anything.
My input data is a table that has:
ID,Attribute1,Attribute2,Attribute3,otherID
- Clustered index on ID(not unique)
- OtherID is unique)
about 10M rows, output query has up to 50 rows per ID, but 7 is typical
there are secondary tables which contain 0-5 attributes per otherID with a structure like
Attribute4Table:
id4(PK),OtherID, Attribute4
Attribute5Table:
id5(PK),OtherID, Attribute5
and the desired output is:
ID Dimension Value
4 'Attribute1' w
4 'Attribute4' x
4 'Attribute4' y
4 'Attribute3' z
5 'Attribute2' a
5 'Attribute3' b
5 'Attribute1' c
The by current query looks like:
there is a clustered index on ##quoteHistory(id)
select * from (
select ID, 'Attribute1' Dimension, cast(thing1 as varchar(400)) Value from ##quoteHistory
UNION ALL
select ID, 'Attribute2' Dimension, cast(thing2 as varchar(400)) Value from ##quoteHistory
UNION ALL
... couple of other similer clauses, 'Dimension' is unique
) x order by ID where Value is not null
output requirements: all rows for a given ID are output together (downstream application consumes the data ID group by ID group, and the cost of processing this data is substantially more than the query).
Problem: when it (correctly) detects it can use a merge-union, sql server unnecessarily pre-sorts the data..
by "unnecessary" if you remove "value" from the query you get the query plan i'm expecting, which is a stream output with no blocking components:
select * from (
select ID, 'Attribute1' Dimension from ##quoteHistory
UNION ALL
select ID, 'Attribute2' Dimension from ##quoteHistory
UNION ALL
... couple of other similer clauses
) x order by ID
Question:
How do i coerce the first query to produce the second plan, as that plan does produce the output ordering that i'm interested in.
edit:
In the full data set, other tables are sometimes joined onto ##quoteHistory in a 1:many relationship to pick up multiple values for that dimension.

Postgresql inner select with distinct

I'm using Postgresql 9.2 and have a simple students table as follow
id | proj_id | mark | name | test_date
I have 2 queries which is described below
select * from (select distinct on (proj_id) proj_id , mark, name,
test_date from students )
t
where t.mark <= 1000
VS
select distinct on (proj_id) proj_id , mark, name, test_date from
students where mark <= 1000
when I run each query for more than 10000 records each query returns different result especially result count although for less than 3000 records the result would be the same.
is this postgresql 9.2 bug or I'm missing something ?

Your queries are producing two different sets of results because they are applying the logic differently.
The first query is getting a distinct set of results, and then applying the 'mark' filter.
The second query is applying the 'mark' filter, and then getting a distinct set of results.
As you don't have any ordering applied the first query could potential return a different number of rows each time it is run - as the mark field could contain any of the values that relate to the proj_id.

MSAccess/SQL lookup table for match field based on sum of current table.field

I've been battling this for the last week with many attempted solutions. I want to return the unique names in table with the sum of their points and their current dance level based on that sum. Ultimately I want compare the returned dance level with what is stored in the customer table against the customer and show only the records where the two dance levels are different (the stored dance level and the calculated dance level based on the current sum of the points.
The final solution will be a web page using ADODB connection to MSAccess DB (2013). But for starters just want it to work in MSAccess.
I have a MSAccess DB (2013) with the following tables.
PointsAllocation
CustomerID Points
100 2
101 1
102 1
100 1
101 4
DanceLevel
DLevel Threshold
Beginner 2
Intermediate 4
Advanced 6
Customer
CID Firstname Dancelevel1
100 Bob Beginner
101 Mary Beginner
102 Jacqui Beginner
I want to find the current DLevel for each customer by using the SUM of their Points in the first table. I have this first...
SELECT SUM(Points), CustomerID FROM PointsAllocation GROUP BY CustomerID
Works well and gives me total points per customer. I can then INNER JOIN this to the customer table to get the persons name. Perfect.
Now I want to add the DLevel from the DanceLevel table to the results where the SUM total is used to lookup the Threshold and not exceed the value so I get the following:
(1) (2) (3) (4)
Bob 3 Beginner Intermediate
Mary 5 Beginner Advanced
Where...
(1) Customer.Firstname
(2) SUM(PointsAllocation.Points)
(3) Customer.Dancelevel1
(4) Dancelevel.DLevel
Jacqui is not shown as her SUM of Points is less than or equal to 2 giving her a calculated dance level of Beginner and this already matches the her Dancelevel1 in the Customer table.
Any ideas anyone?

You can start from the customer table because you want to list every customer. Then left join it with a subquery that calculates the dance levels and point totals. The innermost subquery totals the points and then joins on valid dance levels and selects the max threshold value from the dance levels. Then left join on the DanceLevel table again on the threshold value to get the level's description.
Select Customer.Firstname,
CustomerDanceLevels.Points,
Customer.Dancelevel1,
Dancelevel.DLevel
from Customer
left join
(select CustomerID, Points, Min(Threshold) Threshold
from
(select CustomerID, sum(Points) Points
from PointsAllocation
group by CustomerID
) PointsTotal
left join DanceLevel
on PointsTotal.Points <= DanceLevel.Threshold
group by CustomerID, Points
) CustomerDanceLevels
on Customer.CID = CustomerDanceLevels.CustomerID
left join DanceLevel
on CustomerDanceLevels.Threshold = DanceLevel.Threshold

How can we add a column on the fly in a dynamic table in SQL SERVER?

My question needs little explanation so I'd like to explain this way:
I've got a table (lets call it RootTable), it has one million records, and not in any proper order. What I'm trying to do is to get number of rows(#ParamCount) from RootTable and at the same time these records must be sorted and also have an additional column(with unique data) added on the fly to maintain a key for row identification which will be used later in the program. It can take any number of parameters but my basic parameters are the two which mentioned below.
It's needed for SQL SERVER environment.
e.g.
RootTable
ColumnA ColumnB ColumnC
ABC city cellnumber
ZZC city1 cellnumber
BCD city2 cellnumber
BCC city3 cellnumber
Passing number of rows to return #ParamCount and columnA startswith
#paramNameStartsWith
<b>#paramCount:2 <br>
#ParamNameStartsWith:BC</b>
desired result:
Id(added on the fly) ColumnA ColumnB ColumnC
101 BCC city3 cellnumber
102 BCD city2 cellnumber
Here's another point about Id column. Id must maintain its order, like in the above result it's starting from 101 because 100 is already assigned to the first row when sorted and added column on the fly, and because it starts with "ABC" so obviously it won't be in the result set.
Any kind of help would be appreciated.
NOTE: My question title might not reflect my requirement, but I couldn't get any other title.

So first you need your on-the-fly-ID. This one is created by the ROW_NUMBER() function which is available from SQL Server 2005 onwards. What ROW_NUMBER() will do is pretty self-explaining i think. However it works only on a partition. The Partition is specified by the OVER clause. If you include GROUP BY within the OVER clause, you will have multiple partitions. In your case, there is only one partition which is the whole table, therefor GROUP BY is not necessary. However an ORDER BY is required so that the system knows which record should get which row number in the partition. The query you get is:
SELECT ROW_NUMBER() OVER (ORDER BY ColumnA) ID, ColumnA,ColumnB,ColumnC
FROM RootTable
Now you have a row number for your whole table. You cannot include any condition like your #ParamNameStartsWith parameter here because you wanted a row number set for the whole table. The query above has to be a subquery which provides the set on which the condition can be applied. I use a CTE here, i think that is better for readability:
;WITH OrderedList AS (
SELECT ROW_NUMBER() OVER (ORDER BY ColumnA) ID, ColumnA,ColumnB,ColumnC
FROM RootTable
)
SELECT *
FROM OrderedList
WHERE ColumnA LIKE #ParamNameStartsWith+'%'
Please note that i added the wildcard % after the parameter, so that the condition is basically "starts with" #ParamNameStartsWith.
Finally,if i got you right you wanted only #ParamCount rows. You can use your parameter directly with the TOP keyword which is also only possible with SQL Server 2005 or later.
;WITH OrderedList AS (
SELECT ROW_NUMBER() OVER (ORDER BY ColumnA) ID, ColumnA,ColumnB,ColumnC
FROM RootTable
)
SELECT TOP (#ParamCount) *
FROM OrderedList
WHERE ColumnA LIKE #ParamNameStartsWith+'%'

Oracle IN condition without sort

May I know is there any solution to get the result without ordering in Oracle? It is because when I execute the query as follows, it seems to automatically helps me to sort it by ID field.
SELECT ID FROM USER WHERE ID IN (5004, 5003, 5005, 5002, 5008);
Actual results Expected results
---5002 ---5004
---5003 ---5003
---5004 ---5005
---5005 ---5002
---5008 ---5008
Million thanks if you guys have solutions on this.

SELECT statements return the rows of their result sets in an unpredictable order unless you give an ORDER BY clause.
Certain DBMS products give the illusion that their result sets are in a predictable order. But if you rely on that you're bound to be disappointed.

This is one way I've seen in the past using INSTR:
SELECT *
FROM YourTable
WHERE ID IN (5004, 5003, 5005, 5002, 5008)
ORDER BY INSTR ('5004,5003,5005,5002,5008', id)
SQL Fiddle Demo
I've also seen use of CASE like this:
ORDER BY
CASE ID
WHEN 5004 THEN 1
WHEN 5003 THEN 2
WHEN 5005 THEN 3
WHEN 5002 THEN 4
WHEN 5008 THEN 5
END

if you want to keep the order as your in list, you can do something like this:
SQL> create type user_va as varray(1000) of number;
2 /
Type created.
SQL> with users as (select /*+ cardinality(a, 10) */ rownum r, a.column_value user_id
2 from table(user_va(11, 0, 19, 5)) a)
3 select d.user_id, d.username
4 from dba_users d
5 inner join users u
6 on u.user_id = d.user_id
7 order by u.r
8 /
USER_ID USERNAME
---------- ------------------------------
11 OUTLN
0 SYS
19 DIP
5 SYSTEM
i.e we put the elements into a varray and assign a rownum prior to merging the set. we can then order by that r to maintain the order of our in list. The cardinality hint just tells the optimizer how many rows are in the array (doesn't have to be dead on, just in the ballpark..as without this, it will assume 8k rows and may prefer a full scan over an index approach)
if you don't have privs to create a type and this is just some adhoc thing, there's a few public ones:
select owner, type_name, upper_bound max_elements, length max_size, elem_type_name
from all_Coll_types
where coll_type = 'VARYING ARRAY'
and elem_type_name in ('INTEGER', 'NUMBER');

There is no guarantee of sort order without an ORDER BY clause.

If your question is about why the ordering occurs then the answer is: Do you have an index or primary key defined on the column ID? If yes the database responds to your query with an index scan. That is: it looks up the IDs in the IN clause not in the table itself but in the index defined on your ID-column. Within the index the values are ordered.
To get more information about the execution of your query try Oracle's explain plan feature.
To get the values in a certain order you have to add an ORDER BY clause. One way of doing this would be
select ID
from USER
where ID in (5004, 5003, 5005, 5002, 5008)
order by
case ID
when 5004 then 1
when 5003 then 2
...
end;
A more general way would be to add an ORDERING column to your table:
select ID
from USER
where ID in (5004, 5003, 5005, 5002, 5008)
order by
ORDERING;

Another solution that I found here.
select ID
from USER
where ID in (5004, 5003, 5005, 5002, 5008)
order by decode(ID, 5002, 1, 5003, 2, 5004, 3, 5005, 4, 5008, 5);
order by decode(COLUMN NAME, VALUE, POSITION)
*Note: Only need to repeat the VALUE and POSITION
And yah, thanks for all the responds! I am really appreciate it.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

snowflake - Order by not sorting correctly - snowflake-cloud-data-platform

Upper caps rank above lower caps. This will sort then irrespective of capitalization: with data as (select * from table(split_to_table('AMP 1 AMP 2 Aluminum Apple', '\n'))) select distinct value from data order by lower(value);

Related

how to merge-union sorted data sets without another sort

Postgresql inner select with distinct

MSAccess/SQL lookup table for match field based on sum of current table.field

How can we add a column on the fly in a dynamic table in SQL SERVER?

Oracle IN condition without sort

Categories

Resources