Keeping it simple and how to do multiple CTE in a query

Keeping it simple and how to do multiple CTE in a query - sql-server

I have this simple T-SQL query, it emits a bunch of columns from a table and also joins information from other related tables.
My data model is simple. I have a scheduled event, with participants. I need to know how many participants participate in each event.
My solution to this is to add a CTE that groups scheduled events and counts the number of participants.
This will allow me to join in that information per scheduled event. Keeping the query simple.
I like to keep my queries simple, however, If I ever in the future need to have additonal temporary results accessible during my simple query, what do I do?
I would really like it, if I could have multiple CTEs but I can't, right? What are my options here?
I've ruled out views and doing things at the application data layer. I prefer to isolated my SQL queries.

You can have multiple CTEs in one query, as well as reuse a CTE:
WITH cte1 AS
(
SELECT 1 AS id
),
cte2 AS
(
SELECT 2 AS id
)
SELECT *
FROM cte1
UNION ALL
SELECT *
FROM cte2
UNION ALL
SELECT *
FROM cte1
Note, however, that SQL Server may reevaluate the CTE each time it is accessed, so if you are using values like RAND(), NEWID() etc., they may change between the CTE calls.

You certainly are able to have multiple CTEs in a single query expression. You just need to separate them with a comma. Here is an example. In the example below, there are two CTEs. One is named CategoryAndNumberOfProducts and the second is named ProductsOverTenDollars.
WITH CategoryAndNumberOfProducts (CategoryID, CategoryName, NumberOfProducts) AS
(
SELECT
CategoryID,
CategoryName,
(SELECT COUNT(1) FROM Products p
WHERE p.CategoryID = c.CategoryID) as NumberOfProducts
FROM Categories c
),
ProductsOverTenDollars (ProductID, CategoryID, ProductName, UnitPrice) AS
(
SELECT
ProductID,
CategoryID,
ProductName,
UnitPrice
FROM Products p
WHERE UnitPrice > 10.0
)
SELECT c.CategoryName, c.NumberOfProducts,
p.ProductName, p.UnitPrice
FROM ProductsOverTenDollars p
INNER JOIN CategoryAndNumberOfProducts c ON
p.CategoryID = c.CategoryID
ORDER BY ProductName

Related

Order SQL statement by count of sub-records efficiently (MS-SQL)

I have a table with two columns, an ID and a serial number. Let's call this table vehicle. The serial numbers are used for an auto-complete textbox. Right now, I simply narrow the list with a contains/like statement and select the top 5 hits. Obviously, this returns the 5 oldest records.
These vehicle entries are referenced by another table via foreign key. Let's call that one appearence. To improve the user experience, I want to present the five vehicles, which appeared most often during the last three months.
A simple approach would be, to simply join the two tables together, group by the serial number, count the number of appearences in the specified time range, and finally order it by that count descending.
The problem is, that this auto-complete textbox is accessed relatively often, so I wonder, if there is a built in optimization for this. My first idea was, to create a daily job, which updated a new probability column and order by that column. I've also seen the statistics option in SQL Server Management Studio, and I wonder, if that could help me. Or would there be a way to use views for that?
Updating the order every day would be abundant.
Here's the select statement (without the time component yet):
SELECT RegistrationNumber, count(lbt.ID) as cnt
FROM [Trailer] t
INNER JOIN [LoadingBay_Trailer] lbt ON t.ID = lbt.ID_Trailer
group by t.RegistrationNumber
order by cnt desc

For now, I've solved the problem with a simple view. I don't like this approach, because it creates a lot of load on the server, as the view is called after every keystroke the user performs:
CREATE VIEW [dbo].[TrailerAutocompleteHelper] AS
SELECT
t.ID,
t.RegistrationNumber,
t.InUse,
ROW_NUMBER() OVER(ORDER BY COUNT_BIG(lbt.ID_Trailer) DESC, RegistrationNumber ASC) AS Probability
FROM [dbo].[AvailableTrailers] t
LEFT JOIN [dbo].[LoadingBay_Trailer] lbt ON
t.ID = lbt.ID_Trailer AND
lbt.Arrival > DATEADD(month, -3, GETDATE())
GROUP BY t.ID, t.RegistrationNumber, t.InUse
GO
And C#
public async Task<List<TrailerAutocomplete>> GetTrailers(string registrationNumberPart) {
return await mapi.db.TrailerAutocompleteHelpers
.Where(t => t.RegistrationNumber.Contains(registrationNumberPart))
.OrderBy(t => t.Probability)
.Select(t => new TrailerAutocomplete() {
ID = t.ID,
RegistrationNumber = t.RegistrationNumber,
InUse = t.InUse
})
.Take(10)
.ToListAsync();
}
And here's what Entity framework makes out of my command:
exec sp_executesql N'SELECT TOP (10)
[Project1].[ID] AS [ID],
[Project1].[RegistrationNumber] AS [RegistrationNumber],
[Project1].[InUse] AS [InUse]
FROM ( SELECT
[Extent1].[ID] AS [ID],
[Extent1].[RegistrationNumber] AS [RegistrationNumber],
[Extent1].[InUse] AS [InUse],
[Extent1].[Probability] AS [Probability]
FROM (SELECT
[TrailerAutocompleteHelper].[ID] AS [ID],
[TrailerAutocompleteHelper].[RegistrationNumber] AS [RegistrationNumber],
[TrailerAutocompleteHelper].[InUse] AS [InUse],
[TrailerAutocompleteHelper].[Probability] AS [Probability]
FROM [dbo].[TrailerAutocompleteHelper] AS [TrailerAutocompleteHelper]) AS [Extent1]
WHERE [Extent1].[RegistrationNumber] LIKE #p__linq__0 ESCAPE N''~''
) AS [Project1]
ORDER BY [Project1].[Probability] ASC',N'#p__linq__0 nvarchar(4000)',#p__linq__0=N'%ABC%'

How to create multiple return subquery? [duplicate]

Want to improve this post? Provide detailed answers to this question, including citations and an explanation of why your answer is correct. Answers without enough detail may be edited or deleted.
This question already has answers here:
Retrieving the last record in each group - MySQL
(33 answers)
Closed 3 years ago.
I have this table for documents (simplified version here):
id
rev
content
1
1
...
2
1
...
1
2
...
1
3
...
How do I select one row per id and only the greatest rev?
With the above data, the result should contain two rows: [1, 3, ...] and [2, 1, ..]. I'm using MySQL.
Currently I use checks in the while loop to detect and over-write old revs from the resultset. But is this the only method to achieve the result? Isn't there a SQL solution?

At first glance...
All you need is a GROUP BY clause with the MAX aggregate function:
SELECT id, MAX(rev)
FROM YourTable
GROUP BY id
It's never that simple, is it?
I just noticed you need the content column as well.
This is a very common question in SQL: find the whole data for the row with some max value in a column per some group identifier. I heard that a lot during my career. Actually, it was one the questions I answered in my current job's technical interview.
It is, actually, so common that Stack Overflow community has created a single tag just to deal with questions like that: greatest-n-per-group.
Basically, you have two approaches to solve that problem:
Joining with simple group-identifier, max-value-in-group Sub-query
In this approach, you first find the group-identifier, max-value-in-group (already solved above) in a sub-query. Then you join your table to the sub-query with equality on both group-identifier and max-value-in-group:
SELECT a.id, a.rev, a.contents
FROM YourTable a
INNER JOIN (
SELECT id, MAX(rev) rev
FROM YourTable
GROUP BY id
) b ON a.id = b.id AND a.rev = b.rev
Left Joining with self, tweaking join conditions and filters
In this approach, you left join the table with itself. Equality goes in the group-identifier. Then, 2 smart moves:
The second join condition is having left side value less than right value
When you do step 1, the row(s) that actually have the max value will have NULL in the right side (it's a LEFT JOIN, remember?). Then, we filter the joined result, showing only the rows where the right side is NULL.
So you end up with:
SELECT a.*
FROM YourTable a
LEFT OUTER JOIN YourTable b
ON a.id = b.id AND a.rev < b.rev
WHERE b.id IS NULL;
Conclusion
Both approaches bring the exact same result.
If you have two rows with max-value-in-group for group-identifier, both rows will be in the result in both approaches.
Both approaches are SQL ANSI compatible, thus, will work with your favorite RDBMS, regardless of its "flavor".
Both approaches are also performance friendly, however your mileage may vary (RDBMS, DB Structure, Indexes, etc.). So when you pick one approach over the other, benchmark. And make sure you pick the one which make most of sense to you.

My preference is to use as little code as possible...
You can do it using IN
try this:
SELECT *
FROM t1 WHERE (id,rev) IN
( SELECT id, MAX(rev)
FROM t1
GROUP BY id
)
to my mind it is less complicated... easier to read and maintain.

I am flabbergasted that no answer offered SQL window function solution:
SELECT a.id, a.rev, a.contents
FROM (SELECT id, rev, contents,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY rev DESC) ranked_order
FROM YourTable) a
WHERE a.ranked_order = 1
Added in SQL standard ANSI/ISO Standard SQL:2003 and later extended with ANSI/ISO Standard SQL:2008, window (or windowing) functions are available with all major vendors now. There are more types of rank functions available to deal with a tie issue: RANK, DENSE_RANK, PERSENT_RANK.

Yet another solution is to use a correlated subquery:
select yt.id, yt.rev, yt.contents
from YourTable yt
where rev =
(select max(rev) from YourTable st where yt.id=st.id)
Having an index on (id,rev) renders the subquery almost as a simple lookup...
Following are comparisons to the solutions in #AdrianCarneiro's answer (subquery, leftjoin), based on MySQL measurements with InnoDB table of ~1million records, group size being: 1-3.
While for full table scans subquery/leftjoin/correlated timings relate to each other as 6/8/9, when it comes to direct lookups or batch (id in (1,2,3)), subquery is much slower then the others (Due to rerunning the subquery). However I couldnt differentiate between leftjoin and correlated solutions in speed.
One final note, as leftjoin creates n*(n+1)/2 joins in groups, its performance can be heavily affected by the size of groups...

I can't vouch for the performance, but here's a trick inspired by the limitations of Microsoft Excel. It has some good features
GOOD STUFF
It should force return of only one "max record" even if there is a tie (sometimes useful)
It doesn't require a join
APPROACH
It is a little bit ugly and requires that you know something about the range of valid values of the rev column. Let us assume that we know the rev column is a number between 0.00 and 999 including decimals but that there will only ever be two digits to the right of the decimal point (e.g. 34.17 would be a valid value).
The gist of the thing is that you create a single synthetic column by string concatenating/packing the primary comparison field along with the data you want. In this way, you can force SQL's MAX() aggregate function to return all of the data (because it has been packed into a single column). Then you have to unpack the data.
Here's how it looks with the above example, written in SQL
SELECT id,
CAST(SUBSTRING(max(packed_col) FROM 2 FOR 6) AS float) as max_rev,
SUBSTRING(max(packed_col) FROM 11) AS content_for_max_rev
FROM (SELECT id,
CAST(1000 + rev + .001 as CHAR) || '---' || CAST(content AS char) AS packed_col
FROM yourtable
)
GROUP BY id
The packing begins by forcing the rev column to be a number of known character length regardless of the value of rev so that for example
3.2 becomes 1003.201
57 becomes 1057.001
923.88 becomes 1923.881
If you do it right, string comparison of two numbers should yield the same "max" as numeric comparison of the two numbers and it's easy to convert back to the original number using the substring function (which is available in one form or another pretty much everywhere).

Unique Identifiers? Yes! Unique identifiers!
One of the best ways to develop a MySQL DB is to have each id AUTOINCREMENT (Source MySQL.com). This allows a variety of advantages, too many to cover here. The problem with the question is that its example has duplicate ids. This disregards these tremendous advantages of unique identifiers, and at the same time, is confusing to those familiar with this already.
Cleanest Solution
DB Fiddle
Newer versions of MySQL come with ONLY_FULL_GROUP_BY enabled by default, and many of the solutions here will fail in testing with this condition.
Even so, we can simply select DISTINCT someuniquefield, MAX( whateverotherfieldtoselect ), ( *somethirdfield ), etc., and have no worries understanding the result or how the query works :
SELECT DISTINCT t1.id, MAX(t1.rev), MAX(t2.content)
FROM Table1 AS t1
JOIN Table1 AS t2 ON t2.id = t1.id AND t2.rev = (
SELECT MAX(rev) FROM Table1 t3 WHERE t3.id = t1.id
)
GROUP BY t1.id;
SELECT DISTINCT Table1.id, max(Table1.rev), max(Table2.content) : Return DISTINCT somefield, MAX() some otherfield, the last MAX() is redundant, because I know it's just one row, but it's required by the query.
FROM Employee : Table searched on.
JOIN Table1 AS Table2 ON Table2.rev = Table1.rev : Join the second table on the first, because, we need to get the max(table1.rev)'s comment.
GROUP BY Table1.id: Force the top-sorted, Salary row of each employee to be the returned result.
Note that since "content" was "..." in OP's question, there's no way to test that this works. So, I changed that to "..a", "..b", so, we can actually now see that the results are correct:
id max(Table1.rev) max(Table2.content)
1 3 ..d
2 1 ..b
Why is it clean? DISTINCT(), MAX(), etc., all make wonderful use of MySQL indices. This will be faster. Or, it will be much faster, if you have indexing, and you compare it to a query that looks at all rows.
Original Solution
With ONLY_FULL_GROUP_BY disabled, we can use still use GROUP BY, but then we are only using it on the Salary, and not the id:
SELECT *
FROM
(SELECT *
FROM Employee
ORDER BY Salary DESC)
AS employeesub
GROUP BY employeesub.Salary;
SELECT * : Return all fields.
FROM Employee : Table searched on.
(SELECT *...) subquery : Return all people, sorted by Salary.
GROUP BY employeesub.Salary: Force the top-sorted, Salary row of each employee to be the returned result.
Unique-Row Solution
Note the Definition of a Relational Database: "Each row in a table has its own unique key." This would mean that, in the question's example, id would have to be unique, and in that case, we can just do :
SELECT *
FROM Employee
WHERE Employee.id = 12345
ORDER BY Employee.Salary DESC
LIMIT 1
Hopefully this is a solution that solves the problem and helps everyone better understand what's happening in the DB.

Another manner to do the job is using MAX() analytic function in OVER PARTITION clause
SELECT t.*
FROM
(
SELECT id
,rev
,contents
,MAX(rev) OVER (PARTITION BY id) as max_rev
FROM YourTable
) t
WHERE t.rev = t.max_rev
The other ROW_NUMBER() OVER PARTITION solution already documented in this post is
SELECT t.*
FROM
(
SELECT id
,rev
,contents
,ROW_NUMBER() OVER (PARTITION BY id ORDER BY rev DESC) rank
FROM YourTable
) t
WHERE t.rank = 1
This 2 SELECT work well on Oracle 10g.
MAX() solution runs certainly FASTER that ROW_NUMBER() solution because MAX() complexity is O(n) while ROW_NUMBER() complexity is at minimum O(n.log(n)) where n represent the number of records in table !

Something like this?
SELECT yourtable.id, rev, content
FROM yourtable
INNER JOIN (
SELECT id, max(rev) as maxrev
FROM yourtable
GROUP BY id
) AS child ON (yourtable.id = child.id) AND (yourtable.rev = maxrev)

I like to use a NOT EXIST-based solution for this problem:
SELECT
id,
rev
-- you can select other columns here
FROM YourTable t
WHERE NOT EXISTS (
SELECT * FROM YourTable t WHERE t.id = id AND rev > t.rev
)
This will select all records with max value within the group and allows you to select other columns.

SELECT *
FROM Employee
where Employee.Salary in (select max(salary) from Employee group by Employe_id)
ORDER BY Employee.Salary

Note: I probably wouldn't recommend this anymore in MySQL 8+ days. Haven't used it in years.
A third solution I hardly ever see mentioned is MySQL specific and looks like this:
SELECT id, MAX(rev) AS rev
, 0+SUBSTRING_INDEX(GROUP_CONCAT(numeric_content ORDER BY rev DESC), ',', 1) AS numeric_content
FROM t1
GROUP BY id
Yes it looks awful (converting to string and back etc.) but in my experience it's usually faster than the other solutions. Maybe that's just for my use cases, but I have used it on tables with millions of records and many unique ids. Maybe it's because MySQL is pretty bad at optimizing the other solutions (at least in the 5.0 days when I came up with this solution).
One important thing is that GROUP_CONCAT has a maximum length for the string it can build up. You probably want to raise this limit by setting the group_concat_max_len variable. And keep in mind that this will be a limit on scaling if you have a large number of rows.
Anyway, the above doesn't directly work if your content field is already text. In that case you probably want to use a different separator, like \0 maybe. You'll also run into the group_concat_max_len limit quicker.

I think, You want this?
select * from docs where (id, rev) IN (select id, max(rev) as rev from docs group by id order by id)
SQL Fiddle :
Check here

NOT mySQL, but for other people finding this question and using SQL, another way to resolve the greatest-n-per-group problem is using Cross Apply in MS SQL
WITH DocIds AS (SELECT DISTINCT id FROM docs)
SELECT d2.id, d2.rev, d2.content
FROM DocIds d1
CROSS APPLY (
SELECT Top 1 * FROM docs d
WHERE d.id = d1.id
ORDER BY rev DESC
) d2
Here's an example in SqlFiddle

I would use this:
select t.*
from test as t
join
(select max(rev) as rev
from test
group by id) as o
on o.rev = t.rev
Subquery SELECT is not too eficient maybe, but in JOIN clause seems to be usable. I'm not an expert in optimizing queries, but I've tried at MySQL, PostgreSQL, FireBird and it does work very good.
You can use this schema in multiple joins and with WHERE clause. It is my working example (solving identical to yours problem with table "firmy"):
select *
from platnosci as p
join firmy as f
on p.id_rel_firmy = f.id_rel
join (select max(id_obj) as id_obj
from firmy
group by id_rel) as o
on o.id_obj = f.id_obj and p.od > '2014-03-01'
It is asked on tables having teens thusands of records, and it takes less then 0,01 second on really not too strong machine.
I wouldn't use IN clause (as it is mentioned somewhere above). IN is given to use with short lists of constans, and not as to be the query filter built on subquery. It is because subquery in IN is performed for every scanned record which can made query taking very loooong time.

Since this is most popular question with regard to this problem, I'll re-post another answer to it here as well:
It looks like there is simpler way to do this (but only in MySQL):
select *
from (select * from mytable order by id, rev desc ) x
group by id
Please credit answer of user Bohemian in this question for providing such a concise and elegant answer to this problem.
Edit: though this solution works for many people it may not be stable in the long run, since MySQL doesn't guarantee that GROUP BY statement will return meaningful values for columns not in GROUP BY list. So use this solution at your own risk!

If you have many fields in select statement and you want latest value for all of those fields through optimized code:
select * from
(select * from table_name
order by id,rev desc) temp
group by id

How about this:
SELECT all_fields.*
FROM (SELECT id, MAX(rev) FROM yourtable GROUP BY id) AS max_recs
LEFT OUTER JOIN yourtable AS all_fields
ON max_recs.id = all_fields.id

This solution makes only one selection from YourTable, therefore it's faster. It works only for MySQL and SQLite(for SQLite remove DESC) according to test on sqlfiddle.com. Maybe it can be tweaked to work on other languages which I am not familiar with.
SELECT *
FROM ( SELECT *
FROM ( SELECT 1 as id, 1 as rev, 'content1' as content
UNION
SELECT 2, 1, 'content2'
UNION
SELECT 1, 2, 'content3'
UNION
SELECT 1, 3, 'content4'
) as YourTable
ORDER BY id, rev DESC
) as YourTable
GROUP BY id

Here is a nice way of doing that
Use following code :
with temp as (
select count(field1) as summ , field1
from table_name
group by field1 )
select * from temp where summ = (select max(summ) from temp)

I like to do this by ranking the records by some column. In this case, rank rev values grouped by id. Those with higher rev will have lower rankings. So highest rev will have ranking of 1.
select id, rev, content
from
(select
#rowNum := if(#prevValue = id, #rowNum+1, 1) as row_num,
id, rev, content,
#prevValue := id
from
(select id, rev, content from YOURTABLE order by id asc, rev desc) TEMP,
(select #rowNum := 1 from DUAL) X,
(select #prevValue := -1 from DUAL) Y) TEMP
where row_num = 1;
Not sure if introducing variables makes the whole thing slower. But at least I'm not querying YOURTABLE twice.

here is another solution hope it will help someone
Select a.id , a.rev, a.content from Table1 a
inner join
(SELECT id, max(rev) rev FROM Table1 GROUP BY id) x on x.id =a.id and x.rev =a.rev

None of these answers have worked for me.
This is what worked for me.
with score as (select max(score_up) from history)
select history.* from score, history where history.score_up = score.max

Here's another solution to retrieving the records only with a field that has the maximum value for that field. This works for SQL400 which is the platform I work on. In this example, the records with the maximum value in field FIELD5 will be retrieved by the following SQL statement.
SELECT A.KEYFIELD1, A.KEYFIELD2, A.FIELD3, A.FIELD4, A.FIELD5
FROM MYFILE A
WHERE RRN(A) IN
(SELECT RRN(B)
FROM MYFILE B
WHERE B.KEYFIELD1 = A.KEYFIELD1 AND B.KEYFIELD2 = A.KEYFIELD2
ORDER BY B.FIELD5 DESC
FETCH FIRST ROW ONLY)

Sorted the rev field in reverse order and then grouped by id which gave the first row of each grouping which is the one with the highest rev value.
SELECT * FROM (SELECT * FROM table1 ORDER BY id, rev DESC) X GROUP BY X.id;
Tested in http://sqlfiddle.com/ with the following data
CREATE TABLE table1
(`id` int, `rev` int, `content` varchar(11));
INSERT INTO table1
(`id`, `rev`, `content`)
VALUES
(1, 1, 'One-One'),
(1, 2, 'One-Two'),
(2, 1, 'Two-One'),
(2, 2, 'Two-Two'),
(3, 2, 'Three-Two'),
(3, 1, 'Three-One'),
(3, 3, 'Three-Three')
;
This gave the following result in MySql 5.5 and 5.6
id rev content
1 2 One-Two
2 2 Two-Two
3 3 Three-Two

You can make the select without a join when you combine the rev and id into one maxRevId value for MAX() and then split it back to original values:
SELECT maxRevId & ((1 << 32) - 1) as id, maxRevId >> 32 AS rev
FROM (SELECT MAX(((rev << 32) | id)) AS maxRevId
FROM YourTable
GROUP BY id) x;
This is especially fast when there is a complex join instead of a single table. With the traditional approaches the complex join would be done twice.
The above combination is simple with bit functions when rev and id are INT UNSIGNED (32 bit) and combined value fits to BIGINT UNSIGNED (64 bit). When the id & rev are larger than 32-bit values or made of multiple columns, you need combine the value into e.g. a binary value with suitable padding for MAX().

Explanation
This is not pure SQL. This will use the SQLAlchemy ORM.
I came here looking for SQLAlchemy help, so I will duplicate Adrian Carneiro's answer with the python/SQLAlchemy version, specifically the outer join part.
This query answers the question of:
"Can you return me the records in this group of records (based on same id) that have the highest version number".
This allows me to duplicate the record, update it, increment its version number, and have the copy of the old version in such a way that I can show change over time.
Code
MyTableAlias = aliased(MyTable)
newest_records = appdb.session.query(MyTable).select_from(join(
MyTable,
MyTableAlias,
onclause=and_(
MyTable.id == MyTableAlias.id,
MyTable.version_int < MyTableAlias.version_int
),
isouter=True
)
).filter(
MyTableAlias.id == None,
).all()
Tested on a PostgreSQL database.

I used the below to solve a problem of my own. I first created a temp table and inserted the max rev value per unique id.
CREATE TABLE #temp1
(
id varchar(20)
, rev int
)
INSERT INTO #temp1
SELECT a.id, MAX(a.rev) as rev
FROM
(
SELECT id, content, SUM(rev) as rev
FROM YourTable
GROUP BY id, content
) as a
GROUP BY a.id
ORDER BY a.id
I then joined these max values (#temp1) to all of the possible id/content combinations. By doing this, I naturally filter out the non-maximum id/content combinations, and am left with the only max rev values for each.
SELECT a.id, a.rev, content
FROM #temp1 as a
LEFT JOIN
(
SELECT id, content, SUM(rev) as rev
FROM YourTable
GROUP BY id, content
) as b on a.id = b.id and a.rev = b.rev
GROUP BY a.id, a.rev, b.content
ORDER BY a.id

How to order sql server query with one to many relationship and pagination

I have an sql server database full of products and each product has 1 style record and 1 or more sku records. I'm able to successfully paginate my queries on these tables using dense_rank(), but I can't for the life of me figure out how to order the styles based on a column other than the style_id and still be able to add a an incremental rank to each style.
I'm currently using this query:
SELECT *
FROM
(
SELECT
styles.style_id
styles.department,
styles.style,
styles.brand,
styles.description,
skus.sku_id,
skus.color,
skus.size,
skus.price,
skus.creation_date,
DENSE_RANK() OVER (ORDER BY styles.style_id) AS style_rank
FROM skus
LEFT OUTER JOIN styles ON styles.style_id=skus.style_id
WHERE skus.color='blue'
AND styles.brand='acme'
) AS t1
WHERE
t1.style_rank > 0 AND t1.style_rank <= 25
The above will allow me to paginate at the style level, but I need to be able to sort these styles based on price ASC/DESC or creation_date ASC/DESC. It would be nice to be able to control the sorting with an aggregate function like sort styles by MAX(skus.price), but that is not necessary.
To clarify, one style may have three skus and each of these three skus could be a different price.
This seems to be a pretty basic need for any retail website using RDBMS, but I cant find any documentation on how to do this.
Any suggestions on how this could be done would be much appreciated.
EDIT:
I solved this using two queries (shown in my answer below).

I think you need to add PARTITION BY styles.style_id instead:
SELECT *
FROM
(
SELECT
styles.style_id
styles.department,
styles.style,
styles.brand,
styles.description,
skus.sku_id,
skus.color,
skus.size,
skus.price,
skus.creation_date,
DENSE_RANK() OVER (PARTITION BY styles.style_id
ORDER BY skus.createion_date DESC, skus.price) AS style_rank
FROM skus
LEFT OUTER JOIN styles ON styles.style_id=skus.style_id
WHERE skus.color='blue'
AND styles.brand='acme'
) AS t1
WHERE
t1.style_rank > 0 AND t1.style_rank <= 25

I figured out a solution using two queries. The first query is to get the styles. The second is a "multi-get" query using the style_id's from the first query to get all of the skus.
First query:
SELECT *
FROM
(
SELECT DISTINCT
styles.style_id
styles.department,
styles.style,
styles.brand,
styles.description,
MAX(skus.price) AS max_price,
ROW_NUMBER() OVER (ORDER BY MAX(skus.price) DESC) AS price_rank
FROM skus
LEFT OUTER JOIN styles ON styles.style_id=skus.style_id
WHERE skus.color='blue'
AND styles.brand='acme'
GROUP BY styles.style_id, styles.department, styles.style, styles.brand, styles.description
) AS t1
WHERE
t1.price_rank > 0 AND t1.price_rank <= 25
Second query:
SELECT
skus.style_id,
skus.sku_id,
skus.color,
skus.size,
skus.price,
skus.creation_date
FROM skus
WHERE skus.style_id ='styleid1' OR skus.style_id='styleid2' OR skus.style_id='styleid3' /* etc. */
Now populate style objects with the appropriate skus from the second query.

Multiple Select against one CTE

I have a CTE query filtering a table Student
Student
(
StudentId PK,
FirstName ,
LastName,
GenderId,
ExperienceId,
NationalityId,
CityId
)
Based on a lot filters (multiple cities, gender, multiple experiences (1, 2, 3), multiple nationalites), I create a CTE by using dynamic sql and joining the student table with a user defined tables (CityTable, NationalityTable,...)
After that I have to retrieve the count of student by each filter like
CityId City Count
NationalityId Nationality Count
Same thing the other filter.
Can I do something like
;With CTE(
Select
FROM Student
Inner JOIN ...
INNER JOIN ....)
SELECT CityId,City,Count(studentId)
FROm CTE
GROUP BY CityId,City
SELECT GenderId,Gender,Count
FROM CTE
GROUP BY GenderId,Gender
I want to something like what LinkedIn is doing with search(people search,job search)
http://www.linkedin.com/search/fpsearch?type=people&keywords=sales+manager&pplSearchOrigin=GLHD&pageKey=member-home
It's so fast and do the same thing.

You can not use multiple select but you can use more than one CTE like this.
WITH CTEA
AS
(
SELECT 'Coulmn1' A,'Coulmn2' B
),
CETB
AS
(
SELECT 'CoulmnX' X,'CoulmnY' Y
)
SELECT * FROM CTEA, CETB
For getting count use RowNumber and CTE some think like this.
ROW_NUMBER() OVER ( ORDER BY COLUMN NAME )AS RowNumber,
Count(1) OVER() AS TotalRecordsFound
Please let me know if you need more information on this.
Sample for your reference.
With CTE AS (
Select StudentId, S.CityId, S.GenderId
FROM Student S
Inner JOIN CITY C
ON S.CityId = C.CityId
INNER JOIN GENDER G
ON S.GenderId = G.GenderId)
,
GENDER
AS
(
SELECT GenderId
FROM CTE
GROUP BY GenderId
)
SELECT * FROM GENDER, CTE

It is not possible to get multiple result sets from a single CTE.
You can however use a table variable to cache some of the information and use it later instead of issuing the same complex query multiple times:
declare #relevantStudent table (StudentID int);
insert into #relevantStudent
select s.StudentID from Students s
join ...
where ...
-- now issue the multiple queries
select s.GenderID, count(*)
from student s
join #relevantStudent r on r.StudentID = s.StudentID
group by s.GenderID
select s.CityID, count(*)
from student s
join #relevantStudent r on r.StudentID = s.StudentID
group by s.CityID
The trick is to store only the minimum required information in the table variable.
As with any query whether this will actually improve performance vs. issuing the queries independently depends on many things (how big the table variable data set is, how complex is the query used to populate it and how complex are the subsequent joins/subselects against the table variable, etc.).

Do a UNION ALL to do multiple SELECT and concatenate the results together into one table.
;WITH CTE AS(
SELECT
FROM Student
INNER JOIN ...
INNER JOIN ....)
SELECT CityId,City,Count(studentId),NULL,NULL
FROM CTE
GROUP BY CityId,City
UNION ALL
SELECT NULL,NULL,NULL,GenderId,Gender,Count
FROM CTE
GROUP BY GenderId,Gender
Note: The NULL values above just allow the two results to have matching columns, so the results can be concatenated.

I know this is a very old question, but here's a solution I just used. I have a stored procedure that returns a PAGE of search results, and I also need it to return the total count matching the query parameters.
WITH results AS (...complicated foo here...)
SELECT results.*,
CASE
WHEN #page=0 THEN (SELECT COUNT(*) FROM results)
ELSE -1
END AS totalCount
FROM results
ORDER BY bar
OFFSET #page * #pageSize ROWS FETCH NEXT #pageSize ROWS ONLY;
With this approach, there's a small "hit" on the first results page to get the count, and for the remaining pages, I pass back "-1" to avoid the hit (I assume the number of results won't change during the user session). Even though totalCount is returned for every row of the first page of results, it's only computed once.
My CTE is doing a bunch of filtering based on stored procedure arguments, so I couldn't just move it to a view and query it twice. This approach allows avoid having to duplicate the CTE's logic just to get a count.

SQL Error with Order By in Subquery

I'm working with SQL Server 2005.
My query is:
SELECT (
SELECT COUNT(1) FROM Seanslar WHERE MONTH(tarihi) = 4
GROUP BY refKlinik_id
ORDER BY refKlinik_id
) as dorduncuay
And the error:
The ORDER BY clause is invalid in views, inline functions, derived
tables, subqueries, and common table expressions, unless TOP or FOR
XML is also specified.
How can I use ORDER BY in a sub query?

This is the error you get (emphasis mine):
The ORDER BY clause is invalid in
views, inline functions, derived
tables, subqueries, and common table
expressions, unless TOP or FOR XML is
also specified.
So, how can you avoid the error? By specifying TOP, would be one possibility, I guess.
SELECT (
SELECT TOP 100 PERCENT
COUNT(1) FROM Seanslar WHERE MONTH(tarihi) = 4
GROUP BY refKlinik_id
ORDER BY refKlinik_id
) as dorduncuay

If you're working with SQL Server 2012 or later, this is now easy to fix. Add an offset 0 rows:
SELECT (
SELECT
COUNT(1) FROM Seanslar WHERE MONTH(tarihi) = 4
GROUP BY refKlinik_id
ORDER BY refKlinik_id OFFSET 0 ROWS
) as dorduncuay

Besides the fact that order by doesn't seem to make sense in your query....
To use order by in a sub select you will need to use TOP 2147483647.
SELECT (
SELECT TOP 2147483647
COUNT(1) FROM Seanslar WHERE MONTH(tarihi) = 4
GROUP BY refKlinik_id
ORDER BY refKlinik_id
) as dorduncuay
My understanding is that "TOP 100 PERCENT" doesn't gurantee ordering anymore starting with SQL 2005:
In SQL Server 2005, the ORDER BY
clause in a view definition is used
only to determine the rows that are
returned by the TOP clause. The ORDER
BY clause does not guarantee ordered
results when the view is queried,
unless ORDER BY is also specified in
the query itself.
See SQL Server 2005 breaking changes
Hope this helps,
Patrick

If building a temp table, move the ORDER BY clause from inside the temp table code block to the outside.
Not allowed:
SELECT * FROM (
SELECT A FROM Y
ORDER BY Y.A
) X;
Allowed:
SELECT * FROM (
SELECT A FROM Y
) X
ORDER BY X.A;

You don't need order by in your sub query. Move it out into the main query, and include the column you want to order by in the subquery.
however, your query is just returning a count, so I don't see the point of the order by.

A subquery (nested view) as you have it returns a dataset that you can then order in your calling query. Ordering the subquery itself will make no (reliable) difference to the order of the results in your calling query.
As for your SQL itself:
a) I seen no reason for an order by as you are returning a single value.
b) I see no reason for the sub query anyway as you are only returning a single value.
I'm guessing there is a lot more information here that you might want to tell us in order to fix the problem you have.

Add the Top command to your sub query...
SELECT
(
SELECT TOP 100 PERCENT
COUNT(1)
FROM
Seanslar
WHERE
MONTH(tarihi) = 4
GROUP BY
refKlinik_id
ORDER BY
refKlinik_id
) as dorduncuay
:)

maybe this trick will help somebody
SELECT
[id],
[code],
[created_at]
FROM
( SELECT
[id],
[code],
[created_at],
(ROW_NUMBER() OVER (
ORDER BY
created_at DESC)) AS Row
FROM
[Code_tbl]
WHERE
[created_at] BETWEEN '2009-11-17 00:00:01' AND '2010-11-17 23:59:59'
) Rows
WHERE
Row BETWEEN 10 AND 20;
here inner subquery ordered by field created_at (could be any from your table)

In this example ordering adds no information - the COUNT of a set is the same whatever order it is in!
If you were selecting something that did depend on order, you would need to do one of the things the error message tells you - use TOP or FOR XML

Try moving the order by clause outside sub select and add the order by field in sub select
SELECT * FROM
(SELECT COUNT(1) ,refKlinik_id FROM Seanslar WHERE MONTH(tarihi) = 4 GROUP BY refKlinik_id)
as dorduncuay
ORDER BY refKlinik_id

For me this solution works fine as well:
SELECT tbl.a, tbl.b
FROM (SELECT TOP (select count(1) FROM yourtable) a,b FROM yourtable order by a) tbl

Good day
for some guys the order by in the sub-query is questionable.
the order by in sub-query is a must to use if you need to delete some records based on some sorting.
like
delete from someTable Where ID in (select top(1) from sometable where condition order by insertionstamp desc)
so that you can delete the last insertion form table.
there are three way to do this deletion actually.
however, the order by in the sub-query can be used in many cases.
for the deletion methods that uses order by in sub-query review below link
http://web.archive.org/web/20100212155407/http://blogs.msdn.com/sqlcat/archive/2009/05/21/fast-ordered-delete.aspx
i hope it helps. thanks you all

For a simple count like the OP is showing, the Order by isn't strictly needed. If they are using the result of the subquery, it may be. I am working on a similiar issue and got the same error in the following query:
-- I want the rows from the cost table with an updateddate equal to the max updateddate:
SELECT * FROM #Costs Cost
INNER JOIN
(
SELECT Entityname, costtype, MAX(updatedtime) MaxUpdatedTime
FROM #HoldCosts cost
GROUP BY Entityname, costtype
ORDER BY Entityname, costtype -- *** This causes an error***
) CostsMax
ON Costs.Entityname = CostsMax.entityname
AND Costs.Costtype = CostsMax.Costtype
AND Costs.UpdatedTime = CostsMax.MaxUpdatedtime
ORDER BY Costs.Entityname, Costs.costtype
-- *** To accomplish this, there are a few options:
-- Add an extraneous TOP clause, This seems like a bit of a hack:
SELECT * FROM #Costs Cost
INNER JOIN
(
SELECT TOP 99.999999 PERCENT Entityname, costtype, MAX(updatedtime) MaxUpdatedTime
FROM #HoldCosts cost
GROUP BY Entityname, costtype
ORDER BY Entityname, costtype
) CostsMax
ON Costs.Entityname = CostsMax.entityname
AND Costs.Costtype = CostsMax.Costtype
AND Costs.UpdatedTime = CostsMax.MaxUpdatedtime
ORDER BY Costs.Entityname, Costs.costtype
-- **** Create a temp table to order the maxCost
SELECT Entityname, costtype, MAX(updatedtime) MaxUpdatedTime
INTO #MaxCost
FROM #HoldCosts cost
GROUP BY Entityname, costtype
ORDER BY Entityname, costtype
SELECT * FROM #Costs Cost
INNER JOIN #MaxCost CostsMax
ON Costs.Entityname = CostsMax.entityname
AND Costs.Costtype = CostsMax.Costtype
AND Costs.UpdatedTime = CostsMax.MaxUpdatedtime
ORDER BY Costs.Entityname, costs.costtype
Other possible workarounds could be CTE's or table variables. But each situation requires you to determine what works best for you. I tend to look first towards a temp table. To me, it is clear and straightforward. YMMV.

On possible needs to order a subquery is when you have a UNION :
You generate a call book of all teachers and students.
SELECT name, phone FROM teachers
UNION
SELECT name, phone FROM students
You want to display it with all teachers first, followed by all students, both ordered by. So you cant apply a global order by.
One solution is to include a key to force a first order by, and then order the names :
SELECT name, phone, 1 AS orderkey FROM teachers
UNION
SELECT name, phone, 2 AS orderkey FROM students
ORDER BY orderkey, name
I think its way more clear than fake offsetting subquery result.

I Use This Code To Get Top Second Salary
I am Also Get Error Like
The ORDER BY clause is invalid in views, inline functions, derived tables, subqueries, and common table expressions, unless TOP or FOR XML is also specified.
TOP 100 I Used To Avoid The Error
select * from (
select tbl.Coloumn1 ,CONVERT(varchar, ROW_NUMBER() OVER (ORDER BY (SELECT 1))) AS Rowno from (
select top 100 * from Table1
order by Coloumn1 desc) as tbl) as tbl where tbl.Rowno=2

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Keeping it simple and how to do multiple CTE in a query - sql-server

Related

Order SQL statement by count of sub-records efficiently (MS-SQL)

How to create multiple return subquery? [duplicate]

How to order sql server query with one to many relationship and pagination

Multiple Select against one CTE

SQL Error with Order By in Subquery

Categories

Resources