SQL Error on Order by and group by - sql-server

This SQL works in mysql but I can't do this in SQL Server.
SELECT COUNT(*)
FROM (
SELECT
COUNT(postnID) AS Total,
postnID,
Unit_DBM,
job_type,
level,
internal_plantilla,
INCID,
ITEM_NO_2005,
position_type,
position_status
FROM paf_plantilla
GROUP BY
internal_plantilla,
level,
INCID,
postnID,
position_status
ORDER BY
internal_plantilla,
postnID
) AS num
Error:
The ORDER BY clause is invalid in views, inline functions, derived
tables, subqueries, and common table expressions, unless TOP, OFFSET
or FOR XML is also specified.

This won't work in any non MySQL implementation of SQL.
Non aggregates in aggregate queries must be grouped by
ORDER BY without TOP in a subquery is most likely not going to do what you think it will (may not give an error but it's not right either)
Because of the first point it's very hard to work out what the correct query - some kind of count of combinations of other things. If you explain what I might be able to update the answer.

You can't select columns which are not in the group by clause. Since you are doing a count, you really don't need the order by clause. Try the below:
SELECT COUNT(*)
FROM (
SELECT
COUNT(postnID) AS Total,
postnID,
Unit_DBM,
job_type,
level,
internal_plantilla,
INCID,
ITEM_NO_2005,
position_type,
position_status
FROM paf_plantilla
GROUP BY
postnID,
Unit_DBM,
job_type,
level,
internal_plantilla,
INCID,
ITEM_NO_2005,
position_type,
position_status
) AS num

Related

Why do I need to use "as" keyword in this sql query?

I have this SQL query:
select top(1)
salary
from
(select top(2) salary
from employee
order by salary desc) as b
order by
salary asc
If I don't utilize as b it will give me an error:
Incorrect syntax near ...
Why is mandatory to use as in this query?
You don't need the as keyword. In fact, I advise using as for column aliases but not for table aliases. So, I would write this as:
select top(1) salary
from (select top(2) salary
from employee
order by salary desc
) b
order by salary asc;
You do need the table alias for the subquery, because SQL Server requires that all subqueries in the from clause be named.
This is TSql syntax. Subquery in FROM must have an alias even it's never used. Oracle for example considers this alias optional.
This is because you have a sub-query that, according to the Transact-SQL documentation on FROM, makes the use of an alias mandatory:
When a derived table, rowset or table-valued function, or operator clause (such as PIVOT or UNPIVOT) is used, the required table_alias at the end of the clause is the associated table name for all columns, including grouping columns, returned.
Note that with derived table the kind of sub-query is intended that you use in your SQL statement:
derived_table
Is a subquery that retrieves rows from the database. derived_table is used as input to the outer query.
Because you are using 'salary' twice. Without an alias the interpreter won't know what 'salary' to order the results by. By using an alias it can discern between employee.salary and b.salary.
A different approach to get the 2nd highest salary... as if you need the 3rd or 4th you're approach would get much more challenging...
SELECT *
FROM (SELECT salary, row_number() over (order by salary desc) rn
FROM employee) E
WHERE rn = 2
You are creating two queries. The first one selects the top 2 salaries from employee. You are calling this list "b". Then you are selecting the top salary from "b".

SQL Server : group all data by one column

I need some help in writing a SQL Server stored procedure. All data group by Train_B_N.
my table data
Expected result :
expecting output
with CTE as
(
select Train_B_N, Duration,Date,Trainer,Train_code,Training_Program
from Train_M
group by Train_B_N
)
select
*
from Train_M as m
join CTE as c on c.Train_B_N = m.Train_B_N
whats wrong with my query?
The GROUP BY smashes the table together, so having columns that are not GROUPED combine would cause problems with the data.
select Train_B_N, Duration,Date,Trainer,Train_code,Training_Program
from Train_M
group by Train_B_N
By ANSI standard, the GROUP BY must include all columns that are in the SELECT statement which are not in an aggregate function. No exceptions.
WITH CTE AS (SELECT TRAIN_B_N, MAX(DATE) AS Last_Date
FROM TRAIN_M
GROUP BY TRAIN_B_N)
SELECT A.Train_B_N, Duration, Date,Trainer,Train_code,Training_Program
FROM TRAIN_M AS A
INNER JOIN CTE ON CTE.Train_B_N = A.Train_B_N
AND CTE.Last_Date = A.Date
This example would return the last training program, trainer, train_code used by that ID.
This is accomplished from MAX(DATE) aggregate function, which kept the greatest (latest) DATE in the table. And since the GROUP BY smashed the rows to their distinct groupings, the JOIN only returns a subset of the table's results.
Keep in mind that SQL will return #table_rows X #Matching_rows, and if your #Matching_rows cardinality is greater than one, you will get extra rows.
Look up GROUP BY - MSDN. I suggest you read everything outside the syntax examples initially and obsorb what the purpose of the clause is.
Also, next time, try googling your problem like this: 'GROUP BY, SQL' or insert the error code given by your IDE (SSMS or otherwise). You need to understand why things work...and SO is here to help, not be your google search engine. ;)
Hope you find this begins your interest in learning all about SQL. :D

Querying aggregate columns in a SQL Server SELECT statement

I have a SQL Server query which runs just fine -- until I add a computed column to the SELECT statement. Then I get an odd SQL Server error.
Here's the SQL:
SELECT
outmail_.MessageID_,
CONVERT(VARCHAR(10),outmail_.Created_,120) AS 'Issue',
lyrReportSummaryData.mailed,
lyrReportSummaryData.successes,
COUNT(*) AS 'opens',
COUNT(DISTINCT clicktracking_.MemberID_) AS 'unique_opens',
convert(decimal(3,1),((convert(float,[unique_opens]))/[successes]) * 100) AS 'Rate'
FROM
outmail_
RIGHT JOIN
clicktracking_ ON clicktracking_.MessageID_ = outmail_.MessageID_
RIGHT JOIN
lyrReportSummaryData ON lyrReportSummaryData.id = clicktracking_.MessageID_
GROUP BY
outmail_.MessageID_, CONVERT(VARCHAR(10), outmail_.Created_,120),
lyrReportSummaryData.mailed, lyrReportSummaryData.successes
The problem is the line beginning with the convert(decimal ... When it is included, I get the following error:
Error 8120: Column 'lyrReportSummaryData.unique_opens' is invalid in
the select list because it is not contained in either an aggregate
function or the GROUP BY clause.
I'm not sure how to resolve the error since I don't know how to use it in a GROUP BY clause (and it doesn't seem that I should need to do so).
Any suggestions for how to proceed? Thanks.
I'm sure someone with better DBA skills than me can point out a more efficient way of doing this, but...
If you perform the bulk of your query as an sub-query, you can then do the calculations on the result of your sub-query:
SELECT
MessageID_,
Issue,
mailed,
successes,
opens,
unique_opens,
convert(decimal(3,1),((convert(float,[unique_opens]))/[successes]) * 100) AS 'Rate'
FROM
(SELECT
outmail_.MessageID_,
CONVERT(VARCHAR(10),outmail_.Created_,120) AS 'Issue',
lyrReportSummaryData.mailed,
lyrReportSummaryData.successes,
COUNT(*) AS 'opens',
COUNT(DISTINCT clicktracking_.MemberID_) AS 'unique_opens'
FROM outmail_
RIGHT JOIN clicktracking_ ON clicktracking_.MessageID_ = outmail_.MessageID_
RIGHT JOIN lyrReportSummaryData ON lyrReportSummaryData.id = clicktracking_.MessageID_
GROUP BY outmail_.MessageID_, CONVERT(VARCHAR(10), outmail_.Created_,120), lyrReportSummaryData.mailed, lyrReportSummaryData.successes
) subquery /* was 'g' */
Effectively what this does is runs the grouping, and then based on that, does the calculation afterwards.
Subqueries must be given an alias (in this instance 'subquery') - even if you don't use that alias name.

Count of Distinct Rows Without Using Subquery

Say I have Table1 which has duplicate rows (forget the fact that it has no primary key...) Is it possible to rewrite the following without using a JOIN, subquery or CTE and also without having to spell out the columns in something like a GROUP BY?
SELECT COUNT(*)
FROM (
SELECT DISTINCT * FROM Table1
) T1
You can do something like this.
SELECT Count(DISTINCT ProductName) FROM Products
but if you want a count of completely distinct records then you will have to use one of the other options you mentioned.
If you wanted to do something like you suggested in the question, then that would imply you have duplicate records in your table.
If you didn't have duplicate records SELECT DISTINCT * from table would be the same without the distinct.
No, it's not possible.
If you are limited by your framework/query tool/whatever, can't use a subquery, and can't spell out each column name in the GROUP BY, you are SOL.
If you are not limited by your framework/query tool/whatever, there's no reason not to use a subquery.
if you really really want to do that you can just "SELECT COUNT(*) FROM table1 GROUP BY all,columns,here" and take the size of the result set as your count.
But it would be dailywtf worthy code ;)
I just wanted to refine the answer by saying that you need to check that the datatype of the columns is comparable - otherwise you will get an error trying to make them DISTINCT:
e.g.
com.microsoft.sqlserver.jdbc.SQLServerException: The ntext data type cannot be selected as DISTINCT because it is not comparable.
This is true for large binary, xml columns and others depending on your RDBMS - rtm. The solution for SQLServer for example is to cast it from an ntext to an nvarchar(MAX) from SQLServer 2005 onwards.
If you stick to the PK columns then you should be OK (I haven't verified this myself but I'd have thought logically that PK columns would have to be comparable)

How to elegantly write a SQL ORDER BY (which is invalid in inline query) but required for aggregate GROUP BY?

I have a simple query that runs in SQL 2008 and uses a custom CLR aggregate function, dbo.string_concat which aggregates a collection of strings.
I require the comments ordered sequentially hence the ORDER BY requirement.
The query I have has an awful TOP statement in it to allow ORDER BY to work for the aggregate function otherwise the comments will be in no particular order when they are concatenated by the function.
Here's the current query:
SELECT ID, dbo.string_concat(Comment)
FROM (
SELECT TOP 10000000000000 ID, Comment, CommentDate
FROM Comments
ORDER BY ID, CommentDate DESC
) x
GROUP BY ID
Is there a more elegant way to rewrite this statement?
So... what you want is comments concatenated in order of ID then CommentDate of the most recent comment?
Couldn't you just do
SELECT ID, dbo.string_concat(Comment)
FROM Comments
GROUP BY ID
ORDER BY ID, MAX(CommentDate) DESC
Edit: Misunderstood your objective. Best I can come up with is that you could clean up your query a fair bit by making it SELECT TOP 100 PERCENT, it's still using a top but at least it gets around having an arbitrary number as the limit.
Since you're using sql server 2008, you can use a Common Table Expression:
WITH cte_ordered (ID, Comment, CommentDate)
AS
(
SELECT ID, Comment, CommentDate
FROM Comments
ORDER BY ID, CommentDate DESC
)
SELECT ID, dbo.string_concat(Comment)
FROM cte_ordered
GROUP BY ID

Resources