Group by multiple formatted and casted columns in SQL Server - sql-server

I've a trouble grouping my multiple column in SQL server. I'm aware that Group by ordinals does not work and also, I cannot use the alias name in Group by.
It's painful to list the columns especially if I've formatted and casted the columns in SELECT statements.
e.g.
SELECT
CONCAT('ABC',column1) AS col1,
cast('XYZ AS VARCHAR) AS col2,
column3 AS col3,
cast(ISNULL(column4,0) AS MONEY) / 100 AS col4,
....
....
count(DISTINCT table2.ID) AS col15
GROUP BY <>
Above is a sample query which I use and it's complex when I select different columns from different table JOINing them. What's the solution?

If you really want to achieve this outcome, the only mechanisms are to use either a Sub-Query or CTE:
Sub-Query:
SELECT *
FROM (
SELECT
CONCAT('ABC',column1) AS col1,
cast('XYZ AS VARCHAR) AS col2,
column3 AS col3,
cast(ISNULL(column4,0) AS MONEY) / 100 AS col4,
....
....
count(DISTINCT table2.ID) AS col15
)
GROUP BY <>
Common Table Expression (CTE):
;WITH data AS (
SELECT
CONCAT('ABC',column1) AS col1,
cast('XYZ AS VARCHAR) AS col2,
column3 AS col3,
cast(ISNULL(column4,0) AS MONEY) / 100 AS col4,
....
....
count(DISTINCT table2.ID) AS col15
)
SELECT *
FROM Data
GROUP BY <>

Related

Ignore duplicates and not deleting them

I already know how to delete duplicate rows on an Id column. But I am not allowed to delete anything on the server. What I need is a WITH statement that ignores duplicate rows (takes only one of them). Is there a way to do this without modifying data in a table?
P.S.1 All the duplicates rows are identical. So there is no need to decide which one to keep.
P.S.2 I'd rather not create an extra temp table (SELECT * INTO ...)
You can use a CTE which does a select DISTINCT on all columns:
WITH cte AS (
SELECT DISTINCT Id, col1, col2, ..., colN
FROM yourTable
)
You could also achieve this using GROUP BY on all columns:
WITH cte AS (
SELECT Id, col1, col2, ..., colN
FROM yourTable
GROUP BY Id, col1, col2, ..., colN
)
If the Id values are not duplicated, but all other columns are, then you can try:
WITH cte AS (
SELECT MIN(Id) AS Id, col1, col2, ..., colN
FROM yourTable
GROUP BY col1, col2, ..., colN
)

SQL Pivot query without aggregation or max

I am trying to get a pivot result with no aggregation, I tried max and it didn't help, may be I am doing something wrong.
When I run this below query
declare #t table
(
col1 int,
col2 varchar(100),
col3 varchar(100),
col4 varchar(100),
col5 int
)
insert into #t values(1,'test1','p1','v1',1)
insert into #t values(1,'test1','p2','v2',2)
insert into #t values(1, 'test1','p3','v3',3)
insert into #t values(1,'test1','p1','v11',1)
insert into #t values(1,'test1','p1','v12',1)
insert into #t values(1,'test1','p2','v21',2)
insert into #t values(1,'test1','p2','v22',2)
--select * from #t
select col1,
col2,
[p1],
[p2],
[p3]
from
(
select * from #t
) x
pivot
(
Max(col4 )
for col3 in ([p1],[p2],[p3])
) pvt
I get this below result
I am trying to get this below result
It would be great if you could show me a path to achieve this.
You'll still need to use an aggregate function with the PIVOT, but you need some sort of value to return multiple rows based on the combination of col1, col2, and col3. This is where you'd want to use a windowing function like row_number().
If you use the following query you should be able to get the result:
select col1, col2, p1, p2, p3
from
(
select col1, col2, col3, col4,
rn = row_number() over(partition by col1, col2, col3 order by col5, col4)
from #t
) d
pivot
(
max(col4)
for col3 in (p1, p2, p3)
) p;
See SQL Fiddle with Demo
The row_number() function creates a unique sequence that is partitioned by the col1, col2 and col3 values - I then ordered the results by your col5 and col4 values to create the sequence in a specific order. This new value is used when the pivot groups the data which results in multiple rows being returned instead of the single row.

Delete duplicates from SQL Server table

I need to remove some duplicate entries from an intersection table.
The table is incredibly badly set up, without primary keys, so I'm having some trouble removing entries which are duplicates.
Here's just a rough overview of the table:
col1 col2
------------
1 70
1 70
1 71
Both columns carry id's, and duplicates breaks stuff.
You can use RANKING Functions
with cte as
(
select row_number() over(partition by col1,col2 order by col1,col2 )as rowNum
from tableName
)
delete from cte where rowNum>1
SQL FIDDLE DEMO
with t1dups (col1, coldups)
AS (
select col2, ROW_NUMBER() Over (Partition by col1, col2 order by col2) as dups from t1 )
delete from t1dups where coldups > 1
drop table #t
create table #t(col1 int,col2 int)
insert into #t values(1,70),(1,70),(2,71)
;with cte as
(
select [col1],[col2],rn=row_number() over(partition by col1 order by col2) from #t
)
delete from cte where rn>1
select * from #t
DEMO

Filter on Output clause sql

I am trying to use a filter on an OUTPUT clause in t-sql.
What I want to do is something like this:
Insert into tbl_1(col1,col2)
Output Inserted.col1 into #tbl_temp
**where col1 > 0**
select col3, col4
from tbl_2
For performance reasons I don't want to use two insert statements.
insert into #tbl_temp
select col1
from
(
insert into tbl_1(col1,col2)
output Inserted.col1
select col3, col4
from tbl_2
) as T
where T.col1 > 0

How to filter rows by values of one column?

I need to get several columns form sql query. Then I have to filter this answer by the "distinct" values of one column, but in the output I need to have all columns, not only this which values has to be distinct. Can anybody help me? Order by clause is not an answer for me.
A,B,C,D
E,F,G,H
I,J,C,L
M,N,Z,H
Above is a simple rows output. Please have a look onto 3rd column. Let's assume that we don't know how many rows do we have. I need to select only rows which has distinct value in 3rd column. (C,G,Z) - We need to filter anyone from "C" rows.
I've arbitrarily chosen to use col1 to break ties on col3. You can adjust the order by portion of the partition to suit your needs.
/* Set up test data */
declare #test table (
col1 char(1),
col2 char(1),
col3 char(1),
col4 char(1)
)
insert into #test
(col1, col2, col3, col4)
select 'A','B','C','D' union all
select 'E','F','G','H' union all
select 'I','J','C','L' union all
select 'M','N','Z','H'
/* Here's the query */
;with cteRowNumber as (
select col1, col2, col3, col4,
row_number() over (partition by col3 order by col1) as RowNumber
from #test
)
select col1, col2, col3, col4
from cteRowNumber
where RowNumber = 1
Returns
col1 col2 col3 col4
----------------------------
A B C D
E F G H
M N Z H
ROLL UP or CUBE could be helpful for your problem, since they can aggregate (i.e. subtotal) data based on the GROUP BY and still return the individual rows.

Resources