Example ID in aggregate queries SQL Server [closed] - sql-server

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I have a query that aggregates a large amount of transaction data. The raw data has unique IDs for every transaction and I need to have an example ID in each aggregated row. It doesn't matter which ID is chosen as long as the ID is intact so that we can go back and look up individual examples for a grouping from the raw transaction data if need be. I do not have control over the raw data.
For example, this:
ID Group
6457982468798542364879 Group 1
FR65487985412354 Group 1
1564879541356897 Group 2
6548941236584269 Group 2
Into this:
ExampleID Group Volume
6457982468798542364879 Group 1 2
1564879541356897 Group 2 2
I was trying to use MAX to do this but that doesn't work when there are ID's that have letters or more than 20 characters. I also tried using STRING_AGG but kept hitting its character limit and I only want a single ID for each group anyways.
The data sets are large so efficiency is a consideration. I'm using SQL Server version 2017.

If the example ID doesn't matter, pick an aggregate function like MIN() or MAX() and use that to show one ID from the group.

Assuming data as follows
CREATE TABLE #Test (ID nvarchar(30), Grp nvarchar(10))
INSERT INTO #Test (ID, Grp) VALUES
(N'6457982468798542364879' ,N'Group 1'),
(N'FR65487985412354' ,N'Group 1'),
(N'1564879541356897' ,N'Group 2'),
(N'6548941236584269' ,N'Group 2')
Update:
Based on comments regarding MAX etc, you can just use it in one easy command (don't try to CAST/CONVERT it to anything, just use MAX on its own).
SELECT Grp,
MAX(ID) AS ExampleID,
COUNT(*) AS Volume
FROM #Test
GROUP BY Grp
My original version (which was using FIRST_VALUE to get an example row) needed a window function as a sub-query/CTE. But use the one above as it's a lot clearer and easier to maintain, and probably uses less processing time. Here is the original one just for history's sake:
; WITH Src AS
(SELECT Grp,
FIRST_VALUE(ID) OVER (PARTITION BY Grp ORDER BY ID) AS exampleID
FROM #Test
)
SELECT Grp,
ExampleID,
COUNT(*) AS Volume
FROM Src
GROUP BY Grp,
ExampleID
Here's an updated db<>fiddle with both examples.

It seems an efficient way would be to use 2 windowing functions: ROW_NUMBER() to eliminate duplicates, and COUNT() to get the Volume. Something like this.
Data
drop table if exists #Test;
go
create table #Test (
ExampleID nvarchar(30),
Grp nvarchar(10));
go
INSERT INTO #Test (ExampleID, Grp) VALUES
(N'6457982468798542364879', N'Group 1'),
(N'FR65487985412354', N'Group 1'),
(N'1564879541356897', N'Group 2'),
(N'6548941236584269', N'Group 2');
Query
with grp_cte as (
select *, row_number() over (partition by Grp order by (select null)) rn,
count(*) over (partition by Grp order by (select null)) cn
from #Test)
select ExampleID, Grp as [Group], cn Volume
from grp_cte
where rn=1;
Output
ExampleID Group Volume
6457982468798542364879 Group 1 2
1564879541356897 Group 2 2

Just use ID. This should do the job.
SELECT
ID AS ExampleID,
[GROUP],
COUNT(ID) as VOLUME
FROM MyTable
GROUP BY [GROUP]

Related

How to add sum of a column as a new row after a group in SQL select query

I have a select query which returns a couple of rows grouped by ParentId. How can I add a new row with sum of a column after each parentId group?
For now I have kept the data in a temp table and the result is as below.
And I want to add a new row at the end of each ParentId group as below with the sum of column LoanAmount.
Any help will be appreciated. Many thanks.
You can use a common table expression to achieve this. Here I've created a cte with rank column for getting it sorted in order.
;WITH cte AS
(SELECT ParentId,
sum(LoanAmount) LoanAmount,
max(rank) + 1 AS rank
FROM test
GROUP BY ParentId)
SELECT *
FROM test
UNION ALL
SELECT *
FROM cte
ORDER BY ParentId, rank
rextester
See this link here enter link description here
I think you want:
SELECT ParentID, SUM(VALUE1), SUM(VALUE2)
FROM tableName
GROUP BY ID
You cant do it after each group or at the bottom like in excel, but you create a 'new table' in your query effectively.
Yeah having seen your updated comment, you main issue is youre thinking of it like excel, SQL is not a spreadsheet tool - its a relational database. Id suggest going through a SQL intro - youll pick up the concepts quite fast.
The query I gave you could be created as a stored procedure.
If you feel I've answered your question, id appreciate an upvote :)
You can make sum of group by subquery then combine them in union
; with cte as
( select 9999 as Slno, Level, ParentId, Ent_id, relation, sum(colname) as colname from table group by Level, ParentId, Ent_id, relation)
, ct as ( select row_number() over (partition by ParentId order by level) as Slno, Level, ParentId, Ent_id, Name, --- rest of your column names
colname from table
union all
select Slno, Level, ParentId, Ent_id, '' as Name, ---rest of '' for each column with column name as alias
colname from cte )
select Slno, Level, ParentId, Ent_Id, Name, ---- your columns of table
colname from ct order by Slno
This is just rough idea. Feel free to ask for any confusion.
Post your exact schema for accurate details.

Reverse of each value in a column

Suppose I have a table with even number of rows. For eg- a table Employee with two columns Name and EmpCode. The table looks like
Name EmpCode
Ajay 7
Vikash 5
Shalu 4
Hari 8
Anu 1
Puja 9
Now, I want my output in reverse of EmpCode like:
Name EmpCode
Ajay 9
Vikash 1
Shalu 8
Hari 4
Anu 5
Puja 7
I need to run this query in SQL Server.
As the OP hasn't replied, I'll post a little explanation for them instead. As everyone has eluded to, tables in SQL Server have no built in ordering. Your data is stored in what is known as a HEAP. This means, when you run a query without an ORDER BY your data can return in any order that the Server feels like. With small datasets this might be in the order you inserted it in, but that's just it (it might).
When you get to larger datasets, and when you have multiple cores running on the operation, then the order of a SELECT * FROM [Table]; is more likely to not be the order in insertion, and is more likely to be random which each instance of running the query. I have several tables where a SELECT TOP 1 *... will return a different row every time I run the query; even with the CLUSTERED INDEX.
The only, yes only, way to guarantee the order is by using ORDER BY. Now, you might have another column which you haven't shared that you can order by, but if not, perhaps this (very) simple example will at least assist you, if nothing else:
CREATE TABLE #Employee ([Name] varchar(10), EmpCode tinyint);
INSERT INTO #Employee
VALUES ('Ajay',7),
('Vikash',5),
('Shalu',4),
('Hari',8),
('Anu',1),
('Puja',9);
GO
--Just SELECT *. ORDER is NOT guaranteed, but, due to the low volume of data, will probably be in the order by insertion
SELECT *
FROM #Employee;
--But, we want to reverse the order, so, let's add an ORDER BY
SELECT *
FROM #Employee
ORDER BY [Name];
--Oh! That didn't work (duh). Let's try again
SELECT *
FROM #Employee
ORDER BY Empcode;
--Nope, this isn't working. That's because your data has nothing related to it's insertion order. So, let's give it one:
GO
DROP TABLE #Employee;
CREATE TABLE #Employee (ID int IDENTITY(1,1), --Oooo, what is this?
[Name] varchar(10),
EmpCode tinyint);
INSERT INTO #Employee
VALUES ('Ajay',7),
('Vikash',5),
('Shalu',4),
('Hari',8),
('Anu',1),
('Puja',9);
GO
--Now look
SELECT *
FROM #Employee;
--So, we can use an ORDER BY, and get the correct order too
SELECT [Name],
Empcode
FROM #Employee
ORDER BY ID;
--So, we got the right ORDER using an ORDER BY. Now we can do something about the ordering:
--We'll need a CTE for this:
WITH RNs AS(
SELECT *,
ROW_NUMBER() OVER (ORDER BY ID ASC) AS RN1,
ROW_NUMBER() OVER (ORDER BY ID DESC) AS RN2
FROM #Employee)
SELECT R1.[Name],
R2.EmpCode
FROM RNs R1
JOIN RNs R2 ON R1.RN1 = R2.RN2;
GO
DROP TABLE #Employee;

SELECT INTO query

I have to write an SELECT INTO T-SQL script for a table which has columns acc_number, history_number and note.
How do i facilitate an incremental value of history_number for each record being inserted via SELECT INTO.
Note, that the value for history_number comes off as a different value for each account from a different table.
SELECT history_number = IDENTITY(INT,1,1),
... etc...
INTO NewTable
FROM ExistingTable
WHERE ...
You could use ROW_NUMBER instead of identity i.e. ROW_NUMBER() OVER (ORDER BY )
SELECT acc_number
,o.historynumber
,note
,o.historynumber+DENSE_RANK() OVER (Partition By acc_number ORDER BY Note) AS NewHistoryNumber
--Or some other order by probably a timestamp...
FROM Table t
INNER JOIN OtherTable o
ON ....
Working Fiddle
The will give you an incremented count starting from history number for each accnum. I suggest you use a better order by in the rank but there was not enough info in the question.
This answer to this question may help you as well
Question
Suppose your SELECT statement is like this
SELECT acc_number,
history_number,
note
FROM [Table]
Try this Query as below.
SELECT ROW_NUMBER() OVER (ORDER BY acc_number) ID,
acc_number,
history_number,
note
INTO [NewTable]
FROM [Table]

MSSQL 2008 R2 Selecting rows withing certain range - Paging - What is the best way

Currently this sql query is able to select between the rows i have determined. But are there any better approach for this ?
select * from (select *, ROW_NUMBER() over (order by Id desc) as RowId
from tblUsersMessages ) dt
where RowId between 10 and 25
Depends on your indexes.
Sometimes this can be better
SELECT *
FROM tblUsersMessages
WHERE Id IN (SELECT Id
FROM (select Id,
ROW_NUMBER() over (order by Id desc) as RowId
from tblUsersMessages) dt
WHERE RowId between 10 and 25)
If a narrower index exists that can be used to quickly find the Id values within the range. See my answer here for an example that demonstrates the type of issue that can arise.
You need to check the execution plans and output of SET STATISTICS IO ON for your specific case.

Generate Row Serial Numbers in SQL Query

I have a customer transaction table. I need to create a query that includes a serial number pseudo column. The serial number should be automatically reset and start over from 1 upon change in customer ID.
Now, I am familiar with the row_number() function in SQL. This doesnt exactly solve my problem because to the best of my knowledge the serial number will not be reset in case the order of the rows change.
I want to do this in a single query (SQL Server) and without having to go through any temporary table usage etc. How can this be done?
Sometime we might don't want to apply ordering on our result set to add serial number. But if we are going to use ROW_NUMBER() then we have to have a ORDER BY clause. So, for that we can simply apply a tricks to avoid any ordering on the result set.
SELECT ROW_NUMBER() OVER(ORDER BY (SELECT 1)) AS ItemNo, ItemName FROM ItemMastetr
For that we don't need to apply order by on our result set. We'll just add ItemNo on our given result set.
select
ROW_NUMBER() Over (Order by CustomerID) As [S.N.],
CustomerID ,
CustomerName,
Address,
City,
State,
ZipCode
from Customers;
I'm not certain, based on your question if you want numbered rows that will remember their numbers even if the underlying data changes (and gives a different ordering), but if you just want numbered rows - that reset on a change in customer ID, then try using the Partition by clause of row_number()
row_number() over(partition by CustomerID order by CustomerID)
Implementing Serial Numbers Without Ordering Any of the Columns
Demo SQL Script-
IF OBJECT_ID('Tempdb..#TestTable') IS NOT NULL
DROP TABLE #TestTable;
CREATE TABLE #TestTable (Names VARCHAR(75), Random_No INT);
INSERT INTO #TestTable (Names,Random_No) VALUES
('Animal', 363)
,('Bat', 847)
,('Cat', 655)
,('Duet', 356)
,('Eagle', 136)
,('Frog', 784)
,('Ginger', 690);
SELECT * FROM #TestTable;
There are ‘N’ methods for implementing Serial Numbers in SQL Server. Hereby, We have mentioned the Simple Row_Number Function to generate Serial Numbers.
ROW_NUMBER() Function is one of the Window Functions that numbers all rows sequentially (for example 1, 2, 3, …) It is a temporary value that will be calculated when the query is run. It must have an OVER Clause with ORDER BY. So, we cannot able to omit Order By Clause Simply. But we can use like below-
SQL Script
IF OBJECT_ID('Tempdb..#TestTable') IS NOT NULL
DROP TABLE #TestTable;
CREATE TABLE #TestTable (Names VARCHAR(75), Random_No INT);
INSERT INTO #TestTable (Names,Random_No) VALUES
('Animal', 363)
,('Bat', 847)
,('Cat', 655)
,('Duet', 356)
,('Eagle', 136)
,('Frog', 784)
,('Ginger', 690);
SELECT Names,Random_No,ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS SERIAL_NO FROM #TestTable;
In the Above Query, We can Also Use SELECT 1, SELECT ‘ABC’, SELECT ” Instead of SELECT NULL. The result would be Same.
SELECT ROW_NUMBER() OVER (ORDER BY ColumnName1) As SrNo, ColumnName1, ColumnName2 FROM TableName
select ROW_NUMBER() over (order by pk_field ) as srno
from TableName
Using Common Table Expression (CTE)
WITH CTE AS(
SELECT ROW_NUMBER() OVER(ORDER BY CustomerId) AS RowNumber,
Customers.*
FROM Customers
)
SELECT * FROM CTE
I found one solution for MYSQL its easy to add new column for SrNo or kind of tepropery auto increment column by following this query:
SELECT #ab:=#ab+1 as SrNo, tablename.* FROM tablename, (SELECT #ab:= 0)
AS ab
ALTER function dbo.FN_ReturnNumberRows(#Start int, #End int) returns #Numbers table (Number int) as
begin
insert into #Numbers
select n = ROW_NUMBER() OVER (ORDER BY n)+#Start-1 from (
select top (#End-#Start+1) 1 as n from information_schema.columns as A
cross join information_schema.columns as B
cross join information_schema.columns as C
cross join information_schema.columns as D
cross join information_schema.columns as E) X
return
end
GO
select * from dbo.FN_ReturnNumberRows(10,9999)

Resources