Select a random database row based on another query - sql-server

For internal control we would like to select a single random invoice for each of multiple invoice types and regions.
Here's the SQL to get a set of distinct Invoice Types and Regions
select InvoiceType,RegionID
from Invoices
group by InvoiceType, RegionID
For each row this returns I need to fetch a random row with that InvoiceType and RegionID. This is how I'm fetching random rows:
SELECT top 1
CustomerID
,InvoiceNum
,Name
FROM Invoices
JOIN Customers on Customers.CustomerID=Invoices.CustomerID
where InvoiceType=X and RegionID=Y
ORDER BY NEWID
But I don't know how to run this select statement foreach() row the first statement returns. I could do it programmatically but I would prefer an option using only a stored procedure as this query isn't supposed to need a program.

WITH cteInvoices AS (
SELECT CustomerID, InvoiceNum, Name,
ROW_NUMBER() OVER(PARTITION BY InvoiceType, RegionID ORDER BY NEWID()) AS RowNum
FROM Invoices
)
SELECT c.CustomerID, c.InvoiceNum, c.Name
FROM cteInvoices c
WHERE c.RowNum = 1;

Related

Column 'ACCOUNT.ACCOUNT_ID' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause

I am trying to get available balance on last(max) date. I am trying to write below query but it is showing error.
select ACCOUNT_ID,AVAIL_BALANCE,OPEN_DATE,MAX(LAST_ACTIVITY_DATE)
from ACCOUNT
group by CUST_ID;
Column 'ACCOUNT.ACCOUNT_ID' is invalid in the select list because it
is not contained in either an aggregate function or the GROUP BY
clause.
I am new to sql. Can anyone let me know where I am wrong in this query?
Any column not having a calculation/function on it must be in the GROUP BY clause.
select ACCOUNT_ID,AVAIL_BALANCE,OPEN_DATE,MAX(LAST_ACTIVITY_DATE)
from ACCOUNT
group by ACCOUNT_ID,AVAIL_BALANCE,OPEN_DATE;
If you're wanting the most recent row for each customer, think ROW_NUMBER(), not GROUP BY:
;With Numbered as (
select *,ROW_NUMBER() OVER (
PARTITION BY CUST_ID
ORDER BY LAST_ACTIVITY_DATE desc) rn
from Account
)
select ACCOUNT_ID,AVAIL_BALANCE,OPEN_DATE,LAST_ACTIVITY_DATE
from Numbered
where rn=1
I think you want to select one records having max(LAST_ACTIVITY_DATE) for each CUST_ID.
For this you can use TOP 1 WITH TIES like following.
SELECT TOP 1 WITH TIES account_id,
avail_balance,
open_date,
last_activity_date
FROM account
ORDER BY Row_number()
OVER (
partition BY cust_id
ORDER BY last_activity_date DESC)
Issue with your query is, you can't select non aggregated column in select if you don't specify those columns in group by
If you want to get the max activity date for a customer then your query should be as below
select CUST_ID, MAX(LAST_ACTIVITY_DATE)
from ACCOUNT
group by CUST_ID;
You can't select any other column which is not in the group by clause. The error message also giving the same message.
with query(CUST_ID, LAST_ACTIVITY_DATE) as
(
select
CUST_ID,
MAX(LAST_ACTIVITY_DATE) as LAST_ACTIVITY_DATE
from ACCOUNT
group by CUST_ID
)
select
a.ACCOUNT_ID,
a.AVAIL_BALANCE,
a.OPEN_DATE,
a.LAST_ACTIVITY_DATE
from ACCOUNT as a
inner join query as q
on a.CUST_ID = q.CUST_ID
and a.LAST_ACTIVITY_DATE = q.LAST_ACTIVITY_DATE

SSIS - Filter duplicate rows

I have a table (Id, ArticleCode, StoreCode, Adress, Number) that contains duplicate entries based on only these columns [ArticleCode, StoreCode].
Currently I can filter duplicate rows using Aggregate transformation, but the problem is in the output rows I have only two columns [Article, StoreCode] and I need the other columns as well.
Just in the OLEDB Source component use SQL Command as Source instead of Table name and write the following command (as a source):
SELECT [ID]
,[ArticleCode]
,[StoreCode]
,[Address]
,[Number] FROM (
SELECT [ID]
,[ArticleCode]
,[StoreCode]
,[Address]
,[Number]
,ROW_NUMBER() OVER(PARTITION BY [ArticleCode]
,[StoreCode] ORDER BY [ArticleCode]
,[StoreCode]) AS ROWNUM
FROM [dbo].[Table_1]) AS T1
WHERE T1.ROWNUM = 1
To get rid of duplicates and select unique records by [ArticleCode, StoreCode]:
select top 1 with ties
Id ,
ArticleCode ,
StoreCode ,
Adress ,
Number
from
YourTable
order by
row_number() over(partition by ArticleCode, StoreCode order by Id)
But which of two records have to be selected when [ArticleCode, StoreCode] are equal and [Adress, Number] differ?
If Id is auto-increment then order by Id gets the first entered record, order by Id desc - the last.
You have somehow to define which [Adress, Number] pair among the duplicates is correct to be selected.

Subtracting two columns within the sql query

I have been trying to subtract two columns in sql server to form a third one. Below is my query
select AD.Id, Sum(APS.Amount) AS TotalDue,
isnull((select sum(Amount) from Activation where InvoiceId in (select InvoiceId from Invoices where AgreementId = AD.Id)),0)
As AllocatedToDate
from AdvantageDetails AD
inner join AllPaymentsSubstantial APS
on APS.AgreementId=AD.Id
where AD.OrganizationId=30
group by AD.Id
What I tried is below but it is not working. :
select AD.Id, Sum(APS.Amount) AS TotalDue,
isnull((select sum(Amount) from Activation where InvoiceId in (select InvoiceId from Invoices where AgreementId = AD.Id)),0)
As AllocatedToDate , (TotalDue-AllocatedToDate) as NewColumn
from AdvantageDetails AD
inner join AllPaymentsSubstantial APS
on APS.AgreementId=AD.Id
where AD.OrganizationId=30
group by AD.Id
At last I tried it using a CTE which worked fine. But I want to do it without creating CTE. Can there be any other way for performing the same functionality. I do not want to use CTE because it is forcasted that there
can be other columns which will be calculated in future.
with CTE as(select AD.Id, Sum(APS.Amount) AS TotalDue,
isnull((select sum(Amount) from Activation where InvoiceId in (select InvoiceId from Invoices where AgreementId = AD.Id)),0)
As AllocatedToDate , (TotalDue-AllocatedToDate) as NewColumn
from AdvantageDetails AD
inner join AllPaymentsSubstantial APS
on APS.AgreementId=AD.Id
where AD.OrganizationId=30
group by AD.Id) select * , (CTE.TotalDue-CTE.AllocatedToDate)As Newcolumn from CTE
You can do it without a CTE by repeating the entire formula that makes up AllocatedToDate.
You cannot use the alias of a column in the SELECT list, so you cannot do this:
SELECT {some calculation} AS ColumnA, (ColumnA - ColumnB) AS ColumnC
If you don't want to use a CTE or derived table, you have to do this:
SELECT {some calculation} AS ColumnA, ({some calculation} - ColumnB) AS ColumnC
And by the way, I can't imagine why the possibility of future columns being added is a reason not to use a CTE. To me, it sounds like a reason TO use a CTE, as you will only have to make changes in one place in the code, and not duplicate the same code in different places in the same query.
You can just use nested queries:
select Id, TotalDue, AllocatedToDate, (TotalDue-AllocatedToDate) as NewColumn
from (
select AD.Id, Sum(APS.Amount) AS TotalDue,
isnull((select sum(Amount) from Activation where InvoiceId in (select InvoiceId from Invoices where AgreementId = AD.Id)),0)
As AllocatedToDate
from AdvantageDetails AD
inner join AllPaymentsSubstantial APS
on APS.AgreementId=AD.Id
where AD.OrganizationId=30
group by AD.Id
) x

Multiple Select against one CTE

I have a CTE query filtering a table Student
Student
(
StudentId PK,
FirstName ,
LastName,
GenderId,
ExperienceId,
NationalityId,
CityId
)
Based on a lot filters (multiple cities, gender, multiple experiences (1, 2, 3), multiple nationalites), I create a CTE by using dynamic sql and joining the student table with a user defined tables (CityTable, NationalityTable,...)
After that I have to retrieve the count of student by each filter like
CityId City Count
NationalityId Nationality Count
Same thing the other filter.
Can I do something like
;With CTE(
Select
FROM Student
Inner JOIN ...
INNER JOIN ....)
SELECT CityId,City,Count(studentId)
FROm CTE
GROUP BY CityId,City
SELECT GenderId,Gender,Count
FROM CTE
GROUP BY GenderId,Gender
I want to something like what LinkedIn is doing with search(people search,job search)
http://www.linkedin.com/search/fpsearch?type=people&keywords=sales+manager&pplSearchOrigin=GLHD&pageKey=member-home
It's so fast and do the same thing.
You can not use multiple select but you can use more than one CTE like this.
WITH CTEA
AS
(
SELECT 'Coulmn1' A,'Coulmn2' B
),
CETB
AS
(
SELECT 'CoulmnX' X,'CoulmnY' Y
)
SELECT * FROM CTEA, CETB
For getting count use RowNumber and CTE some think like this.
ROW_NUMBER() OVER ( ORDER BY COLUMN NAME )AS RowNumber,
Count(1) OVER() AS TotalRecordsFound
Please let me know if you need more information on this.
Sample for your reference.
With CTE AS (
Select StudentId, S.CityId, S.GenderId
FROM Student S
Inner JOIN CITY C
ON S.CityId = C.CityId
INNER JOIN GENDER G
ON S.GenderId = G.GenderId)
,
GENDER
AS
(
SELECT GenderId
FROM CTE
GROUP BY GenderId
)
SELECT * FROM GENDER, CTE
It is not possible to get multiple result sets from a single CTE.
You can however use a table variable to cache some of the information and use it later instead of issuing the same complex query multiple times:
declare #relevantStudent table (StudentID int);
insert into #relevantStudent
select s.StudentID from Students s
join ...
where ...
-- now issue the multiple queries
select s.GenderID, count(*)
from student s
join #relevantStudent r on r.StudentID = s.StudentID
group by s.GenderID
select s.CityID, count(*)
from student s
join #relevantStudent r on r.StudentID = s.StudentID
group by s.CityID
The trick is to store only the minimum required information in the table variable.
As with any query whether this will actually improve performance vs. issuing the queries independently depends on many things (how big the table variable data set is, how complex is the query used to populate it and how complex are the subsequent joins/subselects against the table variable, etc.).
Do a UNION ALL to do multiple SELECT and concatenate the results together into one table.
;WITH CTE AS(
SELECT
FROM Student
INNER JOIN ...
INNER JOIN ....)
SELECT CityId,City,Count(studentId),NULL,NULL
FROM CTE
GROUP BY CityId,City
UNION ALL
SELECT NULL,NULL,NULL,GenderId,Gender,Count
FROM CTE
GROUP BY GenderId,Gender
Note: The NULL values above just allow the two results to have matching columns, so the results can be concatenated.
I know this is a very old question, but here's a solution I just used. I have a stored procedure that returns a PAGE of search results, and I also need it to return the total count matching the query parameters.
WITH results AS (...complicated foo here...)
SELECT results.*,
CASE
WHEN #page=0 THEN (SELECT COUNT(*) FROM results)
ELSE -1
END AS totalCount
FROM results
ORDER BY bar
OFFSET #page * #pageSize ROWS FETCH NEXT #pageSize ROWS ONLY;
With this approach, there's a small "hit" on the first results page to get the count, and for the remaining pages, I pass back "-1" to avoid the hit (I assume the number of results won't change during the user session). Even though totalCount is returned for every row of the first page of results, it's only computed once.
My CTE is doing a bunch of filtering based on stored procedure arguments, so I couldn't just move it to a view and query it twice. This approach allows avoid having to duplicate the CTE's logic just to get a count.

Generate Row Serial Numbers in SQL Query

I have a customer transaction table. I need to create a query that includes a serial number pseudo column. The serial number should be automatically reset and start over from 1 upon change in customer ID.
Now, I am familiar with the row_number() function in SQL. This doesnt exactly solve my problem because to the best of my knowledge the serial number will not be reset in case the order of the rows change.
I want to do this in a single query (SQL Server) and without having to go through any temporary table usage etc. How can this be done?
Sometime we might don't want to apply ordering on our result set to add serial number. But if we are going to use ROW_NUMBER() then we have to have a ORDER BY clause. So, for that we can simply apply a tricks to avoid any ordering on the result set.
SELECT ROW_NUMBER() OVER(ORDER BY (SELECT 1)) AS ItemNo, ItemName FROM ItemMastetr
For that we don't need to apply order by on our result set. We'll just add ItemNo on our given result set.
select
ROW_NUMBER() Over (Order by CustomerID) As [S.N.],
CustomerID ,
CustomerName,
Address,
City,
State,
ZipCode
from Customers;
I'm not certain, based on your question if you want numbered rows that will remember their numbers even if the underlying data changes (and gives a different ordering), but if you just want numbered rows - that reset on a change in customer ID, then try using the Partition by clause of row_number()
row_number() over(partition by CustomerID order by CustomerID)
Implementing Serial Numbers Without Ordering Any of the Columns
Demo SQL Script-
IF OBJECT_ID('Tempdb..#TestTable') IS NOT NULL
DROP TABLE #TestTable;
CREATE TABLE #TestTable (Names VARCHAR(75), Random_No INT);
INSERT INTO #TestTable (Names,Random_No) VALUES
('Animal', 363)
,('Bat', 847)
,('Cat', 655)
,('Duet', 356)
,('Eagle', 136)
,('Frog', 784)
,('Ginger', 690);
SELECT * FROM #TestTable;
There are ‘N’ methods for implementing Serial Numbers in SQL Server. Hereby, We have mentioned the Simple Row_Number Function to generate Serial Numbers.
ROW_NUMBER() Function is one of the Window Functions that numbers all rows sequentially (for example 1, 2, 3, …) It is a temporary value that will be calculated when the query is run. It must have an OVER Clause with ORDER BY. So, we cannot able to omit Order By Clause Simply. But we can use like below-
SQL Script
IF OBJECT_ID('Tempdb..#TestTable') IS NOT NULL
DROP TABLE #TestTable;
CREATE TABLE #TestTable (Names VARCHAR(75), Random_No INT);
INSERT INTO #TestTable (Names,Random_No) VALUES
('Animal', 363)
,('Bat', 847)
,('Cat', 655)
,('Duet', 356)
,('Eagle', 136)
,('Frog', 784)
,('Ginger', 690);
SELECT Names,Random_No,ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS SERIAL_NO FROM #TestTable;
In the Above Query, We can Also Use SELECT 1, SELECT ‘ABC’, SELECT ” Instead of SELECT NULL. The result would be Same.
SELECT ROW_NUMBER() OVER (ORDER BY ColumnName1) As SrNo, ColumnName1, ColumnName2 FROM TableName
select ROW_NUMBER() over (order by pk_field ) as srno
from TableName
Using Common Table Expression (CTE)
WITH CTE AS(
SELECT ROW_NUMBER() OVER(ORDER BY CustomerId) AS RowNumber,
Customers.*
FROM Customers
)
SELECT * FROM CTE
I found one solution for MYSQL its easy to add new column for SrNo or kind of tepropery auto increment column by following this query:
SELECT #ab:=#ab+1 as SrNo, tablename.* FROM tablename, (SELECT #ab:= 0)
AS ab
ALTER function dbo.FN_ReturnNumberRows(#Start int, #End int) returns #Numbers table (Number int) as
begin
insert into #Numbers
select n = ROW_NUMBER() OVER (ORDER BY n)+#Start-1 from (
select top (#End-#Start+1) 1 as n from information_schema.columns as A
cross join information_schema.columns as B
cross join information_schema.columns as C
cross join information_schema.columns as D
cross join information_schema.columns as E) X
return
end
GO
select * from dbo.FN_ReturnNumberRows(10,9999)

Resources