SQL duplicates with percentage result - sql-server

I have an issue where my database contains a table with these columns:
Program_ID, Vehicle_VIN, Vehicle_Type
I need to create a report where will be:
Program_ID, AmountOfAllPoliciesInProgram, PercentageDuplicatesInProgram
where Vehicle_type = 10
and criteria for the duplicate is to have Vehicle_VIN more than 1 unique time in the table for dedicated Program_ID.
It's in Microsoft SQL Server Management Studio
AmountOfAllPoliciesInProgram is:
SELECT
PROGRAM_ID, COUNT(*) AS AmountOfAllPoliciesInProgram
FROM
dbo.table
WHERE
Vehicle_type = 10
GROUP BY
PROGRAM_ID

I'm not 100% sure I understand you correctly, but you can count distinct Vehicle_VIN and use the having clause to test if the count of distinct Vehicle_VIN is larger than 1. Like so,
SELECT PROGRAM_ID, COUNT(*) as AmountOfAllPoliciesInProgram, count(distinct Vehicle_VIN) as VIN_COUNT
FROM dbo.table
where Vehicle_type = 10
group by PROGRAM_ID
having count(distinct Vehicle_VIN)>1

Related

SQL Server : Update WHERE xxx AND most recent record

Similar to this question, but I require this in Microsoft SQL Server.
I'm looking to update values in my database, but only the most recently added value for each date.
For example if I have 4 rows where column_day = '2023-02-01',
and they have updated datetimes in a column [Time_Stamp] of
2023-01-01, 2023-01-27, 2023-01-10, and 2023-01-05,
I would like to only update the Jan 27, 2023 2023-01-27 line.
The answer provided here which does not work in Microsoft SQL Server is:
UPDATE your_table
SET some_column = 1
ORDER BY date_time_column DESC
LIMIT 1
My code I'm trying is slightly different as it includes a WHERE:
I get an error
Incorrect syntax near the keyword 'order'
UPDATE your_table
SET some_column = 1
WHERE column_tag = 'test' AND column_day = '2023-02-01'
ORDER BY Time_Stamp DESC
LIMIT 1
How can I achieve this in Microsoft SQL Server?
You can use TOP in an UPDATE, however, you can't provide an ORDER BY; as such it's really used for batching and previously UPDATEd rows are filtered in each iteration in the WHERE clause.
For what you want, one method would be to use an UPDATEable CTE with ROW_NUMBER:
WITH CTE AS(
SELECT SomeColumn,
ROW_NUMBER() OVER (ORDER BY date_time_column DESC) AS RN
FROM dbo.YourTable
WHERE ColumnTag = 'Test'
AND ColumnDay = '20230201')
UPDATE CTE
SET SomeColumn = 1
WHERE RN = 1;

SQL Server : return id column using max on different column

In my table I have the columns id, userId and points. As a result of my query I would like to have the id of the record that contains the highest points, per user.
In my experience (more with MySQL than SQL Server) I would use the following query to get this result:
SELECT id, userId, max(points)
FROM table
GROUP BY userId
But SQL Server does not allow this, because all columns in the select should also be in the GROUP BY or be an aggregate function.
This is a hypothetical situation. My actual situation is a lot more complicated!
Use ROW_NUMBER window function in SQL Server
Select * from
(
select Row_Number() over(partition by userId Order by points desc) Rn,*
From yourtable
) A
Where Rn = 1

SQL Server : error which works well in Oracle

I'm migrating from an Oracle database to SQL Server 2012. Some SQL which works well in Oracle doesn't work with SQL Server.
The following is my SQL and the error.
SELECT
SUM(COUNT(DISTINCT dfc.rentalNumber))
FROM
DueFromClient dfc
WHERE
dfc.facilityId=:facilityId
AND dfc.isRentalComponent = 1
GROUP BY
dfc.rentalNumber
and the error is
Cannot perform an aggregate function on an expression containing an aggregate or a subquery
Remove the SUM from the query, as it has no use, since you are also using GROUP BY. The results will be the same as your original query.
SELECT
COUNT(DISTINCT dfc.rentalNumber)
FROM DueFromClient dfc
WHERE dfc.facilityId = facilityId
AND dfc.isRentalComponent = 1
GROUP BY dfc.rentalNumber
This doesn't make much sense, so I suggest also adding the rentalNumber to the SELECT in order to make sense of your data and to also make full use of the GROUP BY.
SELECT
COUNT(DISTINCT dfc.rentalNumber)
, dfc.rentalNumber
FROM DueFromClient dfc
WHERE dfc.facilityId = facilityId
AND dfc.isRentalComponent = 1
GROUP BY dfc.rentalNumber
You don't need to sum count and group by. Try this:
SELECT
COUNT(DISTINCT dfc.rentalNumber)
FROM
DueFromClient dfc
WHERE
dfc.facilityId=:facilityId
AND dfc.isRentalComponent = 1
sqlfiddle for this query
All of proposed below are suboptimal and returns same result as query above, but they may meet needs of #Sachi Pj with using of same construction as in original query with SUM(COUNT(DISTINCT()) two more options:
SELECT SUM(dfc2.rentalNumber)
FROM
(
SELECT
COUNT(DISTINCT dfc.rentalNumber) rentalNumber
FROM
DueFromClient dfc
WHERE
dfc.facilityId=:facilityId
AND dfc.isRentalComponent = 1
) AS dfc2
GROUP BY dfc2.rentalNumber
sqlfiddle for this query
And without GROUP BY since doubles was eliminated by distinct:
SELECT SUM(dfc2.rentalNumber)
FROM
(
SELECT
COUNT(DISTINCT dfc.rentalNumber) rentalNumber
FROM
DueFromClient dfc
WHERE
dfc.facilityId=:facilityId
AND dfc.isRentalComponent = 1
) AS dfc2
sqlfiddle for this query
You can compare with your original query sqlfiddle to being sure what the results are same.

Function to generate incrementing numbers in SQL Server

Is there any way to select from a function and have it return incrementing numbers?
For example, do this:
SELECT SomeColumn, IncrementingNumbersFunction() FROM SomeTable
And have it return:
SomeColumn | IncrementingNumbers
--------------------------------
some text | 0
something | 1
foo | 2
On sql server 2005 and up you can use ROW_NUMBER()
SELECT SomeColumn,
ROW_NUMBER() OVER(Order by SomeColumn) as IncrementingNumbers
FROM SomeTable
0n SQL Server 2000, you can use identity but if you have deletes you will have gaps
SQL 2000 code in case you have gaps in your regular table with an identity column, do an insert into a temp with identity and then select out of it
SELECT SomeColumn,
IDENTITY( INT,1,1) AS IncrementingNumbers
INTO #temp
FROM SomeTable
ORDER BY SomeColumn
SELECT * FROM #temp
ORDER BY IncrementingNumbers
I think you're looking for ROW_NUMBER added in SQL Server 2005. it "returns the sequential number of a row within a partition of a result set, starting at 1 for the first row in each partition."
From MSDN (where there's plenty more) the following example returns the ROW_NUMBER for the salespeople in AdventureWorks2008R2 based on the year-to-date sales.
SELECT FirstName, LastName, ROW_NUMBER() OVER(ORDER BY SalesYTD DESC) AS 'Row Number', SalesYTD, PostalCode
FROM Sales.vSalesPerson
WHERE TerritoryName IS NOT NULL AND SalesYTD <> 0;
You could an auto increment identity column, or do I miss understand the question?
http://msdn.microsoft.com/en-us/library/Aa933196
Starting with SQL 2012 versions and later, there is a built-in sequence feature.
Example:
-- sql to create the Sequence object
CREATE SEQUENCE dbo.MySequence
AS int
START WITH 1
INCREMENT BY 1 ;
GO
-- use the sequence
SELECT NEXT VALUE FOR dbo.MySequence;
SELECT NEXT VALUE FOR dbo.MySequence;
SELECT NEXT VALUE FOR dbo.MySequence;
No, there is no sequence generation functions in SQLServer. You might find identity field types handy to resolve your current issue.

How to create RowNum column in SQL Server?

In Oracle we have "rownum".
What can I do in SQL Server?
In SQL Server 2005 (and 2008) you can use the ROW_NUMBER function, coupled with the OVER clause to determine the order in which the rows should be counted.
Update
Hmm. I don't actually know what the Oracle version does. If it's giving you a unique number per row (across the entire table), then I'm not sure there's a way to do that in SQL Server. SQL Server's ROW_NUMBER() only works for the rows returned in the current query.
If you have an id column, you can do this:
select a.*,
(select count(*) from mytable b where b.id <= a.id) as rownum
from mytable a
order by id;
Of course, this only works where you're able to order rownums in the same (or opposite) order as the order of the ids.
If you're selecting a proper subset of rows, of course you need to apply the same predicate to the whole select and to the subquery:
select a.*,
(select count(*) from table b where b.id <= a.id and b.foo = 'X') as rownum
from table a where a.foo = 'X'
order by id;
Obviously, this is not particularly efficient.
Based on my understanding, you'd need to use ranking functions and/or the TOP clause. The SQL Server features are specific, the Oracle one combines the 2 concepts.
The ranking function is simple: here is why you'd use TOP.
Note: you can't WHERE on ROWNUMBER directly...
'Orable:
select
column_1, column_2
from
table_1, table_2
where
field_3 = 'some value'
and rownum < 5
--MSSQL:
select top 4
column_1, column_2
from
table_1, table_2
where
field_3 = 'some value'

Resources