SQL Server: problems with max and IN clause - sql-server

I have this table in SQL Server 2008:
create table [hFUNDASSET]
(
[hFundAssetID] numeric(19,0) identity not null,
[fundAssetID] numeric(19,0) not null,
[modified] datetime not null,
primary key ([hFundAssetID])
);
Its a history table, the goal is to get the closest hFundAssetID based on a given max modified column timestamp, for every distinct fundAssetID.
The following query gets the correct latest modified for each fundAssetID:
select max(modified), fundAssetID
from hFundAsset
where fundAssetID IN
(
select distinct (fundAssetID)
from hFundAsset where modified < 'April 20, 2010 11:13:00'
)
group by fundAssetID;
I can't put the hFundAssetID in the select clause without it being in the group by , which would return extra rows. Somehow, I need the hFundAssetID matching each one of the modified, fundAssetID pairs returned in the above query, or equivalent. But SQL Server doesn't allow multiple values for the IN clause, according to their docs:
"subquery - Is a subquery that has a result set of one column.
This column must have the same data type as test_expression."
Googling shows that 'exists' and joins are typically used with mssql in these cases, but I've tried that using 'max' and 'group by' and I'm having problems getting it to work. Any help appreciated.

Try this:
WITH qry AS
(
SELECT a.*,
RANK() OVER(PARTITION BY fundAssetID ORDER BY modified DESC) rnk
FROM hFundAsset a
WHERE modified < 'April 20, 2010 11:13:00'
)
SELECT *
FROM qry
WHERE rnk = 1

Related

Efficient limit result set in SQL window function

My question would be better served as a comment on Limit result set in sql window function , but I don't have the necessary reputation to comment.
Given a table of moving vehicle locations, for each vehicle I wish to find the most recent recorded position (and other data about the vehicle at that time). Based on answers in the other question, I can run a query like:
Table definition:
CREATE TABLE VehiclePositions
(
Id BIGINT NOT NULL,
VehicleID NVARCHAR(12) NULL,
Timestamp DATETIME NULL,
PositionX FLOAT NULL,
PositionY FLOAT NULL,
PositionZ SMALLINT NULL,
Speed SMALLINT NULL,
Heading SMALLINT NULL
)
Query:
select *
from
(select
*,
row_number() over (partition by VehicleID order by Timestamp desc) as ranking
from VehiclePositions) as x
where
ranking = 1
Now, the problem is that this does a full table scan. I thought that by creating an appropriate index, I could avoid this:
CREATE INDEX idx_VehicPosition ON VehiclePositions(VehicleID, Timestamp);
However, SQL Server will happily ignore this index in the query and still perform the stable scan.
Note: I can get SQL Server to use the index, but the code is rather ugly:
DECLARE #ids TABLE (id NVARCHAR(12) UNIQUE)
INSERT INTO #ids
SELECT DISTINCT VehicleID
FROM VehiclePositions
SELECT ep.*
FROM VehiclePositions vp
WHERE Timestamp = (SELECT Max(TimeStamp) FROM VehiclePositions vp2
WHERE vp2.VehicleID = vp.VehicleID)
AND VehicleID IN (SELECT DISTINCT id FROM #ids)
(The VehicleID IN... is because it seems SQL Server doesn't implement seek-skip optimisations. It still comes up with a pretty non-optimal query plan that visits the index twice, but at least it doesn't execute in linear time).
Is there a way to make SQL Server run the window function query intelligently?
I'm using SQL Server 2014...
Help will be appreciated
What i would do :
SELECT *
FROM
(SELECT MAX(Timestamp) as maxtime,
VehicleID
FROM VehiclePositions
GROUP BY VehicleID ) as maxed INNER JOIN
(SELECT Id ,
VehicleID ,
Timestamp ,
PositionX ,
PositionY,
PositionZ,
Speed ,
Heading
FROM VehiclePositions) as vals
ON maxed.maxtime = vals.Timestamp
AND maxed.VehicleID = vals.VehicleID
to my knowledge you cant get around your index getting scanned twice.
As long as you are selecting all vehicles from the table and are select all column (or at least columns that are not in your index), I would expect the table scan to keep popping up.
In many cases, that will actually be the most efficient query plan. Only if you have a many rows per vehicle (like several pages) a seek strategy might be faster.
If you do have a lot of rows per vehicle, you might consider partitioning your table on Timestamp...
You can filter results in windows function using 'qualify', as follows:
select *
from VehiclePositions
qualify row_number() over (partition by VehicleID order by Timestamp desc) = 1

MS SQL Server cannot call methods on ntext

I am trying to create a view in MS SQL server from a table. The table name is Account_Plan and I am trying to create a view as Account_Plan_vw. While executing the DDL to create the view, I am getting the error as shown below.
Msg 258, Level 15, State 1, Procedure Account_Plan_vw, Line 56
Cannot call methods on ntext
Msg 207, Level 16, State 1, Procedure Account_Plan_vw, Line 22
Invalid column name 'How_the_CU_will_achieve_these_objective2__c'.
The error message shows the column 'How_the_CU_will_achieve_these_objective2__c' as invalid. However, this is a valid column in the Account_Plan table of ntext type.
Can someone help? I just removed the extra columns from the Create view statement.
CREATE VIEW [dbo].[Account_Plan_vw]
AS
SELECT
Results_1.Account__c
,Results_1.How_the_CU_will_achieve_these_objectives__c
,Results_1.How_the_CU_will_achieve_these_objective2__c
FROM
(
SELECT ROW_NUMBER() OVER (PARTITION BY apc1.Account__c ORDER BY apc1.Year__c DESC, apc1.CreatedDate DESC) AS RN_1
,apc1.Account__c
,apc1.How_the_CU_will_achieve_these_objectives__c
,apc1.How_the_CU_will_achieve_these_objective2__c
FROM Account_Plan apc1
INNER JOIN RecordType rtp1
ON apc1.RecordTypeId=rtp1.[Id]
AND rtp1.DeveloperName = 'Account_Plan'
INNER JOIN Account acc1
ON acc1.[Id] = apc1.Account__c
WHERE apc1.Year__c <= YEAR(GETDATE())
) AS Results_1
WHERE RN_1 = 1
NTEXT is deprecated, convert it to NVARCHAR(MAX) instead
see: ntext, text, and image (Transact-SQL)
You should consider altering the table not just casting in the view, but:
CREATE VIEW [dbo].[Account_Plan_vw]
AS
SELECT
results_1.Account__c
, results_1.How_the_CU_will_achieve_these_objectives__c
, results_1.How_the_CU_will_achieve_these_objective2__c
FROM (
SELECT
ROW_NUMBER() OVER (PARTITION BY apc1.Account__c ORDER BY apc1.Year__c DESC, apc1.CreatedDate DESC) AS rn_1
, apc1.Account__c
, apc1.How_the_CU_will_achieve_these_objectives__c
, cast(apc1.How_the_CU_will_achieve_these_objective2__c as nvarchar(max)) as How_the_CU_will_achieve_these_objective2__c
FROM Account_Plan apc1
INNER JOIN RecordType rtp1 ON apc1.RecordTypeId = rtp1.[Id]
AND rtp1.DeveloperName = 'Account_Plan'
INNER JOIN Account acc1 ON acc1.[Id] = apc1.Account__c
WHERE apc1.Year__c <= YEAR(GETDATE())
) AS results_1
WHERE RN_1 = 1
the issue and it was somewhat cryptic to find. The salesforce object had a field as last_peer_review_date__c for which no permissions were given to anybody. As a result, DBAMP user was not able to see the field and hence missed to create this field in SQL server when I used the SF_Replicate command. The create view SQL was created by me couple of weeks ago and it did work at that time. Now, when I used the same SQL, it failed because the SQL had the last_peer_review_date field, but the Account_Plan table does not.
Balaji Pooruli

SQL Server : return id column using max on different column

In my table I have the columns id, userId and points. As a result of my query I would like to have the id of the record that contains the highest points, per user.
In my experience (more with MySQL than SQL Server) I would use the following query to get this result:
SELECT id, userId, max(points)
FROM table
GROUP BY userId
But SQL Server does not allow this, because all columns in the select should also be in the GROUP BY or be an aggregate function.
This is a hypothetical situation. My actual situation is a lot more complicated!
Use ROW_NUMBER window function in SQL Server
Select * from
(
select Row_Number() over(partition by userId Order by points desc) Rn,*
From yourtable
) A
Where Rn = 1

TSQL - extract data to table/view to speed up query

I use this statement to create a list for excel
SELECT DISTINCT Year, Version
FROM myView
WHERE id <> 'old'
ORDER BY Year DESC, Version DESC
The problem is that the execution time is over 30s because of the almost 2 million rows.
The result has only around 1000 rows.
What are my options to extract only those two columns in order to speed up the execution time? I also need to make sure that inserts to the underlying table are recognized.
Do I need a new table to copy the values from the view? And a trigger to manage the updates?
Thank you
So, presumably there's a table with Year and id underlying your view. Given this (trivial) example:
CREATE TABLE myTable ([id] varchar(10), [Year] int, [Version] int);
Just create an index on that table that matches the way you're querying your data. Given your query of:
SELECT DISTINCT Year, Version
FROM myView
WHERE id <> 'old'
ORDER BY Year DESC, Version DESC
This query matches the WHERE and ORDER BY clauses and should give you all the performance you need:
IF EXISTS (SELECT * FROM sys.indexes WHERE object_id = OBJECT_ID(N'[dbo].[myTable]') AND name = N'IX_YearVersion_Filtered')
DROP INDEX [IX_YearVersion_Filtered] ON [dbo].[myTable] WITH ( ONLINE = OFF )
GO
CREATE NONCLUSTERED INDEX [IX_YearVersion_Filtered] ON [dbo].[myTable]
(
[Year] DESC,
[Version] DESC
)
WHERE ([id]<>'old')
GO
with cte_x
as
(SELECT Year, Version
FROM myView
WHERE id not in ('old')
group by Year, Version)
SELECT DISTINCT Year, Version
FROM cte_x
ORDER BY Year DESC, Version DESC

Function to generate incrementing numbers in SQL Server

Is there any way to select from a function and have it return incrementing numbers?
For example, do this:
SELECT SomeColumn, IncrementingNumbersFunction() FROM SomeTable
And have it return:
SomeColumn | IncrementingNumbers
--------------------------------
some text | 0
something | 1
foo | 2
On sql server 2005 and up you can use ROW_NUMBER()
SELECT SomeColumn,
ROW_NUMBER() OVER(Order by SomeColumn) as IncrementingNumbers
FROM SomeTable
0n SQL Server 2000, you can use identity but if you have deletes you will have gaps
SQL 2000 code in case you have gaps in your regular table with an identity column, do an insert into a temp with identity and then select out of it
SELECT SomeColumn,
IDENTITY( INT,1,1) AS IncrementingNumbers
INTO #temp
FROM SomeTable
ORDER BY SomeColumn
SELECT * FROM #temp
ORDER BY IncrementingNumbers
I think you're looking for ROW_NUMBER added in SQL Server 2005. it "returns the sequential number of a row within a partition of a result set, starting at 1 for the first row in each partition."
From MSDN (where there's plenty more) the following example returns the ROW_NUMBER for the salespeople in AdventureWorks2008R2 based on the year-to-date sales.
SELECT FirstName, LastName, ROW_NUMBER() OVER(ORDER BY SalesYTD DESC) AS 'Row Number', SalesYTD, PostalCode
FROM Sales.vSalesPerson
WHERE TerritoryName IS NOT NULL AND SalesYTD <> 0;
You could an auto increment identity column, or do I miss understand the question?
http://msdn.microsoft.com/en-us/library/Aa933196
Starting with SQL 2012 versions and later, there is a built-in sequence feature.
Example:
-- sql to create the Sequence object
CREATE SEQUENCE dbo.MySequence
AS int
START WITH 1
INCREMENT BY 1 ;
GO
-- use the sequence
SELECT NEXT VALUE FOR dbo.MySequence;
SELECT NEXT VALUE FOR dbo.MySequence;
SELECT NEXT VALUE FOR dbo.MySequence;
No, there is no sequence generation functions in SQLServer. You might find identity field types handy to resolve your current issue.

Resources