Related
How to convert following sql query to Linq,
Select F1, F2
From [Table]
Where Convert(Varchar(10), [OnDate], 103) = '12/08/2019'
The correct way to filter datetime values by date only is to use cast( ... as date) :
declare #myDate date='20190812'
Select F1, F2
From [Table]
Where cast([OnDate] as date)=#myDate
This avoids any parsing and localization errors and takes advantage of indexes that cover the OnDate column. Without it the server would have to scan the entire table to convert the dates into strings before comparing them.
This is due to the dynamic seek optimizations introduced at least as far back as SQL Server 2008 R2.
LINQ by itself doesn't query databases. It's a query language that gets translated by an ORM into actual SQL statements. Writing the equivalent of cast( ... as date) depends on the ORM.
In LINQ to SQL, calling DateTime.Date generates a cast( as date) :
var data = context.MyTable.Where(row=>row.OnDate.Date=someDate.Date);
or
var data = from row in contect.MyTable
where row.OnDate.Date=someDate.Date
select row;
EF doesn't recognize this and requires the DbFunctions.TruncateTime call :
var data = context.MyTable.Where(row=>DbFunctions.TruncateTime(row.OnDate)=someDate.Date);
EF Core once again recognizes DateTime.Date :
var data = context.MyTable.Where(row=>row.OnDate.Date=someDate.Date);
Here is the SQL
SELECT tal.TrustAccountValue
FROM TrustAccountLog AS tal
INNER JOIN TrustAccount ta ON ta.TrustAccountID = tal.TrustAccountID
INNER JOIN Users usr ON usr.UserID = ta.UserID
WHERE usr.UserID = 70402 AND
ta.TrustAccountID = 117249 AND
tal.trustaccountlogid =
(
SELECT MAX (tal.trustaccountlogid)
FROM TrustAccountLog AS tal
INNER JOIN TrustAccount ta ON ta.TrustAccountID = tal.TrustAccountID
INNER JOIN Users usr ON usr.UserID = ta.UserID
WHERE usr.UserID = 70402 AND
ta.TrustAccountID = 117249 AND
tal.TrustAccountLogDate < '3/1/2010 12:00:00 AM'
)
Basicaly there is a Users table a TrustAccount table and a TrustAccountLog table.
Users: Contains users and their details
TrustAccount: A User can have multiple TrustAccounts.
TrustAccountLog: Contains an audit of all TrustAccount "movements". A TrustAccount is associated with multiple TrustAccountLog entries.
Now this query executes in milliseconds inside SQL Server Management Studio, but for some strange reason it takes forever in my C# app and even timesout (120s) sometimes.
Here is the code in a nutshell. It gets called multiple times in a loop and the statement gets prepared.
cmd.CommandTimeout = Configuration.DBTimeout;
cmd.CommandText = #"SELECT tal.TrustAccountValue FROM TrustAccountLog AS tal
INNER JOIN TrustAccount ta ON ta.TrustAccountID = tal.TrustAccountID
INNER JOIN Users usr ON usr.UserID = ta.UserID
WHERE usr.UserID = #UserID1 AND
ta.TrustAccountID = #TrustAccountID1 AND
tal.trustaccountlogid =
(
SELECT MAX (tal.trustaccountlogid) FROM TrustAccountLog AS tal
INNER JOIN TrustAccount ta ON ta.TrustAccountID = tal.TrustAccountID
INNER JOIN Users usr ON usr.UserID = ta.UserID
WHERE usr.UserID = #UserID2 AND
ta.TrustAccountID = #TrustAccountID2 AND
tal.TrustAccountLogDate < #TrustAccountLogDate2
)";
cmd.Parameters.Add("#TrustAccountID1", SqlDbType.Int).Value = trustAccountId;
cmd.Parameters.Add("#UserID1", SqlDbType.Int).Value = userId;
cmd.Parameters.Add("#TrustAccountID2", SqlDbType.Int).Value = trustAccountId;
cmd.Parameters.Add("#UserID2", SqlDbType.Int).Value = userId;
cmd.Parameters.Add("#TrustAccountLogDate2", SqlDbType.DateTime).Value =TrustAccountLogDate;
// And then...
reader = cmd.ExecuteReader();
if (reader.Read())
{
double value = (double)reader.GetValue(0);
if (System.Double.IsNaN(value))
return 0;
else
return value;
}
else
return 0;
In my experience the usual reason why a query runs fast in SSMS but slow from .NET is due to differences in the connection's SET-tings. When a connection is opened by either SSMS or SqlConnection, a bunch of SET commands are automatically issued to set up the execution environment. Unfortunately SSMS and SqlConnection have different SET defaults.
One common difference is SET ARITHABORT. Try issuing SET ARITHABORT ON as the first command from your .NET code.
SQL Profiler can be used to monitor which SET commands are issued by both SSMS and .NET so you can find other differences.
The following code demonstrates how to issue a SET command but note that this code has not been tested.
using (SqlConnection conn = new SqlConnection("<CONNECTION_STRING>")) {
conn.Open();
using (SqlCommand comm = new SqlCommand("SET ARITHABORT ON", conn)) {
comm.ExecuteNonQuery();
}
// Do your own stuff here but you must use the same connection object
// The SET command applies to the connection. Any other connections will not
// be affected, nor will any new connections opened. If you want this applied
// to every connection, you must do it every time one is opened.
}
If this is parameter sniffing, try to add option(recompile) to the end of your query.
I would recommend creating a stored procedure to encapsulate logic in a more manageable way. Also agreed - why do you pass 5 parameters if you need only three, judging by the example?
Can you use this query instead?
select TrustAccountValue from
(
SELECT MAX (tal.trustaccountlogid), tal.TrustAccountValue
FROM TrustAccountLog AS tal
INNER JOIN TrustAccount ta ON ta.TrustAccountID = tal.TrustAccountID
INNER JOIN Users usr ON usr.UserID = ta.UserID
WHERE usr.UserID = 70402 AND
ta.TrustAccountID = 117249 AND
tal.TrustAccountLogDate < '3/1/2010 12:00:00 AM'
group by tal.TrustAccountValue
) q
And, for what it's worth, you are using ambiguous date format, depending on the language settings of the user executing query. For me for example, this is 3rd of January, not 1st of March. Check this out:
set language us_english
go
select ##language --us_english
select convert(datetime, '3/1/2010 12:00:00 AM')
go
set language british
go
select ##language --british
select convert(datetime, '3/1/2010 12:00:00 AM')
The recommended approach is to use 'ISO' format yyyymmdd hh:mm:ss
select convert(datetime, '20100301 00:00:00') --midnight 00, noon 12
Had the same issue in a test environment, although the live system (on the same SQL server) was running fine. Adding OPTION (RECOMPILE) and also OPTION (OPTIMIZE FOR (#p1 UNKNOWN)) did not help.
I used SQL Profiler to catch the exact query that the .NET client was sending and found that this was wrapped with exec sp_executesql N'select ... and that the parameters had been declared as nvarchar - even though the columns being compared are simple varchar.
Putting the captured query text into SSMS confirmed it runs just as slowly as it does from the .NET client.
I found that changing the type of the parameters to DbType.AnsiString cleared up the problem:
p = cm.CreateParameter();
p.ParameterName = "#company";
p.Value = company;
p.DbType = DbType.AnsiString;
cm.Parameters.Add(p);
I could never explain why the test and live environments had such marked difference in performance.
Hope your specific issue is resolved by now since it is an old post.
Following SET options has potential to affect plan resuse (complete list at the end)
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_NULLS ON
GO
SET ARITHABORT ON
GO
Following two statements are from msdn - SET ARITHABORT
Setting ARITHABORT to OFF can negatively impact query optimization leading to performance issues.
The default ARITHABORT setting for SQL Server Management Studio is ON. Client applications setting ARITHABORT to OFF can receive different query plans making it difficult to troubleshoot poorly performing queries. That is, the same query can execute fast in management studio but slow in the application.
Another interesting topic to understand is Parameter Sniffing as outlined in Slow in the Application, Fast in SSMS? Understanding Performance Mysteries - by Erland Sommarskog
Still another possibility is with conversion (internally) of VARCHAR columns into NVARCHAR while using Unicode input parameter as outlined in Troubleshooting SQL index performance on varchar columns - by Jimmy Bogard
OPTIMIZE FOR UNKNOWN
In SQL Server 2008 and above, consider OPTIMIZE FOR UNKNOWN . UNKNOWN: Specifies that the query optimizer use statistical data instead of the initial value to determine the value for a local variable during query optimization.
OPTION (RECOMPILE)
Use "OPTION (RECOMPILE)" instead of "WITH RECOMPILE" if recompiliing is the only solution. It helps in Parameter Embedding Optimization. Read Parameter Sniffing, Embedding, and the RECOMPILE Options - by Paul White
SET Options
Following SET options can affect plan-reuse, based on msdn - Plan Caching in SQL Server 2008
ANSI_NULL_DFLT_OFF 2. ANSI_NULL_DFLT_ON 3. ANSI_NULLS 4. ANSI_PADDING 5. ANSI_WARNINGS 6. ARITHABORT 7. CONCAT_NULL_YIELDS_NUL 8. DATEFIRST 9. DATEFORMAT 10. FORCEPLAN 11. LANGUAGE 12. NO_BROWSETABLE 13. NUMERIC_ROUNDABORT 14. QUOTED_IDENTIFIER
Most likely the problem lies in the criterion
tal.TrustAccountLogDate < #TrustAccountLogDate2
The optimal execution plan will be highly dependent on the value of the parameter, passing 1910-01-01 (which returns no rows) will most certainly cause a different plan than 2100-12-31 (which returns all rows).
When the value is specified as a literal in the query, SQL server knows which value to use during plan generation. When a parameter is used, SQL server will generate the plan only once and then reuse it, and if the value in a subsequent execution differs too much from the original one, the plan will not be optimal.
To remedy the situation, you can specify OPTION(RECOMPILE) in the query. Adding the query to a stored procedure won't help you with this particular issue, unless
you create the procedure WITH RECOMPILE.
Others have already mentioned this ("parameter sniffing"), but I thought a simple explanation of the concept won't hurt.
It might be type conversion issues. Are all the IDs really SqlDbType.Int on the data tier?
Also, why have 4 parameters where 2 will do?
cmd.Parameters.Add("#TrustAccountID1", SqlDbType.Int).Value = trustAccountId;
cmd.Parameters.Add("#UserID1", SqlDbType.Int).Value = userId;
cmd.Parameters.Add("#TrustAccountID2", SqlDbType.Int).Value = trustAccountId;
cmd.Parameters.Add("#UserID2", SqlDbType.Int).Value = userId;
Could be
cmd.Parameters.Add("#TrustAccountID", SqlDbType.Int).Value = trustAccountId;
cmd.Parameters.Add("#UserID", SqlDbType.Int).Value = userId;
Since they are both assigned the same variable.
(This might be causing the server to make a different plan since it expects four different variables as op. to. 4 constants - making it 2 variables could make a difference for the server optimization.)
Sounds possibly related to parameter sniffing? Have you tried capturing exactly what the client code sends to SQL Server (Use profiler to catch the exact statement) then run that in Management Studio?
Parameter sniffing: SQL poor stored procedure execution plan performance - parameter sniffing
I haven't seen this in code before, only in procedures, but it's worth a look.
In my case the problem was that my Entity Framework was generating queries that use exec sp_executesql.
When the parameters don't exactly match in type the execution plan does not use indexes because it decides to put the conversion into the query itself.
As you can imagine this results in a much slower performance.
in my case the column was defined as CHR(3) and the Entity Framework was passing N'str' in the query which cause a conversion from nchar to char. So for a query that looks like this:
ctx.Events.Where(e => e.Status == "Snt")
It was generating an SQL query that looks something like this:
FROM [ExtEvents] AS [Extent1] ...
WHERE (N''Snt'' = [Extent1].[Status]) ...
The easiest solution in my case was to change the column type, alternatively you can wrestle with your code to make it pass the right type in the first place.
Since you appear to only ever be returning the value from one row from one column then you can use ExecuteScalar() on the command object instead, which should be more efficient:
object value = cmd.ExecuteScalar();
if (value == null)
return 0;
else
return (double)value;
I had this problem today and this solve my problem:
https://www.mssqltips.com/sqlservertip/4318/sql-server-stored-procedure-runs-fast-in-ssms-and-slow-in-application/
I put on the begining of my SP this: Set ARITHABORT ON
Holp this help you!
You don't seem to be closing your data reader - this might start to add up over a number of iterations...
I had a problem with a different root cause that exactly matched the title of this question's symptoms.
In my case the problem was that the result set was held open by the application's .NET code while it looped through every returned record and executed another three queries against the database! Over several thousand rows this misleadingly made the original query look like it had been slow to complete based on timing information from SQL Server.
The fix was therefore to refactor the .NET code making the calls so that it doesn't hold the result set open while processing each row.
I realise the OP doesn't mention the use of stored procedures but there is an alternative solution to parameter sniffing issues when using stored procedures that is less elegant but has worked for me when OPTION(RECOMPILE) doesn't appear to do anything.
Simply copy your parameters to variables declared in the procedure and use those instead.
Example:
ALTER PROCEDURE [ExampleProcedure]
#StartDate DATETIME,
#EndDate DATETIME
AS
BEGIN
--reassign to local variables to avoid parameter sniffing issues
DECLARE #MyStartDate datetime,
#MyEndDate datetime
SELECT
#MyStartDate = #StartDate,
#MyEndDate = #EndDate
--Rest of procedure goes here but refer to #MyStartDate and #MyEndDate
END
I have just had this exact issue. A select running against a view that returned a sub second response in SSMS. But run through sp_executesql it took 5 to 20 seconds. Why? Because when I looked at the query plan when run through sp_executesql it did not use the correct indexes. It was also doing index scans instead of seeks. The solution for me was simply to create a simple sp that executed the query with the passed parameter. When run through sp_executesql it used the correct indexes and did seeks not scans. If you want to improve it even further make sure to use command.CommandType = CommandType.StoredProcedure when you have a sp then it does not use sp_executesql it just uses EXEC but this only shaved ms off the result.
This code ran sub second on a db with millions of records
public DataTable FindSeriesFiles(string StudyUID)
{
DataTable dt = new DataTable();
using (SqlConnection connection = new SqlConnection(connectionString))
{
connection.Open();
using (var command = new SqlCommand("VNA.CFIND_SERIES", connection))
{
command.CommandType = CommandType.StoredProcedure;
command.Parameters.AddWithValue("#StudyUID", StudyUID);
using (SqlDataReader reader = command.ExecuteReader())
{
dt.Load(reader);
}
return dt;
}
}
}
Where the stored procedure simply contained
CREATE PROCEDURE [VNA].[CFIND_SERIES]
#StudyUID NVARCHAR(MAX)
AS BEGIN
SET NOCOUNT ON
SELECT *
FROM CFIND_SERIES_VIEW WITH (NOLOCK)
WHERE [StudyInstanceUID] = #StudyUID
ORDER BY SeriesNumber
END
This took 5 to 20 seconds (but the select is exactly the same as the contents of the VNA.CFIND_SERIES stored procedure)
public DataTable FindSeriesFiles(string StudyUID)
{
DataTable dt = new DataTable();
using (SqlConnection connection = new SqlConnection(connectionString))
{
connection.Open();
using (var command = connection.CreateCommand())
{
command.CommandText =" SELECT * FROM CFIND_SERIES_VIEW WITH (NOLOCK) WHERE StudyUID=#StudyUID ORDER BY SeriesNumber";
command.Parameters.AddWithValue("#StudyUID", StudyUID);
using (SqlDataReader reader = command.ExecuteReader())
{
dt.Load(reader);
}
return dt;
}
}
}
I suggest you try and create a stored procedure - which can be compiled and cached by Sql Server and thus improve performance
I'm having a really tough time figuring how to use an xml data column in SQL Server, specifically for use with Entity Framework.
Basically, one of our tables stores "custom metadata" provided by users in the form of XML, so it seemed sensible to store this in an Xml column in the table.
One of the requirements of our application is to support searching of the metadata, however. The users are able to provided an XPath query string, as well as a value to compare the value of the XPath with, to search for elements that contain metadata that matches their query.
I identified the SQL Server xml functions as ideal for this (eg, [xmlcol].exist('/path1/path2[0][text()=''valuetest'''] ), but they're not supported by Entity Framework, irritatingly (or specifically, xml columns aren't supported). As an alternative, I tried creating a UDF that passes the user-provided XPath to the xml functions, but then discovered that the xml functions only allow string literals, so I can't provide variables...
At this point, I was running out of options.
I created a small bit of code that performs a regular expression replace on the result of a IQueryable.ToString(), to inject my XPath filter in, and then send this string to the database manually, but there are problems with this too, such as the result doesn't seem to lazily load the navigational properties, for example.
I kept looking, and stumbled upon the idea of SQLCLR types, and started creating a SQLCLR function that performs the XPath comparison. I thought I was onto a winner at this point, until a colleague pointed out that SQL Server in Azure doesn't support SQLCLR - doh!
What other options do I have? I seem to be running very close to empty...
You could do this in a stored procedure where you build your query dynamically.
SQL Fiddle
MS SQL Server 2008 Schema Setup:
create table YourTable
(
ID int identity primary key,
Name varchar(10) not null,
XMLCol xml
);
go
insert into YourTable values
('Row 1', '<x>1</x>'),
('Row 2', '<x>2</x>'),
('Row 3', '<x>3</x>');
go
create procedure GetIt
#XPath nvarchar(100)
as
begin
declare #SQL nvarchar(max);
set #SQL = N'
select ID, Name
from YourTable
where XMLCol.exist('+quotename(#XPath, '''')+N') = 1';
exec (#SQL);
end
Query 1:
exec GetIt N'*[text() = "2"]'
Results:
| ID | NAME |
--------------
| 2 | Row 2 |
To remain "customisable", the SqlQuery method on DbSet can be used:
var query = #"SET ARITHABORT ON;
select * from [YourTable] where
[xmlcol].exist('/path1/path2[0][text()=''{0}''']";
var numOfResults = 5;
var offsetPage = 1;
var results = Context.YourTable.SqlQuery(String.Format(query,"valuetest"))
.OrderBy(x => x.col)
.Skip(offsetPage * numOfResults)
.Take(numOfResults).ToList();
Note, due to its dynamic nature, this method would also most likely expose some degree of sql injection security holes.
I have following query which takes almost 1 minute to execute.
public static Func<Entities, string, IQueryable<string>> compiledInvoiceQuery =
CompiledQuery.Compile((Entities ctx, string orderNumb) =>
(from order in ctx.SOP10100
where order.ORIGNUMB == orderNumb
select order.SOPNUMBE).Union(
from order in ctx.SOP30200
where order.ORIGNUMB == orderNumb
select order.SOPNUMBE)
);
It filters on basis of ORIGNUMB which is not my primary key, i can not even put any index on it. Do we have any other way to make it faster? I tested on sql server and found that only query
from order in ctx.SOP10100
where order.ORIGNUMB == orderNumb
select order.SOPNUMBE
or
select SOPNUMBE
from SOP10100
where ORIGNUMB = #orderNumb
is taking more than 55 seconds. Please suggest.
If it's taking 55 seconds on the server, then it's nowto to do with linq.
Why can't you have an index on it, because you need one....
Only other option is to rejig your logic to filter out records (using indexed columns), before you start searching for an ordernumber match.
One of the big problems with LINQ to SQL is that you have very little control over the SQL being generating.
Since you are running a union and not a join, it should be a pretty simple SQL. Something like this:
SELECT *
FROM SOP10100
WHERE ORIGNUMB = 'some number'
UNION
SELECT *
FROM SOP30200
WHERE ORIGNUMB = 'some number'
You can use SQL Server Profiler to see the SQL statements that are being run against the database to see if the SQL is like this or something more complicated. You can then run the SQL generated in SQL Server Management Stuido and turn on Include Client Statistics and Include Actual Execution Plan to see what exactly is causing the performance issue.
I have a view that returns 2 ints from a table using a CTE. If I query the view like this it runs in less than a second
SELECT * FROM view1 WHERE ID = 1
However if I query the view like this it takes 4 seconds.
DECLARE #id INT = 1
SELECT * FROM View1 WHERE ID = #id
I've checked the 2 query plans and the first query is performing a Clustered index seek on the main table returning 1 record then applying the rest of the view query to that result set, where as the second query is performing an index scan which is returning about 3000 records records rather than just the one I'm interested in and then later filtering the result set.
Is there anything obvious that I'm missing to try to get the second query to use the Index Seek rather than an index scan. I'm using SQL 2008 but anything I do needs to also run on SQL 2005. At first I thought it was some sort of parameter sniffing problem but I get the same results even if I clear the cache.
Probably it is because in the parameter case, the optimizer cannot know that the value is not null, so it needs to create a plan that returns correct results even when it is. If you have SQL Server 2008 SP1 you can try adding OPTION(RECOMPILE) to the query.
You could add an OPTIMIZE FOR hint to your query, e.g.
DECLARE #id INT = 1
SELECT * FROM View1 WHERE ID = #id OPTION (OPTIMIZE FOR (#ID = 1))
In my case in DB table column type was defined as VarChar and in parameterized query parameter type was defined as NVarChar, this introduced CONVERT_IMPLICIT in the actual execution plan to match data type before comparing and that was culprit for sow performance, 2 sec vs 11 sec. Just correcting parameter type made parameterized query as fast as non parameterized version.
One possible way to do that is to CAST the parameters, as such:
SELECT ...
FROM ...
WHERE name = CAST(:name AS varchar)
Hope this may help someone with similar issue.
I ran into this problem myself with a view that ran < 10ms with a direct assignment (WHERE UtilAcctId=12345), but took over 100 times as long with a variable assignment (WHERE UtilAcctId = #UtilAcctId).
The execution-plan for the latter was no different than if I had run the view on the entire table.
My solution didn't require tons of indexes, optimizer-hints, or a long-statistics-update.
Instead I converted the view into a User-Table-Function where the parameter was the value needed on the WHERE clause. In fact this WHERE clause was nested 3 queries deep and it still worked and it was back to the < 10ms speed.
Eventually I changed the parameter to be a TYPE that is a table of UtilAcctIds (int). Then I can limit the WHERE clause to a list from the table.
WHERE UtilAcctId = [parameter-List].UtilAcctId.
This works even better. I think the user-table-functions are pre-compiled.
When SQL starts to optimize the query plan for the query with the variable it will match the available index against the column. In this case there was an index so SQL figured it would just scan the index looking for the value. When SQL made the plan for the query with the column and a literal value it could look at the statistics and the value to decide if it should scan the index or if a seek would be correct.
Using the optimize hint and a value tells SQL that “this is the value which will be used most of the time so optimize for this value” and a plan is stored as if this literal value was used. Using the optimize hint and the sub-hint of UNKNOWN tells SQL you do not know what the value will be, so SQL looks at the statistics for the column and decides what, seek or scan, will be best and makes the plan accordingly.
I know this is long since answered, but I came across this same issue and have a fairly simple solution that doesn't require hints, statistics-updates, additional indexes, forcing plans etc.
Based on the comment above that "the optimizer cannot know that the value is not null", I decided to move the values from a variable into a table:
Original Code:
declare #StartTime datetime2(0) = '10/23/2020 00:00:00'
declare #EndTime datetime2(0) = '10/23/2020 01:00:00'
SELECT * FROM ...
WHERE
C.CreateDtTm >= #StartTime
AND C.CreateDtTm < #EndTime
New Code:
declare #StartTime datetime2(0) = '10/23/2020 00:00:00'
declare #EndTime datetime2(0) = '10/23/2020 01:00:00'
CREATE TABLE #Times (StartTime datetime2(0) NOT NULL, EndTime datetime2(0) NOT NULL)
INSERT INTO #Times(StartTime, EndTime) VALUES(#StartTime, #EndTime)
SELECT * FROM ...
WHERE
C.CreateDtTm >= (SELECT MAX(StartTime) FROM #Times)
AND C.CreateDtTm < (SELECT MAX(EndTime) FROM #Times)
This performed instantly as opposed to several minutes for the original code (obviously your results may vary) .
I assume if I changed my data type in my main table to be NOT NULL, it would work as well, but I was not able to test this at this time due to system constraints.
Came across this same issue myself and it turned out to be a missing index involving a (left) join on the result of a subquery.
select *
from foo A
left outer join (
select x, count(*)
from bar
group by x
) B on A.x = B.x
Added an index named bar_x for bar.x
DECLARE #id INT = 1
SELECT * FROM View1 WHERE ID = #id
Do this
DECLARE #sql varchar(max)
SET #sql = 'SELECT * FROM View1 WHERE ID =' + CAST(#id as varchar)
EXEC (#sql)
Solves your problem