How to best use multicolumn index with value ranges in SQL Server? - sql-server

I'm running SQL Server 2016 Enterprise edition.
I have a table with 24 columns and 10 indexes. Those indexes are vendor defined so I cannot change them. I have a hard time to understand how to get best performance as whatever I do SQL Server chooses in my opinion a poor execution plan.
The following query :
SELECT event_id
FROM Events e WITH(NOLOCK, index=[Event_By_PU_And_TimeStamp])
WHERE e.timestamp > '2022-05-12 15:00'
AND e.PU_Id BETWEEN 103 AND 186
results in this index seek:
The specified index is the clustered index and it has two columns PU_ID and Timestamp. Even though the SEEK PREDICATE lists both PU_ID and Timestamp as the used columns the "Number of rows read" is too high in my opinion. Without the index hint SQL chooses a different index for the seek with double rows-read number.
Unfortunately the order of the columns in the index is PU_ID, Timestamp, while Timestamp is the much more selective column here.
However if I change the PU_ID condition to list every possible number between the margins
PU_ID IN (103,104,105,...186)
then the "rows read are exactly the number of returned rows" and the statistics output confirms a better performance (validated with profiler trace).
Between-condition:
(632 rows affected)
Table 'Events'. Scan count 7, logical reads 139002, physical reads 0, read-ahead reads 1, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
IN-condition with every number written out:
(632 rows affected)
Table 'Events'. Scan count 84, logical reads 459, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Edit: the IndexSeek for the IN-query
What is the best way to make SQL Server choose the better plan?
Do I really need to write out all possible PU_IDs in every query?
The used index is just a simple two column index, it's just the clustered index as well:
CREATE UNIQUE CLUSTERED INDEX [Event_By_PU_And_TimeStamp] ON [dbo].[Events]
(
[PU_Id] ASC,
[TimeStamp] ASC
)

Related

Understanding Implicit Type Conversion in SQL Server

On the SQL Server documentation page, MS provides the following matrix showing what conversions are supported and not:
What would be an example in SQL of an explicit conversion and an implicit conversion?
For example, I would assume that an explicit conversion would be something like CAST('2014-01-01' AS DATE), but then it also allows odd things like converting varchar to image. Or, how could you explicitly cast a datetime to a float?
https://learn.microsoft.com/en-us/sql/t-sql/functions/cast-and-convert-transact-sql?view=sql-server-ver16
We have a table Employee with a NationalIDNumber column defined with a NVARCHAR data type. In this query, we will use a WHERE clause to search for a specific ID.
In the query below, we have requested NationalIDNumber equal to the integer value 14417807. For SQL Server to compare these two data types, it must convert that NVARCHAR into INT. Which means every value in that column must go through a conversion process which causes a table scan.
USE AdventureWorks2016CTP3
GO
SET STATISTICS IO ON
GO
SELECT BusinessEntityID, NationalIDNumber, LoginID, HireDate,JobTitle
FROM HumanResources.Employee
WHERE NationalIDNumber = 14417807
In the execution plan, you will see an exclamation point warning you that there is a potential issue with the query. Hovering over the SELECT operator, you will see that a CONVERT_IMPLICIT is happening which may have affected the optimizer from using a SEEK.
(1 row affected)
Table 'Employee'. Scan count 1, logical reads 9, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Now the question is how do we fix it. It’s really simple but it does require a code change. Let’s look back at our query.
SELECT BusinessEntityID, NationalIDNumber, LoginID, HireDate,JobTitle
FROM HumanResources.Employee
WHERE NationalIDNumber = 14417807
Remember we asked for an integer value. Just by adding single quotes to the value, we can eliminate our issue. It’s important to always know what data types your columns are when querying them. In this case, since it is an NVARCHAR, all I need to do is to supply a character value. This is accomplished by adding single quotes around the value.
SELECT BusinessEntityID, NationalIDNumber, LoginID, HireDate,JobTitle
FROM HumanResources.Employee
WHERE NationalIDNumber = '14417807'
It’s simple to see the results. Note above the Scan count 1, logical reads 9, physical reads 0. When we rerun it we get the below.
(1 row affected)
Table 'Employee'. Scan count 0, logical reads 4, physical reads 2, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
You can also see in the graphical plan that the warning is now gone, and we have a SEEK instead of the SCAN which is much more efficient.
Source

SQL Server: creating indexes on foreign keys

What I'm experimenting here is how DELETE statements perform on a very simple example. I'm currently using SQL Server 2017 (I also tried with SQL Server 2014 and the results were similar).
I have two tables: Parent and Child. Child has a foreign key to Parent (Parent_ID).
Parent:
Parent_ID Name
-----------------------
1 P1
2 P2
Child:
Child_ID Parent_ID Data
-----------------------------
1 1 P1C1
2 2 P2C1
3 2 PPPPCCCC
4 2 P2C1
5 2 PPPPCCCC
(around 4 million more rows with Parent_ID=2)
I always thought that adding an index on the foreign key (Parent_ID in Child here) was a good idea. But today, I have tried the behavior of DELETE in a somewhat extreme case - but I'm sure this kind of case could happen in real life - (4 millions rows with Parent_ID=2 in the Child table, only one row for Parent_ID=1).
If I try to delete rows with Parent_ID = 1, it looks good: it is fast enough, the index is used, the amount of logical reads seems to be fine (12 logical reads: I am no expert and don't know if it's really OK for such small amount of data).
Now here is what I don't understand (and don't like):
I try to delete all records in Child where Parent_ID=2:
BEGIN TRAN
DELETE FROM child
WHERE parent_id = 2
ROLLBACK TRAN
The IO statistics show this (for the DELETE):
Table 'Child'. Scan count 1, logical reads 38486782, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
38486782 logical reads... isn't that huge? I have tried to update statistics to be sure.
UPDATE STATISTICS Child WITH FULLSCAN
Then ran my query again => same results. May be the problem is the Index Delete on IX_Child_Parent_ID?
After disabling the index on the foreign key, things went much better:
Table 'Child'. Scan count 1, logical reads 202233, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Note: SQL Server suggests to create an index for the FK.
202233 logical reads sounds much better... at least for the specific case of Parent_Id=2.
The question is: why is SQL Server using the index and did not choose the clustered index scan method when it knows there are about 4 000 000 rows for Parent_ID = 2? Or may be it doesn't know? Aren't the statistics supposed to "help" SQL Server to know this kind of information?
I'm probably missing something.
(I have double checked and the statistics are - seemed to be - OK after the index is created:
For the purpose of a delete and with the cardinality of the data you have, the index on parent_id is not useful, as you have seen.
Given you know you need to delete 99% of the rows, doing so is highly inneficient for many reasons, not least the growth of the transaction log.
Every statement you execute is atomic and its own implicit transaction, if the delete were to fail mid-way ie a power cut, SQL Server needs to be able to roll the incomplete delete back, for which it uses the transaction log, so all the rows that are deleted will hit the transaction log.
In cases such as these it's much more performant to insert the rows to keep into a new table, drop the original table and then rename the new table to the original; you can also script the indexes/constraints from the original and apply them to the new table.
Where deleting a large proportion of rows but less than 50% of the table, other recommendations would be to split the job into batches and delete <5000 rows at a time (5000 is the rough threshold for lock-escalation).
Often it can help to create a View on the table selecting the top n rows to delete, ordered by a specific key, then delete from the View.
May be I found the answer here: https://stackoverflow.com/a/3650886.
Looks like this is an expected behavior and that the problem is really updating the index (IX_Child_Parent_ID in my case).

No blocking, good execution plan, slow query: why?

I have a query that occasionally takes several minutes to complete. Several processes are running concurrently but there is no blocking (I'm running an extended events session, I can see blocking of other transactions, so the query to inspect the logged events is working).
Looking at the query plan cache, the execution plan is a good one: running it in SSMS, it takes less than 100 IOs, and there are no table or index scans.
There is the possibility that the users are getting a different plan, but if I add hints to use scans on all tables (and some are fairly large), it still returns in around 1 second. So the worst possible execution plan still wouldn't result in a query that takes several minutes.
Having ruled out blocking and a bad execution plan, What else can make a query slow ?
One thing worth pointing out is that SQL Server uses an indexed view we have created, although the code doesn't reference it (we're using SQL Server Enterprise). That indexed view has a covering index to support the query and it is being used - again, the execution plan is very good. The original query is using NOLOCK, and I observed that no locks are taken on any rows or pages of the indexed view either (so SQL Server respects our locking hints, even though it's accessing an indexed view instead of the underlying tables - good). This makes sense, otherwise I would have expected to see blocking.
We are using indexed views in some other queries but we reference them in SQL code (and specify NOLOCK, NOEXPAND). I've not seen any problems with those queries, and I'm not aware that there should be any difference between indexed views that we tell the optimizer to use and indexed views that the optimizer itself decides to use, but what I'm seeing suggests that there is.
Any thoughts ? Anything else I should be looking at ?
This is the query:
execute sp_executesql
N'SELECT DISTINCT p.policy_id
, p.name_e AS policy_name_e
, p.name_l AS policy_name_l
FROM patient_visit_nl_view AS pv
INNER JOIN swe_cashier_transaction_nl_view AS ct ON ct.patient_visit_id = pv.patient_visit_id
AND ct.split_date_time IS NOT NULL
INNER JOIN ar_invoice_nl_view AS ai ON ai.ar_invoice_id = ct.invoice_id
AND ai.company_code = ''KOC''
AND ai.transaction_status_rcd = ''TEMP''
INNER JOIN policy_nl_view p ON p.policy_id = ai.policy_id
WHERE pv.patient_id = #pv__patient_id'
, N' #pv__patient_id uniqueidentifier'
, #pv__patient_id = '5D61EDF1-7542-11E8-BFCB-D89EF37315A2'
Note: views with suffix _nl_view select from the table with NOLOCK (the idea is we can change this in future without affecting the business tier code).
You can see the query plan here: https://www.brentozar.com/pastetheplan/?id=HJI9Lj_WH
IO stats:
Table 'policy'. Scan count 0, logical reads 9, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'ar_invoice_cashier_transaction_visit_iview'. Scan count 1, logical reads 5, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Locks taken (IS locks on the objects involved, nothing else):
locks taken
Below the relevant part of the indexed view:
CREATE VIEW dbo.ar_invoice_cashier_transaction_visit_iview WITH SCHEMABINDING
AS
SELECT ai.ar_invoice_id
, ai.company_code
, ai.policy_id
, ai.transaction_status_rcd
, ct.cashier_transaction_id
, pv.patient_id
-- more columns
FROM dbo.ar_invoice AS ai
INNER JOIN dbo.swe_cashier_transaction AS ct ON ct.invoice_id = ai.ar_invoice_id AND ct.split_date_time IS NOT NULL
INNER JOIN dbo.patient_visit AS pv ON pv.patient_visit_id = ct.patient_visit_id
CREATE UNIQUE CLUSTERED INDEX XPKar_invoice_cashier_transaction_visit_iview ON dbo.ar_invoice_cashier_transaction_visit_iview (ar_invoice_id, cashier_transaction_id)
CREATE INDEX XIE4ar_invoice_cashier_transaction_visit_iview ON dbo.ar_invoice_cashier_transaction_visit_iview (patient_id, transaction_status_rcd, company_code) INCLUDE (policy_id)
So far so good.
But every few days (and not at the same time of day), things go pear-shaped, the query takes minutes and actually times out (the command timeout of the provider is set to 10 minutes). When this happens, there is no blocking. I have an extended event session and this is my query
DECLARE #event_xml xml;
SELECT #event_xml = CONVERT(xml, target_data)
FROM sys.dm_xe_sessions AS s
INNER JOIN sys.dm_xe_session_targets AS t ON s.address = t.event_session_address
WHERE s.name = 'Blocking over 10 seconds'
SELECT DATEADD(hour, DATEDIFF(hour, GETUTCDATE(), GETDATE()), R.c.value('#timestamp', 'datetime')) AS time_stamp
, R.c.value('(data[#name="blocked_process"]/value[1]/blocked-process-report[1]/blocked-process[1]/process)[1]/#spid', 'int') AS blocked_spid
, R.c.value('(data[#name="blocked_process"]/value[1]/blocked-process-report[1]/blocked-process[1]/process[1]/inputbuf)[1]', 'varchar(max)') AS blocked_inputbuf
, R.c.value('(data[#name="blocked_process"]/value[1]/blocked-process-report[1]/blocked-process[1]/process[1]/#waitresource)[1]', 'varchar(max)') AS wait_resource
, R.c.value('(data[#name="blocked_process"]/value[1]/blocked-process-report[1]/blocking-process[1]/process)[1]/#spid', 'int') AS blocking_spid
, R.c.value('(data[#name="blocked_process"]/value[1]/blocked-process-report[1]/blocking-process[1]/process[1]/inputbuf)[1]', 'varchar(max)') AS blocking_inputbuf
, R.c.query('.')
FROM #event_xml.nodes('/RingBufferTarget/event') AS R(c)
ORDER BY R.c.value('#timestamp', 'datetime') DESC
This query is returning other cases of blocking, so I believe it's correct. At the time the problem (the timeouts) occur, there are no cases of blocking involving the query above, or any other query.
Since there is no blocking, I'm looking at the possibility of a bad query plans. I didn't find a bad plan in the cache (I had already recommended an sp_recompile on of the tables before I was given remote access), so I tried to think of the worst possible one: scans for every table. Applying the relevant options, here are the IO stats for this query:
Table 'patient_visit'. Scan count 1, logical reads 4559, physical reads 0, read-ahead reads 7, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'swe_cashier_transaction'. Scan count 9, logical reads 24840, physical reads 0, read-ahead reads 23660, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'ar_invoice'. Scan count 9, logical reads 21247, physical reads 0, read-ahead reads 7074, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'policy'. Scan count 9, logical reads 271, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
And here is the execution plan: https://www.brentozar.com/pastetheplan/?id=rJr29s_br
The customer has a beefy SQL Server 2012 box, plenty of cores (maxdop is set to 8), tons of memory. It eats this bad query for breakfast (takes around 350 msec).
For completeness, here are the row counts of the tables involved:
ar_invoice: 2363527
swe_cashier_transaction: 2946514
patient_visit: 654976
policy: 1038
ar_invoice_cashier_transaction_visit_iview: 1999609
I also ran the query for a patient_id that returns the most rows, and for a patient_id that didn't exist (i.e. 0 rows). I ran these with the recompile option: in both cases the optimizer selected the same (good) execution plan.
So back to the question: there is no blocking, the query plan seems to be good (and even if it was bad, it wouldn't be bad to the extent this query takes 10 minutes), so what can cause this ?
The only thing a little unusual here is that, although the SQL doesn't select from the indexed view, the optimizer uses it anyway - and this is or should be a good thing. I know the Enterprise version claims it can do this, but this is the first time I've seen it in the wild (I've seen plenty of the opposite though: referencing an indexed view in SQL, but the optimizer selects from the view's underlying tables anyway). I'm tempted to believe that this is relevant.
Without knowing anything about your setup, a few other things I would check:
what is overall CPU and memory utilisation like on the box, could there be resource contention
if your storage is on a SAN rather than local storage, is there contention at the storage end (this can happen if you have heavy reads/writes on the same disk arrays from different systems)
There can be several other factors involved in slowing down a query. Personally I don't really trust the SQL Server's Optimization technique though. Normally I would recommend to optimize your query so that optimizer does NOT have to do hard work, for example use Exists / In on main table instead of joining and doing distinct/grouping, like,
select distinct ia.AttributeCode, ia.AttributeDescription
from ItemsTable as i
inner join ItemAttributesTable as ia on i.AttributeCode = ia.AttributeCode
where i.Manufacturer = #paramMfr
and i.MfrYear between #paramYearStart and #paramYearYend
instead of running a query like above run it like this
select ia.AttributeCode, ia.AttributeDescription
from ItemAttributesTable as ia
where ia.AttributeCode in (
select i.AttributeCode
from ItemsTable as i
where i.Manufacturer = #paramMfr
and i.MfrYear between #paramYearStart and #paramYearYend
)
I am NOT really expert in indexing, but for above case, I think only 1 index should be sufficient in ItemsTable
Another optimization can be done by removing the views and directly using the tables, because views may also be doing joins on other tables that are really not required here.
All in all, the main point is that when query optimizer is figuring out the best possible scenario and it may run into the case where it reaches to the timeout (which is called Optimizer TimeOut limit), in that case it may pick up a plan which is NOT really good at that specific time, which is why the plan cache should be used. That's the reason here I am recommending to focus on optimizing the query rather looking at the reasons why it's timing out.
Check this out as well https://blogs.msdn.microsoft.com/psssql/2018/10/19/understanding-optimizer-timeout-and-how-complex-queries-can-be-affected-in-sql-server/
Update-1:
Recommendations:
Use Exists / In, even if you see the same execution plan as your current query, still this will help optimizer to almost always use the correct plan
Try eliminating the views and directly use the tables, with fewer select columns.
Make sure you have proper indexed defined as per the given parameters
Try breaking the query into smaller parts, for example pick the filtered data in temporary table and then grab rest of the details using temporary tables
Try googling "Timeout in Application not in SSMS" and see different hacks
Common causes of query timeout:
No indexing defined
Extracting too much data
There is/are lock(s) on one or more tables while you are trying to read data from those table(s)
Parameter type and Field type difference, for example, the column is varchar while parameter type is nvarchar
Parameter sniffing

SQL Server does not choose to use index although everything seems to suggest it

Something is wrong here and I don't understand what. It's worth to mention, there searched value is not in the table, for an existing value there is no problem. Though, why does the first query require a clustered key search for the primary key which is not even used in the query, while the second can run on the index directly.
Forcing the query to use the index WITH(INDEX(indexname)) does work, but why does the optimizer not choose to use it by itself.
The column PIECE_NUM is not in any other index and is also not the primary key.
SET STATISTICS IO ON
DECLARE #vchEventNum VARCHAR(50)
SET #vchEventNum = '54235DDS28KC1F5SJQMWZ'
SELECT TOP 1
fwt.WEIGHT,
fwt.TEST_RESULT
FROM FIN_WEIGHT_TESTS fwt WITH(NOLOCK)
WHERE fwt.PIECE_NUM LIKE #vchEventNum + '%'
ORDER BY fwt.DTTM_INSERT DESC
SELECT TOP 1
fwt.WEIGHT,
fwt.TEST_RESULT
FROM FIN_WEIGHT_TESTS fwt WITH(NOLOCK)
WHERE fwt.PIECE_NUM LIKE '54235DDS28KC1F5SJQMWZ' + '%'
ORDER BY fwt.DTTM_INSERT DESC
SET STATISTICS IO OFF
I let both queries run in one batch:
IO statistics report:
Query 1: logical reads 16244910
Query 2: logical reads 5
Table 'FIN_WEIGHT_TESTS'. Scan count 1, logical reads 16244910, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'FIN_WEIGHT_TESTS'. Scan count 1, logical reads 5, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
The table has a non-clustered index on PIECE_NUM INCLUDING all three other columns from the query.
Here are the query execution plans(with a little editing to remove the actual names):
I noticed the convert_implicit, but that is just due to conversion of the varchar parameter to nvarchar column. Changing the parameter type did not change the behaviour of the query.
Why does the query with the parameter not use the index while replacing the parameter with its value does?
The first query is going to scan because you are using a local variable. The optimizer sees this as an "anonymous" value and therefore cannot use statistics to build a good query plan.
The second query seeks because it is a literal value and SQL can look into it's stats and knows much better how many estimated rows it will find with that value.
If you run your first query as follows I would imagine you will see it use the better plan:
DECLARE #vchEventNum VARCHAR(50)
SET #vchEventNum = '54235DDS28KC1F5SJQMWZ'
SELECT TOP 1
fwt.WEIGHT,
fwt.TEST_RESULT
FROM FIN_WEIGHT_TESTS fwt WITH(NOLOCK)
WHERE fwt.PIECE_NUM LIKE #vchEventNum + '%'
ORDER BY fwt.DTTM_INSERT DESC
OPTION(RECOMPILE)
I would suggest using a parameterized procedure to run this code to ensure that it uses a cached plan. Using the RECOMPILE hint has it's own drawbacks as the optimizer will need to rebuild the plan every time it runs. So if you run this code very often I would avoid this hint.
You can read about local variables here:https://www.brentozar.com/archive/2014/06/tuning-stored-procedures-local-variables-problems/
I think the cause of what happened is both use of #variable and ORDER BY in your query. To test my guess remove order by from your query and it may lead to equal plans in both cases (this time with different estimated number of rows reported in select).
As mentioned in previous answer, local variables cannot be sniffed at compilation time as the batch is seen as a whole thing, and only recompile option permits to server to know the value of a variable at the compilation time as recompilation begins when the variable is already assigned. This leads to "estimate for unknown" in the first case i.e. statistics cannot be used as we don't know the value in the filter, more rows in output are estimated.
But the query has top + order by in it. This means that if we expect many rows, to get only one but the first ordered by DTTM_INSERT DESC we must sort all the filtered rows. In fact if you look at the second plan you see that the SORT operator costs most of all. But when you use the constant, SQL Server uses the statistics and finds out that there will be only one row returned, so it can permit to sort the result.
In case of many rows expected it decides to use the index that is already ordered by DTTM_INSERT. It's only my guess because you did not post here the creation scripts for your indexes but from the plan I see that the first plan surely goes to the clustered index to grab the fields that are missing in non-clustered index, this means it's not the same non clustered that is used in the second case, but I'm sure the index chosen in the first case has THE LEADING KEY COLUMN DTTM_INSERT. Doing so server eliminates the sort that we see in the second plan

To get the number of orders placed on the current day

I am using sql-server 2012
The query is
1.select count(*) from table where orderti=getdate()
2.select count(*) from table where orderti>=convert(date,getdate()) and orderti<
dateadd(day,convert(date,getdate())
the table structure is:
sales(orderti datetime)
non clustered index on orderti.
I want to know what is the difference in writing styles of 2 queries mentioned above.
which one is efficient ?
any help?
Thanks,
Chio
Based on your question
Query 1
select count(*) from table where orderti=getdate()
This query will not give you the orders for the current day because
orderti is datetime and will contain the time portion as well
getdate() contains both the current date and time as well
This this query is trying to do is to get all orders with orderti = current date and time which is not what you require.
The query which you are looking for is this
select count(*) from table where CONVERT(DATE,orderti)=CONVERT(DATE,getdate())
Query 2
The query you are looking for is
select count(*) from table where orderti>=convert(date,getdate()) and orderti< dateadd(day,1,convert(date,getdate()))
On your question
which one is efficient ?
Based on the statistics and execution plan, both query do a index seek and are equally efficient.
Table 'sales'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'sales'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Update: With around 650k records there are variations in the statistics however the plan remains the same.
Table 'sales'. Scan count 1, logical reads 34, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'sales'. Scan count 1, logical reads 19, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
I would recommend Query 2 as it has lesser logical reads and doesn't have a CAST/Convert.
I would prefer option-2 due to couple of reasons
Better don't wrap filtered columns with functions as standard practice, not valid in the case of course because optimizer is cleaver in 2012 version.
Although 2012 version of optimizer is using index seek which dosen't mean it saves CPU cycles for CAST

Resources