Ensuring index is used on Informix DATETIME column - database

Say I have a table on an Informix DB:
create table password_audit (
username CHAR(20),
old_password CHAR(20),
new_password CHAR(20),
update_date DATETIME YEAR TO FRACTION));
I need the update_date field to be in milliseconds (or seconds maybe - same question applies) because there will be multiple updates of the password on the same day.
Say, I have a nightly batch job that wants to retrieve all records from the password_audit table for today.
To increase performance, I want to put an index on the update_date column. If I do this:
CREATE INDEX pw_idx ON password_audit(update_date);
and run this SQL:
SELECT *
FROM password_audit
WHERE DATE(update_date) = mdy(?,?,?)
(where ?, ?, ? are the month, day and year passed in by my batch job)
then I don't think my index will be used - is that right?
I think I need to create an index something like this:
CREATE INDEX pw_idx ON password_audit(DATE(update_date));
- is that right?

Because you are forcing the server to convert two values to DATE, not DATETIME, then it probably won't use an index.
You would do best to generate the SQL as:
SELECT *
FROM password_audit
WHERE update_date
BETWEEN DATETIME(2010-08-02 00:00:00.00000) YEAR TO FRACTION(5)
AND DATETIME(2010-08-02 23:59:59.99999) YEAR TO FRACTION(5)
That's rather verbose. Alternatively, and maybe slightly more easily:
SELECT *
FROM password_audit
WHERE update_date >= DATETIME(2010-08-02 00:00:00.00000) YEAR TO FRACTION(5)
AND update_date < DATETIME(2010-08-03 00:00:00.00000) YEAR TO FRACTION(5)
Both of these should be able to use the index on the update_date column. You can experiment with dropping some of the trailing zeroes from the literals, but I don't think you'll be able to remove them all - but see what the SET EXPLAIN ON output tells you.
Depending on your server version, you might need to run UPDATE STATISTICS after creating the index before the optimizer uses it at all; that is more of a problem on older (say 10.00 and earlier) versions of Informix than on the current (11.10 and later) versions.

I Didn't see 'date_to_accounts_ni' defined in your password_audit table.
What datatype/length is it?
Your first index on password_audit.update_date is adequate, why would you want to index
(DATE(update_table))?

Related

i can not understand why a query filtering on datetime field is not working on sql server 2016

This is a very basic query:
select * from [dbo].[TestTable] where year(start_date)>2021
it returns no records, start_date is datetime, the table contains many records with that field valorized and dates beyound 2021.
this query return all the records of the table:
SELECT year(start_date), * FROM [dbo].[TestTable] order by start_date desc
this the table structure:
another query with strange result:
SELECT year(start_date), case when year(start_date)>2021 then 1 else 0 end, * FROM [ADS].[dbo].[TestTable] order by start_date desc
what can i check?
You are missing the = part of the operator. All the rows seem to have 2021-01-01, so the YEAR is not greater than 2021, it IS 2021.
You'd need to either do
select * from [dbo].[TestTable] where year(start_date)>=2021
or
select * from [dbo].[TestTable] where year(start_date)=2021
As Dan Guzman points out, as an additional improvement, you should avoid using functions on your WHERE clauses because that will prevent the use of indexes, since it requires to execute the function against every single row on the table to be able to determine if it's a match.
If it's a small table in terms of record count, it's not a big deal, but if you're talking about tens of thousands or more, it will add up.
The alternatives would be to either filter by the original date value, or save the year as a separate field and add an index to it.
It's working fine. None of the results in the picture contains record where year(start_date)>2021.
So I think you might looking for the rows where start date is greater than 2020
select * from [dbo].[TestTable] where year(start_date)>2020
OR
select * from [dbo].[TestTable] where year(start_date)>=2021
Or as #Dan Guzman suggested it would be better to use:
select * from [dbo].[TestTable] where start_date>='2021-01-01'
Nothing in that table has a year > 2021. There dates greater then the start of the year, but when looking at just the years, they are equal to 2021, not greater.
To fix this, I'd remove the year() function completely. Calling a function on a column can be bad for performance, because it impacts the ability to use indexes on the column. This one is probably not awful (year() is deterministic, and so sometimes the index is still okay), but if there is a way to express a query without the function it's usually a good idea. In this case, we can do it like this:
select * from [dbo].[TestTable] where start_date >= '20210101'

How to reset SQL Server 2008 Column based on years

I'm working on a leave software, and my problem is that i need to reset the leave days to default number of days (30 day) after one year. would you pleas help me with that.
ps: I'm using VB.NET AND SQL SERVER.
create table Addemployees
(
Fname varchar (500),
Lname varchar (500),
ID int not null identity(1, 1) primary key,
CIN varchar (500),
fromD date,
toD date,
Email varchar(500),
phone varchar(500),
Leave_num int
)
This is the tablet that contains the column Leave_num that has the leave numbers inserted by the user
update addemployees
set leave_num = 30
As for how you trigger this logic. There are many ways you could go about this. You'll need some sort of scheduler like an Agent job, or whatever else you have at your disposal to run this process on a recurring, scheduled, basis. The key thing is not to keep updating the LeaveNum if it's already been updated. You could maintain an extra column on each row indicating the last time they were reset. This is probably the simplest, but if it's truly an all-or-nothing type thing, and those dates will all be the same, that's sort of a waste of space.
You could then either create a separate table which just contains information about when these once-a-year jobs run, or something like an Extended Property (which is a little more involved to set up).
Whatever the solution you choose, Just save off the date (or even just the year), and then when your process runs, if the difference between the last update is greater than a year (or if the year of the last update is less than the current year) run your update, then update however you're storing that information; be it columns, a separate table, or an extended property.

T-SQL Select where Subselect or Default

I have a SELECT that retrieves ROWS comparing a DATETIME field to the highest available value of another TABLE.
The Two Tables have the following structure
DeletedRecords
- Id (Guid)
- RecordId (Guid)
- TableName (varchar)
- DeletionDate (datetime)
And Another table which keep track of synchronizations using the following structure
SynchronizationLog
- Id (Guid)
- SynchronizationDate (datetime)
In order to get all the RECORDS that have been deleted since the last synchronization, I run the following SELECT:
SELECT
[Id],[RecordId],[TableName],[DeletionDate]
FROM
[DeletedRecords]
WHERE
[TableName] = '[dbo].[Person]'
AND [DeletionDate] >
(SELECT TOP 1 [SynchronizationDate]
FROM [dbo].[SynchronizationLog]
ORDER BY [SynchronizationDate] DESC)
The problem occurs if I do not have synchronizations available yet, the T-SQL SELECT does not return any row while it should returns all the rows cause there are no synchronization records available.
Is there a T-SQL function like COALESCE that I can use with DateTime?
Your subquery should look like something like this:
SELECT COALESCE(MAX([SynchronizationDate]), '0001-01-01')
FROM [dbo].[SynchronizationLog]
It says: Get the last date, but if there is no record (or all values are NULL), then use the '0001-01-01' date as start date.
NOTE '0001-01-01' is for DATETIME2, if you are using the old DATETIME data type, it should be '1753-01-01'.
Also please note (from https://msdn.microsoft.com/en-us/library/ms187819(v=sql.100).aspx)
Use the time, date, datetime2 and datetimeoffset data types for new work. These types align with the SQL Standard. They are more portable. time, datetime2 and datetimeoffset provide more seconds precision. datetimeoffset provides time zone support for globally deployed applications.
EDIT
An alternative solution is to use NOT EXISTS (you have to test it if its performance is better or not):
SELECT
[Id],[RecordId],[TableName],[DeletionDate]
FROM
[DeletedRecords] DR
WHERE
[TableName] = '[dbo].[Person]'
AND NOT EXISTS (
SELECT 1
FROM [dbo].[SynchronizationLog] SL
WHERE DR.[DeletionDate] <= SL.[SynchronizationDate]
)

Proper way to index date & time columns

I have a table with the following structure:
CREATE TABLE MyTable (
ID int identity,
Whatever varchar(100),
MyTime time(2) NOT NULL,
MyDate date NOT NULL,
MyDateTime AS (DATEADD(DAY, DATEDIFF(DAY, '19000101', [MyDate]),
CAST([MyDate] AS DATETIME2(2))))
)
The computed column adds date and time into a single datetime2 field.
Most queries against the table have one or more of the following clauses:
... WHERE MyDate < #filter1 and MyDate > #filter2
... ORDER BY MyDate, MyTime
... ORDER BY MyDateTime
In a nutshell, date is usually used for filtering, and full datetime is used for sorting.
Now for questions:
What is the best way to set indices on those 3 date-time columns? 2 separate on date and time or maybe 1 on date and 1 on composite datetime, or something else? Quite a lot of inserts and updates occur on this table, and I'd like to avoid over-indexing.
As I wrote this question, I noticed the long and kind of ugly computed column definition. I picked it up from somewhere a while ago and forgot to investigate if there's a simpler way of doing it. Is there any easier way of combining a date and time2 into a datetime2? Simple addition does not work, and I'm not sure if I should avoid casting to varchar, combining and casting back.
Unfortunately, you didn't mention what version of SQL Server you're using ....
But if you're on SQL Server 2008 or newer, you should turn this around:
your table should have
MyDateTime DATETIME
and then define the "only date" column as
MyDate AS CAST(MyDateTime AS DATE) PERSISTED
Since you make it persisted, it's stored along side the table data (and now calculated every time you query it), and you can easily index it now.
Same applies to the MyTime column.
Having date and time in two separate columns may seem peculiar but if you have queries that use only the date (and/or especially only the time part), I think it's a valid decision. You can create an index on date only or on time or on (date, whatever), etc.
What I don't understand is why you also have the computed datetime column as well. There s no reason to store this value, too. It can easily be calculated when needed.
And if you need to order by datetime, you can use ORDER BY MyDate, MyTime. With an index on (MyDate, MyTime) this should be ok. Range datetime queries would also be using that index.
The answer isn't in your indexing, it's in your querying.
A single DateTime field should be used, or even SmallDateTime if that provides the range of dates and time resolution required by your application.
Index that column, then use queries like this:
SELECT * FROM MyTable WHERE
MyDate >= #startfilterdate
AND MyDate < DATEADD(d, 1, #endfilterdate);
By using < on the end filter, it only includes results from sometime before midnight of that date, which is the day after the user-selected "end date". This is simpler and more accurate than adding 23:59:59, especially since stored times can include microseconds between 23:59:59 and 00:00:00.
Using persisted columns and indexes on them is a waste of server resources.

Indexing on DateTime and VARCHAR fields in SQL Server 2000, which one is more effectient?

We have a CallLog table in Microsoft SQL Server 2000. The table contains CallEndTime field whose type is DATETIME, and it's an index column.
We usually delete free-charge calls and generate monthly fee statistics report and call detail record report, all the SQLs use CallEndTime as query condition in WHERE clause. Due to a lot of records exist in CallLog table, the queries are slow, so we want to optimize it starting from indexing.
Question
Will it more effictient if query upon an extra indexed VARCHAR column CallEndDate ?
Such as
-- DATETIME based query
SELECT COUNT(*) FROM CallLog WHERE CallEndTime BETWEEN '2011-06-01 00:00:00' AND '2011-06-30 23:59:59'
-- VARCHAR based queries
SELECT COUNT(*) FROM CallLog WHERE CallEndDate BETWEEN '2011-06-01' AND '2011-06-30'
SELECT COUNT(*) FROM CallLog WHERE CallEndDate LIKE '2011-06%'
SELECT COUNT(*) FROM CallLog WHERE CallEndMonth = '2011-06'
It has to be the datetime. Dates are essentially stored as a number in the database so it is relatively quick to see if the value is between two numbers.
If I were you, I'd consider splitting the data over multiple tables (by month, year of whatever) and creating a view to combine the data from all those tables. That way, any functionality which needs to entire data set can use the view and anything which only needs a months worth of data can access the specific table which will be a lot quicker as it will contain much less data.
I think comparing DateTime is much faster than LIKE operator.
I agree with DoctorMick on Spliting your DateTime as persisted columns Year, Month, Day
for your query which selects COUNT(*), check if in the execution plan there is a Table LookUp node. if so, this might be because your CallEndTime column is nullable. because you said that you have a [nonclustered] index on CallEndTime column. if you make your column NOT NULL and rebuild that index, counting it would be a INDEX SCAN which is not so slow.and I think you will get much faster results.

Resources