We have approximately 2 dozen SQL Database in our Azure portal. I have been asked to evaluate these and see if any are no longer being used and can be dropped. I know how to query their original Quotas and current sizes but I am baffled as to how to query DTUs.
How can I query each DB and see when someone last logged into or initiated any queries?
Thank you!
The following query should give you an idea if the database has been used based on resource consumption over the last 7 days:
SELECT *
FROM sys.dm_db_resource_stats
WHERE --database_name = 'AdventureWorksLT' AND
end_time > DATEADD(day, -7, GETDATE())
ORDER BY end_time DESC;
Related
I have SQL Server Agent Job on my System that copies data into at table for later evaluation purposes. The Job runs on two types of schedules every Friday every week and last day of the month. The target data records should also contain a column indicating the schedule that originally triggered the job. But I found no way so far to receive this data as parameter or so. I'm using a Microsoft SQL Server 2017.
I did a web search but maybe searched for the wrong keywords. I also thought about comparing current time to expected runtime per schedule but that seemed to be not a fault tolerant option to me.
I like to fill a column "schedule" with values like "End of week", "End of month"
sys tables are your friend here. Documentation
sysjobs has your job information.
sysjobschedules links your job to its schedule.
sysschedules has your schedule info.
SELECT j.*
, s.*
FROM sysjobs j
JOIN sysjobschedules js ON j.id = js.job_id
JOIN sysschedules s ON js.schedule_id = s.schedule_id
WHERE j.name = 'your job name here'
After long search and analyzing I finally found a solution that at least fit my needs:
The undocumented and unsupport stored procedures provides the schedule that triggered a job ind Column Request Source ID:
EXEC master.dbo.xp_sqlagent_enum_jobs 1, garbage
see also: https://am2.co/2016/02/xp_sqlagent_enum_jobs_alt/
I have an OrderProduct table with these columns and some data:
-order_number : ORDER01
-customer_name : Jackie
-order_status : Wait For Payment
-datetime_order_status : 25-01-2020 15:30:00
-datetime_transfer_notify : NULL
A customer needs to transfer notify in my order product system in 24 hours if not the Microsoft SQL will automatic update data in column 'order_status' from 'Wait for payment' to 'Cancel'.
How can I do that?
I believe the easiest way to do this is with a SQL Agent job (MS Docs). This is very dependent on the architecture and size of your databases and tables, but it would definitely get the job done. Depending on how sensitive the business is to being up to date, you could set the job to run every 1 minute, every 5 minutes, or any other time interval you would like. If I was going to do this, I would use a query along the lines of the following:
UPDATE OrderProduct SET order_state = 'Cancel' WHERE datetime_order_status < DATEADD(DAY, -1, GETDATE()) AND order_status = 'Wait for Payment'
Along with this, I would use something like SQL Server Management Studio to create a SQL Agent job on that server that ran at the interval you'd like, similar to this (Stack Overflow). Here (Stack Exchange DBA) is a very similar question to yours for MySQL as added reference.
I am using the following query to retrieve query history from my Snowflake database.
SELECT *
FROM table(MY_DATABASE.information_schema.query_history(
end_time_range_start => dateadd(HOUR, -4, current_timestamp()),
current_timestamp()
));
Oddly, if the warehouse (size: XS) I am using gets suspended after a period of inactivity, the next time I attempt to retrieve query history- the history that was there prior to the warehouse's suspension is gone.
I could not find anything documented to explain this.
Anyone run into this issue or related documentation that could explain this?
Thank you!
I can't explain exactly the limitations of that information schema query you are running (some of them only return like 10,000 rows or like you said, once the warehouse turns off), but it's a limited view into the actual query history. You can use the snowflake database for all query history.
It's a massive table so make sure you put filters on it. Here's an example query to access it:
USE DATABASE snowflake;
USE SCHEMA account_usage;
SELECT *
FROM query_history
WHERE start_time BETWEEN '2020-01-01 00:00' AND '2020-01-03 00:00'
AND DATABASE_NAME = 'DATABASE_NAME'
AND USER_NAME = 'USERNAME'
ORDER BY START_TIME DESC;
1: Your question states that after a period of inactivity, does not specify what is the period of inactivity.
"after a period of inactivity, the next time I attempt to retrieve query history- the history that was there prior to the warehouse's suspension is gone."
If its beyond 7 days then the data can be found from account_usage table. Below is the link of difference between INFORMATION_SCHEMA and ACCOUNT_USAGE.
https://docs.snowflake.com/en/sql-reference/account-usage.html#differences-between-account-usage-and-information-schema
2: Your query does not specify USER_NAME or WAHREHOUSE_NAME in your query so it could be that before the output of your queries before suspension of warehouse may have moved beyond 4 hours period as in your predicate. If you can increase the time period and check if behaviour still exists.
3: In general its not advisable to query INFORMATION_SCHEMA to get query history unless your application requires data without any latency. If possible use ACCOUNT_USAGE table to get query history information.
Here is what I did.
1: Created an XS warehouse
2: Set auto_suspend to 5 minutes
3: Ran few queries
4: Ran your query (which does not specify user_name or warehouse_name) meaning you are searching for history from all users.
SELECT *
FROM table(MY_DATABASE.information_schema.query_history(
end_time_range_start => dateadd(HOUR, -4, current_timestamp()),
current_timestamp()
));
5: Returned output of few 100 records.
6: Used additional where clause to check for data of my user which ran few queries before auto_suspend of Warehouse and it returned few records.
SELECT *
FROM table(MY_DATABASE.information_schema.query_history(
end_time_range_start => dateadd(HOUR, -4, current_timestamp()),
current_timestamp()
))
WHERE USER_NAME = 'ADITYA';
7: Waited for 10 minutes so that my warehouse is auto_suspended.
8: Repeat point 5 and point 6 and again it returned records as expected.
I have a simple SQL query to count the number of telemetry records by clients within the last 24 hours.
With an index on TimeStamp, the following query runs in less than 1 seconds for about 10k rows
select MachineName,count(Message) from Telemetry where TimeStamp between DATEADD(HOUR,-24, getutcdate()) and getutcdate() group by MachineName
However, when I tried to making the hard-coded -24 configurable and added a variable, it took more than 5 min for the query to get executed.
DECLARE #cutoff int; SET #cutoff = 24
select MachineName,count(Message) from Telemetry where TimeStamp between DATEADD(HOUR, -1*#cutoff, getutcdate()) and getutcdate() group by MachineName
Is there any specific reason for the significant decrease of performance? What's the best way of adding a variable without impacting performance?
My guess is that you also have an index on MachineName - or that SQL is deciding that since it needs to group by MachineName, that would be a better way to access the records.
Updating statistics as suggested by AngularRat is a good start - but SQL often maintains those automatically. (In fact, the good performance when SQL knows the 24 hour interval in advance is evidence that the statistics are good...but when SQL doesn't know the size of the BETWEEN in advance, then it thinks other approaches might be a better idea).
Given:
CREATE TABLE Telemetry ( machineName sysname, message varchar(88), [timestamp] timestamp)
CREATE INDEX Telemetry_TS ON Telemetry([timestamp]);
First, try the OPTION (OPTIMIZE FOR ( #cutoff = 24 )); clause to let SQL know how to approach the query, and if that is insufficient then try WITH (Index( Telemetry_TS)). Using the INDEX hint is less desirable.
DECLARE #cutoff int = 24;
select MachineName,count(Message)
from Telemetry -- WITH (Index( Telemetry_TS))
where TimeStamp between DATEADD(HOUR, -1*#cutoff, getutcdate()) and getutcdate()
group by MachineName
OPTION (OPTIMIZE FOR ( #cutoff = 24 ));
Your parameter should actually work, but you MIGHT be seeing an issue where the database is using out of date statistics for the query plan. I'd try updating statistics for the table you are quering. Something like:
UPDATE STATISTICS TableName;
Additionally, if your code is running from within a stored procedure, you might want to recompile the procedure. Something like:
EXEC sp_recompile N'ProcedureName';
A lot of times when I have a query that seems like it should run a lot faster but isn't, it's a statistic/query plan out of date issue.
References:
https://msdn.microsoft.com/en-us/library/ms187348.aspx
https://msdn.microsoft.com/en-us/library/ms190439.aspx
I have a situation where a query might be called multiple times from multiple users, but I only want it to run once (per week) against the database. The environment is SQL Server Express so scheduling via SQL Server Agent is not an option. It needs to be 2005 compatible. I'd like to make it as lightweight as possible too, so I'm asking for suggestions. Ideally a database wide declared variable - but I don't think that SQL Server supports such a beast? Thanks
Try something like this:
IF NOT EXISTS ( -- Check if you have the current week content
SELECT *
FROM WeeklyTable
WHERE
DATEPART(YEAR, DateCr) = DATEPART(YEAR, GETDATE())
AND
DATEPART(WEEK, DateCr) = DATEPART(WEEK, GETDATE())
)
BEGIN
-- delete old content
DELETE WeeklyTable
-- insert new content
INSERT INTO WeeklyTable (MyID, MyField1, ... , MyFieldN, DateCr)
SELECT
MyID, MyField1, MyField2, GETDATE()
FROM MainTable
END
You can create indexes you need for the WeeklyTable.
One option would be SQL Scheduler as a add-on to SQL Server Express.
The other option would be to create a small command-line utility that does the querying and schedule that using the Windows Scheduler on the machine where SQL Server Express is installed.
With either of the two setups, you could select the values / numbers you need into a result table once a week, and any requests during the week would be satisfied from that one result table. SQL Server doesn't have "server-wide" variables - but you can always define a table for that purpose...