i'm trying to better task scheduler software monitoring by querying a table, with column events, contains steps of tasks scheduler log: task initiated, task stopped, task triggered manually, etc
now i know that if a specific order of events is met, that the task was executed fully successfully
say that the optimal order of events for a task is
task started
task processing
job task finished
now if for a particular task, i have the following order of events:
task started
start triggered manually
task processing job
task finished
this order of event is not optimal, because event start triggered manually is not part of the optimal order of events - i want to flag this task
now if for a particular task, the following order of events occurr:
task started
task finished
this order of event is not optimal, because event task processing job is missing, i want to flag this task
The optimal order of event I get using the following query
select t.events from
(SELECT distinct events FROM [jobmonitoring]) t
ORDER BY (case when activity = 'task started' then 1
when activity = 'task processing job' then 2
when activity = 'task finished' then 3
else 4 end)
i'm stuck in flagging the tasks that do not follow this particular order of events
flagging must respect, these 3 events in that specific order
desired output would look something like
task
flag
a
null
b
null
c
flagged
d
null
e
flagged
tasks c and e do not follow optimal event ordering
table jobmonitoring, looks like this
task
events
timestamp
c
task started
28072022 1205
c
job task finished
28072022 1305
e
task started
28072021 1005
e
job task finished
28072021 1105
e
task processing
28072021 1205
a
task started
21072021 0905
a
task processing
21072021 1005
a
job task finished
21072021 1205
You can add a ROW_NUMBER then group by task and use conditional aggregation
SELECT
t.task,
flag = CASE WHEN MAX(CASE WHEN t.rn = 1 THEN t.events END) <> 'task started'
OR MAX(CASE WHEN t.rn = 2 THEN t.events END) <> 'task processing'
OR MAX(CASE WHEN t.rn = 3 THEN t.events END) <> 'job task finished'
THEN 'flagged' END
FROM (
SELECT *,
rn = ROW_NUMBER() OVER (PARTITION BY t.task ORDER BY t.timestamp)
FROM YourTable t
) t
GROUP BY
t.task;
db<>fiddle
Related
We've setup a stream on a table that is continuously loaded via snowpipe.
We're consuming this data with a task that runs every minute where we merge into another table. There is a possibility of duplicate keys so we use a ROW_NUMBER() window function, ordered by the file created timestamp descending where row_num=1. This way we always get the latest insert
Initially we used a standard task with the merge statement but we noticed that in some instances, since snowpipe does not guarantee loading in order of when the files were staged, we were updating rows with older data. As such, on the WHEN MATCHED section we added a condition so only when the file created ts > existing, to update the row
However, since we did that, reconciliation checks show that some new inserts are missing. I don't know for sure why changing the matched clause would interfere with the not matched clause.
My theory was that the extra clause added a bit of time to the task run where some runs were skipped or the next run happened almost immediately after the last one completed. The idea being that the missing rows were caught up in the middle and the offset changed before they could be consumed
As such, we changed the task to call a stored procedure which uses an explicit transaction. We did this because the docs seem to suggest that using a transaction will lock the stream. However even with this we can see that new inserts are still missing. We're talking very small numbers e.g. 8 out of 100,000s
Any ideas what might be happening?
Example task code below (not the sp version)
WAREHOUSE = TASK_WH
SCHEDULE = '1 minute'
WHEN SYSTEM$stream_has_data('my_stream')
AS
MERGE INTO processed_data pd USING (
select
ms.*,
CASE WHEN ms.status IS NULL THEN 1/mv.count ELSE NULL END as pending_count,
CASE WHEN ms.status='COMPLETE' THEN 1/mv.count ELSE NULL END as completed_count
from my_stream ms
JOIN my_view mv ON mv.id = ms.id
qualify
row_number() over (
partition by
id
order by
file_created DESC
) = 1
) ms ON ms.id = pd.id
WHEN NOT MATCHED THEN INSERT (col1, col2, col3,... )
VALUES (ms.col1, ms.col2, ms.col3,...)
WHEN MATCHED AND ms.file_created >= pd.file_created THEN UPDATE SET pd.col1 = ms.col1, pd.col2 = ms.col2, pd.col3 = ms.col3, ....
;
I am not fully sure what is going wrong here, but the file created time related recommendation is given by Snowflake somewhere. It suggest that the file created timestamp is calculated in cloud service and it may be bit different than you think. There is another recommendation related to snowpipe and data ingestion. The queue service takes a min to consume the data from pipe and if you have lot of data being flown inside with in a min, you may end up this issue. Look you implementation and simulate if pushing data in 1min interval solve that issue and don't rely on file create time.
The condition "AND ms.file_created >= pd.file_created" seems to be added as a mechanism to avoid updating the same row multiple times.
Alternative approach could be using IS DISTINCT FROM to compare source against target columns(except id):
MERGE INTO processed_data pd USING (
select
ms.*,
CASE WHEN ms.status IS NULL THEN 1/mv.count ELSE NULL END as pending_count,
CASE WHEN ms.status='COMPLETE' THEN 1/mv.count ELSE NULL END as completed_count
from my_stream ms
JOIN my_view mv ON mv.id = ms.id
qualify
row_number() over (
partition by
id
order by
file_created DESC
) = 1
) ms ON ms.id = pd.id
WHEN NOT MATCHED THEN INSERT (col1, col2, col3,... )
VALUES (ms.col1, ms.col2, ms.col3,...)
WHEN MATCHED
AND (pd.col1, pd.col2,..., pd.coln) IS DISTINCT FROM (ms.col1, ms.col2,..., ms.coln)
THEN UPDATE SET pd.col1 = ms.col1, pd.col2 = ms.col2, pd.col3 = ms.col3, ....;
This approach will also prevent updating row when nothing has changed.
I have a task scheduled to run every 15 minutes:
CREATE OR REPLACE TASK mytask
WAREHOUSE = 'SHARED_WH_MEDIUM'
SCHEDULE = '15 MINUTE'
STATEMENT_TIMEOUT_IN_SECONDS = 3600,
QUERY_TAG = 'KLIPFOLIO'
AS
CREATE OR REPLACE TABLE mytable AS
SELECT * from xxx;
;
alter task mytask resume;
I see from the output of task_history() that the task is SCHEDULED:
select * from table(aftonbladet.information_schema.task_history(task_name => 'MYTASK')) order by scheduled_time;
QUERY_ID NAME DATABASE_NAME SCHEMA_NAME QUERY_TEXT CONDITION_TEXT STATE ERROR_CODE ERROR_MESSAGE SCHEDULED_TIME COMPLETED_TIME RETURN_VALUE
*** MYTASK *** *** *** SCHEDULED 2020-01-21 09:58:12.434 +0100
but I want it to run right now without waiting for the SCHEDULED_TIME , is there any way to accomplish that?
Snowflake now supports running tasks manually. Just use the EXECUTE TASK command:
EXECUTE TASK manually triggers an asynchronous single run of a scheduled task (either a standalone task or the root task in a task tree) independent of the schedule defined for the task. A successful run of a root task triggers a cascading run of child tasks in the tree as their precedent task completes, as though the root task had run on its defined schedule.
Also, there is no need for the task in started mode. Even tasks in suspended mode can be executed manually.
There is no way currently to execute a task manually. You could, however, alter the task schedule to 1 minute, let it run, and then alter it back to 15 minutes, so that you're not waiting the full 15 minutes. I have seen this request multiple times, and there is an Idea on Lodge (https://community.snowflake.com/s/ideas) that you should upvote (search for 'Tasks' and I think it'll be one of the top ideas). Since Tasks are still in Public Preview, it's likely that these types of ideas will be reviewed and prioritized if they have a lot of votes.
To build on Mike's answer:
You could have a task that executes every minute, but only if there's data on the stream!
For this you can create a table and stream just to decide if the task will be triggered every minute or not.
This root task should delete the data inserted in the stream to prevent the task running again.
So then you can have dependent tasks that execute every time you bring data into the stream, but only when the stream has new data.
This relies on the ability to run a task only when SYSTEM$STREAM_HAS_DATA()
-- stream so this task executes every minute, but only if there's new data
create table just_timestamps_stream_table(value varchar);
create stream just_timestamps_stream on table just_timestamps_stream_table;
-- https://docs.snowflake.com/en/user-guide/tasks-intro.html
create or replace task mytask_minute
warehouse = test_small
schedule = '1 MINUTE'
when SYSTEM$STREAM_HAS_DATA('just_timestamps_stream')
as
-- consume stream so tasks doesn't execute again
delete from just_timestamps_stream_table;
-- the real task to be executed
create or replace task mytask_minute_child1
warehouse = test_small
after mytask_minute
as
insert into just_timestamps values(current_timestamp, 'child1');
Full example:
https://github.com/fhoffa/snowflake_snippets/blob/main/stream_and_tasks/minimal.sql
I have a SSIS package deployed in SQL Server and there are 3 different SQL Server Agent jobs that runs this package in different steps and schedules.
My question is: if the package is showing as failed in the Integration Services Catalogs -> Reports in one of the execution, is there a way I can identify which is the job that run that execution which caused the package failed (not by cross checking time of failure from the history of the job and the package failed execution time)?
It is not very straight forward. Based on this stack exchange answer, you may try:
SELECT
history.*
,ex.*
,ex.status
, CASE ex.status
WHEN 1 THEN 'created'
WHEN 2 THEN 'running'
WHEN 3 then 'canceled'
WHEN 4 then 'failed'
WHEN 5 then 'pending'
WHEN 6 then 'ended unexpectedly'
WHEN 7 then 'succeeded'
WHEN 8 then 'stopping'
WHEN 9 then 'completed'
END as job_status
FROM (
SELECT
h.step_name,
-- h.message,
h.run_status,
h.run_date,
h.run_time,
SUBSTRING(h.message, NULLIF(CHARINDEX('Execution ID: ', h.message),0)+14 ,PATINDEX('%[^0-9]%',SUBSTRING(h.message, NULLIF(CHARINDEX('Execution ID: ', h.message),0)+14 ,20))-1) ExecutionId
FROM MSDB.DBO.SYSJOBHISTORY h) history
LEFT JOIN
SSISDB.CATALOG.EXECUTIONS ex on ex.execution_id = history.ExecutionId
WHERE project_name = '<ssisdb_project_name_here>'
It has many columns which you can ignore by replacing * in select. The important part is to join MSDB.DBO.SYSJOBHISTORY with MSDB.DBO.SYSJOBHISTORY.
Also, this works for project deployment mode and not package deployment mode of SSIS.
I have around 40 different sql server jobs in one instance. They all have different schedules. Some run once a day some every two mins some every five mins. If I have a need to stop sql server agent, how can I find the best time when no jobs are running so I won't interrupt any of my jobs?
how can I find the best time when no jobs are running so I won't interrupt any of my jobs?
You basically want to find a good window to perform some maintenance. #MaxVernon has blogged about it here with a handy script
/*
Shows gaps between agent jobs
-- http://www.sqlserver.science/tools/gaps-between-sql-server-agent-jobs/
-- requires SQL Server 2012+ since it uses the LAG aggregate.
Note: On SQL Server 2005, SQL Server 2008, and SQL Server 2008 R2, you could replace the LastEndDateTime column definition with:
LastEndDateTime = (SELECT TOP(1) s1a.EndDateTime FROM s1 s1a WHERE s1a.rn = s1.rn - 1)
*/
DECLARE #EarliestStartDate DATETIME;
DECLARE #LatestStopDate DATETIME;
SET #EarliestStartDate = DATEADD(DAY, -1, GETDATE());
SET #LatestStopDate = GETDATE();
;WITH s AS
(
SELECT StartDateTime = msdb.dbo.agent_datetime(sjh.run_date, sjh.run_time)
, MaxDuration = MAX(sjh.run_duration)
FROM msdb.dbo.sysjobs sj
INNER JOIN msdb.dbo.sysjobhistory sjh ON sj.job_id = sjh.job_id
WHERE sjh.step_id = 0
AND msdb.dbo.agent_datetime(sjh.run_date, sjh.run_time) >= #EarliestStartDate
AND msdb.dbo.agent_datetime(sjh.run_date, sjh.run_time) < = #LatestStopDate
GROUP BY msdb.dbo.agent_datetime(sjh.run_date, sjh.run_time)
UNION ALL
SELECT StartDate = DATEADD(SECOND, -1, #EarliestStartDate)
, MaxDuration = 1
UNION ALL
SELECT StartDate = #LatestStopDate
, MaxDuration = 1
)
, s1 AS
(
SELECT s.StartDateTime
, EndDateTime = DATEADD(SECOND, s.MaxDuration - ((s.MaxDuration / 100) * 100)
+ (((s.MaxDuration - ((s.MaxDuration / 10000) * 10000))
- (s.MaxDuration - ((s.MaxDuration / 100) * 100))) / 100) * 60
+ (((s.MaxDuration - ((s.MaxDuration / 1000000) * 1000000))
- (s.MaxDuration - ((s.MaxDuration / 10000) * 10000))) / 10000) * 3600, s.StartDateTime)
FROM s
)
, s2 AS
(
SELECT s1.StartDateTime
, s1.EndDateTime
, LastEndDateTime = LAG(s1.EndDateTime) OVER (ORDER BY s1.StartDateTime)
FROM s1
)
SELECT GapStart = CONVERT(DATETIME2(0), s2.LastEndDateTime)
, GapEnd = CONVERT(DATETIME2(0), s2.StartDateTime)
, GapLength = CONVERT(TIME(0), DATEADD(SECOND, DATEDIFF(SECOND, s2.LastEndDateTime, s2.StartDateTime), 0))
FROM s2
WHERE s2.StartDateTime > s2.LastEndDateTime
ORDER BY s2.StartDateTime;
The question title scared me a bit - I thought you wanted to programmatically shut the SQL Server agent down anytime there were no jobs running. My answer to that question would be "Why?" There is no need to.
But if you are just looking to do a planned restart or shut down and you don't have a third party tool like Sentry One's SQL Sentry Event Manager to have a visualization, I would just let the SQL Server Agent Job History and Job Activity Monitor help here. The Job Activity monitor can show you which jobs are running right now in the status column. You can also see the last execute and next execute dates and times.
In the object browser in SSMS, connect to your instance, then expand SQL Server Agent, then you'll see Jobs and under that you'll see "Job Activity Monitor" - this view should show you what you need.
Also - don't worry about shutting down before a job executes. If you do that, you will either have that job just missing its schedule and you can let it run when it is next due to (depending on the job and its purpose) or you can manually right click and execute the job.
For more on the activity monitor for jobs, see Monitor Job Activity in the product documentation.
I recommend creating a script that will disable your jobs. Disabled jobs still exist but will not be automatically launched by their schedules. Run this script (based on procedure sp_update_job in the msdb database) to disable jobs, wait for any currently running jobs to finish execution, then stop SQL agent. A similar script to re-enable disabled jobs would be useful. You might need to plan around jobs that are and should remain disabled.
A complete “SQL Agent shutdown” process could be fully scripted, but I question the wisdom of doing so. A bit of research implies that there is no 100% reliable way of programmatically telling if a given job is or is not running, and while there is an undocumented (where "undocumented" means "you really shouldn't be using this") system procedure for stopping and starting services, doing so from with SQL Server itself seems like a pretty bad idea.
You can query the system tables as shown by Dattatrey Sindol in the MSSQLTips.com article Querying SQL Server Agent Job Information:
SELECT
[sJOB].[job_id] AS [JobID]
, [sJOB].[name] AS [JobName]
, [sDBP].[name] AS [JobOwner]
, [sCAT].[name] AS [JobCategory]
, [sJOB].[description] AS [JobDescription]
, CASE [sJOB].[enabled]
WHEN 1 THEN 'Yes'
WHEN 0 THEN 'No'
END AS [IsEnabled]
, [sJOB].[date_created] AS [JobCreatedOn]
, [sJOB].[date_modified] AS [JobLastModifiedOn]
, [sSVR].[name] AS [OriginatingServerName]
, [sJSTP].[step_id] AS [JobStartStepNo]
, [sJSTP].[step_name] AS [JobStartStepName]
, CASE
WHEN [sSCH].[schedule_uid] IS NULL THEN 'No'
ELSE 'Yes'
END AS [IsScheduled]
, [sSCH].[schedule_uid] AS [JobScheduleID]
, [sSCH].[name] AS [JobScheduleName]
, CASE [sJOB].[delete_level]
WHEN 0 THEN 'Never'
WHEN 1 THEN 'On Success'
WHEN 2 THEN 'On Failure'
WHEN 3 THEN 'On Completion'
END AS [JobDeletionCriterion]
FROM
[msdb].[dbo].[sysjobs] AS [sJOB]
LEFT JOIN [msdb].[sys].[servers] AS [sSVR]
ON [sJOB].[originating_server_id] = [sSVR].[server_id]
LEFT JOIN [msdb].[dbo].[syscategories] AS [sCAT]
ON [sJOB].[category_id] = [sCAT].[category_id]
LEFT JOIN [msdb].[dbo].[sysjobsteps] AS [sJSTP]
ON [sJOB].[job_id] = [sJSTP].[job_id]
AND [sJOB].[start_step_id] = [sJSTP].[step_id]
LEFT JOIN [msdb].[sys].[database_principals] AS [sDBP]
ON [sJOB].[owner_sid] = [sDBP].[sid]
LEFT JOIN [msdb].[dbo].[sysjobschedules] AS [sJOBSCH]
ON [sJOB].[job_id] = [sJOBSCH].[job_id]
LEFT JOIN [msdb].[dbo].[sysschedules] AS [sSCH]
ON [sJOBSCH].[schedule_id] = [sSCH].[schedule_id]
ORDER BY [JobName]
We have different SSIS package that we use in daily tasks (updates, ETL...) and we have a kind of complicated structure, where a package calls different other packages. And there are primarily about 10 principal jobs that call secondary ones. So these 10 jobs are always on success even if a step fails so it wouldn't block other executions. Although we would like to retrieve the steps (and their status) that are related to these jobs via a SQL Query but we couldn't join between the steps and their calling jobs and at the same time retrieve the status (The step status in this case and not the jobs).
I searched a lot on the net and i always find a script that joins the steps and calling jobs without the status or steps and status without knowing which job is calling...
(for example this link and this one )
so to sum it all up, we are trying to do a Query where we can join the jobs, their Status and their parent job.
Any help in this matter would be really appreciated and thanks in advance.
EDIT
Thanks to the link in #BaconBits comment i was able to create a query joining three tables (msdb.dbo.sysjobsteps, msdb.dbo.sysjobs, msdb.dbo.sysjobhistory) that retrieves something like the following:
Job_name1 Step_name1 Job1_status
Job_name1 Step_name2 Job1_status
Job_name1 Step_name3 Job1_status
Job_name2 Step_name1 Job2_status
Job_name2 Step_name2 Job2_status
But I still couldn't retrieve the step status (which is what i need in this case since the job outcome is always on success even if a step fails)
Query:
select j.name, s.step_name,
CASE WHEN s.last_run_outcome=0 THEN 'Failed'
WHEN s.last_run_outcome=1 THEN 'Success'
WHEN s.last_run_outcome=2 THEN 'Retry'
WHEN s.last_run_outcome=3 THEN 'Canceled'
END
,h.run_date, s.output_file_name
from msdb.dbo.sysjobsteps s
inner join msdb.dbo.sysjobs j on s.job_id=j.job_id
inner join msdb.dbo.sysjobhistory h
on h.job_id=j.job_id or s.step_id=h.step_id
--where j.name like '%Dem%'
order by h.run_date, j.name
Thank you #BaconBits and anyone for any further help.