Every so often I have ActionFailed. An action failed. No dependent actions succeeded. after condition block. This seems pretty random and goes away if I resubmit the run, as if there was some sort of race condition.
I know that an empty condition branch taken would cause a Skipped result and I know I can handle that by setting Configure run after to include Skipped (which for whatever stupid reason is not a default) but I've tried an alternative solution of putting a "no-op" block into empty branch (just a useless compose block) in order to always have success outcome going out of condition as seen below.
Is this because both branches of condition are actually executed in parallel and the not taken branch, if faster, will cause the outcome of condition to be skipped? It's a very counterintuitive behavior (like most of Flows sadly).
Is this because both branches of condition are actually executed in parallel and the not taken branch, if faster, will cause the outcome of condition to be skipped
For the question above, I don't think it was caused by both branches of condition are actually executed in parallel because we can see the expression result of "Switch" action show with updated. The outcome of condition will not affect "Switch" action.
According to some test, the error may be related to the cases under the "Switch". If any case under "Switch" fails, it will show this error. So please check the cases under "Switch" action. As we can see the "Switch" spent 5 minutes, so please check if there is a case under "Switch" needs much time to do the job and failed with time out.
Related
I have a task running on a table 'original_table' for every five minutes in snowflake .
I am taking a backup for some reason in another table 'back_up_table' . I need to perform swap for 'original_table' with 'back_up_table'
Do I have to pause task on 'original_table' before swap and then resume it?
Short answer, no, there's nothing technically requiring that you stop the task
However, depending on what you are doing, you may get some unexpected results.
For example, if the task is actively running a Insert when the swap happens you may have a bit of a race condition and it will be unpredictable as to which table gets that new record.
If the task is doing something more complicated, like if it is also creating/swapping tables, you could get an error in the task or some truly weird results that would be hard to troubleshoot.
So if your task is able to recover after a failure on follow-up runs, and you're not worried about transient results/race conditions, then It's probably fine to leave it running.
Edit: as other comments suggested. For backing up tables, you might want to look into the Clone feature if you haven't already for a possibly cleaner overall solution.
I have a counter in my app where I expect that 99% of the time there will not be contention issues in updating the counter with transactions.
To handle the 1% times when it is busy, I was thinking of updating the counter by using transactions within deferred tasks as follows:
def update_counter(my_key):
deferred.defer(update_counter_transaction)
#ndb.transactional
def update_counter_transaction(my_key):
x = my_key.get()
x.n += 1
x.put()
For the occasional instances when contention causes the transaction to fail, the task will be retried.
I'm familiar with sharded counters but this seems easier and suited to my situation.
Is there anything I am missing that might cause this solution to not work well?
A problem may exist with the automatic task retries which at least theoretically may happen for reasons other than transaction colissions for the intended counter increments. If such undesired retry successfully re-executes the counter increment code the counter value may be thrown off (will be higher than the expected value). Which might or might not be acceptable for your app, depending on the use of the counter.
Here's an example of undesired defered task invocation: GAE deferred task retried due to "instance unavailable" despite having already succeeded
The answer to that question seems inline with this note on regular task queue documentation (I saw no such note in the deferred task queues article, but I marked it as possible in my brain):
Note that task names do not provide an absolute guarantee of once-only
semantics. In extremely rare cases, multiple calls to create a task of
the same name may succeed, but in this event, only one of the tasks
would be executed. It's also possible in exceptional cases for a task
to run more than once.
From this perspective it might actually be better to keep the counter incrementing together with the rest of the related logical/transactional operations (if any) than to isolate it as a separate transaction on a task queue.
Today while browsing the source I noticed this comment in Pipeline.start method:
Returns:
A taskqueue.Task instance if return_task was True. This task will *not*
have a name, thus to ensure reliable execution of your pipeline you
should add() this task as part of a separate Datastore transaction.
Interesting, I do want reliable execution of my pipeline after all.
I suspect the comment is a bit inaccurate, since if you use the default return_task=False option the task is added inside a transaction anyway (by _PipelineContext.start)... it seems like the reason you'd want to add the task yourself is only if you want the starting of the pipeline to depend on success of something in your own transaction.
Can anyone confirm my suspicion or suggest how else following the comment's advice may effect 'reliable execution of your pipeline' ?
If you don't include the parameter when you call Pipeline.start(), the task is enqueued in the queue given by the Pipeline's inner variable context (type _PipelineContext). The default name for this queue is "default".
If you do include the parameter when you call Pipeline.start(), the task is not enqueued within these methods. Pipeline.start() will return _PipelineContext.start(), we see that it relies on an inner method txn(). This method is annotated transactional since it first does a bit of book-keeping for the Datastore records used to run this pipeline. Then, after this book-keeping is done, it creates a task without a name property (see the Task class definition here).
If return_task was not provided, it will go ahead and add that (un-named) task to the default queue for this pipeline's context. It also sets transactional on that task, so that it will be a "transactional task" which will only be added if the enclosing datastore transaction is committed successfully (ie. with all the bookkeeping in the txn() method successful so that this task, when run, will interact properly with the other parts of the pipeline etc.)
If, on the other hand, return_task was not defined, the un-named task, un-added to a queue, is returned. The txn() book-keeping work will nonetheless have taken place to prepare it to run. _PipelineContext.start() returns to Pipeline.start(), and user code will get the un-named, un-added task.
You're absolutely correct to say that the reason you would want this pattern is if you want pipeline execution to be part of a transaction in your code. Maybe you want to receive and store some data, kick off a pipeline, and store the pipeline id on a user's profile somewhere in Datastore. Of course, this means you want not only the datastore events but also the pipeline execution event to be grouped together into this atomic transaction. This pattern would allow you to do such a thing. If the transaction failed, the transactional task would not execute, and handle_run_exception() will be able to catch the TransactionFailedError and run pipeline.abort() to make sure the Datastore book-keeping data had been destroyed for the task that never ran.
The fact that the pipeline is un-named will not cause any disruption, since un-named tasks are auto-generated a unique name when added to a queue if the name is undefined, and in fact not having a name is actually a requirement for tasks which are added with transactional=True.
All in all, I think the comment just means to say that due to the fact that the task returned is transactional, in order for it to be reliably executed, you should make sure that the task.add (queue_name...) takes place in a transaction. It's not saying that the returned task is somehow "unreliable" just because you set return_task, it's basically using the word "reliable" superfluously, due to the fact that the task is run in a transaction.
We sometimes experience CommandTimeouts on relative simple DELETE queries.
They normally execute in a second or a few, but sometimes hit our 300 sec CommandTimeout.
So far we have seen it happen from 2 locations, where we send a lot of delete statements from loops. In one case a TADOStoredProc with parameters is used, in the other case a TADOCommand with a DELETE statement is used (no parameters). It does not happen in a predictable fashion, in two executions of the same code the result can vary. The exception happens rarely, so far we have gotten past it by re-running the program.
What could be the cause of these timeout?
What would the best solution be? Currently we are considering trying async execution and resending the command on timeout.
I have a very popular site in ASP.NET MVC/SQL Server, and unfortunately a lot of deadlocks occur. While I'm trying to figure out why they occur via the SQL profiler, I wonder how I can change the default behavior of SQL Server when doing the deadlocks.
Is it possible to re-run the transaction(s) that caused problems instead of showing the error screen?
Remus's answer is fundamentally flawed. According to https://stackoverflow.com/a/112256/14731 a consistent locking order does not prevent deadlocks. The best we can do is reduce their frequency.
He is wrong on two points:
The implication that deadlocks can be prevented. You will find both Microsoft and IBM post articles about reducing the frequency of deadlocks. No where do they claim you can prevent them altogether.
The implication that all deadlocks require you to re-evaluate the state and come to a new decision. It is perfectly correct to retry some actions at the application level, so long as you travel far back enough to the decision point.
Side-note: Remus's main point is that the database cannot automatically retry the operation on your behalf, and he is completely right on that count. But this doesn't mean that re-running operations is the wrong response to a deadlock.
You are barking up the wrong tree. You will never succeed in doing automated deadlock retries by the SQL engine, such concept is fundamentally wrong. The very definition of deadlock is that the state you base your decision on has changed therefore you need to read again the state and make a new decision. If your process has deadlocked, by definition another process has won the deadlocks, and it meas it has changed something you've read.
Your only focus should be at figuring out why the deadlocks occur and eliminate the cause. Invariably, the cause will turn out to be queries that scan more data that they should. While is true that other types of deadlock can occur, I bet is not your case. Many problems will be solved by deploying appropriate indexes. Some problems will send you back to the drawing board and you will have to rethink your requirements.
There are many, many resources out there on how to identify and solve deadlocks:
Detecting and Ending Deadlocks
Minimizing Deadlocks
You may also consider using snapshot isolation, since the lock-free reads involved in snapshot reduce the surface on which deadlocks can occur (ie. only write-write deadlocks can occur). See Using Row Versioning-based Isolation Levels.
A lot of deadlocks occurring is often an indication that you either do not have the correct indexes and/or that your statistics are out of date. Do you have regular scheduled index rebuilds as part of maintenance?
Your save code should automatically retry saves when error 1205 is returned (deadlock occurred). There is a standard pattern that looks like this:
catch (SqlException ex)
{
if (ex.Number == 1205)
{
// Handle Deadlock by retrying save...
}
else
{
throw;
}
}
The other option is to retry within your stored procedures. There is an example of that here: Using TRY...CATCH in Transact-SQL
One option in addition to those suggsted by Mitch and Remus, as your comments suggest you're looking for a fast fix. If you can identify the queries involved in the deadlocks, you can influence which of the queries involved are rolled back and which continue by setting DEADLOCK_PRIORITY for each query, batch or stored procedure.
Looking at your example in the comment to Mitch's answer:
Let's say deadlock occurs on page A,
but page B is trying to access the
locked data. The error will be
displayed on page B, but it doesn't
mean that the deadlock occurred on
page B. It still occurred on page A.
If you consistently see a deadlock occuring from the queries issued from page A and page B, you can influence which page results in an error and which completes successfully. As the others have said, you cannot automatically force a retry.
Post a question with the problem queries and/or the deadlock trace output and theres a good chance you'll get an explanation as to why its occurring and how it could be fixed.
in some cases, you can do below. Between begin tran and commit is all or nothing. So either #errorcode take 0 as value and ends loop, or, in case of failure, decrease counter by 1 and retry again. It may not work if you provide variables to code from outside begin tran/commit. Just an Idea :)
declare #errorcount int = 4 -- retry number
while #errorcount >0
begin
begin tran
<your code here>
set #errorcount =0
commit
set #errorcount=#errorcount-1
end