Flink for abandoned workflow - apache-flink

We want to remind users to complete their workflow. These workflow events look like 'Workflow started', 'progressed stage 1', 'progressed stage 2',... 'Workflow ended' and they flow through Kafka. Each event has a unique identifier to identify a workflow attempt by the user.
How do we design a pipeline in Flink to detect workflows that have started but abandoned in the middle? Is there any established pattern for this?

You can use processFunction timers I think.
Timers

We ended up building with a timeout process function. We process each event of a workflow attempt and set a timer to fire.
Instant timerFireAt = event.getTimestamp().plusSeconds(timeoutDuration);
context.timerService().registerProcessingTimeTimer(timerFireAt.toEpochMilli();
This keeps getting updated with each incoming event of the same workflow attempt. On completion of the attempt, we delete the timer. If it's not deleted i.e. if there are no events for certain time, the timer fires.

Related

How schedule events on the Netezza Events management

I want to undestand if the are some configuration to schedule the events rules that i have created, i follow the informations here Text
the process that i follow it was :
I Copy and enable a template event rule
I Modified the message template, to send the report event by email
but the only doubt that may a dont understand is who schedule this events , if i want to run daily or since 7 days, can this scheduler ?
and my last question i can test my rules?
who schedule this events
One does not schedule an event. An event happens ... and then the system responds to the event (using the rule that you have set up).
e.g.,
notify me if the system goes offline
notify me if the disks are > 95% full
If you want something to occur at a regularly scheduled time, then linux's "cron" facility would seem to be the way to go.

Apache Flink: How to make some action after the job is finished?

I'm trying to do one action after the flink job is finished (make some change in DB). I want to do it in the same flink application with no luck.
I found that there is JobStatusListener that is notified in ExecutionGraph about changed state but I cannot find how I can get this ExecutionGraph to register my listener.
I've tried to completely replace ExecutionGraph in my project (yes, bad approach but...) but as soon as it is runtime library it is not called at all in distributed mode, only in local run.
I have next flink application in short:
DataSource.output(RichOutputFormat.class)
ExecutionEnvironment.getExecutionEnvironment().execute()
Can please anybody help?

Flink failed job notification

Is there a mechanism in Flink to send alerts/notifications when a job has failed?
I was thinking maybe if a restart strategy is applied the job will be aware that it is being restarted and client code can send notification to some sink, but couldn't find any relevant job context info
I'm not aware of a super-easy way to do this. A couple of ideas:
(1) The jobmanager is aware of failed jobs. You could poll /joboverview/completed, for example, looking for newly failed jobs. /jobs/<jobid>/exceptions can be used to get more info (docs).
(2) The CheckpointedFunction interface has an initializeState() method that is passed a context object that responds to an isRestored() method (docs). This is more-or-less the relevant job context you were looking for.

How to auto start an activiti workflow periodically in a certain schedule in Alfresco 5.0.c?

I created a custom review and approve activiti workflow. I need to start this workflow automatically in every certain period , say every 30 minutes. For this i used timer start event as below:
<startEvent id="timerStart" name="Timer start" activiti:formKey="scheduledtask:submitParallelReviewTask">
<timerEventDefinition>
<timeCycle>R5/PT30M</timeCycle>
</timerEventDefinition>
</startEvent>
This created new process instance in every 30 minutes and repetition occured 5 times as required. But in the new timer started processes' tasks , initiator and other process variables were null. Also if i set the process variables as mandatory, timer executor job failed.
How can i set the initiator and other mandatory process variables in the newly created i.e. timer auto started process instances and its respective tasks?
Please suggest how to fix these bugs.
Thank you in advance !
Well I guess for the solution you should use cron job of alfresco.
For using cron job you can find below link usefull.
https://wiki.alfresco.com/wiki/Scheduled_Actions
Use workflowService for setting parameters.

How to recover Go timer from web-server restart (or code refresh/upgrade)?

Consider a web service, for instance, where user can make an API request to start a task at certain scheduled time. Task definition and scheduled time are persisted in a database.
First approach I came up with is to start a Go timer and wait for the timer to expire in a Goroutine (not blocking the request). This goroutine, after time expiration, will also fire another API request to start executing the task.
Now the problem arises when this service is redeployed. For zero downtime deployment I am using Einhorn with goji. After code reload, obviously both timer goroutine and timer-expiration-handler goroutine dies. Is there any way to recover Go timer after code reload?
Another problem I am struggling with is to allow the user to interrupt the timer (once its started). Go timer has Stop to facilitate this. But since this is a stateless API, when the \interrupt request comes in service doesn't have context of timer channel. And it seems its not possible to marshal the channel (returned from NewTimer) to disk/db.
Its also very well possible that I am not looking at the problem from correct perspective. Any suggestions would be highly appreciated.
One approach that's commonly used is to schedule the task outside your app, for example using crontab or systemd timers.
For example using crontab:
# run every 30 minutes
*/30 * * * * /usr/bin/curl --head http://localhost/cron?key=something-to-verify-local-job >/dev/null 2>&1
Using an external task queue is also a valid option like #Not_a_Golfer mentioned but more complicated.

Resources