why akka-stream restart supervision doesn't restart but just resume - akka-stream

Consider this simple stream:
Source(1 to 5)
.mapAsync(1) { i =>
if (i % 3 == 0) Future.failed(new Exception("I don't like 3"))
else Future.successful(i)
}
.withAttributes(
ActorAttributes.supervisionStrategy(Supervision.restartingDecider)
)
.runForeach(i => println(s"#$i"))
This actually prints
#1
#2
#4
Which is the same as with the resume strategy.
I would expect the stream to restart after the failed future with the following output
#1
#2
#1
#2
...
Why does the Resume and Restart strategy behaves the same way in this case?
How can I restart the stream from start?

Question 1: the difference between resume and restart is that - with the latter - the failing stage is restarted, losing all accumulated internal state. (See docs for reference).
In your case, you have a mapAsync stage with parallelism 1, so you effectively will never have any accumulated state. This results in resume and restart being equivalent in behaviour.
Question 2: The semantic of supervision strategies in Akka streams are related to the specific stage that fails. A failed stage simply has no way to replay the elements that flowed in the past, as they are already gone - i.e. not held anywhere. No supervision strategy can give you that.
What you are looking for is a restart of the whole stream which should be achievable with the recoverWithRetries combinator (docs). You can feed the same source again (Source(1 to 5)) to the combinator to have it replay those elements.

Related

Apache Flink : Batch Mode failing for Datastream API's with exception `IllegalStateException: Checkpointing is not allowed with sorted inputs.`

A continuation to this : Flink : Handling Keyed Streams with data older than application watermark
based on the suggestion, I have been trying to add support for Batch in the same Flink application which was using the Datastream API's.
The logic is something like this :
streamExecutionEnvironment.setRuntimeMode(RuntimeExecutionMode.BATCH);
streamExecutionEnvironment.readTextFile("fileName")
.process(process function which transforms input)
.assignTimestampsAndWatermarks(WatermarkStrategy
.<DetectionEvent>forBoundedOutOfOrderness(orderness)
.withTimestampAssigner(
(SerializableTimestampAssigner<Event>) (event, l) -> event.getEventTime()))
.keyBy(keyFunction)
.window(TumblingEventWindows(Time.of(x days))
.process(processWindowFunction);
Based on the public docs, my understanding was that i simply needed to change the source to a bounded one. However the above processing keeps on failing at the event trigger after the windowing step with the below exception :
java.lang.IllegalStateException: Checkpointing is not allowed with sorted inputs.
at org.apache.flink.util.Preconditions.checkState(Preconditions.java:193)
at org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.init(OneInputStreamTask.java:99)
at org.apache.flink.streaming.runtime.tasks.StreamTask.executeRestore(StreamTask.java:552)
at org.apache.flink.streaming.runtime.tasks.StreamTask.runWithCleanUpOnFail(StreamTask.java:647)
at org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:537)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:764)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:571)
at java.base/java.lang.Thread.run(Thread.java:829)
The input file contains the historical events for multiple keys. The data for a given key is sorted, but the overall data is not. I have also added an event at the end of each key with the timestamp = MAX_WATERMARK to indicate end of keyed Stream. I tried it for a single key as well but the processing failed with the same exception.
Note: I have not enabled checkpointing.
I have also tried explicitly disabling checkpointing to no avail.
env.getCheckpointConfig().disableCheckpointing();
EDIT - 1
Adding more details :
I tried changing and using FileSource to read files but still getting the same exception.
environment.fromSource(FileSource.forRecordStreamFormat(new TextLineFormat(), path).build(),
WatermarkStrategy.noWatermarks(),
"Text File")
The first process step and key splitting works. However it fails after that. I tried removing windowing and adding a simple process step but it continues to fail.
There is no explicit Sink. The last process function simply updates a database.
Is there something I'm missing ?
That exception can only be thrown if checkpointing is enabled. Perhaps you can a checkpointing interval configured in flink-conf.yaml?

CANOPEN SYNC timeout after enable Operation

I am a newbie in CANOPEN. I wrote a program that read actual position via PDO1 (default is statusword + actual position).
void canopen_init() {
// code1 setup PDO mapping
nmtPreOperation();
disablePDO(PDO_TX1_CONFIG_COMM);
setTransmissionTypePDO(PDO_TX1_CONFIG_COMM, 1);
setInhibitTimePDO(PDO_TX1_CONFIG_COMM, 0);
setEventTimePDO(PDO_TX1_CONFIG_COMM, 0);
enablePDO(PDO_TX1_CONFIG_COMM);
setCyclePeriod(1000);
setSyncWindow(100);
//code 2: enable OPeration
readyToSwitchOn();
switchOn();
enableOperation();
motionStart();
// code 3
nmtActiveNode();
}
int main (void) {
canopen_init();
while {
delay_ms(1);
send_sync();
}
}
If I remove "code 2" (the servo is in Switch_on_disable status), i can read position each time sync send. But if i use "code 2", the driver has error "sync frame timeout". I dont know driver has problem or my code has problem. Does my code has problem? thank you!
I don't know what protocol stack this is or how it works, but these:
setCyclePeriod(1000);
setSyncWindow(100);
likely correspond to these OD entries :
Object 1006h: Communication cycle period (CiA 301 7.5.2.6)
Object 1007h: Synchronous window length (CiA 301 7.5.2.7)
They set the SYNC interval and time window for synchronous PDOs respectively. The latter is described by the standard as:
If the synchronous window length expires all synchronous TPDOs may be discarded and an EMCY message may be transmitted; all synchronous RPDOs may be discarded until the next SYNC message is received. Synchronous RPDO processing is resumed with the next SYNC message.
Now if you set this sync time window to 100us but have a sloppy busy-wait delay delay_ms(1), then that doesn't add up. If you write zero to Object 1007h, you disable the sync window feature. I suppose setSyncWindow(0); might do that. You can try to do that to see if that's the issue. If so, you have to drop your busy-wait in favour for proper hardware timers, one for the SYNC period and one for PDO timeout (if you must use that feature).
Problem fixed. Due to much EMI from servo, that make my controller didn't work properly. After isolating, it worked very well :)!

Can a Timer event experience reentrancy?

I'm examining some dreadful legacy code that has a Timer event with some lengthy code that contains DoEvents calls. A simplified version looks something like this:
Private Sub tmrProcess_Timer()
'Run some slow processing code here
DoEvents
'More slow code here
DoEvents
'Lots more slow code and the occasional DoEvents here
If booComplete Then
tmrProcess.Enabled = False
End If
End Sub
The timer has it's Interval set to 250 and the slow code could take up to thirty or so seconds to complete. Note that there is a button on the form that sets booComplete = True when it is clicked.
Given that VB6 is single threaded and that timer messages are low priority is it at all possible for the Timer event to be re-entered during a DoEvents call or will the VB6 runtime block execution of a Timer event if the Timer event is currently executing?
This reference has some relevant information. In particular it states that WM_PAINT messages are combined into a single message but there is no mention of whether or not WM_TIMER messages are combined.
i expected it would reenter, but it seems not to
have a look at the following test project:
'1 form with:
' 1 timer control : Name=Timer1
Option Explicit
Private Sub Form_Load()
WindowState = vbMaximized
Timer1.Interval = 2000
End Sub
Private Sub Timer1_Timer()
Static intCount As Integer
Dim sngTime As Single
intCount = intCount + 1
Print CStr(Now) & " Timer event fired " & CStr(intCount)
sngTime = Timer + 3
Do While sngTime > Timer
DoEvents
Loop
Print CStr(Now) & " End of timer event " & CStr(intCount)
End Sub
you will see a "start" 2 seconds after the form loads
you will see an "end" 3 seconds after that
you will see a "start" 2 seconds after the previous "end" showed
you will see an "end" 3 seconds after the "start"
...
if the timer would be reentered i would expect 2 seconds between each "start", but there appears to be 3+2=5 seconds between each "start"
removing the DoEvents doesn't change the behaviour, it just changes the time at which the texts are printed
To avoid reentry, disable the timer till the time taking logic is executed and then enable again.
Private Sub tmrProcess_Timer()
tmrProcess.Enabled = False
'your time taking logic goes here ......
tmrProcess.Enabled = True
End Sub
First, most code containing DoEvents doesn't require it, it's a magical word that people feel compelled to incant (but without without knowing why).
DoEvents allows reentrancy of anything, not just timers.
Your's is a TimerProc. If you had chosen a message then the wm_timer message only comes if the message queue is empty and you ask "are their any messages?". If their is a paint, timer, or mouse move type message pending then if the queue is empty only then are they available.
Despite the obviousnous this is the link it came from:
> mk:#MSITStore:C:\Program%20Files\Microsoft%20Visual%20Studio\MSDN\2001OCT\1033\kbvb.chm::/Source/vbapps/q118468.htm
(of course you have to have same libraries as me installed for this to work.)
Why do you assume the source is the internet?
Definition of DoEvents in Visual Basic for Applications
Q118468
The information in this article applies to:
Microsoft Visual Basic for Applications version 1.0 Microsoft Excel
for Windows, versions 5.0, 5.0c Microsoft Excel for the Macintosh,
versions 5.0, 5.0a Microsoft Excel for Windows 95, versions 7.0, 7.0a
Microsoft Excel 97 for Windows Microsoft Excel 98 Macintosh Edition
SUMMARY The DoEvents function surrenders execution of the macro so
that the operating system can process other events. The DoEvents
function passes control from the application to the operating system.
Some instances in which DoEvents may be useful include the following:
Hardware I/O
Delay Loops
Operating System Calls
DDE Deadlocking
This article also discusses potential problems associated with the
DoEvents function.
MORE INFORMATION
Hardware I/O If your code waits for an input from any I/O device, the
DoEvents function speeds up the application by multitasking. As a
result, the computer does not seems to pause or stop responding (hang)
while the code is executing.
Example:
Open "com1" For Input As #1 Input #1, x Do Until x = Chr(13) DoEvents
'... '... Input #1, x Loop Delay Loops In a delay loop, the DoEvents
function can allow the CPU operating system to continue with any
pending jobs.
Example:
X = Timer() Do While X + 10 > Timer()
DoEventsLoop Operating System Calls When Visual Basic calls the
operating system, the operating system may return the control even
before processing the command completely. Doing so may prevent any
macro code that depends on an object generated by the call from
running. In the example below, the Shell function starts the Microsoft
Word application. If Word is not yet open, any effort to establish a
DDE link to it will halt the code. By using DoEvents, your procedure
makes sure that an operation, such as Shell, is completely executed
before the next macro statement is processed.
Example:
z% = Shell("WinWord Source.Doc",1) DoEvents ... ... DDE Deadlocking
Consider a situation in which a Visual Basic macro calls an
application that is waiting for a second application to get some data.
If the macro does not give control to the second application, the
result is a deadlock. In DDE conversations between multiple
applications, using DoEvents removes the possibility of this type of
deadlocking. Problems Associated with DoEvents Using too many nested
DoEvents statements may deplete the stack space and therefore generate
an "Out of Stack Space" error message. This error is referring to the
application stack space allocated to the Microsoft Excel application.
Make sure the procedure that has given up control with DoEvents is not
executed again from a different part of your code before the first
DoEvents call returns; this can cause unpredictable results.
Once DoEvents relinquishes control to the operating system, it is not
possible to determine when Microsoft Excel will resume the control.
After the operating system obtains control of the processor, it will
process all pending events that are currently in the message queue
(such as mouse clicks and keystrokes). This may be unsuitable for some
real- time data acquisition applications.
REFERENCES For more information about DoEvents, click the Search
button in Help and type:
doevents
Additional query words: Sendkeys keystroke Wait XL98 XL97 XL7 XL5
Keywords : Issue type : Technology : kbHWMAC kbOSMAC kbExcelSearch
kbZNotKeyword6 kbExcel95 kbExcel500 kbExcel98 kbExcel95Search
kbExcel97Search kbExcel98Search kbExcelMacsearch kbVBASearch
kbZNotKeyword3 kbExcel500Mac kbExcel500aMac kbExcel500c kbExcel95a
kbVBA100
Last Reviewed: January 17, 2001 © 2001 Microsoft Corporation. All
rights reserved. Terms of Use.
Send feedback to MSDN.Look here for MSDN Online resources

Neo4j store is not cleanly shut down; Recovering from inconsistent db state from interrupted batch insertion

I was importing ttl ontologies to dbpedia following the blog post http://michaelbloggs.blogspot.de/2013/05/importing-ttl-turtle-ontologies-in-neo4j.html. The post uses BatchInserters to speed up the task. It mentions
Batch insertion is not transactional. If something goes wrong and you don't shutDown() your database properly, the database becomes inconsistent.
I had to interrupt one of the batch insertion tasks as it was taking time much longer than expected which left my database in an inconsistence state. I get the following message:
db_name store is not cleanly shut down
How can I recover my database from this state? Also, for future purposes is there a way for committing after importing every file so that reverting back to the last state would be trivial. I thought of git, but I am not sure if it would help for a binary file like index.db.
There are some cases where you cannot recover from unclean shutdowns when using the batch inserter api, please note that its package name org.neo4j.unsafe.batchinsert contains the word unsafe for a reason. The intention for batch inserter is to operate as fast as possible.
If you want to guarantee a clean shutdown you should use a try finally:
BatchInserter batch = BatchInserters.inserter(<dir>);
try {
} finally {
batch.shutdown();
}
Another alternative for special cases is registering a JVM shutdown hook. See the following snippet as an example:
BatchInserter batch = BatchInserters.inserter(<dir>);
// do some operations potentially throwing exceptions
Runtime.getRuntime().addShutdownHook(new Thread() {
public void run() {
batch.shutdown();
}
});

how to manually set a task to run in a gae queue for the second time

I have a task that runs in GAE queue.
according to my logic, I want to determine if the task will run again or not.
I don't want it do be normally executed by the queue and then to put it again in the queue
because I want to have the ability to check the "X-AppEngine-TaskRetryCount"
and quit trying after several attempts.
To my understanding it seems that the only case that a task will re-executed is when an internal GAE error will happen (or If my code will take too long in a "DeadlineExceededException" cases..(And I don't want to hold the code "hostage" for that long :) )
How can I re-enter a task to the queue in a manner that GAE will set X-AppEngine-TaskRetryCount ++ ??
You can programmatically retry / restart a task using a self.error() in python.
From the docs: App engine retries a task by returning any HTTP status code outside of the range 200–299
And at the beginning of the task you can test for the number of retries using:
retries = int(self.request.headers['X-Appengine-Taskretrycount'])
if retries < 10 :
self.error(409)
return

Resources