Concurrent modification of DefaultExchange - apache-camel

I have to ensure that the thread that runs a route (and the error handler) is guaranteed to see an exception set on the exchange instance if that instance is shared by many threads and the exception is set by one of those threads.
I have a route step that massages the data in an input stream (jetty endpoint) before proxying to the actual web server. For the stream data manipulation I use stream pipelines (PipedInputStream/PipedOutputStream) which need to run in their own thread per pipeline element. Each pipeline element holds a reference to the exchange and if an error is encountered the exception is set on the exchange.
This appears to work alright. But I don't think it is guaranteed to work as exception is not a volatile member of DefaultExchange. Thus the main thread is not guaranteed to see an exception set on the exchange by a worker thread. The worker threads themselves all use a ReentrantLock I provide as an exchange property (the property map is a ConcurrentHashMap) to synchronize their access to the exchange. So data visibility between the worker threads should not be an issue.
But how to do it for the main thread?
I had a look at the Splitter implementation and how it deals with parallel execution and aggregation. As far as I understand, the Splitter enforces visibility by returning an AtomicReference of an exchange that the main thread has never seen before. The exchange referenced by the atomic reference is treated as read-only after it is set on the atomic reference. So a get() on the atomic reference in the main thread is guaranteed to see the latest version of the referenced exchange instance.
I cannot apply this approach as it requires the main thread to wait for the workers to finish processing. If I block the main thread my workers would be blocked as well as the main thread is responsible for consuming the modified input stream contents from the last PipedInputStream element and forwarding it to the web server.
I also could not find a factory mechanism that would allow me to tell Camel to instantiate my own implementation of the Exchange interface (with a volatile member "exception").
(DefaultException was also not written with sub-classing in mind, it appears. E.g. DefaultException::isFailed() uses the private member exception instead of DefaultException::getException() for the answer. But that's a separate issue.)
Does anyone have any other ideas?
(Cross-posted here.)

Related

StreamTask.getCheckpointLock deprecation and custom Flink sources

When writing custom checkpointed sources for Flink, one must make sure that emitting elements downstream, checkpointing and watermarks are emitted in a synchronized fashion. This is done by acquiring StreamContext.getCheckpointLock
Flink 1.10 introduced a deprecation to StreamTask.getCheckpointLock and is now recommending the use of MailboxExecutor for operation which require such synchronization.
I have a custom source implementation which is split into multiple phases. A SourceFunction[T] for reading file locations and an OneInputStreamOperator for downloading and emitting these elements downstream. Up until now, I used StreamSourceContexts.getSourceContext to receive the SourceContext used to emit elements, which looked as follows:
ctx = StreamSourceContexts.getSourceContext(
getOperatorConfig.getTimeCharacteristic,
getProcessingTimeService,
getContainingTask.getCheckpointLock,
getContainingTask.getStreamStatusMaintainer,
output,
getRuntimeContext.getExecutionConfig.getAutoWatermarkInterval,
-1
)
And this context is being used throughout the code to emit elements and watermarks:
ctx.getCheckpointLock.synchronized(ctx.collect(item))
ctx.getCheckpointLock.synchronized(ctx.emitWatermark(watermark))
Is using the checkpoint lock still the preferred way to emit elements downstream? Or is it now recommended that we use MailboxExecutor instead and make collection and watermark inside the mailbox execution thread?
Checkpoint lock in source context is not deprecated as there is currently no way around implementing a source without the lock. These sources are already dubbed legacy sources exactly for that reason: they spawn their own thread and need the lock to emit data (push-based).
There is currently a larger rework for sources (FLIP-27), which will offer a pull-based interface. This interface is called from the main task thread, such that no synchronization is necessary anymore. If some async work needs to be done, then MailboxExecutor is the way to go.
FYI, new operators should (rather must) only use MailboxExecutor instead of checkpoint lock.

Locking dispatcher

Is it necessary to lock code snippet where multiple threads access same wpf component via dispatcher?
Example:
void ladder_OnIndexCompleted(object sender, EventArgs args)
{
lock (locker)
{
pbLadder.Dispatcher.Invoke(new Action(() => { pbLadder.Value++; }));
}
}
pbLadder is a progress bar and this event can be raised from multiple threads in the same time.
You should not acquire a lock if you're then going to marshal to another thread in a synchronous fashion - otherwise if you try to acquire the same lock in the other thread (the dispatcher thread in this case) you'll end up with a deadlock.
If pbLadder.Value is only used from the UI thread, then you don't need to worry about locking for thread safety - the fact that all the actions occur on the same thread isolates you from a lot of the normal multi-threading problems. The fact that the original action which caused the code using pbLadder.Value to execute occurred on a different thread is irrelevant.
All actions executed on the Dispatcher are queued up and executed in sequence on the UI thread. This means that data races like that increment cannot occur. The Invoke method itself is thread-safe, so also adding the action to the queue does not require any locking.
From MSDN:
Executes the specified delegate with the specified arguments
synchronously on the thread the Dispatcher is associated with.
and:
The operation is added to the event queue of the Dispatcher at the
specified DispatcherPriority.
Even though this one is pretty old it was at the top of my search results and I'm pretty new (4 months since I graduated) so after reading other peoples comments, I went and spoke with my senior coder. What the others are saying above is accurate but I felt the answers didn't provide a solution, just information. Here's the feedback from my senior coder:
"It's true that the Dispatcher is running on its own thread, but if another thread is accessing an object that the dispatcher wants to access then all UI processing stops while the dispatcher waits for the access. To solve this ideally, you want to make a copy of the object that the dispatcher needs to access and pass that to the dispatcher, then the dispatcher is free to edit the object and won't have to wait on the other thread to release their lock."
Cheers!

Dispose called by Component.Finalize() on non-UI thread - does this mean Dispose methods always have to be thread safe?

I checked which thread my Dispose(bool) methods get called on. When the app is running, it is always the UI thread that calls Dispose, say when clicking on the [x] to close a Form. But when I close the whole app, many Dispose methods get called on a (single) different thread. When I dump the stack trace, I see that they all get called from
System.ComponentModel.Component.Finalize().
Does that mean all my Dispose methods need to be made thread-safe? Or is WinForms guaranteeing that the UI thread won't touch these objects any more and does it also establish some kind of "happened-before" relationship between the UI thread and the one that's now finalizing?
Yes, the finalizer works on a separate thread. Usually this is no problem, because when an Object is finalized it is not reachable by any user thread (like the UI thread) anymore. So, you usually do not have to be thread-safe within your finalizer.

Why i should call Control.Invoke from non-ui thread?

Why i should call Control.Invoke from non-ui thread? As i know any manipulations with control are the messages to control. So when i call, for example TextBox.Text = "text", it will produce message SendMessage(TextBox.Hanlde...). That message will be queued into UI thread message queue and dispatched by UI thread. Why i have to call invoke, even it will produce the same message?
There are two reasons MS developers made this restriction:
Some UI functions have access to thread-local storage (TLS). Calling these functions from another thread gives incorrect results on TLS operations.
Calling all UI-related functions from the same thread automatically serializes them, this is thread-safe and doesn't require synchronization.
From our point of view, we just need to follow these rules.
Because you cannot directly access UI controls from threads other than the thread they were created on. Control.Invoke will marshal your call onto the correct thread - allowing you to make a call from another thread onto the UI thread without needing to know yourself what the UI thread is or how to perform the marshalling.
Update: to answer your question, you don't have to use Control.Invoke - if you have code to marshal your call onto the correct thread and post a message to the message pump - then use that. This, however, is known as re-inventing the wheel. Unless you are doing something that changes the behaviour.

Is Socket.SendAsync thread safe effectively?

I was fiddling with Silverlight's TCP communication and I was forced to use the System.Net.Sockets.Socket class which, on the Silverlight runtime has only asynchronous methods.
I was wondering what happens if two threads call SendAsync on a Socket instance in a very short time one from the other?
My single worry is to not have intermixed bytes going through the TCP channel.
Being an asynchronous method I suppose the message gets placed in a queue from which a single thread dequeues so no such things will happen (intermixing content of the message on the wire).
But I am not sure and the MSDN does not state anything in the method's description. Is anyone sure of this?
EDIT1 : No, locking on an object before calling SendAsync such as :
lock(this._syncObj)
{
this._socket.SendAsync(arguments);
}
will not help since this serializes the requests to send data not the data actually sent.
In order to call the SendAsync you need first to have called ConnectAsync with an instance of SocketAsyncEventArgs. Its the instance of SocketAsyncEventArgs which represents the connection between the client and server. Calling SendAsync with the same instance of SocketAsyncEventArgs that has just been used for an outstanding call to SendAsync will result in an exception.
It is possible to make multiple outstanding calls to SendAsync of the same Socket object but only using different instances of SocketAsyncEventArgs. For example (in a parallel universe where this might be necessay) you could be making multiple HTTP posts to the same server at the same time but on different connections. This is perfectly acceptable and normal neither client nor server will get confused about which packet is which.

Resources