Does Akka.net Streams preserve input order of elements? If so, does this hold true when working with reactive streams over the network (when using stream refs)?
Current Version of Akka.Streams being used is 1.4.39
Unfortunately, I was unable to find a definitive answer in the Akka.net Documentation.
After further reading I found my answer.
https://getakka.net/articles/streams/basics.html#stream-ordering
In Akka Streams almost all computation stages preserve input order of elements. This means that if inputs {IA1,IA2,...,IAn} "cause" outputs {OA1,OA2,...,OAk} and inputs {IB1,IB2,...,IBm} "cause" outputs {OB1,OB2,...,OBl} and all of IAi happened before all IBi then OAi happens before OBi.
This property is even uphold by async operations such as SelectAsync, however an unordered version exists called SelectAsyncUnordered which does not preserve this ordering.
...
Related
I am using akka-streams and I hit an exception because of maxing out the Http Pool on akka-http.
There is a Source of list-elements, which get split and thus transformed to SubFlows.
The SubFlows issue http requests. Although I put a buffer on the SubFlow, it seems the buffer takes effect per SubFlow.
Is there a way to have a buffer based on the Source that takes effect on the SubFlows?
My mistake was that I was merging the substreams without taking into consideration the parallelism by using
def mergeSubstreams(): Flow[In, Out, Mat]
From the documentation
This is identical in effect to mergeSubstreamsWithParallelism(Integer.MAX_VALUE).
Thus my workaround was to use
def mergeSubstreamsWithParallelism(parallelism: Int): Flow[In, Out, Mat]
Using Windows API, I want to implement something like following:
i.e. Getting current microphone input level.
I am not allowed to use external audio libraries, but I can use Windows libraries. So I tried using waveIn functions, but I do not know how to process audio input data in real time.
This is the method I am currently using:
Record for 100 milliseconds
Select highest value from the recorded data buffer
Repeat forever
But I think this is way too hacky, and not a recommended way. How can I do this properly?
Having built a tuning wizard for a very dated, but well known, A/V conferencing applicaiton, what you describe is nearly identical to what I did.
A few considerations:
Enqueue 5 to 10 of those 100ms buffers into the audio device via waveInAddBuffer. IIRC, when the waveIn queue goes empty, weird things happen. Then as the waveInProc callbacks occurs, search for the sample with the highest absolute value in the completed buffer as you describe. Then plot that onto your visualization. Requeue the completed buffers.
It might seem obvious to map the sample value as follows onto your visualization linearly.
For example, to plot a 16-bit sample
// convert sample magnitude from 0..32768 to 0..N
length = (sample * N) / 32768;
DrawLine(length);
But then when you speak into the microphone, that visualization won't seem as "active" or "vibrant".
But a better approach would be to give more strength to those lower energy samples. Easy way to do this is to replot along the μ-law curve (or use a table lookup).
length = (sample * N) / 32768;
length = log(1+length)/log(N);
length = max(length,N)
DrawLine(length);
You can tweak the above approach to whatever looks good.
Instead of computing the values yourself, you can rely on values from Windows. This is actually the values displayed in your screenshot from the Windows Settings.
See the following sample for the IAudioMeterInformation interface:
https://learn.microsoft.com/en-us/windows/win32/coreaudio/peak-meters.
It is made for the playback but you can use it for capture also.
Some remarks, if you open the IAudioMeterInformation for a microphone but no application opened a stream from this microphone, then the level will be 0.
It means that while you want to display your microphone peak meter, you will need to open a microphone stream, like you already did.
Also read the documentation about IAudioMeterInformation it may not be what you need as it is the peak value. It depends on what you want to do with it.
I am trying to create a node (a collection of nodes is fine too), that takes in many streams and an index, and outputs one stream specified by the index. Basically, I want to create a mux node, something like:
Node : Stream ... Number -> Stream
FFmpeg's filter graph API seems to have two filters for doing that: streamselect (for video) and astreamselect (for audio). And for the most part, they seem to do what I want:
[in0][in1][in2]streamselect=inputs=3:map=1[out]
This stream will take in three video streams, and output the second one in1.
You can use a similar filter for audio streams:
[in0][in1]astreamselect=inputs=2:map=0[out]
Which will take in two streams and output the first one in0.
The question is, can I create a filter that takes in a list of both audio and video streams and outputs the stream based only on the stream index? So something like:
[v0][v1][a0][a1][a2]avstreamselect=inputs=5:map=3[out]
Which maps a1 to out?
If it helps I am using the libavfilter C API rather than the command line interface.
While it may not be possible with one filter1, it is certainly possible to do this by combining multiple filters, one for either audio or video (depending on which one you are selecting), and a bunch of nullsink or anullsink filters for the rest of them.
For example, the would be filter:
[v0][v1][a0][a1]avstreamselect=inputs=4:map=2[out]
which takes in two video streams and two audio streams, and returns the third stream (the first audio stream), can be written as:
[a0][a1]astreamselect=inputs=2:map=0[out];
[v0]nullsink;[v1]nullsink
Here, we run select for the first stream, and all of the remaining ones are mapped to sinks. This idea could potentially be generalized to only use nullsink, anullsink, copy, and acopy, for example, we could have also written it with 4 nodes:
[a0]acopy[out];
[a1]anullsink;
[v0]nullsink;
[v1]nullsink
1I still don't know if it is or not. Feel free to remove this if it actually is possible.
I'm trying to evaluate Apache Flink for the use case we're currently running in production using custom code.
So let's say there's a stream of events each containing a specific attribute X which is a continuously increasing integer. That is a bunch of contiguous events have this attributes set to N, then the next batch has it set to N+1 etc.
I want to break the stream into windows of events with the same value of X and then do some computations on each separately.
So I define a GlobalWindow and a custom Trigger where in onElement method I check the attribute of any given element against the saved value of the current X (from state variable) and if they differ I conclude that we've accumulated all the events with X=CURRENT and it's time to do computation and increase the X value in the state.
The problem with this approach is that the element from the next logical batch (with X=CURRENT+1) has been already consumed but it's not a part of the previous batch.
Is there a way to put it back somehow into the stream so that it is properly accounted for the next batch?
Or maybe my approach is entirely wrong and there's an easier way to achieve what I need?
Thank you.
I think you are on a right track.
Trigger specifies when a window can be processed and results for a window can be emitted.
The WindowAssigner is the part which says to which window element will be assigned. So I would say you also need to provide a custom implementation of WindowAssigner that will assign same window to all elements with equal value of X.
A more idiomatic way to do this with Flink would be to use stream.keyBy(X).window(...). The keyBy(X) takes care of grouping elements by their particular value for X. You then apply any sort of window you like. In your case a SessionWindow may be a good choice. It will fire for each key after that key hasn't been seen for some configurable period of time.
This approach will be much more robust with regard to unordered data which you must always assume in a stream processing system.
I have a question which is a variation of this question: Flink: how to store state and use in another stream?
I have two streams:
val ipStream: DataStream[IPAddress] = ???
val routeStream: DataStream[RoutingTable] = ???
I want to find out which route which package uses. Usually this can be done with:
val ip = IPAddress("10.10.10.10")
val table = RoutingTable(Seq("10.10.10.0/24", "5.5.5.0/24"))
val route = table.lookup(ip) // == "10.10.10.0/24"
The problem here is that I cannot really key the stream here since that requires both the complete table as well as the ip address (and keys must be computed isolated).
For every element from the ipStream, I need the latest routeStream element. Right now I'm using a hack that all of that is processed non-parallel:
ipStream
.connect(routeStream)
.keyBy(_ => 0, _ => 0)
.flatMap(new MyRichCoFlatMapFunction) // with ValueState[RoutingTable]
This sounds like the use case for a broadcast strategy. However, the routeStream will be updated and is not fixed in a file. The question remains: Is there a way to have two streams, one of which contains changing control data for the other stream?
Since I solved the issue, I might aswell write an answer here :)
I keyed the two streams like this:
The RoutingTable stream was keyed using the first byte of the network route
The IPAddress was also keyed by the first byte of the address
This works under the condition that IP packages are getting generally routed in the net with the same /8 prefix, which can be assumed for most traffic.
Then, by having a stateful RichCoFlatMap one can build up the routing table state as key. When receiving a new IP package, do a lookup in the routing table. Now there are two possible scenarios:
No matching route has been found. We could store the package for later here but discarding it works aswell.
If a route has been found, output the tuple of [IPAddress, RoutingTableEntry].
This way, we have two streams where one of them has changing control data for the other stream.