Camel reactive streams not completing when subscribed more than once - apache-camel

#Component
class TestRoute(
context: CamelContext,
) : EndpointRouteBuilder() {
val streamName: String = "news-ticker-stream"
val logger = LoggerFactory.getLogger(TestRoute::class.java)
val camel: CamelReactiveStreamsService = CamelReactiveStreams.get(context)
var count = 0L
val subscriber: Subscriber<String> =
camel.streamSubscriber(streamName, String::class.java)
override fun configure() {
from("timer://foo?fixedRate=true&period=30000")
.process {
count++
logger.info("Start emitting data for the $count time")
Flux.fromIterable(
listOf(
"APPLE", "MANGO", "PINEAPPLE"
)
)
.doOnComplete {
logger.info("All the data are emitted from the flux for the $count time")
}
.subscribe(
subscriber
)
}
from(reactiveStreams(streamName))
.to("file:outbox")
}
}
2022-07-07 13:01:44.626 INFO 50988 --- [1 - timer://foo] c.e.reactivecameltutorial.TestRoute : Start emitting data for the 1 time
2022-07-07 13:01:44.640 INFO 50988 --- [1 - timer://foo] c.e.reactivecameltutorial.TestRoute : All the data are emitted from the flux for the 1 time
2022-07-07 13:01:44.646 INFO 50988 --- [1 - timer://foo] a.c.c.r.s.ReactiveStreamsCamelSubscriber : Reactive stream 'news-ticker-stream' completed
2022-07-07 13:02:14.616 INFO 50988 --- [1 - timer://foo] c.e.reactivecameltutorial.TestRoute : Start emitting data for the 2 time
2022-07-07 13:02:44.610 INFO 50988 --- [1 - timer://foo] c.e.reactivecameltutorial.TestRoute : Start emitting data for the 3 time
2022-07-07 13:02:44.611 WARN 50988 --- [1 - timer://foo] a.c.c.r.s.ReactiveStreamsCamelSubscriber : There is another active subscription: cancelled
The reactive stream are not getting completed when running for more than 1 times. So, as you can see in the logs the log message which I have added doOnComplete is only coming for the first time when timer route was triggered. When the timer route is triggered for the second time then there is no completion message. I tried to put the break point in the ReactiveStreamsCamelSubscriber, and found that for the 1st time the flow is going into the onNext() and onComplete() methods but the flow is not going into these method when the timer ran for 2nd time. I am not able to understand why this scenario is playing out?

Related

Finding the correct traversal path for a packet

Suppose a scenario similar to the above image where Node-A and Node-B are sending data to Node-D via Node-C. Node-A and Node-B each sent one data packet to Node-D. Node-A sent one msg with msg-id = 1 which is received at Node-C with msg-id = 1 and Node-B sent one msg with msg-id = 3 which is received at node-C with msg-id = 3. Now, how I will get to know that the msg forwarded from Node-C with msg-id = 2 is the msg from Node-A or Node-B and msg forwarded from Node-C with msg-id = 4 is the msg from Node-A or Node-B? How I will follow the correct path while traversing the trace.json file of simulation?
The trace.json file has information about the thread of events and stimulus that is related to each event. This may provide you the relevant information for your tracing.
To illustrate the idea, I take an example of how the ranging agent works. Other agents such as router work similarly, so you can use the same idea for tracing routed frames.
See the following entry from a trace.json output of a ranging simulation:
{
"time":1645606042088,
"component":"ranging::org.arl.unet.localization.Ranging/A",
"threadID":"88b89b82-eb37-4ac5-83da-3695acb80e7f",
"stimulus":{"clazz":"org.arl.unet.phy.RxFrameNtf","messageID":"88b89b82-eb37-4ac5-83da-3695acb80e7f","performative":"INFORM","sender":"phy","recipient":"#phy__ntf"},
"response":{"clazz":"org.arl.unet.phy.TxFrameReq","messageID":"ee9f72f5-5753-4d46-a885-9e446bbf9746","performative":"REQUEST","recipient":"phy"}
}
This entry shows that the received frame (RxFrameNtf) with ID 88b89b82... caused the transmission of a new frame (TxFrameReq) with ID ee9f72f5.... The threadID entry can also be helpful, as the same threadID is maintained for the whole chain of events within a single node.
Each of your frames from nodes A and B will have unique IDs, and so will each of the frames relayed by node C. The trace.json entry corresponding to each relay transmission should tell you which stimulus (frame from node A or B) resulted in the transmission.
For your application, I extract a few of the JSON entries from your trace.json to illustrate this:
{
"time":10000, "component":"router::org.arl.unet.net.Router/A", "threadID":"2e37d6d6-a3d2-4e0f-aacc-456efdae91bb",
"stimulus":{"clazz":"org.arl.unet.DatagramReq", "messageID":"2e37d6d6-a3d2-4e0f-aacc-456efdae91bb", "performative":"REQUEST", "sender":"simulator", "recipient":"router"},
"response":{"clazz":"org.arl.unet.DatagramReq", "messageID":"54a55a12-4051-4934-88bc-beb4fa28c548", "performative":"REQUEST", "recipient":"uwlink"}
}
{
"time":10491, "component":"uwlink::org.arl.unet.link.ReliableLink/A", "threadID":"54a55a12-4051-4934-88bc-beb4fa28c548",
"stimulus":{"clazz":"org.arl.unet.DatagramReq", "messageID":"54a55a12-4051-4934-88bc-beb4fa28c548", "performative":"REQUEST", "sender":"router", "recipient":"uwlink"},
"response":{"clazz":"org.arl.unet.phy.TxFrameReq", "messageID":"87757d67-7510-4856-a9ee-a27924f9548a", "performative":"REQUEST", "recipient":"phy"}
}
{
"time":10491, "component":"phy::org.arl.unet.sim.HalfDuplexModem/A", "threadID":"87757d67-7510-4856-a9ee-a27924f9548a",
"stimulus":{"clazz":"org.arl.unet.phy.TxFrameReq", "messageID":"87757d67-7510-4856-a9ee-a27924f9548a", "performative":"REQUEST", "sender":"uwlink", "recipient":"phy"},
"response":{"clazz":"org.arl.unet.sim.HalfDuplexModem$TX", "messageID":"31b892fd-1b2c-40b1-a1d6-4dc457f57a2c", "performative":"INFORM", "recipient":"phy"}
}
{
"time":11870, "component":"phy::org.arl.unet.sim.HalfDuplexModem/C", "threadID":"31b892fd-1b2c-40b1-a1d6-4dc457f57a2c",
"stimulus":{"clazz":"org.arl.unet.sim.HalfDuplexModem$TX", "messageID":"31b892fd-1b2c-40b1-a1d6-4dc457f57a2c", "performative":"INFORM", "sender":"phy", "recipient":"phy"},
"response":{"clazz":"org.arl.unet.phy.RxFrameNtf", "messageID":"5cd195ef-74b0-4394-8f9a-9077de08bc56", "performative":"INFORM", "sender":"phy", "recipient":"#phy__ntf"}
}
{
"time":11870, "component":"router::org.arl.unet.net.Router/C", "threadID":"5cd195ef-74b0-4394-8f9a-9077de08bc56",
"stimulus":{"clazz":"org.arl.unet.phy.RxFrameNtf", "messageID":"5cd195ef-74b0-4394-8f9a-9077de08bc56", "performative":"INFORM", "sender":"phy", "recipient":"#phy__ntf"},
"response":{"clazz":"org.arl.unet.DatagramReq", "messageID":"e9ecfc32-ff9a-4750-99d5-b3b462dcd660", "performative":"REQUEST", "recipient":"uwlink"}
}
{
"time":12259, "component":"uwlink::org.arl.unet.link.ReliableLink/C", "threadID":"e9ecfc32-ff9a-4750-99d5-b3b462dcd660",
"stimulus":{"clazz":"org.arl.unet.DatagramReq", "messageID":"e9ecfc32-ff9a-4750-99d5-b3b462dcd660", "performative":"REQUEST", "sender":"router", "recipient":"uwlink"},
"response":{"clazz":"org.arl.unet.phy.TxFrameReq", "messageID":"76200f60-2334-4f20-88a3-9c2d42a769ad", "performative":"REQUEST", "recipient":"phy"}
}
{
"time":12259, "component":"phy::org.arl.unet.sim.HalfDuplexModem/C", "threadID":"76200f60-2334-4f20-88a3-9c2d42a769ad",
"stimulus":{"clazz":"org.arl.unet.phy.TxFrameReq", "messageID":"76200f60-2334-4f20-88a3-9c2d42a769ad", "performative":"REQUEST", "sender":"uwlink", "recipient":"phy"},
"response":{"clazz":"org.arl.unet.sim.HalfDuplexModem$TX", "messageID":"f3161e4e-1007-4540-8192-f0d7bf80e126", "performative":"INFORM", "recipient":"phy"}
}
{
"time":13542, "component":"phy::org.arl.unet.sim.HalfDuplexModem/D", "threadID":"f3161e4e-1007-4540-8192-f0d7bf80e126",
"stimulus":{"clazz":"org.arl.unet.sim.HalfDuplexModem$TX", "messageID":"f3161e4e-1007-4540-8192-f0d7bf80e126", "performative":"INFORM", "sender":"phy", "recipient":"phy"},
"response":{"clazz":"org.arl.unet.phy.RxFrameNtf", "messageID":"ab5e15a9-27e0-4495-b0b8-f199159cb2a3", "performative":"INFORM", "sender":"phy", "recipient":"#phy__ntf"}
}
{
"time":13542, "component":"uwlink::org.arl.unet.link.ReliableLink/D", "threadID":"ab5e15a9-27e0-4495-b0b8-f199159cb2a3",
"stimulus":{"clazz":"org.arl.unet.phy.RxFrameNtf", "messageID":"ab5e15a9-27e0-4495-b0b8-f199159cb2a3", "performative":"INFORM", "sender":"phy", "recipient":"#phy__ntf"},
"response":{"clazz":"org.arl.unet.DatagramNtf", "messageID":"bb534fc0-47ec-4b2d-8124-07af87528d37", "performative":"INFORM", "recipient":"#uwlink__ntf"}
}
{
"time":13542, "component":"router::org.arl.unet.net.Router/D", "threadID":"bb534fc0-47ec-4b2d-8124-07af87528d37",
"stimulus":{"clazz":"org.arl.unet.DatagramNtf", "messageID":"bb534fc0-47ec-4b2d-8124-07af87528d37", "performative":"INFORM", "sender":"uwlink", "recipient":"#uwlink__ntf"},
"response":{"clazz":"org.arl.unet.DatagramNtf", "messageID":"5e660793-9a44-4341-8843-fc7fa91aa450", "performative":"INFORM", "recipient":"#router__ntf"}
}
These entries show the sequence of events that occurred:
time 10000: DatagramReq from router#A to uwlink#A
time 10491: TxFrameReq from uwlink#A to phy#A
time 10491: TX from phy#A to phy#C
time 11870: RxFrameNtf from phy#C (publish on topic)
time 11870: DatagramReq from router#C to uwlink#C
time 12259: TxFrameReq from uwlink#C to phy#C
time 12259: TX from phy#C to phy#D
time 13542: RxFrameNtf from phy#D (publish on topic)
time 13542: DatagramNtf from uwlink#D (publish on topic)
time 13542: DatagramNtf from router#D (publish on topic)
You should find that each JSON entry's response.messageID corresponds to the next JSON entry's stimulus.messageID. This allows you to follow through the sequence of events.

Why does Gatling stop simulation when any scenario exists and doesn't wait until the end?

Let's say I have this configuration
val scn = (name: String) => scenario(name)
.forever() {
.exec(request)
}
setUp(
scn("scn1").inject(atOnceUsers(1))
.throttle(
jumpToRps(1), holdFor(10 seconds)
),
scn("scn2").inject(atOnceUsers(1))
.throttle(jumpToRps(1), holdFor(20 seconds))
).protocols(http.baseURLs(url))
I would expect to run the whole simulation for 20 seconds - until all is finished. What actually happens is that the simulation is stopped after 10 seconds, right after the first scenario finishes.
---- Requests ------------------------------------------------------------------
> Global (OK=20 KO=0 )
> scn1 / HTTP Request (OK=10 KO=0 )
> scn2 / HTTP Request (OK=10 KO=0 )
---- scn1 ----------------------------------------------------------------------
[--------------------------------------------------------------------------] 0%
waiting: 0 / active: 1 / done:0
---- scn2 ----------------------------------------------------------------------
[--------------------------------------------------------------------------] 0%
waiting: 0 / active: 1 / done:0
================================================================================
Simulation foo.Bar completed in 10 seconds
To overcome this in general, I need to configure all scenarios that ends earlier then the final one to wait with zero throttle.
setUp(
scn.inject(atOnceUsers(1))
.throttle(
jumpToRps(1), holdFor(10 seconds),
jumpToRps(0), holdFor(10 seconds) // <-- added wait
),
scn.inject(atOnceUsers(1))
.throttle(jumpToRps(1), holdFor(20 seconds))
).protocols(http.baseURLs(url))
Is this expected behavior? What other options do I have to make my simulation run until all scenarios are finished or until maxDuration?
Possible explanation could be that Feeder loops on data and when there is no more data it exists. In this case call "circular" on your feeder so that it goes back to the top of the sequence once the end is reached

Flink Streaming: From one window, lookup state in another window

I have two streams:
Measurements
WhoMeasured (metadata about who took the measurement)
These are the case classes for them:
case class Measurement(var value: Int, var who_measured_id: Int)
case class WhoMeasured(var who_measured_id: Int, var name: String)
The Measurement stream has a lot of data. The WhoMeasured stream has little. In fact, for each who_measured_id in the WhoMeasured stream, only 1 name is relevant, so old elements can be discarded if one with the same who_measured_id arrives. This is essentially a HashTable that gets filled by the WhoMeasured stream.
In my custom window function
class WFunc extends WindowFunction[Measurement, Long, Int, TimeWindow] {
override def apply(key: Int, window: TimeWindow, input: Iterable[Measurement], out: Collector[Long]): Unit = {
// Here I need access to the WhoMeasured stream to get the name of the person who took a measurement
// The following two are equivalent since I keyed by who_measured_id
val name_who_measured = magic(key)
val name_who_measured = magic(input.head.who_measured_id)
}
}
This is my job. Now as you might see, there is something missing: The combination of the two streams.
val who_measured_stream = who_measured_source
.keyBy(w => w.who_measured_id)
.countWindow(1)
val measurement_stream = measurements_source
.keyBy(m => m.who_measured_id)
.timeWindow(Time.seconds(60), Time.seconds(5))
.apply(new WFunc)
So in essence this is sort of a lookup table that gets updated when new elements in the WhoMeasured stream arrive.
So the question is: How to achieve such a lookup from one WindowedStream into another?
Follow Up:
After implementing in the way Fabian suggested, the job always fails with some sort of serialization issue:
[info] Loading project definition from /home/jgroeger/Code/MeasurementJob/project
[info] Set current project to MeasurementJob (in build file:/home/jgroeger/Code/MeasurementJob/)
[info] Compiling 8 Scala sources to /home/jgroeger/Code/MeasurementJob/target/scala-2.11/classes...
[info] Running de.company.project.Main dev MeasurementJob
[error] Exception in thread "main" org.apache.flink.api.common.InvalidProgramException: The implementation of the RichCoFlatMapFunction is not serializable. The object probably contains or references non serializable fields.
[error] at org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:100)
[error] at org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.clean(StreamExecutionEnvironment.java:1478)
[error] at org.apache.flink.streaming.api.datastream.DataStream.clean(DataStream.java:161)
[error] at org.apache.flink.streaming.api.datastream.ConnectedStreams.flatMap(ConnectedStreams.java:230)
[error] at org.apache.flink.streaming.api.scala.ConnectedStreams.flatMap(ConnectedStreams.scala:127)
[error] at de.company.project.jobs.MeasurementJob.run(MeasurementJob.scala:139)
[error] at de.company.project.Main$.main(Main.scala:55)
[error] at de.company.project.Main.main(Main.scala)
[error] Caused by: java.io.NotSerializableException: de.company.project.jobs.MeasurementJob
[error] at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184)
[error] at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
[error] at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
[error] at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
[error] at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
[error] at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
[error] at org.apache.flink.util.InstantiationUtil.serializeObject(InstantiationUtil.java:301)
[error] at org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:81)
[error] ... 7 more
java.lang.RuntimeException: Nonzero exit code returned from runner: 1
at scala.sys.package$.error(package.scala:27)
[trace] Stack trace suppressed: run last MeasurementJob/compile:run for the full output.
[error] (MeasurementJob/compile:run) Nonzero exit code returned from runner: 1
[error] Total time: 9 s, completed Nov 15, 2016 2:28:46 PM
Process finished with exit code 1
The error message:
The implementation of the RichCoFlatMapFunction is not serializable. The object probably contains or references non serializable fields.
However, the only field my JoiningCoFlatMap has is the suggested ValueState.
The signature looks like this:
class JoiningCoFlatMap extends RichCoFlatMapFunction[Measurement, WhoMeasured, (Measurement, String)] {
I think what you want to do is a window operation followed by a join.
You can implement the a join of a high-volume stream and a low-value update-by-key stream using a stateful CoFlatMapFunction as in the example below:
val measures: DataStream[Measurement] = ???
val who: DataStream[WhoMeasured] = ???
val agg: DataStream[(Int, Long)] = measures
.keyBy(_._2) // measured_by_id
.timeWindow(Time.seconds(60), Time.seconds(5))
.apply( (id: Int, w: TimeWindow, v: Iterable[(Int, Int, String)], out: Collector[(Int, Long)]) => {
// do your aggregation
})
val joined: DataStream[(Int, Long, String)] = agg
.keyBy(_._1) // measured_by_id
.connect(who.keyBy(_.who_measured_id))
.flatMap(new JoiningCoFlatMap)
// CoFlatMapFunction
class JoiningCoFlatMap extends RichCoFlatMapFunction[(Int, Long), WhoMeasured, (Int, Long, String)] {
var names: ValueState[String] = null
override def open(conf: Configuration): Unit = {
val stateDescrptr = new ValueStateDescriptor[String](
"whoMeasuredName",
classOf[String],
"" // default value
)
names = getRuntimeContext.getState(stateDescrptr)
}
override def flatMap1(a: (Int, Long), out: Collector[(Int, Long, String)]): Unit = {
// join with state
out.collect( (a._1, a._2, names.value()) )
}
override def flatMap2(w: WhoMeasured, out: Collector[(Int, Long, String)]): Unit = {
// update state
names.update(w.name)
}
}
A note on the implementation: A CoFlatMapFunction cannot decide which input to process, i.e., the flatmap1 and flatmap2 functions are called depending on what data arrives at the operator. It cannot be controlled by the function. This is a problem when initializing the state. In the beginning, the state might not have the correct name for an arriving Measurement object but return the default value. You can avoid that by buffering the measurements and joining them once, the first update for the key from the who stream arrives. You'll need another state for that.

Use of pace in gatling to control rate

I have following scenario which has two requests (RequestOne and RequestTwo). It is setup to run for 3 users and 1 repetition. The simulation should have taken at least 20 seconds to finish as I am using 20 seconds as pacing. However, every time I run it, it finishes in less than 20 seconds. I tried with different values for pacing as well.
val Workload = scenario("Load Test")
.repeat(1, "repetition") {
pace(20 seconds)
.exitBlockOnFail {
.feed(requestIdFeeder)
.group("Load Test") {
.exec(session => {
session.set("url", spURL)
})
.group("RequestOne") {exec(requestOne)}
.feed(requestIdFeeder)
.group("RequestTwo") {exec(requestTwo)}
}
}
}
setUp(Workload.inject(atOnceUsers(3))).protocols(httpProtocol)
output
Simulation com.performance.LoadTest completed in 11 seconds
Found the problem. I used only 1 repetition so the scenario didn't need to wait for the 20sec pacing to complete and it exited early. Setting repetition to > 1 helped achieve the desired rate.
val Workload = scenario("Load Test")
.repeat(10, "repetition") {
pace(20 seconds)
.exitBlockOnFail {
So, if you want to achieve fixed number of transactions in your simulation, use repetition, otherwise use "forever (" as mentioned in gatling docoumentation to achieve consistent rate.
val Workload = scenario("Load Test")
.forever (
pace(20 seconds)
.exitBlockOnFail {

How do I access the value of a timer once set?

Given the R3-GUI code below, is there a way to access how much time is left in the timer? The timer ID is returned by set-timer but I am not sure if there is anything I can do with it?
set-timer [print "done"] 60
In other words, what I am looking for in a fake code example:
>> get-timer/time-remaining timer-id
== 0:0:21
The answer can be found by looking at the source of set-timer
>> source set-timer
set-timer: make function! [[
{Calls a function after a specified amount of time. Returns timer ID reference.}
code [block!]
timeout [number! time!]
/repeat "Periodically repeat the function."
/local t id
][
t: now/precise
if number? timeout [timeout: to time! timeout]
sort/skip/compare append guie/timers compose/deep/only [(id: guie/timer-id: guie/timer-id + 1) [
timeout (t + timeout)
rate (all [
repeat
max 0:00 timeout
])
callback (function [] code)
]] 2 2
guie/timeout: true
id
]
]
If the timer is still going, it will be in the guie object.
>> guie/timers
== []
>> set-timer [print "done"] 2
== 5
>> guie/timers
== [5 [
timeout 11-Aug-2013/22:41:13.381-5:00
rate none
callback make function! [[
/local
][print "done"]]
]]
And getting the date value will look like this:
second select guie/timers timer-id
>>b: second select guie/timers 5
==11-Aug-2013/22:41:13.381-5:00
>>c: now/time - b/time
== 0:0:55
If the timer has finished, do-events clears it out. If events are not being run, then the timer will stay even after the time has run out.

Resources