CEP - Pattern not executed after adding Window - apache-flink

//Creating a window of ten items
WindowedStream<ObservationEvent,Tuple,GlobalWindow> windowStream = inputStream.keyBy("rackId").countWindow(10);
// Applying a Window Function , adding some custom evaluating all the values in the window
DataStream<ObservationEvent> inactivityStream = windowStream.apply(new WindowFunction<ObservationEvent, ObservationEvent , Tuple , GlobalWindow>() {
#Override
public void apply(Tuple tuple, GlobalWindow timeWindow, Iterable<ObservationEvent> itr, Collector<ObservationEvent> out)
//custom evaluation logic
out.collect(new ObservationEvent(1,"temperature", "stable"));
}
});
//Defining Simple CEP Pattern
Pattern<ObservationEvent, ?> inactivityPattern = Pattern.ObservationEvent>begin("first")
.subtype(ObservationEvent.class)
.where(new FilterFunction<ObservationEvent>() {
#Override
public boolean filter(ObservationEvent arg0) throws Exception {
System.out.println( arg0 ); //This function is not at all called
return false;
}
});
PatternStream<ObservationEvent> inactivityCEP = CEP.pattern(inactivityStream.keyBy("rackId"), inactivityPattern);
When I run this code, the filter function inside the where clause is not at all getting called.
I have printed the inactivityStream.print() and I can see the matching value.
Now, when I plug in the inputStream directly without applying a window. The pattern is matching
I printed inputStream and WindowedStream and I can see they both send similar kind of data.
What am I missing

The FilterFunction should be getting called eventually but you are going to have to wait for 10 events for the SAME key before you see your FilterFunction called for the first time. Could it be that you are just not waiting long enough in your windowing test?
Keep in mind that if you have many unique keys this implies you will have to wait well more than 10 times as long in the window test before you'll see your filter function called.

Related

Flink DataStream sort program does not output

I have written a small test case code in Flink to sort a datastream. The code is as follows:
public enum StreamSortTest {
;
public static class MyProcessWindowFunction extends ProcessWindowFunction<Long,Long,Integer, TimeWindow> {
#Override
public void process(Integer key, Context ctx, Iterable<Long> input, Collector<Long> out) {
List<Long> sortedList = new ArrayList<>();
for(Long i: input){
sortedList.add(i);
}
Collections.sort(sortedList);
sortedList.forEach(l -> out.collect(l));
}
}
public static void main(final String[] args) throws Exception {
final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
env.setParallelism(2);
env.getConfig().setExecutionMode(ExecutionMode.PIPELINED);
DataStream<Long> probeSource = env.fromSequence(1, 500).setParallelism(2);
// range partition the stream into two parts based on data value
DataStream<Long> sortOutput =
probeSource
.keyBy(x->{
if(x<250){
return 1;
} else {
return 2;
}
})
.window(TumblingProcessingTimeWindows.of(Time.seconds(20)))
.process(new MyProcessWindowFunction())
;
sortOutput.print();
System.out.println(env.getExecutionPlan());
env.executeAsync();
}
}
However, the code just outputs the execution plan and a few other lines. But it doesn't output the actual sorted numbers. What am I doing wrong?
The main problem I can see is that You are using ProcessingTime based window with very short input data, which surely will be processed in time shorter than 20 seconds. While Flink is able to detect end of input(in case of stream from file or sequence as in Your case) and generate Long.Max watermark, which will close all open event time based windows and fire all event time based timers. It doesn't do the same thing for ProcessingTime based computations, so in Your case You need to assert Yourself that Flink will actually work long enough so that Your window is closed or refer to custom trigger/different time characteristic.
One other thing I am not sure about since I never used it that much is if You should use executeAsync for local execution, since that's basically meant for situations when You don't want to wait for the result of the job according to the docs here.

in flink processFunction, all mapstate is empty in onTimer() function

I want implements the aggregationFunction by the processKeyedFunction, because the default aggregationFunction does not support rich function,
Besides, I tryed the aggreagationFunction + processWindowFunction(https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/windows.html), but it also cannot satisfy my needs, so I have to use the basic processKeyedFunction to implement the aggregationFunction, the detail of my problem is as followed:
in processFunction, , I define a windowState for stage the aggregation value of elements, the code is as followed:
public void open(Configuration parameters) throws Exception {
followCacheMap = FollowSet.getInstance();
windowState = getRuntimeContext().getMapState(windowStateDescriptor);
currentTimer = getRuntimeContext().getState(new ValueStateDescriptor<Long>(
"timer",
Long.class
));
in processElement() function, I use the windowState (which is a MapState initiate in open function) to aggregate the window element, and register the first timeServie to clear current window state, the code is as followed:
#Override
public void processElement(FollowData value, Context ctx, Collector<FollowData> out) throws Exception
{
if ( (currentTimer==null || (currentTimer.value() ==null) || (long)currentTimer.value()==0 ) && value.getClickTime() != null) {
currentTimer.update(value.getClickTime() + interval);
ctx.timerService().registerEventTimeTimer((long)currentTimer.value());
}
windowState = doMyAggregation(value);
}
in onTimer() function, first, I register the next timeService in next One minute, and clear the window State
#Override
public void onTimer(long timestamp, OnTimerContext ctx, Collector<FollowData> out) throws Exception {
currentTimer.update(timestamp + interval); // interval is 1 minute
ctx.timerService().registerEventTimeTimer((long)currentTimer.value());
out.collect(windowState);
windowState.clear();
}
but when the program is running , I found that all the windowState in onTimer is empty, but it is not empyt in processElement() function, I don't know why this happens, maybe the execution logic is different, how can I fix this,
Thanks in advance !
new added code about doMyAggregation() part
windowState is a MapState , key is "mykey", value is an self-defined Object AggregateFollow
public class AggregateFollow {
private String clicked;
private String unionid;
private ArrayList allFollows;
private int enterCnt;
private Long clickTime;
}
and the doMyAggregation(value) function is pretty much like this , the function of doMyAggregation is to get all the value whose source field is 'follow', but if there are no values whose field is 'click' during 1 minute, the 'follow' value should be obsolete, in a word , it's like a join operation of 'follow' data and 'click' data,
AggregateFollow acc = windowState.get(windowkey);
String flag = acc.getClicked();
ArrayList<FollowData> followDataList = acc.getAllFollows();
if ("0".equals(flag)) {
if ("follow".equals(value.getSource())) {
followDataList.add(value);
acc.setAllFollows(followDataList);
}
if ("click".equals(value.getSource())) {
String unionid = value.getUnionid();
clickTime = value.getClickTime();
if (followDataList.size() > 0) {
ArrayList listNew = new ArrayList();
for (FollowData followData : followDataList) {
followData.setUnionid(unionid);
followData.setClickTime(clickTime);
followData.setSource("joined_flag"); //
}
acc.setAllFollows(listNew);
}
acc.setClicked("1");
acc.setUnionid(unionid);
acc.setClickTime(clickTime);
windowState.put(windowkey, acc);
}
} else if ("1".equals(flag)) {
if ("follow".equals(value.getSource())) {
value.setUnionid(acc.getUnionid());
value.setClickTime(acc.getClickTime());
value.setSource("joined_flag");
followDataList.add(value);
acc.setAllFollows(followDataList);
windowState.put(windowkey, acc);
}
}
because of performance problem, original windowAPI is not a valid choice for me, the only way here I think is to use processFunction + ontimer and Guava Cache ,
Thanks a lot
If windowState is empty, it would be helpful to see what doMyAggregation(value) is doing.
It's difficult to debug this, or propose good alternatives, without more context, but out.collect(windowState) isn't going to work as intended. What you might want to do instead would be to iterate over this MapState and collect each key/value pair it contains to the output.
I changed the type of windowState from MapState to ValueState, and the problem is solved, maybe it is a bug or something, can anyone can explain this?

Why can not update autocomplete suggestions?

I am using two AutocompleteTextFilters as depended filters. I want the second one filter to change its options depending on the suggestion of the first filter.
I have bind an event listener on the first filter so as when it loose focus it triggers a proccess on the second filter.
The proble is that the second filter never changes its options. I even have setup hardcoded values in case somethig was wrong on my code but no luck.
The code I use is below:
public CreateSubmission(com.codename1.ui.util.Resources resourceObjectInstance, Map<String, ProjectType> projectTypes) {
this.projectTypes = projectTypes;
initGuiBuilderComponents(resourceObjectInstance);
gui_ac_projecttype.clear();
gui_ac_projecttype.setCompletion( this.projectTypes.keySet().toArray( new String[0]) );
gui_ac_projecttype.addFocusListener( new ProjectTypeFocusListener( this ));
gui_ac_steps.setCompletion( new String[]{"t10", "t20"});
}
public void makeSteps (String selection) {
ProjectType projectType = this.projectTypes.get( selection );
if (projectType != null) {
this.selectedProjectType = selection;
int length = projectType.projectSteps.length;
String[] steps = new String[ length ];
for(int i =0; i < length; i ++) {
steps[i] = projectType.projectSteps[i].projectStep;
}
// String[] s = gui_ac_steps.getCompletion();
gui_ac_steps.setCompletion( new String[]{"t1", "t2"} );
gui_ac_steps.repaint();
}
else {
}
}
public class ProjectTypeFocusListener implements FocusListener{
private CreateSubmission parent;
public ProjectTypeFocusListener( CreateSubmission parent ) {
this.parent = parent;
}
#Override
public void focusGained(Component cmp) {
//throw new UnsupportedOperationException("Not supported yet."); //To change body of generated methods, choose Tools | Templates.
}
#Override
public void focusLost(Component cmp) {
this.parent.makeSteps (
((AutoCompleteTextField)cmp).getText()
);
//throw new UnsupportedOperationException("Not supported yet."); //To change body of generated methods, choose Tools | Templates.
}
}
On the above code the initialization happens on "public CreateSubmission" method.
"gui_ac_projecttype" is the first AutocompletionTextField that triggers the whole proccess through it's FocusListener handler (class ProjectTypeFocusListener )
"gui_ac_steps" is the second AutocompleteTextField filter that must change its values. On the code above I initialize it's suggestions to "t10", "t20". Those two values are shown correctly.
Later from iside the FoculListenerHandler's method "ProjectTypeFocusListener.focusLost" I call method "makeSteps" which sets the suggestion options to "t1", "t2 and then I repaint the component. These two last values are never shown. It remains on the first values "t10", "t20".
The Strange thing is that in debugger when I ask gui_ac_steps.getCompletion(); to see the current options ( the code that is commentd out into makeSteps method) I get the correct values "t1", "t2".
But on the screen it keeps showing "t10", "t20".
any help is aprreciated.
You shouldn't do anything "important" in a focus listener. Especially not with a text field. They are somewhat unreliable because the text field switches to native editing and in effect transfers the focus there. The problem is that some events are delayed due to the back and forth with the native editing so by the time the focus event is received you've moved on to the next field.
Try something like this for this specific use case https://www.codenameone.com/blog/dynamic-autocomplete.html

Flink executes dataflow twice

I'm new to Flink and I work with DataSet API. After a whole bunch of processing as the last stage I need to normalize one of the values by dividing it by its maximum value. So, I have used the .max() operator to take the max and later I'm passing the result as constructor's argument to the MapFunction.
This works, however all the processing is performed twice. One job is executed to find max values, and later another job is executed to create final result (starting execution from the beginning)... Is there any workaround to execute whole dataflow only once?
final List<Tuple6<...>> maxValues = result.max(2).collect();
assert maxValues.size() == 1;
result.map(new NormalizeAttributes(maxValues.get(0))).writeAsCsv(...)
#FunctionAnnotation.ForwardedFields("f0; f1; f3; f4; f5")
#FunctionAnnotation.ReadFields("f2")
private static class NormalizeAttributes implements MapFunction<Tuple6<...>, Tuple6<...>> {
private final Tuple6<...> maxValues;
public NormalizeAttributes(Tuple6<...> maxValues) {
this.maxValues = maxValues;
}
#Override
public Tuple6<...> map(Tuple6<...> value) throws Exception {
value.f2 /= maxValues.f2;
return value;
}
}
collect() immediately triggers an execution of the program up to the dataset requested by collect(). If you later call env.execute() or collect() again, the program is executed second time.
Besides the side effect of execution, using collect() to distribute values to subsequent transformation has also the drawback that data is transferred to the client and later back into the cluster. Flink offers so-called Broadcast variables to ship a DataSet as a side input into another transformation.
Using Broadcast variables in your program would look as follows:
DataSet maxValues = result.max(2);
result
.map(new NormAttrs()).withBroadcastSet(maxValues, "maxValues")
.writeAsCsv(...);
The NormAttrs function would look like this:
private static class NormAttr extends RichMapFunction<Tuple6<...>, Tuple6<...>> {
private Tuple6<...> maxValues;
#Override
public void open(Configuration config) {
maxValues = (Tuple6<...>)getRuntimeContext().getBroadcastVariable("maxValues").get(1);
}
#Override
public PredictedLink map(Tuple6<...> value) throws Exception {
value.f2 /= maxValues.f2;
return value;
}
}
You can find more information about Broadcast variables in the documentation.

How to get Json Array to work inside Asyntask - Android [duplicate]

This question already has an answer here:
How to put Json inside asynctask - Android
(1 answer)
Closed 9 years ago.
I have an Asynctask that uses a Json function in the doInBackground part. The function collects an array of comments and places them into a variable called KEY_COMMENTS. In the onPreExecute it places the comments into a textView using a for loop to select each comment individually. The problem is that its not selecting each comment it will only select one. If I set the loop to go for more than 1 time it will crash the app. Here is my code,
class loadComments extends AsyncTask<JSONObject, String, JSONObject> {
#Override
protected void onPreExecute() {
super.onPreExecute();
}
#Override
protected void onProgressUpdate(String... values) {
super.onProgressUpdate(values);
}
protected JSONObject doInBackground(JSONObject... params) {
//do your work here
JSONObject json2 = CollectComments.collectComments(usernameforcomments, offsetNumber);
return json2;
}
#Override
protected void onPostExecute(JSONObject json2) {
try {
if (json2.getString(KEY_SUCCESS) != null) {
registerErrorMsg.setText("");
String res2 = json2.getString(KEY_SUCCESS);
if(Integer.parseInt(res2) == 1){
JSONArray array = json2.getJSONArray(KEY_COMMENT);
for(int i = 0; i < 2; i++) {
LinearLayout.LayoutParams layoutParams = new LinearLayout.LayoutParams(
LinearLayout.LayoutParams.FILL_PARENT, LinearLayout.LayoutParams.WRAP_CONTENT);
commentBox.setBackgroundResource(R.drawable.comment_box_bg);
layoutParams.setMargins(0, 10, 0, 10);
commentBox.setPadding(0,0,0,10);
commentBox.setOrientation(LinearLayout.VERTICAL);
linear.addView(commentBox, layoutParams);
commentBoxHeader.setLayoutParams(new LayoutParams(LayoutParams.FILL_PARENT, LayoutParams.WRAP_CONTENT));
commentBoxHeader.setBackgroundResource(R.drawable.comment_box_bg);
commentBoxHeader.setBackgroundResource(R.drawable.comment_box_header);
commentBox.addView(commentBoxHeader);
commentView.setText(array.getString(i));
LinearLayout.LayoutParams commentViewParams = new LinearLayout.LayoutParams(
LinearLayout.LayoutParams.FILL_PARENT, LinearLayout.LayoutParams.WRAP_CONTENT);
commentViewParams.setMargins(20, 10, 20, 20);
commentView.setBackgroundResource(R.drawable.comment_bg);
commentView.setTextColor(getResources().getColor(R.color.black));
commentBox.addView(commentView, commentViewParams);
}
}//end if key is == 1
else{
// Error in registration
registerErrorMsg.setText(json2.getString(KEY_ERROR_MSG));
}//end else
}//end if
} //end try
catch (JSONException e) {
e.printStackTrace();
}//end catch
}
}
doInBackGround : method is used as a Thread !
onPostExecute : acts as a UI Thread !
So try to put your any-long running code in , doInBackGround method !
When an asynchronous task is executed, the task goes through 4 steps:
From the Docs :
onPreExecute(), invoked on the UI thread before the task is executed. This step is normally used to setup the task, for instance by showing a progress bar in the user interface.
doInBackground(Params...), invoked on the background thread immediately after onPreExecute() finishes executing. This step is used to perform background computation that can take a long time. The parameters of the asynchronous task are passed to this step. The result of the computation must be returned by this step and will be passed back to the last step. This step can also use publishProgress(Progress...) to publish one or more units of progress. These values are published on the UI thread, in the onProgressUpdate(Progress...) step.
onProgressUpdate(Progress...), invoked on the UI thread after a call to publishProgress(Progress...). The timing of the execution is undefined. This method is used to display any form of progress in the user interface while the background computation is still executing. For instance, it can be used to animate a progress bar or show logs in a text field.
onPostExecute(Result), invoked on the UI thread after the background computation finishes. The result of the background computation is passed to this step as a parameter.

Resources