How is a funnel operation process of Amplitude analytics? - analytics

I'm working on mobile app service company.
My job is to handle data of the app.
I'm trying to analyze the app data through 'Amplitude analytics(AT)'
Custom event tracking by AT works pretty well.
I'm satisfied with this function of AT.
However, there is problem in funnel function of AT/
For example, there are 4 custom events(called A,B,C,D) which are I installed.
Event A,B,C is related to each other, like after event A occur, user can take event B or C.
Event D is not related to event A,B,C.
My question is this.
When I try to figure out the event A to B conversing rate by funnel function of AT, there is no problem.
Because event A and B is related.
However even I put the event A as 1st step and D as 2nd step, funnel show the result and the result(conversion rate) is not zero.
Users can not occur event D after event A.
I don't understand the funnel operation process of AT.
If somebody know the answer, please help me.
Thax.

I found the answer by myself.
Even event A and D looks there is no relation, It could happen.
If there is really no relation between A and D,
Funnel conversion can not happen and has shown
because Funnel in Amplitude is closed funnel.
I ask it to Amplitude vendor of the South Korea.
Wish it help you.

Related

Flink drop late records even I specified the side output

I use flink to process dynamoDB stream data.
Watermark strategy: periodic, extract approximate time stamp from stream events and use it under withTimeStampAssigner.
Idleness: 10s(may not be useful at all as we only use parallism of 1.)
The data work flow looks like this:
InputStream.assignTimeStampsAndWatermarks().keyby().window(TumblingEventTimeWindow.of(1min).sideOutputLateData().reduce().map()
Then I getSideOutput(), and process the late events using exactly similar above workflow with small change such as no need to assign time stamp and watermark, no need for late output.
My logs show that all things work perfectly if ddb stream data has right timstamp, the corresponding window can close without issue and I can see the output after window is closed.
However, after I introduced late events, the late records processing logic is never triggered. I am sure that the late record’s timestamp corresponding window has closed. I put a log after I call getSideOutPut(), it never triggered. I used debugger and I am sure the getSideOutput() code is not triggered as well.
Can someone help to check this issue? Thank you.
I tried to use a different watermark strategy for late records logic. This doesn’t work as well. I want to understand why the late records are not collected to the late stream.
Without seeing more details from your implementation is difficult to give an accurate diagnosis, but based on your description, I wouldn't expect this to work:
Then I getSideOutput(), and process the late events using exactly similar above workflow with small change such as no need to assign time stamp and watermark, no need for late output.
If you are trying to apply event time windowing to the stream of late events, that's not going to work unless you adjust the allowed lateness for those windows enough to accommodate them.
As a starting point, have you tried printing the stream of late events?

Detect absence of a certain event

In the documentation of FlinkCEP, I found that I can enforce that a particular event doesn't occur between two other events using notFollowedBy or notNext.
However, I was wondering If I could detect the absence of a certain event after a time X.
For example, if an event A is not followed by another event A within 10 seconds, fire an alert or do something.
Could be possible to define a FlinkCEP pattern to capture that situation?
Thanks in advance,
Humberto
Although Flink CEP does not support notFollowedBy at the end of a Pattern, there is a way to implement this by exploiting the timeout feature.
The Flink training includes an exercise where the objective is to identify taxi rides with a START event that is not followed by an END event within two hours. You'll find a solution to this exercise that uses CEP
here.
The main idea would be to define a Pattern of A followed by A within 10 seconds, and then capture the case where this times out.

Consolidate/discard events in count window

I just started using Flink and have a problem I'm not sure how to solve. I get events from a Kafka Topic, these events represent a "beacon" signal from a mobile device. The device sends an event every 10 seconds.
I have an external customer that is asking for a beacon from our devices but every 60 seconds. Since we are already using Flink to process other events I thought I could solve this using a count window, but I'm struggling to understand how to "discard" the first 5 events and emit only the last one. Any ideas?
There are some ways to do this. As Far as I understand the idea is as follows: You receive beacon signal each 10 sec but You actually only need the most actual one and disard the others since the client asks for the data each 60 sec.
The simplest would be ofc to use ProcessFunction with count/event time window as You said. The type of the window actually depends on Your requirements. Then You sould do something like this:
stream.timeWindow([windowSize]).process(new CustomWindowProcessFunction())
The signature of the process() method of the ProcessWindowFunctionis as follows, depending on the type of the actual function def process(context: Context, elements: Iterable[IN], out: Collector[OUT]). So basically it gives you the acces to all window elements, so You can easily only push further the elements You like.
While this is the simplest idea, you may want also to take a look at the Flink timers, as they seem to be a good solution for Your issue. They are described here.

Extended windows

I have an always one application, listening to a Kafka stream, and processing events. Events are part of a session. And I need to do calculations based off of a sessions data. I am running into a problem trying to correctly run my calculations due to the length of my sessions. 90% of my sessions are done after 5 minutes. 99% are done after 1 hour. Sessions may last more than a day, due to this being a real-time system, there is no determined end. Session are unique, and show never collide.
I am looking for a way where I can process a window multiple times, either with an initial wait period and processing any later events after that, or a pure process per event type structure. I will need to keep all previous events around(ListState), as well as previously processed values(ValueState).
I previously thought allowedLateness would allow me to do this, but it seems the lateness is only considered for when the event should have been processed, it does not extend an actual window. GlobalWindows may also work, but I am unsure if there is a way to process a window multiple times. I believe I can used an evictor with GlobalWindows to purge the Windows after a period of inactivity(although admittedly, I did not research this yet, because I was unsure of how to trigger a GlobalWindow multiple times.
Any suggestions on how to achieve what I am looking to do would be greatly appreciated, I would also be happy to clarify any points needed.
If SessionWindows won't do the job, then you can use GlobalWindows with a custom Trigger and Evictor. The Trigger interface has onElement and timer-based callbacks that can fire whenever and as often as you like. If you go down this route, then yes, you'll also need to implement an Evictor to dispose of elements when they are no longer needed.
The documentation and the source code are helpful when trying to understand how this all fits together.

What is complex Event processing?

I have build an analytic app for a company that shows Key Performance Indicators(KPIs) based on selling/buying of items of their employees.
Now what they want is to add event functionality like:
* alert them if any employee sell more than $10,000 worth items
* or alert them if some employee is tired of life and killed himself/herself
basically these silly events.
I am totally blank about what to do, they recommended using ESPER but I find it very tough to understand how event processing works.
I want to know how Event Processing works in such cases and where can i learn more about it.
See besides programming and DB i know nothing and I am not a PRO too.
Please share your opinions on what am I supposed to do?
Basically, complex event processing is used when you need to analyze "streams" of data, and when you have to react quickly when a seeked pattern emerges from it. For example when three guys flying from different airports are carrying ingredients to potentially build a bomb, and they're heading to the same destination, you need to find that correlation in data flowing from each of the airports, and react to the event quickly.
Employee selling more than 10000 is not something you need to know about in real-time. You can reward him next month, and for that "normal" reporting will work just fine.
Some papers to read:
Oracle CEP
CEP is an excellent way to accomplish your task actually. You can think of CEP as a pattern matching technique. Once it's matched you can be notified or launch another process if your CEP is integrated with other tools.
Here's the example from Wikipedia. We can easily infer that it's a wedding based on these incoming signals:
church bells ringing.
the appearance of a man in a tuxedo with a woman in a flowing white gown.
rice flying through the air.
This is an abstract example, and things like this probably aren't getting put into any system, but you get the idea. It's best to see the code to understand a CEP implimentation. Here's exactly the script you can write on top of NebriOS to build the inference. Other tools like Drools will accomplish the same thing.
class wedding_detect(NebriOS):
def check(self):
if self.church_bells == "ringing" and \
(self.clothes_type == "tuxedo" or \
self.clothes_type == "wedding gown") and \
self.rice_flying == True:
return True
else
return False
def action(self):
# fires if the check() is true
send_email("me#example.com", "A wedding is underway")
For your situation, CEP is easy to program also:
class sales_alert(NebriOS):
def check(self):
return self.sales_total > 10000
def action(self):
send_email("me#example.com", "You got a $10k sale!")
The sales_total can come into the system from your CRM for example.
That's CEP - Complex Event Processing! A start anyways.

Resources