I have produced a report which allows you to drill down into the table you see with ticks and crosses. Each tick indicates that a task is complete. In the second picture you can see the rules which dictate whether the task is classes as complete or incomplete and they are the following:
If Remaining Hours = 0 then the task is complete and therfore a tick appears.
If Remaining Hours is between 0.01 and 1,000,000 then there is still work to be carried out and therfore the task is not complete.
The oversight on this the 'Total Effort' column, if this is 0.00 and remaining Hours is also 0.00 then it means that the task has not began yet and is therfore incomplete however due to my rules, a tick is appearing next to these unstarted tasks.
My question is, how do I add a second ruling to take into account 'Total Effort' because as far as I can see I am only able to select one value from the drop down box and then populate just for that.
The rules need to be:
If Remaining Hours = 0 and Total Effort >= 0.01 then a green tick
If Remaining Hours is between 0.01 and 1,000,000 then a red cross
If Total Effort = 0.00 and Remaining Hours = 0.00, then red cross
Hope this makes sense.
=iif(Sum(Fields!EffortRemaining.Value) =0 and Sum(Fields!TotalEffort.Value) >= 0.01,1,0) for value expression
Then tell red cross for 0 and green tick for 1
Related
My data returns an average run time that needs to be greater than 9 minutes. If it is less than 9 min then the row needs to be highlighted red. Here is one of the various iif statements I have attempted to writer to cover this in the background color expression. In the picture the top two rows should be highlighted in red.
=iif(Fields!AvgTime.Value < TimeValue("00:09:00"),"#ffb3be","#c8ffcd")
Given the variable 'points' which increases every time a variable 'player' collects a point, how do I logically find a way to reward user for finding 30 points inside a 5 minutes limit? There's no countdown timer.
e.g player may have 4 points but in 5 minutes if he has 34 points that also counts.
I was thinking about using timestamps but I don't really know how to do that.
What you are talking about is a "sliding window". Your window is time based. Record each point's timestamp and slide your window over these timestamps. You will need to pick a time increment to slide your window.
Upon each "slide", count your points. When you get the amount you need, "reward your user". The "upon each slide" means you need some sort of timer that calls a function each time to evaluate the result and do what you want.
For example, set a window for 5 minutes and a slide of 1 second. Don't keep a single variable called points. Instead, simply create an array of timestamps. Every timer tick (of 1 second in this case), count the number of timestamps that match t - 5 minutes to t now; if there are 30 or more, you've met your threshold and can reward your super-fast user. If you need the actual value, that may be 34, well, you've just computed it, so you can use it.
There may be ways to optimize this. I've provided the naive approach. Timestamps that have gone out of range can be deleted to save space.
If there are "points going into the window" that count, then just add them to the sum.
I would like to predict the Reliability of my physical machines by ANN.
Q1) What is the right metric that measure the reliability for repairable machine.
Q2) In order to calculate the reliability of each machine in each time period or row should I calculate TBF or MTBF, and feed my ANN.
Q3) Is ANN a good machine learning approach to solve my issue
Lets take a look.
In my predictor ANN. One of the input is the current reliability value for my physical machines by applying the right distribution function with right metric MTBF or MTTF. In sample data, there are two machines with some log events.
Time , machine ID, and event_type. event_type = 0 when a machine became available to the cluster, event_type=1 machine failed, and when event_type=2 when a machine available to the cluster had its available resources changed.
For non-repairable product MTTF is preferred to use to measure the reliability, and MTBF is for repairable product.
What is the right metric to get the current reliability value for each time period row , is it TBF or MTBF . Previously I use MTTF= TOTAL UPTIME/TOTAL NUMBER OF FAILURE. To calculate the UPTIME, I subtract the time in event_type = 1 from first previous time in event_type=0, and so on, then divide the total UPTIME by number of failure. Or I need to TBF for each row. Machine events table looks like:
time machine_id event_type R()
0 6640223 0
30382.66466 6640223 1
30399.2805 6640223 0
37315.23415 6640223 1
37321.64514 6640223 0
0 3585557842 0
37067.13354 3585557842 1
37081.0917 3585557842 0
37081.2932 3585557842 2
37321.33633 3585557842 2
37645.77424 3585557842 1
37824.73506 3585557842 0
37824.73506 3585557842 2
41666.42118 3585557842 2
After Preprocessing previous table of machine events to get input_2 (Reliability) to the training data table the expected table should be look like:
start_time machine_id input_x1 input_2_(Relibility) Predicied_output_Relibility
0 111 0.06 xx.xx
1 111 0.04 xx.xx
2 111 0.06 xx.xx
3 111 0.55 xx.xx
0 222 0.06 xx.xx
1 222 0.06 xx.xx
2 222 0.86 xx.xx
3 222 0.06 xx.xx
mean time TO failure
It is (or should be) a predictor of equipment reliability. The TO in that term indicates it's predictive intent.
Mean time to failure (MTTF) is the length of time a device or other
product is expected to last in operation. MTTF is one of many ways to
evaluate the reliability of pieces of hardware or other technology.
https://www.techopedia.com/definition/8281/mean-time-to-failure-mttf
e.g.
take total hours of operation of same equipment items
divided by the number of failures of those items
If there is 100 items, all except one operate for 100 hours.
One failure happens 50 hours.
MTTF = (( 99 items x 100hrs ) + (1 item x 50 hrs)) / 1 failure = 9950 hours
----
I believe you have been calculating MTBF
mean time BETWEEN failures
This measure is based on recorded events.
Mean time between failure (MTBF) refers to the average amount of time
that a device or product functions before failing. This unit of
measurement includes only operational time between failures and does
not include repair times, assuming the item is repaired and begins
functioning again. MTBF figures are often used to project how likely a
single unit is to fail within a certain period of time.
https://www.techopedia.com/definition/2718/mean-time-between-failures-mtbf
the MTBF of a component is the sum of the lengths of the operational
periods divided by the number of observed failures
https://en.wikipedia.org/wiki/Mean_time_between_failures
In short the data you have in that table is suited to MTBF calculation, in the manner you have been doing it. I'm not sure what the lambda reference would be discussing.
I am really unable to understand why is the total elapsed time for a dataflow job so much higher than time taken by individual steps.
For example, total elapsed time for the dataflow in picture is 2 min 39 sec. While time spent in individual steps is just 10 sec. Even if we consider the time spent in setup and destroy phases, there is a difference of 149 secs, which is too much.
Is there some other way of reading the individual stage timing or I am missing something else?
Thanks
According to me 2 min 39 sec time is fine. You are doing this operation reading file and then pardo and then writting it to bigquery.
There are lot of factor involved in this time calculation.
How much data you need to process. i.e - in your case I don't think you are processing much data.
What computation you are doing. i.e your pardo step is only 3 sec so apart from small amount of data pardo do not have much computation as well.
Writing it to bigquery - i.e in your case it is taking only 5 sec.
So creation and destroy phases of the dataflow remains constant. In your case it is 149 sec. Your job is taking only 10 sec that is dependent on all three factor I explained above.
Now let assume that you have to process 2 million record And each record transform take 10 sec. In this case the time will be much higher i.e 10 sec * 2 million records for single node dataflow load job.
So in this case 149 sec didn't stands in-front of whole job completion time as 149 sec is considered for all record process 0 sec * 2 million records.
Hope these information help you to understand the timing.
I have a setup with a Beaglebone Black which communicates over I²C with his slaves every second and reads data from them. Sometimes the I²C readout fails though, and I want to get statistics about these fails.
I would like to implement an algorithm which displays the percentage of successful communications of the last 5 minutes (up to 24 hours) and updates that value constantly. If I would implement that 'normally' with an array where I store success/no success of every second, that would mean a lot of wasted RAM/CPU load for a minor feature (especially if I would like to see the statistics of the last 24 hours).
Does someone know a good way to do that, or can anyone point me in the right direction?
Why don't you just implement a low-pass filter? For every successfull transfer, you push in a 1, for every failed one a 0; the result is a number between 0 and 1. Assuming that your transfers happen periodically, this works well -- and you just have to adjust the cutoff frequency of that filter to your desired "averaging duration".
However, I can't follow your RAM argument: assuming you store one byte representing success or failure per transfer, which you say happens every second, you end up with 86400B per day -- 85KB/day is really negligible.
EDIT Cutoff frequency is something from signal theory and describes the highest or lowest frequency that passes a low or high pass filter.
Implementing a low-pass filter is trivial; something like (pseudocode):
new_val = 1 //init with no failed transfers
alpha = 0.001
while(true):
old_val=new_val
success=do_transfer_and_return_1_on_success_or_0_on_failure()
new_val = alpha * success + (1-alpha) * old_val
That's a single-tap IIR (infinite impulse response) filter; single tap because there's only one alpha and thus, only one number that is stored as state.
EDIT2: the value of alpha defines the behaviour of this filter.
EDIT3: you can use a filter design tool to give you the right alpha; just set your low pass filter's cutoff frequency to something like 0.5/integrationLengthInSamples, select an order of 0 for the IIR and use an elliptic design method (most tools default to butterworth, but 0 order butterworths don't do a thing).
I'd use scipy and convert the resulting (b,a) tuple (a will be 1, here) to the correct form for this feedback form.
UPDATE In light of the comment by the OP 'determine a trend of which devices are failing' I would recommend the geometric average that Marcus Müller ꕺꕺ put forward.
ACCURATE METHOD
The method below is aimed at obtaining 'well defined' statistics for performance over time that are also useful for 'after the fact' analysis.
Notice that geometric average has a 'look back' over recent messages rather than fixed time period.
Maintain a rolling array of 24*60/5 = 288 'prior success rates' (SR[i] with i=-1, -2,...,-288) each representing a 5 minute interval in the preceding 24 hours.
That will consume about 2.5K if the elements are 64-bit doubles.
To 'effect' constant updating use an Estimated 'Current' Success Rate as follows:
ECSR = (t*S/M+(300-t)*SR[-1])/300
Where S and M are the count of errors and messages in the current (partially complete period. SR[-1] is the previous (now complete) bucket.
t is the number of seconds expired of the current bucket.
NB: When you start up you need to use 300*S/M/t.
In essence the approximation assumes the error rate was steady over the preceding 5 - 10 minutes.
To 'effect' a 24 hour look back you can either 'shuffle' the data down (by copy or memcpy()) at the end of each 5 minute interval or implement a 'circular array by keeping track of the current bucket index'.
NB: For many management/diagnostic purposes intervals of 15 minutes are often entirely adequate. You might want to make the 'grain' configurable.