Calculate the distance between multiple points ordered by TimeStamp in SQL - sql-server

I want to calculate the distance between multiple coordinates by time in sql. For example the car passes at point A at 1pm, and then at point B at 3 pm and in the end at point C at 6 pm. Points are connected among each other by the timestamp. So if the car has been in point A at 1 pm and then at point B at 3 pm and Point C at 6 pm. First the distance A to B is calculated and then B to C, and in the end we get the sum of these three points.I want to get the distance that the car passes per 1 day.
The car can pass different number of points. For example it can go A->B->C->A or just A->B
I have tables for the CarId,CheckpointCordinates and datastamps.
Can you please guide me how to solve the problem. This question is asked once in here but no answer was given.
Thank you

Related

SUMIF for first 5 cells meeting criteria, with moving window

I am looking to find a formula that calculates the cumulative points a team has for its last 5 games (so the first 5 games would have no values) in the last column. So it should return a number between 0 and 15. In the image below you can see an extract of the dataset. The range can also not continue to move down because then for the last 4 games, only the last 4, 3, 2, 1 game points would be added (hope I did not confuse you more).
I have tried to use SUMIF, SUMIFS with relative row numbers in the cell address within a function but sadly, that did not help. I also looked at SUMIF for first 5 cells meeting criteria but that did not make me any wiser.
Link to the full dataset: https://www.dropbox.com/scl/fi/thu7f8ajsz9g8wtfo9q2w/Data.xlsx?dl=0&rlkey=aq8d7xi4zyg7hvkophsrhswpi
Does anyone know how to do this?
FTHG = Full-Time Home Goals
FTAG = Full-Time Away Goals
FTR = Full-Time Result
PH = Points Home
PA = Points Away
Not sure I really get what you want, but I have used two vlookup's to collect PH and PA and add them using a list of unique names from column A.
Here is the formula as text:
=IFERROR(VLOOKUP(J4,$A$4:$H$20,6,0),0)+IFERROR(VLOOKUP(J4,$B$4:$H$20,6,0),0)
Given your original table and assuming that for any particular row, you are interested in the five games above that row (played by the same home team), you can use:
=LET(x,FILTER(G2:$G$2, B2:$B$2=B2),IF(COUNT(x)>=5,SUM(INDEX(x,SEQUENCE(5))),0))
You can see the first NON-ZERO result is in Row 65

how to compute multiple variables using loop

In the dataset, there are two columns "start_year" and "end_year", indicating the year a patient start and end the registration in the GP clinic. I want to know whether each patient was registered in the clinic from 1990 to 2019. Probably compute 20 new variables (1=yes,0=no) for each year.
I used ifelse (R) to compute the variable one by one:
test$pt_1990<-ifelse(test$start_year<=1990 & 1990<=test$end_year,1,0)
Hope loops could have a better solution instead of write 20 lines of same code. Thank u very much

Excel function array formula based on sumif to find minimum count

I have some data where I want to find the minimum number of days it takes to reach a total sum based on some criteria.
Essentially the data is like this:
Date Season Recieval
1/01/2006 2006 500
2/01/2006 2006 100
3/01/2006 2006 150
…
10/12/2009 2009 300
etc
Want I want to do is find a formula that finds the minimum number of days it takes to reach a receivals total for the season.
The formula below is what I have tried so far with no avail.
=MIN(COUNT(IF(SUMIFS(C:C,B:B,"2006")>2000,DATA!A:A)))
It doesn't matter what point it starts from, but it must take the minimum number of days to reach the 2000.
Output should be a number eg 39 (39 days consecutive to sum up to receivals of 2000).
Essentially what I want to generate is the minimum number of consecutive days required to reach the total of 2000, regardless of the starting point.
Cheers!
If your Dates are in the range A2:A25 and Recieval in C2:C25, then try this...
=INDEX(A2:A25,MATCH(TRUE,INDEX(SUBTOTAL(9,(OFFSET(C$2:C25,,,ROW(INDIRECT("1:25")),1)))>=2000,),0))-A2
The formula will get you the no. of days to reach total 2000 receival in column C.
Or if you just need to count the consecutive dates, please try..
=MATCH(TRUE,INDEX(SUBTOTAL(9,(OFFSET(C$2:C25,,,ROW(INDIRECT("1:25")),1)))>=2000,),0)
Remember both the formulas are Array Formulas which require a special key stroke Ctrl+Shift+Enter instead of Enter alone.

Multiple combinations (ex drug-ADR) with the same unique case ID

I am quite new to R statistics, and I one you can help me. I have tried finding the answer to my question by searching the forum and so on, and I apologize in advance if my question is trivial or stupid.
I have spent the last month collecting my first data set. And my dataset is now ready to be analyzed. I have spent some time learning the most basic function of the R statistics.
My dataset deals with adverse drug reaction reports. Each report may contain several suspect drugs and several adverse reactions. A case can therefore contain several drugs and adverse reaction (drug-ADR) combinations. Some cases contain just one combination and others contain several.
And now my question is: How do I make calculations that are “case-specific”?
I want to calculate a Completeness Score for the percentage of completed data fields for each drug-ADR combination, and then I would like to calculate the average for the entire case/report.
I want to calculate a Completness Score (C) for each drug-ADR combination expressed as:
C = (1-Pi) = (1-P1) x (1-P 2) x (1-P3) …. (1-Pn)
, where Pi refers to the penalty deducted, if the data field is not complete (ex 0.50 for 50%). If the information is not missing the panalty 0. The max score will then be 1. n is the number of parameters / variables.
Ultimately I want to calculate an overall Completness score for the overall case/report. The total score is should be calculated from the average of each drug-ADR combination.
C = Cj / m
, where j denotes the current drug-ADR combination, and m is the total number of combinations of drug-ADR in the full report.
Can anyone help me?
Thanke you for your attention!! I will be very grateful for any help that I can get.

Can we solve this using a greedy strategy? If not how do we solve this using dynamic programming?

Problem:
The city of Siruseri is impeccably planned. The city is divided into a rectangular array of cells with M rows and N columns. Each cell has a metro station. There is one train running left to right and back along each row, and one running top to bottom and back along each column. Each trains starts at some time T and goes back and forth along its route (a row or a column) forever.
Ordinary trains take two units of time to go from one station to the next. There are some fast trains that take only one unit of time to go from one station to the next. Finally, there are some slow trains that take three units of time to go from one station the next. You may assume that the halting time at any station is negligible.
Here is a description of a metro system with 3 rows and 4 columns:
S(1) F(2) O(2) F(4)
F(3) . . . .
S(2) . . . .
O(2) . . . .
The label at the beginning of each row/column indicates the type of train (F for fast, O for ordinary, S for slow) and its starting time. Thus, the train that travels along row 1 is a fast train and it starts at time 3. It starts at station (1,1) and moves right, visiting the stations along this row at times 3, 4, 5 and 6 respectively. It then returns back visiting the stations from right to left at times 6, 7, 8 and 9. It again moves right now visiting the stations at times 9, 10, 11 and 12, and so on. Similarly, the train along column 3 is an ordinary train starting at time 2. So, starting at the station (3,1), it visits the three stations on column 3 at times 2, 4 and 6, returns back to the top of the column visiting them at times 6,8 and 10, and so on.
Given a starting station, the starting time and a destination station, your task is to determine the earliest time at which one can reach the destination using these trains.
For example suppose we start at station (2,3) at time 8 and our aim is to reach the station (1,1). We may take the slow train of the second row at time 8 and reach (2,4) at time 11. It so happens that at time 11, the fast train on column 4 is at (2,4) travelling upwards, so we can take this fast train and reach (1,4) at time 12. Once again we are lucky and at time 12 the fast train on row 1 is at (1,4), so we can take this fast train and reach (1,1) at time 15. An alternative route would be to take the ordinary train on column 3 from (2,3) at time 8 and reach (1,3) at time 10. We then wait there till time 13 and take the fast train on row 1 going left, reaching (1,1) at time 15. You can verify that there is no way of reaching (1,1) earlier than that.
Test Data: You may assume that M, N ≤ 50.
Time Limit: 3 seconds
As the size of N,M is very small we can try to solve it by recursion.
At every station, we take two trains which can take us nearer to our destination. E.g.: If we want to go to 1,1 from 2,3 , we take the trains which take us more near to 2,3 and get down to the nearest station to our destination, while keeping track of the time we take, if we reach the destination, we keep track of the minimum time so far, and if the time taken to reach the destination is lesser than the minimum we update it.
We can determine which station a train is at a particular time using this method:
/* S is the starting time of the train and N is the number of stations it
visits, T is the time for which we want to find the station the train is at.
T always be greater than S*/
T = T-S+1
Station(T) = T%N, if T%N = 0, then Station(T) = N;
Here is my question:
How do we determine the earliest time when a particular train reaches the station we want in the direction we want?
As my above algorithm uses greedy strategy, will it give an accurate answer? If not then how do I approach this problem?
P.S : This is not homework, it is an online judge problem.
I believe greedy solution will fail here, but it will be a bit hard to construct a counter-example.
This problem is meant to be solved using Dijkstra's algorithm. Edges are the connection between adjacent nodes and depend on the type of train and its starting time. You also don't need to compute the whole graph - only compute edged for the current node you are considering. I have solved numerous similar problems and this is the way you solved. Also tried to use greedy several times before I learnt it never passes.
Hope this helps.

Resources