Cannot match exact number within an array in R - arrays

This must be a trivial one but after many, many trials and errors I need to ask this.
I would like to find a whole number within a numerical array, e.g. 14 in c(1,14,144).
The code I tried reads
dayNo <- 14
which(grepl(dayNo, c(1,14,144))==TRUE)
I get 2 & 3. The results I am looking for is 2.
Another one is
dayNo <- 14
which(grepl("\\bdayNo\\b", c(1,14,144))==TRUE)
but I get as result integer(0).
Any ideas would be very appreciated.

In this case this simple solution works just fine
which(dayNo==c(1,14,144))and gives as expected 2

Related

Excel calculate smallest of X columns within Y columns, ignoring zeros

I'm trying to calculate the sum of best segments in a run. For example, each Km gives a list as such:
5:40 6:00 5:45 5:55 6:21 6 :30
I'm trying to gather the best segments of 2km/3km/4km etc and would like a simple code to do it. At the moment, I'm using the formula
=Min(If(B1=0,9:9:9,sum(A1:B1),If(C1=0,9:9:9,sum(B1:C1))
but this goes all the way to 50km, meaning a very long formulae that I then have to repeat slightly differently at 3km, then 4km, then 5km etc. Surely there must me a way of
generating an array of summed columns of every n column, then iterating over that to find the min while ignoring values of 0?
I can do it manually for now, but what if I want to go over 50km? I might want to incorporate bike rides/car drives in the future just for some data analysis so I figured it best finding an ideal formulae now.
It's frustrating as I could code it and I want to avoid VBA ideally and stick to formulae in Excel.
Here is a draft of the case where there aren't any zeroes just for groups of 2Km. I decided the simplest approach initially was to add a couple of helper rows containing the running total of times (and for later use counts) and use a formula like this to subtract them in pairs:
=MIN(INDEX(A2:J2,SEQUENCE(1,9,2))-IF(SEQUENCE(1,9,0)=0,0,INDEX(A2:J2,SEQUENCE(1,9,0))))
but if you have access to recent additions to Excel 365 like Scan you can do it without helper rows.
Here is a more realistic scenario with a couple of zeroes thrown in
=LET(runningSum,Y$4:AP$4,runningCount,Y$5:AP$5,cols,COLUMNS(runningSum),leg,X7,
seqEnd,SEQUENCE(1,cols-leg+1,leg),seqStart,SEQUENCE(1,cols-leg+1,0),
times,INDEX(runningSum,seqEnd)-IF(seqStart=0,0,INDEX(runningSum,seqStart)),
counts,INDEX(runningCount,seqEnd)-IF(seqStart=0,0,INDEX(runningCount,seqStart)),
MIN(IF(counts=leg,times)))
Note that there are no runs of more than seven consecutive legs that don't contain a zero so 8, 9, 10 etc. just work out to 0.
As mentioned you could dispense with the helper rows by using Scan, but not everyone has access to this so I will add it separately:
=LET(data,Y$3:AP$3,runningSum,SCAN(0,data,LAMBDA(a,b,a+b)),
runningCount,SCAN(0,data,LAMBDA(a,b,a+(b>0))),leg,X7,cols,COLUMNS(data),
seqEnd,SEQUENCE(1,cols-leg+1,leg),seqStart,SEQUENCE(1,cols-leg+1,0),
times,INDEX(runningSum,seqEnd)-IF(seqStart=0,0,INDEX(runningSum,seqStart)),
counts,INDEX(runningCount,seqEnd)-IF(seqStart=0,0,INDEX(runningCount,seqStart)),
MIN(IF(counts=leg,times)))
Tom that worked! I learnt a few things on the way too and using the indexing method alongside sequence and columns is something I had not thought of. I'd never heard of the LET command before and I can already see that this is going to really help with some of the bigger calculations in the future.
Thank you so much, I'd like to show you how it now looks. Row 3087 is my old formula, row 3088 is a copy of the same data using the new formula, as you can see I've gotten exactly the same results so it's clear that it works perfectly and it is can be easily duplicated.

(excel) How to return an array from a sum of ranges?

I'm setting up a morphological table that will have to go through potentially a couple hundred items, so it's desirable for this process to not be done by hand.
Here's a small summary of the situation:
fin
eng
op
fli
A
2
4
6
8
B
1
3
5
4
C
1
2
3
5
D
1
4
7
2
The first column holds named ranges A through D which have associated values from the 4 categories in row 1.
In a second table we create configurations based on which features are selected, something like this:
Config 1
Config 2
A
B
C
D
What I'm looking for is a formula that would read for each configuration which named range is selected, add the score for each category and return it in a simple array. Something like
Config 1 {3,6,9,13}, Config 2 {2,7,12,6}
So far I've found that the Indirect formula works exactly the way I want but I have to manually input each range. Something like:
=INDIRECT(A1)+INDIRECT(A2)
I've played around with different permutations of sum functions but instead of returning the arrays it returns the sum of the first values.
=SUM(INDIRECT(A1:A2))
Amy suggestion would be welcome.
I know this would probably be much simpler with code but this study needs to be done in excel..
I'm not sure if this answers your question as it doesn't use named ranges, but you could try something like this:
=MMULT(SEQUENCE(1,4,1,0),$B$2:$E$5*COUNTIF(INDEX($H$2:$I$3,0,ROW()-ROW($A$7)+1),$A$2:$A$5))

Best approach for finding the maximum array element in a given range

Given a non-negative integer array of length n and m queries consisting of two integers a and b, it is expected to find the maximum in the range of index [a,b] of the array. Note that a can be greater than b, in which case the desired range is from a to n and then from 1 to b. And an input k is also given that signifies that the length of the range to be considered is also constant that is constant
Example:
INPUT:
6 3 5 ---> n,m,k
7 6 2 6 1 5 ---> integer array
1 5 ---> query 1
2 6 ---> query 2
4 2 ---> query 3
OUTPUT:
7
6
7
I referred this article but am not able to get how to take care of the cases where a>b. Is there any other approach for this problem
Sliding window approach:
To solve the problem using approach mentioned i.e. Sliding Window Maximum, Just append the input array to itself like as shown below:
7 6 2 6 1 5 7 6 2 6 1 5
For a<=b case work as normal.
For a>bcase: Consider b = a + k. So your new range is [a,a+k] which you can happily solve without any changes to algorithm.
To optimize the above approach a bit, you can just append first k elements.
If you slide over every time a query arrives, it takes O(n) per query. k being very close or equal to n is the worst case.
Alternative Approach: Use the following approach in case of heavy querying and flexible ranges.
You are looking for range queries and this is what Segment Trees are popular for.
This tutorial finds the minimum in given range. I know you have asked for maximum, which is just a trivial change you have to make in code.
For a>b case, query two times once for [1,b] & then for [a,n] and report the maximum out of the two.
Preprocessing time: O(n)
Extra Space: O(n)
This approach is very efficient as it will answer every query in O(logn) which is quite helpful in case you are querying too much.
Sliding Window is going to output maximum element in all the ranges, but you need the maximum element only in given range. So instead of going with Sliding Window approach go with Segment Trees or Binary Indexed Trees. You'll feel the fun of truly querying within a range and not sliding over. (Just sliding over every time a query arrives won't scale if the range is flexible.)
I think this could be done by using divide and conquer approach, so let's take a look at the above example.
So for the case a>b
find max for range (1,b), say max_b = max_in_range(1,b).
find max for range (a,n), say max_a = max_in_range(a,n).
Now you can easily take up max between two numbers using a in built max method in any language as
ans = max(max_a, max_b)
But problems like this which involes ranges, you can solve it using segment trees, here is the link to start with - https://en.wikipedia.org/wiki/Segment_tree
Hope this helps!

SPSS Identifying Different Lagged Values Through Loops

I have this dataset with 2 variables: week and brand_chosen, where brand chosen designates which product from e.g. a super market was chosen, an it looks like this.
Week brand_chosen
2 19
2 15
2 50
2 12
3 19
3 16
3 50
4 77
4 19
What I am trying to do is for each line, to note the week in which the brand purchase was made, and check if in the week before that the same brand purchase was made. In case it did, a variable dummy would take the value of 1, otherwise 0.
Because week appears multiple times I cannot take just the lag(week,1), so I probably need to loop through the week variables for each case, until it finds the first different value.
This is what i tried to do
loop i=1 to 70.
do if (week<>lag(week,i) and brand_chosen=lag(brand_chosen,i)).
compute dummy=1.
end loop.
else.
compute dummy=0.
end if.
end loop.
execute.
Where 70 is just an arbitrary number so that I am sure that it will check all the previous cases.
I get two problems with that. First the lag function needs to contain a number from what I understand but "i" is not considered a number here.
The second problem is that i would like to close the loop if the condition is satisfied, and move to the next case but I get an error.
I am new to spss syntax and I am struggling with that one, so any help is greatly appreciated.
I assume that every combination of week--brand_chosen is unique. In this case the solution is quite simple. Just reorder your dataset by brand_chosen and then week, and then run a simple lag command.
This should do the trick:
SORT CASES BY brand_chosen week.
COMPUTE dummy=0.
IF (brand_chosen=LAG(brand_chosen) AND week>LAG(week)) dummy = 1.

Does the position of the blank in an n-puzzle solution affect the set of valid puzzles?

I'm having trouble with my n-puzzle solver. Thought it was working, but it turns out it is solving insoluble puzzles. I've tried to trace it, but that's a lot of tracing and so far I see no cheating. I think I understand the algorithm for determining solubility, and my implementation agrees with the odd/even parity of some examples from the web... that is to say, when I count up the number of tiles after a given tile that are smaller than it, for every tile, and then add the row index of the blank tile, I get the same odd or even number as others have gotten.
So a thought that has occurred to me. In my model of, say, the 8-puzzle, my solution state is:
_ 1 2
3 4 5
6 7 8
Rather than
1 2 3
8 _ 4
7 6 5
Or
1 2 3
4 5 6
7 8 _
As it is in some other formulations. Could this be affecting which puzzles are soluble and which are not?
Thanks!
z.
In general, yes: If a configuration is solvable to the standard solution, it will not be solvable to an unsolvable configuration.
In particular, it depends on the exact configuration you're using as a solution. You will need to check to see if you can solve from that configuration to the standard one.
EDIT: This of it this way:
Let A be the standard solution.
Let B be your preferred solution.
Let C be your starting configuration.
If you can get from A to B, and you can get from C to A, then you can get from C to B.
But if you can't get from A to B, and you can get from C to A, then you can't get from C to B.

Resources