loop through column and add other row - arrays

EDIT: I've made some progress. So I read up on subsets, and was able to break down my dataframe under a certain condition. Let's say titleCSV[3] consists of file names ("file1", "file2", "file3", etc) and titleCSV[13] contains values (-18, -8, -2, etc). Code below:
titleRMS <- data.frame(titleCSV[3], titleCSV[13])
for(x.RMS in titleRMS[2]){
x.RMS <- gsub("[A-Za-z]","",r)
x.RMS <- gsub(" ","",r)
x.RMS = abs(as.numeric(r))
}
x.titleRMSJudge <- data.frame(titleRMS[1], x.RMS)
x.titleRMSResult <- subset(x.titleRMSJudge, r < 12)
My question now is, what's the best way to print each row of the first column of x.titleRMSResult with a message saying that it's loud? Thanks, guys!
BTW, here is the dput of my titleRMS:
dput(titleRMS)
structure(list(FILE.NAME = c("00-Introduction.mp3", "01-Chapter_01.mp3",
"02-Chapter_02.mp3", "03-Chapter_03.mp3", "04-Chapter_04.mp3",
"05-Chapter_05.mp3", "06-Chapter_06.mp3", "07-Chapter_07.mp3",
"08-Chapter_08.mp3", "09-Chapter_09.mp3", "10-Chapter_10.mp3",
"11-Chapter_11.mp3", "12-Chapter_12.mp3", "Bonus_content.mp3",
"End.mp3"), AVG.RMS..dB. = c(-14, -10.74, -9.97, -10.53, -10.94,
-12.14, -11, -9.19, -10.42, -11.51, -14, -10.96, -11.71, -11,
-16)), .Names = c("FILE.NAME", "AVG.RMS..dB."), row.names = c(NA,
-15L), class = "data.frame")
ORIGINAL POST BELOW
Newb here! Coding in R. So I am trying to analyze a csv file. One column has 10 rows with different file names, while the other has 10 rows with different values. I want to run the 2nd column into a loop, and if it's greater/less than a certain value, I wanted it to print the associating file name as well as a message. I don't know how to have both columns run in a loop together so that the proper file name prints with the proper value/message. I wrote a loop that ends up checking each value for as many rows as there are in the other column. At the moment, all 10 rows meet the criteria for the message I want to print, so I've been getting 100 messages!
titleRMS <- data.frame(titleCSV[3], titleCSV[13])
for(title in titleRMS[1]){
title <- gsub(" ","",title)
}
for(r in titleRMS[2]){
r <- gsub("[A-Za-z]","",r)
r <- gsub(" ","",r)
r = abs(as.numeric(r))
for(t in title){
for(f in r){
if (f < 18 & f > 0) {
message(t, "is Loud!")
}
}
}
}
And this line of code only prints the first file name for each message:
for(r in titleRMS[2]){
r <- gsub("[A-Za-z]","",r)
r <- gsub(" ","",r)
r = abs(as.numeric(r))
for(f in r){
if (f < 18 & f > 0) {
message(t, "is Loud!")
}
}
}
Can someone throw me some tips or even re-write what I wrote to show me how to get what I need? Thanks, guys!

I've figured out my own issue. Here is what I wrote to come to the conclusion I wanted:
titleRMS <- data.frame(titleCSV[3], titleCSV[13])
filesHighRMS <- vector()
x.titleRMSJudge <- data.frame(titleCSV[3], titleCSV[13])
x.titleRMSResult <- subset(x.titleRMSJudge, titleCSV[13] > -12 & titleCSV[15] > -1)
for(i in x.titleRMSResult[,1]){
filesHighRMS <- append(filesHighRMS, i, 999)
}
emailHighRMS <- paste(filesHighRMS, collapse=", ")
blurbHighRMS <- paste("" ,nrow(x.titleRMSResult), " file(s) (" ,emailHighRMS, ") have a high RMS and are too loud.")
Being new to code, I bet there is a simpler way, I'm just glad I was able to work this out on my own. :-)

You're making things hard on yourself. You don't need regex for this, and you probably don't need a loop, at least not through your data frame. Definitely you don't need nested loops.
I think this will do what you say you want...
indicesToMessage <- titleRms[, 2] > 0 & titleRms[, 2] < 18
myMessages <- paste(titleRms[indicesToMessage, 1], "is Loud!")
for (i in 1:length(myMessages)) {
message(myMessages[i])
}
A more R-like way (read: without an explicit loop) to do the last line is like this:
invisible(lapply(myMessages, message))
The invisible is needed because message() doesn't return anything, just has the side-effect of printing to the console, but lapply expects a return and will print NULL if there is none. invisible just masks the NULL.
Edits: Negative data
Since your data is negative, I assume you actually want messages when the absolute value abs() is between 0 and 18. This works for that case.
indicesToMessage <- abs(titleRms[, 2]) > 0 & abs(titleRms[, 2]) < 18
myMessages <- paste(titleRms[indicesToMessage, 1], "is Loud!")
invisible(lapply(myMessages, message))

Related

"ValueError: too many values to unpack (expected 2)" should not happen here

I'm currently trying to create a Sudoku without help but i'm stuck on one issue.
def play():
global myinput
global column_rdm
sudoku_col = [[] for _ in range(9)]
for i in range(9):
sudoku_col[i].append(0)
h = 1
try:
while h < 10:
rdm_list = random.sample(range(1, 10), 9)
test_var = 0
for j in range(9):
if rdm_list[j] not in sudoku_col[j]:
test_var += 1
if test_var == 9:
for rdm_number, g in rdm_list, range(9):
sudoku_col[g].append(rdm_number)
# Input the values found in the sudoku
column_rdm = f"{rdm_number}"
myinput = Input(h, g+1)
myinput.value_def(column_rdm) # end
h += 1
update()
# except Exception as e:
# print("Erreur dans la création du Sudoku")
finally:
print(h)
Here the function that should create my Sudoku. I create random lists of 9 numbers which will be my sudoku raws, and i check if each item of those lists is already present in its column with my "sudoku_col". If the test is OK (that is, test_var == 9), then I add this raw to my template. If not, I create a new random list and let it complete the test again. I do that until I have 9 raws (h < 10).
However, the code stops at line "for rdm_number, g in rdm_list, range(9):" due to a ValueError. That should not happen, because rdm_list and range(9) have the same lenght and each item in both lists should be iterated correctly. What am I missing here ?
Thank you for your time
It should be
for rdm_number, g in zip(rdm_list, range(9)):
what you are doing is the same as
for rdm_number, g in (rdm_list, range(9)):
which creates a tuple with two items that you iterate over, you can see that happen if you do this (it will print out whatever is the rdm_list and range(0, 9)):
for sth in rdm_list, range(9):
print(sth)
also while h < 10 can just be replaced with for h in range(9): and you don't need to increase any variables and for loops are faster.
Another improvement would be to do this (instead of using the range and accessing values by index):
for rdm, s_col in zip(rdm_list, sudoku_col):
if rdm not in s_col:
test_var += 1
Also this:
sudoku_col = [[] for _ in range(9)]
for i in range(9):
sudoku_col[i].append(0)
can easily be reduced to
sudoku_col = [[0] for _ in range(9)]
Again you shouldn't use range to access values by using index, you should iterate over the values like this: for value in iterable:, instead of for index in range(len(iterable)), if you also need the index then use this: for index, value in enumerate(iterable):

How to add a continuous sequence (unique identifier) to an array?

I have an array, similar to this:
ar1<- array(rep(1, 91*5*4), dim=c(91, 5, 4))
I want to add an extra column at the end of each component (n = 4) that is sequential across all components (I'm not sure if component is the right word).
In this case it would be a sequence from 1 to 364.
The idea behind this is that if the rows are scrambled when I'm messing around with joining data or anything else I would be able to see it and rectify it.
How do I achieve this please?
Maybe the following is what you want.
It uses apply to add an extra column to each slice defined by the 2nd dimension of the array and after this is done sets the final dimensions correctly.
ar2 <- sapply(1:5, function(i){
new <- seq_len(NROW(ar1[, i, ])) + (i - 1)*NROW(ar1[, i, ])
cbind(ar1[, i, ], new)
})
dim(ar2) <- c(91, 5, 5)
The code above creates a new array, if you want you can rewrite the original one.
To get the original back this will do it.
n <- dim(ar2)[2]
ar1_back <- sapply(1:5, function(i){
ar2[, -n, i]
})
dim(ar1_back) <- c(91, 5, 4)
identical(ar1, ar1_back)
#[1] TRUE

lapply and rbind not properly appending the results

SimNo <- 10
for (i in 1:SimNo){
z1<-rnorm(1000,0,1)
z2<-rnorm(1000,0,1)
z3<-rnorm(1000,0,1)
z4<-rnorm(1000,0,1)
z5<-rnorm(1000,0,1)
z6<-rnorm(1000,0,1)
X<-cbind(z1,z2,z3,z4,z5,z6)
sx<-scale(X)/sqrt(999)
det1<-det(t(sx)%*%sx)
detans<-do.call(rbind,lapply(1:SimNo, function(x) ifelse(det1<1,det1,0)))
}
when I run all commands with in loop except last one I get different values of determinant but when I run code with loops at once I get last value of determinant repeated for all.
Please help and guide to control all situation like this.
Is there way to have short and efficient way for this code, so that each individual variable can also be accessed.
Whenever you are repeating the same operation multiple times, and without inputs, think about using replicate. Here you can use it twice:
SimNo <- 10
det1 <- replicate(SimNo, {
X <- replicate(6, rnorm(1000, 0, 1))
sx <- scale(X) / sqrt(999)
det(t(sx) %*% sx)
})
detans <- ifelse(det1 < 1, det1, 0)
Otherwise, this is what your code should have looked with your for loop. You needed to create a vector for storing your outputs at each loop iteration:
SimNo <- 10
detans <- numeric(SimNo)
for (i in 1:SimNo) {
z1<-rnorm(1000,0,1)
z2<-rnorm(1000,0,1)
z3<-rnorm(1000,0,1)
z4<-rnorm(1000,0,1)
z5<-rnorm(1000,0,1)
z6<-rnorm(1000,0,1)
X<-cbind(z1,z2,z3,z4,z5,z6)
sx<-scale(X)/sqrt(999)
det1<-det(t(sx)%*%sx)
detans[i] <- ifelse(det1<1,det1,0)
}
Edit: you asked in the comments how to access X using replicate. You would have to make replicate create and store all your X matrices in a list. Then use the *apply family of functions to loop throughout that list to finish the computations:
X <- replicate(SimNo, replicate(6, rnorm(1000, 0, 1)), simplify = FALSE)
det1 <- sapply(X, function(x) {
sx <- scale(x) / sqrt(999)
det(t(sx) %*% sx)
})
detans <- ifelse(det1 < 1, det1, 0)
Here, X is now a list of matrices, so you can get e.g. the matrix for the second simulation by doing X[[2]].
SimNo <- 10
matdet <- matrix(data=NA, nrow=SimNo, ncol=1, byrow=TRUE)
for (i in 1:SimNo){
z1<-rnorm(1000,0,1)
z2<-rnorm(1000,0,1)
z3<-rnorm(1000,0,1)
z4<-rnorm(1000,0,1)
z5<-rnorm(1000,0,1)
z6<-rnorm(1000,0,1)
X<-cbind(z1,z2,z3,z4,z5,z6)
sx<-scale(X)/sqrt(999)
det1<-det(t(sx)%*%sx)
matdet[i] <-do.call(rbind,lapply(1:SimNo, function(x) ifelse(det1<1,det1,0)))
}
matdet

Vector search Algorithm

I have the following problem. Say I have a vector:
v = [1,2,3,4,5,1,2,3,4,...]
I want to sequentially sample points from the vector, that have an absolute maginute difference higher than a threshold from a previously sampled point. So say my threshold is 2.
I start at the index 1, and sample the first point 1. Then my condition is met at v[3], and I sample 3 (since 3-1 >= 2). Then 3, the new sampled point becomes the reference, that I check against. The next sampled point is 5 which is v[5] (5-3 >= 2). Then the next point is 1 which is v[6] (abs(1-5) >= 2).
Unfortunately my code in R, is taking too long. Basically I am scanning the array repeatedly and looking for matches. I think that this approach is naive though. I have a feeling that I can accomplish this task in a single pass through the array. I dont know how though. Any help appreciated. I guess the problem I am running into is that the location of the next sample point can be anywhere in the array, and I need to scan the array from the current point to the end to find it.
Thanks.
I don't see a way this can be done without a loop, so here is one:
my.sample <- function(x, thresh) {
out <- x
i <- 1
for (j in seq_along(x)[-1]) {
if (abs(x[i]-x[j]) >= thresh) {
i <- j
} else {
out[j] <- NA
}
}
out[!is.na(out)]
}
my.sample(x = c(1:5,1:4), thresh = 2)
# [1] 1 3 5 1 3
You can do this without a loop using a bit of recursion:
vsearch = function(v, x, fun=NULL) {
# v: input vector
# x: threshold level
if (!length(v) > 0) return(NULL)
y = v-rep(v[1], times=length(v))
if (!is.null(fun)) y = fun(y)
i = which(y >= x)
if (!length(i) > 0) return(NULL)
i = i[1]
return(c(v[i], vsearch(v[-(1:(i-1))], x, fun=fun)))
}
With your vector above:
> vsearch(c(1,2,3,4,5,1,2,3,4), 2, abs)
[1] 3 5 1 3

R - Vector/ Array Addition

I a having a little trouble with vector or array operations.
I have three 3D arrays and i wanna find the average of them. How can i do that? we can't use mean() as it only returns a single value.
The more important is some of the cells in the arrays are NA whic mean if i just add them like
A = (B + C + D)/3
The results of will show NA as well.
How can i let it recognise if the cell is NA then just skip it.
Like
A = c(NA, 10, 15, 15, NA)
B = c(10, 15, NA, 22, NA)
C = c(NA, NA, 20, 26, NA)
I wanna the output of average these vectors be
(10, (10+15)/2, (15+20)/2, (15+22+26)/3, NA)
We also can't use na.omit, because it will move the order of indexes.
This is the corresponding code. i wish it would be helpful.
for (yr in 1950:2011) {
temp_JFM <- sst5_sst2[,,year5_sst2==yr & (month5_sst2>=1 & month5_sst2<=3)]
k = 0
jfm=4*k+1
for (i in 1:72) {
for (j in 1:36) {
iposst5_sst2[i,j,jfm] <- (temp_JFM[i,j,1]+temp_JFM[i,j,2]+temp_JFM[i,j,3])/3
}
}
}
Thnk you.
It already been solved.
The easiest way to correct it can be shown below.
iposst5_sst2[i,j,jfm] <- mean(temp_JFM[i,j,],na.rm=TRUE)
I'm not entirely sure what your desired output is, but I'm guessing that what you really want to build is not three 3D arrays, but one 4D array that you can then use apply on.
Something like this:
#Three 3D arrays...
A <- array(runif(1:27),dim = c(3,3,3))
B <- array(runif(1:27),dim = c(3,3,3))
C <- array(runif(1:27),dim = c(3,3,3))
#Become one 4D array
D <- array(c(A,B,C),dim = c(3,3,3,3))
#Now we can simply apply the function mean
# and use it's na.rm = TRUE argument.
apply(D,1:3,mean,na.rm = TRUE)
Here's an example which makes a vector of the three values, which makes na.omit usable:
vectorAverage <- function(A,B,C) {
Z <- rep(NA, length(A))
for (i in 1:length(A)) {
x <- na.omit(c(A[i],B[i],C[i]))
if (length(x) > 0) Z[i] = mean(x)
}
Z
}
Resulting in:
vectorAverage(A,B,C)
[1] 10.0 12.5 17.5 21.0 NA
Edited: Missed the NaN in the output of the first version.

Resources