I'm new in Haskell.
I trying to parse a text file with two matrices. The insides of a text file:
m n
a11 a12 ...
a21 a22 ...
...
b11 b12 ...
b21 b22 ...
...
where m is number of rows of the 1st matrix, n is number of rows of the 2nd matrix.
For instance:
3 2
1 2 3
4 5 6
7 8 9
1 2
3 4
I know, looks stupid, but I have a task parse a text file with 2 matrices and I only came up with it.
There is the code:
readLine :: Read a => Handle -> IO [a]
readLine = fmap (map read . words) . hGetLine
parse :: Handle -> IO (Matrix a, Matrix a)
parse = do
[m, n] <- readLine
xss1 <- replicateM m readLine
xss2 <- replicateM n readLine
return (fromLists xss1, fromLists xss2)
main = do
[input, output] <- getArgs
h <- openFile input ReadMode
(m1, m2) <- parse h
print $ mult m1 m2
There is a log from console:
Prelude> :r
[1 of 1] Compiling Matrix ( lab.matrix.hs, interpreted )
lab.matrix.hs:156:5:
Couldn't match expected type `IO [a0]' with actual type `[t0]'
In the pattern: [m, n]
In a stmt of a 'do' block: [m, n] <- readLine
In the expression:
do { [m, n] <- readLine;
xss1 <- replicateM m readLine;
xss2 <- replicateM n readLine;
return (fromLists xss1, fromLists xss2) }
Failed, modules loaded: none.
Most likely, there are still a few bugs.
Help me please, I'm exhausted already...
You need to supply a Handle as an argument to every call of readLine, so parse could look like this:
parse h = do
[m, n] <- readLine h
xss1 <- replicateM n $ readLine h
xss2 <- replicateM m $ readLine h
return (fromLists xss1, fromLists xss2)
Another note - it's probably safer to check the number of arguments returned by getArgs, rather than just assuming there will be two. For example:
main = do
args <- getArgs
case args of
[input, output] -> do
h <- openFile input ReadMode
(m1, m2) <- parse h
hClose h
print $ show mult m1 m2
_ -> putStrLn "expected two arguments"
Related
I'm trying to find a much more efficient way to code in R the following matrix:
Let A and C be two 3D array of dimension (n, n, m) and B a matrix of dimension (m, m), then M is an (n, n) matrix such that:
M_ij = SUM_kl A_ijk * B_kl * C_ijl
for (i in seq(n)) {
for (j in seq(n)) {
M[i, j] <- A[i,j,] %*% B %*% C[i,j,]
}
}
It is possible to write this with the TensorA package using i and j as parallel dimension, but I'd rather stay with base R object.
einstein.tensor(A %e% log(B), C, by = c("i", "j"))
Thanks!
I don't know if this would be faster, but it would avoid one level of looping:
for (i in seq(n))
M[i,] <- diag(A[i,,] %*% B %*% t(C[i,,]))
It gives the same answer as yours in this example:
n <- 2
m <- 3
A <- array(1:(n^2*m), c(n, n, m))
C <- A + 1
B <- matrix(1:(m^2), m, m)
M <- matrix(NA, n, n)
for (i in seq(n))
M[i,] <- diag(A[i,,] %*% B %*% t(C[i,,]))
M
# [,1] [,2]
# [1,] 1854 3216
# [2,] 2490 4032
Edited to add: Based on https://stackoverflow.com/a/42569902/2554330, here's a slightly faster version:
for (i in seq(n))
M[i,] <- rowSums((A[i,,] %*% B) * C[i,,])
I did some timing with n <- 200 and m <- 300, and this was the fastest at 3.1 sec, versus my original solution at 4.7 sec, and the one in the question at 17.4 sec.
Let A and B be arrays, of dimension [2,3,4] and [100,2], respectively. Note that 2 is the common dimension.
My desired answer is an array C of dimension [100,2,3,4] such that
C[h,i,j,k] = A[i,j,k] - B[h,i]
for all h,i,j,k.
Or
C[h,i,j,k] = A[i,j,k] + B[h,i]
for all h,i,j,k.
The later case is more easy to check the answer using the following example arrays.
E.g.,
A <- array(NA,c(2,3,4))
for (i in 1:2) {for(j in 1:3){for(k in 1:4){
A[i,j,k] <- i*1000000+j*100000+k*10000
}}}
B <- array(NA,c(100,2))
for (h in 1:100) {for(i in 1:2){B[h,i] <- h*10+i }}
How about this
C <- array(NA, c(dim(B)[1], dim(A)))
# Approach 1
for (h in 1 : dim(B)[1])
for(i in 1 : dim(A)[1])
C[h, i,, ] <- A[i,, ] - B[h, i]
# Approach 2
for (h in 1 : dim(B)[1])
C[h,,,] <- sweep(A, 1, B[h, ], "-")
To check if the answer is correct, pick some values for h, i, j, k
i <- 1; j <- 2; k <- 3; h <- 50
C[h, i, j, k]
#[1] 2338998
A[i,j,k] - B[h,i]
#[1] 2338998
I have a array/ named vector that looks like this:
d f g
1 2 3
I want to fill up the empty slots, meaning I want this:
a b c d e f g
0 0 0 1 0 2 3
Is there an elegant way of doing this, without having to write loops and conditionals? In my actual problem, instead of abcd as my array names, it's numbers. Not sure if that makes a difference. Figured alphabet is easier to understand for a reproducible example.
Create a vector of the final names, nms and then create a named vector of zeros from it using sapply and replace the elements corresponding to input names with the input values.
v <- c(d = 1, f = 2, g = 3) # input
nms <- letters[letters <= max(names(v))] # names on output vector, i.e. letters[1:7]
replace(sapply(nms, function(x) 0), names(v), v) ##
giving:
a b c d e f g
0 0 0 1 0 2 3
If in your actual vector the names are not letters then just set nms yourself. For example, nms <- c("dogs", "cats", "d", "elephants", "f", "g") would work with the same line marked ## above.
2) An alternative is to replace the line marked ## above with:
unlist(modifyList(as.list(setNames(numeric(length(nms)), nms)), as.list(v)))
Data
x <- c(d=1L,f=2L,g=3L);
x;
## d f g
## 1 2 3
Solution 1: First match new names into x and extract values, then replace NAs with zero.
x <- setNames(x[match(letters[1:7],names(x))],letters[1:7]);
x[is.na(x)] <- 0L;
x;
## a b c d e f g
## 0 0 0 1 0 2 3
Solution 2: One-liner, using nomatch argument of match().
setNames(c(x,0L)[match(letters[1:7],names(x),nomatch=length(x)+1L)],letters[1:7]);
## a b c d e f g
## 0 0 0 1 0 2 3
I already asked a similar question, however the input data has different dimension and I don't get the bigger array filled with the smaller matrix or array. Here some basic example data showing my structure:
dfList <- list(data.frame(CNTRY = c("B", "C", "D"), Value=c(3,1,4)),
data.frame(CNTRY = c("A", "B", "E"),Value=c(3,5,15)))
names(dfList) <- c("111.2000", "112.2000")
The input data is a list of >1000 dfs. Which I turned into a list of matrices with the first column as rownames. Here:
dfMATRIX <- lapply(dfList, function(x) {
m <- as.matrix(x[,-1])
rownames(m) <- x[,1]
colnames(m) <- "Value"
m
})
This list of matrices I tried to filled in an array as shown in my former question. Here:
loadandinstall("abind")
CNTRY <- c("A", "B", "C", "D", "E")
full_dflist <- array(dim=c(length(CNTRY),1,length(dfMATRIX)))
dimnames(full_dflist) <- list(CNTRY, "Value", names(dfMATRIX))
for(i in seq_along(dfMATRIX)){
afill(full_dflist[, , i], local= TRUE ) <- dfMATRIX[[i]]
}
which gives the error message:
Error in `afill<-.default`(`*tmp*`, local = TRUE, value = c(3, 1, 4)) :
does not make sense to have more dims in value than x
Any ideas?
I also tried as in my former question to use acast and also array() instead of the dfMATRIX <- lapply... command. I would assume that the 2nd dimension of my full_dflist-array (sorry for the naming:)) is wrong, but I don't know how to write the input. I appreciate your ideas very much.
Edit2: Sorry I put the wrong output:) Here my new expected output:
$`111.2000`
Value
A NA
B 3
C 1
D 4
E NA
$`112.2000`
Value
A 3
B 5
C NA
D NA
E 15
This could be one solution using data.table:
library(data.table)
#create a big data.table with all the elements
biglist <- rbindlist(dfList)
#use lapply to operate on individual dfs
lapply(dfList, function(x) {
#use the big data table to merge to each one of the element dfs
temp <- merge(biglist[, list(CNTRY)], x, by='CNTRY', all.x=TRUE)
#remove the duplicate values
temp <- temp[!duplicated(temp), ]
#convert CNTRY to character and set the order on it
temp[, CNTRY := as.character(CNTRY)]
setorder(temp, 'CNTRY')
temp
})
Output:
$`111.2000`
CNTRY Value
1: A NA
2: B 3
3: C 1
4: D 4
5: E NA
$`112.2000`
CNTRY Value
1: A 3
2: B 5
3: C NA
4: D NA
5: E 15
EDIT
For your updated output you could do:
lapply(dfList, function(x) {
temp <- merge(biglist[, list(CNTRY)], x, by='CNTRY', all.x=TRUE)
temp <- temp[!duplicated(temp), ]
temp[, CNTRY := as.character(CNTRY)]
setorder(temp, 'CNTRY')
data.frame(Value=temp$Value, row.names=temp$CNTRY)
})
$`111.2000`
Value
A NA
B 3
C 1
D 4
E NA
$`112.2000`
Value
A 3
B 5
C NA
D NA
E 15
But I would really suggest keeping the list with data.table elements rather than converting to data.frames so that you can have row.names.
I am reading in data with the JSON package.
Basically, the data has the following format:
{"a":1,"b":2,"c":3}
{"a": null,"b":2,"c":3}
I am storing the data as follows in R:
DAT<-data.table(read.csv("D:/file.csv"))
i<-1
#create unified variable names
while (i<=nrow(DAT)) {
OUT[[i]]<-fromJSON(as.character(DAT[i]$results))
vnames<-c(vnames,names(OUT[[i]]))
i<-i+1
}
#create the corresponding content
content <- NULL
Applicant <- NULL
i<-1
while (i<=nrow(DAT)) {
temp<-fromJSON(as.character(DAT[i]$results))
laenge <- length(fromJSON(as.character(DAT[i]$results)))
for(j in 1:laenge)
{
content_new <- as.character(temp[[j]])
content <- c(content, content_new)
}
i <- i+1
}
Then I want to join the lists via (in order to have the data in the typical format):
assets_mren = data.frame(asset_class=vnames, value=content)
Yet I receive an error message stating that vnames and content have different number of rows. I believe that the problem is "null" in the data to be read in. Do you have an idea how to read in "null" above or how to better read in the data?
Yes the problem is null. You get different structure for each row.
ll <- '{"a":1,"b":2,"c":3}
{"a": null,"b":2,"c":3}'
res <- lapply(ll,function(x)str(fromJSON(x)))
Named num [1:3] 1 2 3 ## named vector for the first line
- attr(*, "names")= chr [1:3] "a" "b" "c"
List of 3
$ a: NULL ## list for the second line
$ b: num 2
$ c: num 3
So you have to homogenise the output of each line. Here 2 options:
1- replace null by a dummy values (0 or -1) for example:
ll <- readLines(textConnection(gsub("null",-1,ll)))
do.call(rbind,lapply(ll,function(x)
fromJSON(x)))
a b c
[1,] 1 2 3
[2,] -1 2 3 ## res[res==-1] <- NA to replace dummy value
2- keep the null but you should use rbind.fill to get a data.frame:
ll <- readLines(textConnection(gsub("null",-1,ll)))
do.call(rbind,lapply(ll,function(x)
fromJSON(x)))
ll <- '{"a":1,"b":2,"c":3}
{"a": null,"b":2,"c":3}'
ll <- readLines(textConnection(ll))
res <- lapply(ll,function(x)
as.data.frame(t(as.matrix(unlist(fromJSON(x))))))
library(plyr)
rbind.fill(res)
a b c
1 1 2 3
2 NA 2 3