Applying Julia function to nested array of arrays - arrays

Is there a simpler way to do apply a function in Julia to nested array than defining a new function? - e.g. for this simple example:
a = collect(1:10)
b = [ a*i for i in 100:100:400]
arraylog(x) = log.(x) ## Need to define an extra function to do the inner array?
arraylog.(b)

I would use a comprehension just like you used it to define b: [log.(x) for x in b].
The benefit of this approach is that such code should be easy to read later.
EDIT
Referring to the answer by Tasos actually a comprehension implicitly defines an anonymous function that is passed to Base.Generator. In this use case a comprehension and map should be largely equivalent.
I assumed that MR_MPI-BGC wanted to avoid defining an anonymous function.
If it were allowed one could also use a double broadcast like this:
(x->log.(x)).(b)
which is even shorer but I thought that it would not be very readable in comparison to a comprehension.

You could define it as a lambda instead.
Obviously the distinction may be moot, depending on how you're using this later in your code, but if all you want is to not waste a line in your code for the sake of conciseness, you could easily dump this inside a map statement, for instance:
map( x->log.(x), b )
or, if you prefer do syntax:
map(b) do x
log.(x)
end
PS. I'm not familiar with a syntax which allows the broadcasted version of a function to be plugged directly into map, but if one exists it would be even cleaner than a lambda here ... but alas 'map( log., b )' is not valid syntax.

Related

Julia, use of map to run a function multiple times,

I have some code that runs fine and does what I want, although there may be a simpler more elegant solution, this works :
round(Int16, floor(rand(TruncatedNormal(150,20,50,250))))
However when I try to execute it multiple times, using map, it throws an error saying it doesn't like the Int16 specification, so this:
map(round(Int16, floor(rand(TruncatedNormal(150,20,50,250)))), 1:2)
throws this error
ERROR: MethodError: objects of type Int16 are not callable
I just want to run it twice (in this case) and sum the results. Why is it unhappy? Thx. J
The first argument to map is a function. So, with your code, Julia is trying to make a function call:
round(Int16, floor(rand(TruncatedNormal(150,20,50,250))))()
But the output of round(Int16, ...) isn't a function, it's a number, so you cannot call it. That's why the error says "objects of type Int16 are not callable." You could fix this by using an anonymous function:
map(() -> round(Int16, floor(rand(TruncatedNormal(150,20,50,250)))), 1:2)
But the "Julian" way to do this is to use a comprehension:
[round(Int16, floor(rand(TruncatedNormal(150,20,50,250)))) for _ in 1:2]
EDIT:
If you are going to sum the results, then you can use something that looks like a comprehension but is called a generator expression. This is basically everything above with the [ ] around the expression. A generator expression can be used directly in functions like sum or mean, etc.
sum(round(Int16, floor(rand(TruncatedNormal(150,20,50,250)))) for _ in 1:2)
The advantage to generator expressions is that they don't allocate the memory for the full array. So, if you did this 100 times and used the sum approach above, you wouldn't need to allocate space for 100 numbers.
This goes beyond the original question, but OP wanted to use the sum expression where the 2 in 1:2 is a 1-element vector. Of course, if the input is always a 1-element vector, then I recommend first(x) like the comments. But this is a nice opportunity to show the importance of breaking things down into functions frequently in Julia. For example, you could take the entire sum expression and define a function
generatenumbers(n::Integer) = sum(... for _ in 1:n)
where n is a scalar. Then if you have some odd array expression for n (1-element vector, many such ns in a multi-dim array, etc.), you can just do:
generatenumbers.(ns)
# will apply to each element and return same shape as ns
If the de-sugaring logic is more complex than applying element-wise, you can even define:
generatenumbers(ns::AbstractArray) = # ... something more complex
The point is to define an "atomic" function that expresses the statement or task you want clearly, then use dispatch to apply it to more complicated data-structures that appear in practical code. This is a common design pattern in Julia (not the only option, but an effective one).
Adding on the answer from #darsnack.
If you want to run it multiple times in order to keep the results (it wasn't clear from the question). Then you could also ask rand to produce a vector by doing the following (and also making the type conversion through the floor call).
Moving from:
map(round(Int16, floor(rand(TruncatedNormal(150,20,50,250)))), 1:2)
to:
floor.(Int16, rand(TruncatedNormal(150,20,50,250), 2))
The documentation is here.

Matlab equivalent to Apply in Mathematica

I'm looking for an equivalent in Matlab to do the same as Apply in Mathematica. There it works like this
fct=#1+#2^2+#3^3+#4^4&;
Apply[fct,{a,b,c,d}]=f[a,b,c,d]
In Matlab, it's easy with a cell like this
fct=#(x1,x2,x3,x4) x1+x2^2+x3^3+x4^4;
mycell={a,b,c,d};
fct(mycell{:})
because mycell{:}=a,b,c,d is a valid comma-separated list of arguments for fct (see here).
Now, I would like to do the same with an array, e.g. like this:
myarrray=[a,b,c,d];
fct(myarray(:))
but this doesn't work. Sadly, things like fct(num2cell(myarray){:}) don't work either.
The problem here is that I would like to use the function as a one-liner (the array already exists, it can be called by its name). The reason for this is that the function should be an element of a struct.
Of course, in reality, my function looks differently. Note that I'm not looking for arrayfun which maps a function over an array.
(side note: In Mathematica, I can even write Apply[#1+#2^2+#3^3+#4^4&,{a,,b,c,d}] and there is even an infix notation for Apply making it more concise.)

eval in function scope (accessing function args)

Given:
abstract ABSGene
type NuGene <: Genetic.ABSGene
fqnn::ANN
dcqnn::ANN
score::Float32
end
function mutate_copy{T<:ABSGene}(gene::T)
all_fields_except_score = filter(x->x != :score, names(T))
all_fields_except_score = map(x->("mutate_copy(gene.$x)"),all_fields_except_score)
eval(parse("$(T)("*join(all_fields_except_score,",")*")"))
end
ng = NuGene()
mutated_ng = mutate_copy(ng)
results in:
ERROR: gene not defined
in mutate_copy at none:4
If I just look at it as a string (prior to running parse and eval) it looks fine:
"NuGene(mutate_copy(gene.fqnn),mutate_copy(gene.dcqnn))"
However, eval doesn't seem to know about gene that has been passed into the mutate_copy function.
How do I access the gene argument that's been passed into the mutate copy?
I tried this:
function mutate_copy{T<:ABSGene}(gene::T)
all_fields_except_score = filter(x->x != :score, names(T))
all_fields_except_score = map(x-> ("mutate_copy($gene.$x)"),all_fields_except_score)
eval(parse("$(T)("*join(all_fields_except_score,",")*")"))
end
But that expands the gene in the string which is not what I want.
Don't use eval! In almost all cases, unless you really know what you're doing, you shouldn't be using eval. And in this case, eval simply won't work because it operates in the global (or module) scope and doesn't have access to the variables local to the function (like the argument gene).
While the code you posted isn't quite enough for a minimal working example, I can take a few guesses as to what you want to do here.
Instead of map(x->("mutate_copy(gene.$x)"),all_fields_except_score), you can dynamically look up the field name:
map(x->mutate_copy(gene.(x)), all_fields_except_score)
This is a special syntax that may eventually be replaced by getfield(gene, x). Either one will work right now, though.
And then instead of eval(parse("$(T)("*join(all_fields_except_score,",")*")")), call T directly and "splat" the field values:
T(all_fields_except_score...)
I think the field order should be stable through all those transforms, but it looks a pretty fragile (you're depending on the score being the last field, and all constructors to have their arguments in the same order as their fields). It looks like you're trying to perform a deepcopy sort of operation, but leaving the score field uninitialized. You could alternatively use Base's deepcopy and then recursively set the scores to zero.

Explanation of splat

While reading about Julia on http://learnxinyminutes.com/docs/julia/ I came across this:
# You can define functions that take a variable number of
# positional arguments
function varargs(args...)
return args
# use the keyword return to return anywhere in the function
end
# => varargs (generic function with 1 method)
varargs(1,2,3) # => (1,2,3)
# The ... is called a splat.
# We just used it in a function definition.
# It can also be used in a fuction call,
# where it will splat an Array or Tuple's contents into the argument list.
Set([1,2,3]) # => Set{Array{Int64,1}}([1,2,3]) # produces a Set of Arrays
Set([1,2,3]...) # => Set{Int64}(1,2,3) # this is equivalent to Set(1,2,3)
x = (1,2,3) # => (1,2,3)
Set(x) # => Set{(Int64,Int64,Int64)}((1,2,3)) # a Set of Tuples
Set(x...) # => Set{Int64}(2,3,1)
Which I'm sure is a perfectly good explanation, however I fail to grasp the main idea/benefits.
From what I understand so far:
Using a splat in a function definition allows us to specify that we have no clue how many input arguments the function will be given, could be 1, could be 1000. Don't really see the benefit of this, but at least I understand (I hope) the concept of this.
Using a splat as an input argument to a function does... What exactly? And why would I use it? If I had to input an array's contents into the argument list, I would use this syntax instead: some_array(:,:) (for 3D arrays i would use some_array(:,:,:) etc.).
I think part of the reason why I don't understand this is that I'm struggling with the definition of tuples and arrays, are tuples and arrays data types (like Int64 is a data type) in Julia? Or are they data structures, and what is a data structure? When I hear array I typically think about a 2D matrix, perhaps not the best way to imagine arrays in a programming context?
I realize that you could probably write entire books about what a data structure is, and I could certainly Google it, however I find that people with a profound understanding of a subject are able to explain it in a much more succinct (and perhaps simplified) way then let's say the wikipedia article could, which is why I'm asking you guys (and girls).
You seem like you get the mechanism and how/what they do but are struggling with what you would use it for. I get that.
I find them useful for things where I need to pass an unknown number of arguments and don't want to have to bother constructing an array first before passing it in when working with the function interactively.
for instance:
func geturls(urls::Vector)
# some code to retrieve URL's from the network
end
geturls(urls...) = geturls([urls...])
# slightly nicer to type than building up an array first then passing it in.
geturls("http://google.com", "http://facebook.com")
# when we already have a vector we can pass that in as well since julia has method dispatch
geturls(urlvector)
So a few things to note. Splat's allow you to turn an iterable into an array and vice versa. See the [urls...] bit above? Julia turns that into a Vector with the urls tuple expanded which turns out to be much more useful than the argument splatting itself in my experience.
This is just 1 example of where they've proved useful to me. As you use julia you'll run across more.
It's mostly there to aid in designing api's that feel natural to use.

Array.isDefinedAt for n-dimensional arrays in scala

Is there an elegant way to express
val a = Array.fill(2,10) {1}
def do_to_elt(i:Int,j:Int) {
if (a.isDefinedAt(i) && a(i).isDefinedAt(j)) f(a(i)(j))
}
in scala?
I recommend that you not use arrays of arrays for 2D arrays, for three main reasons. First, it allows inconsistency: not all columns (or rows, take your pick) need to be the same size. Second, it is inefficient--you have to follow two pointers instead of one. Third, very few library functions exist that work transparently and usefully on arrays of arrays as 2D arrays.
Given these things, you should either use a library that supports 2D arrays, like scalala, or you should write your own. If you do the latter, among other things, this problem magically goes away.
So in terms of elegance: no, there isn't a way. But beyond that, the path you're starting on contains lots of inelegance; you would probably do best to step off of it quickly.
You just need to check the array at index i with isDefinedAt if it exists:
def do_to_elt(i:Int, j:Int): Unit =
if (a.isDefinedAt(i) && a(i).isDefinedAt(j)) f(a(i)(j))
EDIT: Missed that part about the elegant solution as I focused on the error in the code before your edit.
Concerning elegance: no, per se there is no way to express it in a more elegant way. Some might tell you to use the pimp-my-library-Pattern to make it look more elegant but in fact it does not in this case.
If your only use case is to execute a function with an element of a multidimensional array when the indices are valid then this code does that and you should use it. You could generalize the method by changing the signature of to take the function to apply to the element and maybe a value if the indices are invalid like this:
def do_to_elt[A](i: Int, j: Int)(f: Int => A, g: => A = ()) =
if (a.isDefinedAt(i) && a(i).isDefinedAt(j)) f(a(i)(j)) else g
but I would not change anything beyond this. This also does not look more elegant but widens your use case.
(Also: If you are working with arrays you mostly do that for performance reasons and in that case it might even be better to not use isDefinedAt but perform validity checks based on the length of the arrays.)

Resources