Datomic aggregates usage - datomic

I want to find persons with minimal age with next query
(d/q '[:find ?name (min ?age)
:in [[?name ?age]]]
[["John" 20]
["Bill" 25]
["Jack" 20]
["Steve" 28]
["Andrew" 30]])
But result is
[["Andrew" 30] ["Bill" 25] ["Jack" 20] ["John" 20] ["Steve" 28]]
How to do that?

This would be a pure Datalog solution
(let [db [["John" 20]
["Bill" 25]
["Jack" 20]
["Steve" 28]
["Andrew" 30]]]
(d/q '[:find ?name ?min-age
:in $ ?min-age
:where [?name ?min-age]]
db
(ffirst (d/q '[:find (min ?age)
:in [[?name ?age]]]
db))))
A HAVING clause like in SQL is not part of the query language, but since all queries are executed in the peer, there is no overhead in doing nested queries.

Instead of chaining the queries together, you could use a subquery (query call from within the query instead of outside of it):
(d/q '[:find ?name ?mage
:in $
:where [(datomic.api/q '[:find (min ?age)
:where [_ :age ?age]]
$) [[?mage]]]
[?name :age ?mage]]
[["John" :age 20]
["Bill" :age 25]
["Jack" :age 20]
["Steve" :age 28]
["Andrew" :age 30]])
Returns:
#{["John" 20] ["Jack" 20]}

You don't need datomic in this case, as you already have in your sequence all the data needed. Use clojure sort instead.
(first (sort-by second [...]))

Related

Is there a better way of writing nested loops in clojure?

Is there a better way of implementing nested loops in clojure?
As a beginner I have written this code of nested loop for comparing difference between dates in days.
Comparing this with nested loops in java using for or while.
(def my-vector [{:course-type "clojure"
:start-date "2021-01-25"
:end-date "2021-02-06"}
{:course-type "r"
:start-date "2021-01-15"
:end-date "2021-02-06"}
{:course-type "python"
:start-date "2020-12-05"
:end-date "2021-01-05"}
{:course-type "java"
:start-date "2020-09-15"
:end-date "2020-10-20"}
])
(defn find-gap-in-course [mycourses]
(println "Finding gap between dates....")
(loop [[course1 & mycourses] mycourses]
(loop [[course2 & mycourses] mycourses]
(when (and
(and (not-empty course1) (not-empty course2))
(> (-> java.time.temporal.ChronoUnit/DAYS
(.between
(LocalDate/parse (course2 :end-date))
(LocalDate/parse (course1 :start-date)))) 30))
(println "Dates evaluated are =" (course2 :end-date) (course1 :start-date))
(println "Gap of > 30 days between dates ="
(-> java.time.temporal.ChronoUnit/DAYS
(.between
(LocalDate/parse (course2 :end-date))
(LocalDate/parse (course1 :start-date)))))
(do true)))
(do false)
(if course1 (recur mycourses))))
(find-gap-in-course my-vector)
Learning to program in Clojure requires that one learn to think a bit differently because the tricks and techniques which people become accustomed to using in imperative programming may not serve as well in Clojure. For example in a nested loop, such as you've shown above, what are you trying to do? You're trying to match all of the elements of mycourses against one another and do some processing. So let's define a function which gives us back all the combinations of elements in a collection 1:
(defn combos[c] ; produce all combinations of elements in a collection
(for [x c y c] (vector x y)))
This is a very simple function which matches all the elements of a collection against one another and returns the accumulated pairings. For example, if you invoke
(combos [1 2 3])
you'll get back
([1 1] [1 2] [1 3] [2 1] [2 2] [2 3] [3 1] [3 2] [3 3])
This will work with any collection. If you invoke combos as
(combos '("abc" 1 [0 9]))
you'll get back
(["abc" "abc"] ["abc" 1] ["abc" [0 9]] [1 "abc"] [1 1] [1 [0 9]] [[0 9] "abc"] [[0 9] 1] [[0 9] [0 9]])
So I think you can see where we're going here. Rather than running a nested loop against a collection, you can just create a collection of combinations of elements and run a simple loop over those combinations:
(defn find-gap-in-course [mycourses]
(loop [course-combos (combos mycourses)]
(let [combi (first course-combos)
[course1 course2] combi]
; ...processing of course1 and course2 here...
(recur (rest mycourses)))))
But what if we don't want to consider the cases where a course is matched against itself? In that case another function to only return the desired cases is useful:
(defn different-combos [c] ; all combinations where [1] <> [2]
(filter #(not= (% 0) (% 1)) (combos c)))
Use whatever works best for you.
1 About here the Clojure cognoscenti are probably screaming "NO! NO! Use clojure.math.combinatorics!". When teaching I like to give useful examples which the student can see, read, and learn from. YMMV.
Here is how I would write the above code, starting from my favorite template project. I have included some unit tests to illustrate what is occurring in the code:
(ns tst.demo.core
(:use tupelo.core tupelo.test)
(:import
[java.time LocalDate]))
(defn days-between
"Find the (signed) interval in days between two LocalDate strings."
[localdate-1 localdate-2]
(.between java.time.temporal.ChronoUnit/DAYS
(LocalDate/parse localdate-1)
(LocalDate/parse localdate-2)))
(dotest ; from tupelo.test
(is= -5 (days-between "2021-01-25" "2021-01-20"))
(is= 5 (days-between "2021-01-25" "2021-01-30")))
(defn course-pairs-with-30-day-gap
"Return a list of course pairs where the start date of the first course
is at least 30 days after the end of the second."
[courses]
(for [c1 courses
c2 courses
:let [start-1 (:start-date c1)
end-2 (:end-date c2)
gap-days (days-between end-2 start-1)]
:when (< 30 gap-days)]
[(:course-type c1) (:course-type c2) gap-days]))
with result
(dotest
(let [all-courses [{:course-type "clojure"
:start-date "2021-01-25"
:end-date "2021-02-06"}
{:course-type "r"
:start-date "2021-01-15"
:end-date "2021-02-06"}
{:course-type "python"
:start-date "2020-12-05"
:end-date "2021-01-05"}
{:course-type "java"
:start-date "2020-09-15"
:end-date "2020-10-20"}]]
(is= (course-pairs-with-30-day-gap all-courses)
[["clojure" "java" 97]
["r" "java" 87]
["python" "java" 46]])))
In the output, I left the names of course-1, course-2, and the gap in days to verify the calculation is the intended one. This could be modified or extended for production use, of course.
In clojure we normally use pre-existing functions like for (technically a macro) instead of low-level tools like loop/recur. The modifiers :let and :when make them extra-powerful for analyzing & transforming data structures.
Please see this list of documentation sources,
especially books like Getting Clojure and the Clojure CheatSheet.

Ruby group array of hashes by numeric key

I got an array of hashes like this one:
[{1=>6}, {1=>5}, {4=>1}]
I try to group by the keys.
So the solution with named keys was like: group_by { |h| h['keyName'] }.
How can I get the following array with short Lambda expressions or with group_by:
[{1=>[5, 6], 4=>[1]}]
EDIT - To explain what I am trying to achieve:
I got a database to allocate pupils to courses.
Each pupil is able to vote each year for a course.
The votes look like this:
Vote(id: integer, first: integer, second: integer, active: boolean,
student_id: integer, course_id: integer, year_id: integer,
created_at: datetime, updated_at: datetime)
Now I would like to allocate the pupils automatically to a course, if the course is not overstaffed. To find out how many pupils voted for each course I first tried this:
Year.get_active_year.votes.order(:first).map(&:first).group_by(&:itself)
the result looks like this:
{1=>[1, 1], 4=>[4]}
Now I am able to use the .each function:
Year.get_active_year.votes.order(:first).map(&:first).group_by(&:itself).each do |_key, value|
if Year.get_active_year.courses.where(number: _key).first.max_visitor >= value.count
end
end
each course got an explicit number and the pupils just use the course number to vote.
But if I do all this, I lose the information which pupil voted for that course, so I tried to keep the information like this:
Year.get_active_year.votes.order(:first).map{|c| {c.first=> c.student_id}}
Injecting into a default hash:
arr = [{1=>6}, {1=>5}, {4=>1}]
arr.inject(Hash.new{|h,k| h[k]=[]}){|h, e| h[e.first[0]] << e.first[1]; h}
# => {1=>[6, 5], 4=>[1]}
Or, as suggested in the comments:
arr.each.with_object(Hash.new{|h, k| h[k] = []}){|e, h| h[e.first[0]] << e.first[1]}
# => {1=>[6, 5], 4=>[1]}
def group_values(arr)
arr.reduce(Hash.new {|h,k| h[k]=[]}) do |memo, h|
h.each { |k, v| memo[k] << v }
memo
end
end
xs = [{1=>6}, {1=>5}, {4=>1}]
group_values(xs) # => {1=>[6, 5], 4=>[1]}
Note that this solution also works when the hashes contain multiple entries:
ys = [{1=>6, 4=>2}, {1=>5}, {4=>1}]
group_values(ys) # => {1=>[6, 5], 4=>[2, 1]}
arr = [{1=>6}, {1=>5}, {4=>1}]
arr.flat_map(&:to_a).
group_by(&:first).
transform_values { |arr| arr.transpose.last }
#=> {1=>[6, 5], 4=>[1]}
The steps are as follows.
a = arr.flat_map(&:to_a)
#=> [[1, 6], [1, 5], [4, 1]]
b = a.group_by(&:first)
#=> {1=>[[1, 6], [1, 5]], 4=>[[4, 1]]}
b.transform_values { |arr| arr.transpose.last }
#=> {1=>[6, 5], 4=>[1]}
Note that
b.transform_values { |arr| arr.transpose }
#=> {1=>[[1, 1], [6, 5]], 4=>[[4], [1]]}
and arr.flat_map(&:to_a) can be replaced with arr.map(&:flatten).
Another way:
arr.each_with_object({}) do |g,h|
k,v = g.flatten
h.update(k=>[v]) { |_,o,n| o+n }
end
#=> {1=>[6, 5], 4=>[1]}
This uses the form of Hash#update (aka merge!) that employs the block { |_,o,n| o+n } to determine the values of keys that are present in both hashes being merged. The block variable _ is the common key (represented by an underscore to signal that it is not used in the block calculations). The variables o and n are respectively the values of the common key in the two hashes being merged.
One way to achieve this using #group_by could be to group by the first key of each hash, then #map over the result to return the corresponding values:
arr = [{1=>6}, {1=>5}, {4=>1}]
arr.group_by {|h| h.keys.first}.map {|k, v| {k => v.map {|h| h.values.first}}}
# => [{1=>[6, 5], 4=>[1]}]
Hope this helps!

Operator Array#<< failed in shorthand form of reduce

There is well-known shorthand form to pass block to any method, based on Symbol#to_proc implementation.
Instead of:
[1,2,3].reduce(0) { |memo, e| memo + e }
# or
[1,2,3].reduce { |memo, e| memo.+(e) }
one might write:
[1,2,3].reduce &:+
The above is an exact “synonym” of the latter “standard notation.”
Let us now have two arrays:
a = [[1,"a"],[2,"b"]]
b = [[3,"c"],[4,"d"]]
While both
b.reduce(a) { |memo, e| memo << e }
# and
b.reduce(a) { |memo, e| memo.<<(e) }
will correctly update a array inplace, exactly as a.concat(b) would do:
#⇒ [[1,"a"], [2,"b"], [3,"c"], [4,"d"]]
the short notation all if a sudden raises an exception:
b.reduce(a) &:<<
#⇒ TypeError: [[1, "a"], [2, "b"]] is not a symbol
What am I missing? Ruby 2.1.
P.S. Casted by this question.
b.reduce(a) &:<<
won't work because it's not valid method calling. Instead, pass the symbol as the last argument:
b.reduce(a, &:<<)
# => [[1, "a"], [2, "b"], [3, "c"], [4, "d"]]
When you call:
[1,2,3].reduce &:+
&:+ is an argument to the method. It's actually equivalent to:
[1,2,3].reduce(&:+)
If the last argument to a method is preceded by &, it is a considered a Proc
object (the Symbol to Proc trick). Then it's removed from the parameter list, and is converted into a block , the method then associate the block.

Flattening an inner array

I'm trying to make an area that outputs the following: [#month, #monthly_count] so that the complete output looks like this: [["January", 0], ["February", 0], ["March", 0], ["April", 2], ["May", 3], ["June", 19], ["July", 0], ["August", 0], ["September", 0], ["October", 0], ["November", 0], ["December", 0]]
My code is:
#months = [["January"],["February"],["March"],["April"],["May"],["June"],["July"],["August"],["September"],["October"],["November"],["December"]]
#monthly_count = [[0], [0], [0], [2], [3], [19], [0], [0], [0], [0], [0], [0]]
#monthly_activity_count = Array.new(12){Array.new}
i = 0
12.times do |i|
#monthly_activity_count[i] << #months[i]
#monthly_activity_count[i] << #monthly_count[i]
#monthly_activity_count[i].flatten
i += 1
end
But it outputs:
[[["January"], [0]], [["February"], [0]], [["March"], [0]], [["April"], [2]], [["May"], [3]], [["June"], [19]], [["July"], [0]], [["August"], [0]], [["September"], [0]], [["October"], [0]], [["November"], [0]], [["December"], [0]]]
I tried to use array.flatten within the iterator to flatten each individual array while keeping the array bounds around each month, but this didn't work. How can I make the array correctly?
Try by doing flatten! in your code,
12.times do |i|
#monthly_activity_count[i] << #months[i]
#monthly_activity_count[i] << #monthly_count[i]
#monthly_activity_count[i].flatten!
i += 1
end
flatten(level) works for you.
[[["January"], [0]], [["February"], [0]], [["March"], [0]], [["April"], [2]], [["May"], [3]], [["June"], [19]], [["July"], [0]], [["August"], [0]], [["September"], [0]], [["October"], [0]], [["November"], [0]], [["December"], [0]]].flatten(2)
For more information http://apidock.com/ruby/Array/flatten
#monthly_activity_count = #months.flatten.zip(#monthly_count.flatten)
If I understand what you're attempting to get at, you can probably start with a slightly simpler setup:
months = ["January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November", "December"]
From there, you can use either of the following lines to produce the desired array (they're exactly equivalent):
months.map.with_index { |month, index| [month, index] }
months.collect.with_index { |month, index| [month, index] }
Both of those lines will iterate over your months array, passing the month name and its index into the block. The code block (the portion surrounded by { and } returns an array containing just the month name and index — and all of those little arrays are grouped into a containing array because you're using map (or collect).
However, I can't quite imagine why you'd need such a structure. If you know the index of the month you want, you can get its name like so:
months[index]
If you know the month name and want to know its index, you can find out like so:
month_name = "March"
index = months.index { |month| month == month_name }
The index in months.index will iterate over all the months, passing each one into the code block. It expects you to supply a code block that will return true when you've located the object you want the index of. So the code block (month == month_name) is setup to match the name of the month passed in with the name stored in your month_name variable. See this article for more info on Ruby code blocks.
You can do it like so:
# only using some of the array to make the example easier to read
#months = [["January"],["February"],["March"],["April"],["May"]]
#monthly_count = [[0], [0], [0], [2], [3]]
## First Method
zipped = #months.zip(#monthly_count)
=> [[["January"], [0]], [["February"], [0]], [["March"], [0]], [["April"], [2]], [["May"], [3]]]
#monthly_activity_count = zipped.each { |pair| pair.flatten! }
=> [["January", 0], ["February", 0], ["March", 0], ["April", 2], ["May", 3]]
# could also do it as a one liner
#months.zip(#monthly_count).each { |pair| pair.flatten! }
# the flatten! in the line above is not bad, just a warning to help you understand
# the result. The difference between `flatten` and `flatten!` is that `flatten`
# will create a new array to hold the result, whereas `flatten!` will
# modify the array you call it on.
## Second Method
#months.flatten.zip(#monthly_count.flatten)
=> [["January", 0], ["February", 0], ["March", 0], ["April", 2], ["May", 3]]
## Ideal Method
# if you can get your months and monthly_count data as simple arrays, eg ["January",
# "February", ...] then you can remove the flattens in the previous line, giving:
#months.zip(#monthly_count)
See http://ruby-doc.org/core-1.9.3/Array.html#method-i-zip for the docs of the zip method.
See http://dablog.rubypal.com/2007/8/15/bang-methods-or-danger-will-rubyist for an explanation of ! (bang) methods in ruby.

Rules on Parenthesis for Block Variables

I ran across the following piece of code while reading The Ruby Way:
class Array
def invert
each_with_object({}).with_index { |(elem, hash), index| hash[elem] = index }
end
end
I want to make sure that I understand what the parenthesis are doing in (elem, hash).
The first method (each_with_object({})) will yield two objects to the block. The first object will be the element in the array; the second object will be the hash. The parentheses make sure that those two objects are assigned to different block variables. If I had instead used { |elem, index} #code }, then elem would be an array consisting of the element and the hash. I think that is clear.
My confusion lies with the fact that if I didn't chain these two methods, I would not have to use the parentheses, and instead could use: each_with_object({}) { |elem, obj #code }.
What are the rules about when parentheses are necessary in block variables? Why do they differ between the two examples here? My simplistic explanation is that, when the methods are not chained, then the yield code looks like yield (elem, obj), but when the methods are chained, the code looks like yield([elem, obj], index). (We can surmise that a second array would be passed in if we chained a third method). Is this correct? Is the object(s) passed in from the last chained method not an array?
I guess instead of all this conjecture, the question boils down to: "What does the yield statement look like when chaining methods that accept blocks?
Your question is only tangentially concerned with blocks and block variables. Rather, it concerns the rules for "disambiguating" arrays.
Let's consider your example:
[1,2,3].each_with_object({}).with_index {|(elem, hash), index| hash[elem] = index}
We have:
enum0 = [1,2,3].each_with_object({})
#=> #<Enumerator: [1, 2, 3]:each_with_object({})>
We can see this enumerator's elements by converting it to an array:
enum0.to_a
#=> [[1, {}], [2, {}], [3, {}]]
We next have:
enum1 = enum0.with_index
#=> #<Enumerator: #<Enumerator: [1, 2, 3]:each_with_object({})>:with_index>
enum1.to_a
#=> [[[1, {}], 0], [[2, {}], 1], [[3, {}], 2]]
You might want to think of enum1 as a "compound enumerator", but it's just an enumerator.
You see that enum1 has three elements. These elements are passed to the block by Enumerator#each. The first is:
enum1.first
#=> [[1, {}], 0]
If we had a single block variable, say a, then
a #=> [[1, {}], 0]
We could instead break this down in different ways using "disambiguation". For example, we could write:
a,b = [[1, {}], 0]
a #=> [1, {}]
b #=> 0
Now let's stab out all the elements:
a,b,c = [[1, {}], 0]
a #=> [1, {}]
b #=> 0
c #=> nil
Whoops! That's not what we wanted. We've just experienced the "ambiguous" in "disambiguate". We need to write this so that our intentions are unambiguous. We do that by adding parenthesis. By doing so, you are telling Ruby, "decompose the array in this position to its constituent elements". We have:
(a,b),c = [[1, {}], 0]
a #=> 1
b #=> {}
c #=> 0
Disambiguation can be extremely useful. Suppose, for example, a method returned the array:
[[1,[2,3],[[4,5],{a: 6}]],7]
and we wish to pull out all the individual values. We could do that as follows:
(a,(b,c),((d,e),f)),g = [[1,[2,3],[[4,5],{a: 6}]],7]
a #=> 1
b #=> 2
c #=> 3
d #=> 4
e #=> 5
f #=> {:a=>6}
g #=> 7
Again, you just have to remember that the parentheses simply mean "decompose the array in this position to its constituent elements".
The rule is basic: every enumerator has a “signature.” E.g. it yields two parameters, then the proc to be passed should expect two parameters to receive:
[1,2,3].each_with_index { |o, i| ...}
When the object might be expanded, like hash item, it may be expanded using parenthesis. Assuming, the iterator yields an array, [*arr]-like operation is permitted with.
The following example might shed a light on this:
[1,2,3].each_with_object('first') # yielding |e, obj|
.with_index # yielding |elem, idx|
# but wait! elem might be expanded here ⇑⇑⇑⇑
# |(e, obj), idx|
.each_with_object('second') do |((elem, fst_obj), snd_idx), trd_obj|
puts "e: #{elem}, 1o: #{fst_obj}, 2i: #{snd_idx}, 3o: #{trd_obj}"
end
#⇒ e: 1, 1o: first, 2i: 0, 3o: second
#⇒ e: 2, 1o: first, 2i: 1, 3o: second
#⇒ e: 3, 1o: first, 2i: 2, 3o: second

Resources