How to sum multiple elements of nested arrays on unique keys [duplicate]

How to sum multiple elements of nested arrays on unique keys [duplicate] - arrays

This question already has answers here:
How to condense summable metrics to a unique identifier in a ruby table
(3 answers)
Closed 6 years ago.
I have the following defined table. The index for each element in each row corresponds to the same field.
[[123.0, 23,"id1",34, "abc"],
[234.1,43, "id2", 24,"jsk"],
[423.5,53, "id1",1,"xyz"],
[1.4, 5, "id2",0,"klm"]]
In the above example I need to group and sum an output that sums each of the summable elements on the index for the unique identifier in the 3rd column. The result should look like this:
[[546.5,76, "id1",35],
[235.5,48, "id2",24]]
What's the best way to do this?

This is essentially the same as the solution to your previous question.
data = [ [ 123.0, 23, "id1", 34, "abc" ],
[ 234.1, 43, "id2", 24, "jsk" ],
[ 423.5, 53, "id1", 1, "xyz" ],
[ 1.4, 5, "id2", 0, "klm" ] ]
sums = Hash.new {|h,k| h[k] = [0, 0, 0] }
data.each_with_object(sums) do |(val0, val1, id, val2, _), sums|
sums[id][0] += val0
sums[id][1] += val1
sums[id][2] += val2
end
# => { "id1" => [ 546.5, 76, 35 ],
# "id2" => [ 235.5, 48, 24 ] }
The main difference is that instead of giving the Hash a default value of 0, we're giving it a default proc that initializes missing keys with [0, 0, 0]. (We can't just do Hash.new([0, 0, 0]) because then every value would be a reference to a single Array instance, rather than each value having its own Array.) Then, inside the block, we add each value (val0 et al) to the corresponding elements of sums[id].
If you wanted an Array of Arrays instead of a Hash with the id at index 2, then at the end, you would have to add something like this:
.map {|id, vals| vals.insert(2, id) }
However, a Hash with the ids as keys makes more sense as a data structure.

Related

Find overlapping ranges between two int arrays and split/insert them

There two arrays, each of which will always contain an even, (though not equal) number of integers so that each pair will form a range, eg. 1..5, 8..12, etc.
var defaultArray: [Int] = [1, 5, 8, 12]
var priorityArray: [Int] = [1, 3, 5, 10, 13, 20]
What I'm looking for is a generic algorithm that will find each occurrence of where a range from priorityArray overlaps a range from defaultArray and will insert the priorityRange into the defaultArray while splitting the defaultRange apart if necessary.
The goal is to have a combined array of ranges while maintaining their original "types" like so:
var result: [Int] = [
1, 3, // priority
3, 5, // default
5, 10, // priority
10, 12, // default
13, 20 // priority
]
I'll use a simple struct to illustrate the final desired result:
var result: [Range] = [
Range(from: 1, until: 3, key: "priority"),
Range(from: 3, until: 5, key: "default"),
Range(from: 5, until: 10, key: "priority"),
Range(from: 10, until: 12, key: "default"),
Range(from: 13, until: 20, key: "priority")
]

We start with those arrays def and prio and first check if the intervals themselves are sorted wrt their start/end points. Those arrays would then contain the smallest number in the first array position. Ensure these arrays are simple/correct (=no overlapping intervals). If they are not, you can simplify/sanitise them.
We then initialise
array index d=0 to index the def array
array index p=0 to index the prio array
a new array result to hold all your newly created intervals.
a variable s=none to hold the current status
We now determine if the relation between the def[d] and prio[p].
If def[d]<prio[p], we set t=def[d], increment d and set s=def.
If def[d]> prio[p], we set t=prio[p], increment p and set s=prio.
If they are equal, we set t=prio[p], increment p and d and set s=both.
We can now initialise a new entry for the result array with start=def[0]. The priority is either def (if s==def) or prio (if s was prio or both). To determine the end, you can again compare def[d] with prio[p] to determine where it should end. At this point, you should adjust s again, but ensure that you keep track of the proper state which you're in (going from both to def, prio or none depending on the relation between def[d] and prio[p]). As mentioned in the comments of the OP, the different possibilities might require more clarification, but you should be able to incorporate them into a state.
Going from there, you can keep iterating and adjusting your variables until both are done (with d=len(def) and p=len(prio). You should end up with a nice array containing all the desired consolidated intervals.
This is basically a stateful sweep through the 2 arrays, keeping track of the current position in the integer range and advancing 1 (maybe 2) position(s) at a time.

Pair and add corresponding elements in multiple arrays

I have a json data like :
[
[
"2020-05-07T16:30:00.000+0530",
1,
29,
693,
0,
7,
3663,
7413
],
[
"2020-05-07T15:30:00.000+0530",
0,
16,
996,
3,
13,
4452,
10106
]
]
Using JQ, I want to add the corresponding elements of the both array and result a new array. In case of date string the value from one of the array will be fine. The Expected output is
[
"2020-05-07T16:30:00.000+0530",
1,
45,
1689,
3,
20,
8115,
17519
]
Please can u suggest the solution?

Pair corresponding elements using transpose, and create a new array with sums of them.
transpose | [.[0][0]] + map(add)[1:]
demo at jqplay.org

Elasticsearch sorting by array column

How to sort records by column with array of numbers?
For example:
[1, 32, 26, 16]
[1, 32, 10, 1500]
[1, 32, 1, 16]
[1, 32, 2, 17]
The result that is to be expected:
[1, 32, 1, 16]
[1, 32, 2, 17]
[1, 32, 10, 1500]
[1, 32, 26, 16]
Elasticsearch has sort mode option: https://www.elastic.co/guide/en/elasticsearch/reference/1.4/search-request-sort.html#_sort_mode_option. But no one variant is not appropriated.
Language Ruby can sort arrays of numbers' array, ruby has method Array.<=>, which description says "Each object in each array is compared"
How to do the same with elasticsearch?
P.S. Sorry for my English

In ElasticSearch arrays of objects do not work as you would expect:
Arrays of objects do not work as you would expect: you cannot query
each object independently of the other objects in the array. If you
need to be able to do this then you should use the nested datatype
instead of the object datatype.
This is explained in more detail in Nested datatype.
It is not possible to access array elements at sort time by their indices since they are stored in a Lucene index, which allows basically only set operations ("give docs that have array element = x" or "give docs that do not have array element = x").
However, by default the initial JSON document inserted into the index is stored on the disk and is available for scripting access in the field _source.
You have two options:
use script based sorting
store value for sorting explicitly as string
Let's discuss these options in a bit more detail.
1. Script based sorting
The first option is more like a hack. Let's assume you have a mapping like this:
PUT my_index
{
"mappings": {
"my_type": {
"properties": {
"my_array": {
"type": "integer"
}
}
}
}
}
Then you can achieve intended behavior with a scripted sort:
POST my_index/my_type/_search
{
"sort" : {
"_script" : {
"script" : "String s = ''; for(int i = 0; i < params._source.my_array.length; ++i) {s += params._source.my_array[i] + ','} s",
"type" : "string",
"order" : "asc"
}
}
}
(I tested the code on ElasticSearch 5.4, I believe there should be something equivalent for the earlier versions. Please consult relevant documentation in the case you need info for earlier versions, like for 1.4.)
The output will be:
"hits": {
"total": 2,
"max_score": null,
"hits": [
{
"_index": "my_index",
"_type": "my_type",
"_id": "2",
"_score": null,
"_source": {
"my_array": [
1,
32,
1,
16
]
},
"sort": [
"1,32,1,16,"
]
},
{
"_index": "my_index",
"_type": "my_type",
"_id": "1",
"_score": null,
"_source": {
"my_array": [
1,
32,
10,
1500
]
},
"sort": [
"1,32,10,1500,"
]
}
] }
Note that this solution will be slow and memory consuming since it will have to read _source for all documents under sort from disk and to load them into memory.
2. Denormalization
Storing the value for sorting explicitly as string is more like ElasticSearch approach, which favors denormalization. Here the idea would be to do the concatenation before inserting the document into the index and use robust sorting by a string field.
Please select the solution more appropriate for your needs.
Hope that helps!

How can I convert an array of integers into an array of digits for further processing?

I want to break given numbers into digits and sort. I expect to get:
unused_digits(2015, 8, 26) # => [0,1,2,2,5,6,8]
I tried:
def unused_digits(*x)
x # => [2015, 8, 26]
x = x.join.split "" # => [2, 0, 1, 5, 8, 2, 6]
x = x.to_a # => [2, 0, 1, 5, 8, 2, 6]
# other stuff here
return x
end
if you are confused about the name "unused_digits". please ignore the name "unused_digits", and just treat it as "find_out_used_digits".
Originally I was going to find out the unused digits, but I was stuck at first stage finding used digits, so I just copied the first code for finding digits, and didn't copy the rest code to find unused ones. my bad. apologies.

For the problem described in comments, here is the solution:
def unused_digits(*x)
x.join.chars.sort.map(&:to_i)
end
unused_digits(2015,8,26)
#=> [0, 1, 2, 2, 5, 6, 8]
x is an array of arguments - [2015, 8, 26]
.join will join the arguments into a string and give us "2015826"
.chars will split the string into chars.
.sort will sort that character array
.map(&:to_i) will take each char and convert to number

TL;DR
Your question appears to be an X/Y problem, in large part because the name of your method (e.g. "unused_digits") doesn't actually seem to have anything to do with your expected return values. As originally posted, your method returns an array of used digits rather than unused digits.
If you truly want the return value to be [0,1,2,2,5,6,8] per your comment, then others have already posted useful answers. However, in the event that you actually want to return the digits that have not been used in any of your arguments (as suggested by your method name), then you may want to try the alternative described below.
Find Unused Digits with Array Difference
You can use various String functions to flatten an array of integers, and then use the Array difference method to return a de-duplicated list of unused digits. For example:
def unused_digits *integer_array
Array(0..9) - integer_array.flatten.join.scan(/\d/).sort.map(&:to_i)
end
unused_digits 2015, 8, 26
#=> [3, 4, 7, 9]
unused_digits 2345678
#=> [0, 1, 9]
This will correctly return an array of digits that are not included in any passed arguments. This seems to be what is intended by your method name, but your mileage may certainly vary.

Beginning your function, you already have an array: [2015, 8, 26]. If that's what you want, then you don't have to do anything else.
By then calling split("") directly after join, you are converting your initial array into a string, then back into an array.
By way of an example, this is executing what is essentially the same code in irb, the interactive ruby shell:
>> digits = 2015,8,26
=> [2015, 8, 26]
>> joined = digits.join
=> "2015826"
>> split = joined.split("")
=> ["2", "0", "1", "5", "8", "2", "6"]
>> split.to_a
=> ["2", "0", "1", "5", "8", "2", "6"]
>> split.class
=> Array
As you can see, when you call join, your 2015,8,26 turns into "2015826", which is a string. After you call split"", it becomes an array with each character as a separate element in the array.
Calling to_a on what is already an array has no effect.
Hopefully that's helpful!

def unused_digits(*x)
x.flat_map { |n| n.to_s.each_char.map(&:to_i) }.sort
end
unused_digits(2015,8,26)
#=> [0,1,2,2,5,6,8]

How to concatenate/flatten an object's VALUES to an array?

I have the following object:
languages:
english: [ 1, 2, 3 ]
german: [ 4, 5, 6 ]
My goal is to get an array of all values of languagesso that the result looks like [ 1, 2, 3, 4, 5, 6 ].
This is what I have tried:
(word for word in value for key, value of languages)
or
(word for word in languages[lang] for lang in Object.keys languages)
Both methods return a two dimensional array the arrays as first dimension and the values as second dimension
Is there a way to get the desired result using a one-liner?

Use the concat() function:
[1, 2, 3].concat [4, 5, 6]

Yes, you can:
[].concat (val for key, val of languages)...
or
Array::concat (val for key, val of languages)...
which are the same.
(val for key, val of languages) here is the array of all languages arrays to concatenate with one another.
... operator is just a shortcut for java-script apply function.

I am not sure why it has to be in one line ... but here you have it in 2 LOC
result = []
result.splice(result.length, 0, languages[key]...) for key of languages