remove too close time elements from a json array - arrays

I am having an array of json object which contain timing and data.
Basically, each element, contain timing, id and user as below
[
{
"id": "abc",
"ts": "2017-08-17T20:42:12.557229",
"userid": "seb"
},
{
"id": "def",
"ts": "2017-08-17T20:42:52.724773",
"userid": "seb"
},
{
"id": "ghi",
"ts": "2017-08-17T20:42:53.724773",
"userid": "matt"
},
{
"id": "jkl",
"ts": "2017-08-17T20:44:50.557229",
"userid": "seb"
},
{
"id": "mno",
"ts": "2017-08-17T20:44:51.724773",
"userid": "seb"
},
{
"id": "pqr",
"ts": "2017-08-17T20:50:52.724773",
"userid": "seb"
}
]
My goal is to remove object too close to each other if the userid is the same. if the time difference is below 2 sec, we remove the element.
From the list, I should get the list
[
{
"id": "abc",
"ts": "2017-08-17T20:42:12.557229",
"userid": "seb"
},
{
"id": "def",
"ts": "2017-08-17T20:42:52.724773",
"userid": "seb"
},
{
"id": "ghi",
"ts": "2017-08-17T20:42:53.724773",
"userid": "matt"
},
{
"id": "pqr",
"ts": "2017-08-17T20:50:52.724773",
"userid": "seb"
}
]
even if the 2 objects for user matt and seb are too close to each other below 2seconds, we have to keep the element as it's not the same user
"ts": "2017-08-17T20:42:52.724773" for seb
and
"ts": "2017-08-17T20:42:53.724773" for matt
Any idea how to code it in Ruby ? I always compared the element n to the n-1 and delete the n-1 if needed

require 'time'
result = []
timestamps = {}
data.each do |item|
ts = timestamps[item['userid']]
if ts.nil? or Time.parse(item['ts']) - Time.parse(ts) > 2
result.push(item)
timestamps[item['userid']] = item['ts']
end
end
puts result

What about the code below?
It changes the order of the records, but you could re-sort them if needed.
require 'date'
def time_elapsed_in_seconds(start_time, end_time)
((end_time - start_time) * 24 * 60 * 60).to_i
end
def too_close?(first_time, second_time, threshold = 2)
time_elapsed_in_seconds(first_time, second_time) < threshold
end
def datetimes(a, b)
return [DateTime.parse(a), DateTime.parse(b)]
end
def should_reject_record?(record, next_record)
datetimes = datetimes(record[:ts], next_record[:ts])
record[:userid] == next_record[:userid] && too_close?(*datetimes)
end
def filter_records(records)
sorted = records.sort_by{|record| [record[:userid], record[:ts]] }
sorted.select.with_index do |record, index|
previous_record = sorted[index-1]
record == sorted.first || !should_reject_record?(previous_record, record)
end
end
records = [
{
"id": "abc",
"ts": "2017-08-17T20:42:12.557229",
"userid": "seb"
},
{
"id": "def",
"ts": "2017-08-17T20:42:52.724773",
"userid": "seb"
},
{
"id": "ghi",
"ts": "2017-08-17T20:42:53.724773",
"userid": "matt"
},
{
"id": "jkl",
"ts": "2017-08-17T20:44:50.557229",
"userid": "seb"
},
{
"id": "mno",
"ts": "2017-08-17T20:44:51.724773",
"userid": "seb"
},
{
"id": "pqr",
"ts": "2017-08-17T20:50:52.724773",
"userid": "seb"
}
]
puts filter_records(records)

Related

Grouping a collection IN LARAVEL

I have an array called $customerRecords. I want to group the data in this array by the customer's email.
This is the array below
$customerRecords = [
{
"id": 1,
"note": "This is note 1",
"customer": [
{
"id": 1,
"user_id": 34,
"email": "doe#mailnator.com",
"phone": "9829484857"
}
]
},
{
"id": 2,
"note": "This is note 2",
"customer": [
{
"id": 2,
"user_id": 34,
"email": "john#mailnator.com",
"phone": "9829484857"
}
]
},
{
"id": 3,
"note": "This is a note 3",
"customer": [
{
"id": 2,
"user_id": 34,
"email": "john#mailnator.com",
"phone": "9829484857"
}
]
},
]
This is the expected result I want to achieve so that I can know the group of data that belongs to an email .
{
"doe#mailnator.com": [
{
"id": 1,
"note": "This is note 1",
"customer": [
{
"id": 1,
"user_id": 34,
"email": "doe#mailnator.com",
"phone": "9829484857"
}
]
}
],
"john#mailnator.com": [
{
"id": 2,
"note": "This is note 2",
"customer": [
{
"id": 2,
"user_id": 34,
"email": "john#mailnator.com",
"phone": "9829484857"
}
]
},
{
"id": 3,
"note": "This is a note 3",
"customer": [
{
"id": 2,
"user_id": 34,
"email": "john#mailnator.com",
"phone": "9829484857"
}
]
}
]
}
So this is what I have tried but it's not working:
return collect($customerRecords)->groupBy('customer.email')
you are almost done just define customer 0 item then email
return collect($customerRecords)->groupBy('customer.0.email');
This is how I was able to solve it.
$grouped = [];
foreach($customerRecords as $value) {
foreach($value['customer'] as $cust) {
$grouped[$cust['email']][] = $value;
}
}

Elasticsearch query: combine nested array of objects into one array

Using Elasticsearch I am trying to combine a nested array of objects into one array.
This is what my data looks like:
GET invoices/_search
{
"hits": [
{
"_index": "invoices",
"_id": "1234",
"_score": 1.0,
"_source": {
"id": 1234,
"status": "unpaid",
"total": 15.35,
"payments": [
{
"id": 1981,
"amount": 10,
"date": "2022-02-09T13:00:00+01:00"
},
{
"id": 1982,
"amount": 5.35,
"date": "2022-02-09T13:35:00+01:00"
}
]
}
},
# ... More hits
]
}
I want to only get the payments array of each hit combined into one array, so that it returns something like this:
{
"payments": [
{
"id": 1981,
"amount": 10,
"date": "2022-02-09T13:00:00+01:00"
},
{
"id": 1982,
"amount": 5.35,
"date": "2022-02-09T13:35:00+01:00"
},
{
"id": 5658,
"amount": 3,
"date": "2021-12-19T13:00:00+01:00"
}
]
}
I tried to get this result using nested queries but could not figure it out, the query I used is:
# Query I used:
GET invoices/_search
{
"_source": ["payments"],
"query": {
"nested": {
"path": "payments",
"query": {
"bool": {
"must": [
{
"exists": {
"field": "payments.id"
}
}
]
}
}
}
}
}
# Result:
{
"hits": [
{
"_index": "invoices",
"_id": "545960",
"_score": 1.0,
"_source": {
"payments": [
{
"date": "2022-01-22T15:38:15+01:00",
"amount": 374.5,
"id": 320320
},
{
"date": "2022-01-22T15:30:03+01:00",
"amount": 160.5,
"id": 320316
}
]
}
},
{
"_index": "invoices",
"_id": "545961",
"_score": 1.0,
"_source": {
"payments": [
{
"date": "2022-01-22T15:38:15+01:00",
"amount": 12,
"id": 320350
},
{
"date": "2022-01-22T15:30:03+01:00",
"amount": 60.65,
"id": 320379
}
]
}
}
]
}
The result returns only the payments array but divided over multiple hits. How can I combine those arrays?

Performance issue running mongodb aggregation

I need to run a query that joins documents from two collections, I wrote an aggregation query but it takes too much time when running in the production database with many documents. Is there any way to write this query in a more efficient way?
Query in Mongo playground: https://mongoplayground.net/p/dLb3hsJHNYt
There are two collections users and activities. I need to run a query to get some users (from users collection), and also their last activity (from activities collection).
Database:
db={
"users": [
{
"_id": 1,
"email": "user1#gmail.com",
"username": "user1",
"country": "BR",
"creation_date": 1646873628
},
{
"_id": 2,
"email": "user2#gmail.com",
"username": "user2",
"country": "US",
"creation_date": 1646006402
}
],
"activities": [
{
"_id": 1,
"email": "user1#gmail.com",
"activity": "like",
"timestamp": 1647564787
},
{
"_id": 2,
"email": "user1#gmail.com",
"activity": "comment",
"timestamp": 1647564834
},
{
"_id": 3,
"email": "user2#gmail.com",
"activity": "like",
"timestamp": 1647564831
}
]
}
Inefficient Query:
db.users.aggregate([
{
// Get users using some filters
"$match": {
"$expr": {
"$and": [
{ "$not": { "$in": [ "$country", [ "AR", "CA" ] ] } },
{ "$gte": [ "$creation_date", 1646006400 ] },
{ "$lte": [ "$creation_date", 1648684800 ] }
]
}
}
},
{
// Get the last activity within the time range
"$lookup": {
"from": "activities",
"as": "last_activity",
"let": { "cur_email": "$email" },
"pipeline": [
{
"$match": {
"$expr": {
"$and": [
{ "$eq": [ "$email", "$$cur_email" ] },
{ "$gte": [ "$timestamp", 1647564787 ] },
{ "$lte": [ "$timestamp", 1647564834 ] }
]
}
}
},
{ "$sort": { "timestamp": -1 } },
{ "$limit": 1 }
]
}
},
{
// Remove users with no activity
"$match": {
"$expr": {
"$gt": [ { "$size": "$last_activity" }, 0 ] }
}
}
])
Result:
[
{
"_id": 1,
"country": "BR",
"creation_date": 1.646873628e+09,
"email": "user1#gmail.com",
"last_activity": [
{
"_id": 2,
"activity": "comment",
"email": "user1#gmail.com",
"timestamp": 1.647564788e+09
}
],
"username": "user1"
},
{
"_id": 2,
"country": "US",
"creation_date": 1.646006402e+09,
"email": "user2#gmail.com",
"last_activity": [
{
"_id": 3,
"activity": "like",
"email": "user2#gmail.com",
"timestamp": 1.647564831e+09
}
],
"username": "user2"
}
]
I'm more familiar with relational databases, so I'm struggling a little to run this query efficiently.
Thanks!

MongoDB : Update array in array if all objects of the array match multiple conditions

I want to update the array grades for a specific user. I want to push an object into grades if in the array there is no object that matches the semester and subject values.
Input :
{
"users": [
{
"userID": "id_1",
"grades": [
{
"semester": 1,
"subject": "math",
"value": 15
},
{
"semester": 1,
"subject": "french",
"value": 15
}
]
},
{
"userID": "id_2",
"grades": [
{
"semester": 1,
"subject": "math",
"value": 18
}
]
}
]
}
For example if I want to push :
{
"semester": 2,
"subject": "french",
"value": 16
}
for userID = id_1.
The result is :
{
"users": [
{
"userID": "id_1",
"grades": [
{
"semester": 1,
"subject": "math",
"value": 15
},
{
"semester": 1,
"subject": "french",
"value": 15
},
{
"semester": 2,
"subject": "french",
"value": 16
}
]
},
{
"userID": "id_2",
"grades": [
{
"semester": 1,
"subject": "math",
"value": 18
}
]
}
]
}
But also if I try to push
{
"semester": 1,
"subject": "french",
"value": 10
}
for userID = id_1.
It won't update, because there is already an object that match "semester" : 1 and "subject" : "french"
I tried to use arrayFilter with array identifier to filter on userID first, but then I cannot achieve to apply the push condition on grades array.
{"$push":{ "users.$[user].grades": { "semester": 1, "subject": "math", "value" : 10 } }}
arrayFilter = [{"user.userID" : "id_1"}]
Thank you in advance for the help.
You could use a mix of $not with $elemMatch to achieve what you're aiming to, which translate to if no element of grades match the condition on semester & subject of the new one, then add it,
Here's how you can do it:
db.users.update({
"userID": "id_1",
"grades": {
"$not": {
"$elemMatch": {
"semester": 2,
"subject": "french"
}
}
}
}, {
"$push": {
"grades": {
"semester": 2,
"subject": "french",
"value": 16
}
}
})

Merge array of hash with same key

I have an array of hash as shown here. I want to merge the values of some fields with custom seprators. Here, i show only two hashes in the array, it is possible to have more. But, they are always in same sequence as shown here.
{
"details": [
{
"place": "abc",
"group": 3,
"year": 2006,
"id": 1304,
"street": "xyz 14",
"lf_number": "0118",
"code": 4433,
"name": "abc coorperation",
"group2": 3817,
"group1": 32,
"postal_code": "22926",
"status": 2
},
{
"place": "cbc",
"group": 2,
"year": 2007,
"id": 4983,
"street": "mnc 14",
"lf_number": "0145",
"code": 4433,
"name": "abc coorperation",
"group2": 3817,
"group1": 32,
"postalcode": "22926",
"status": 2
}
],
"#timestamp": "2017-09-04",
"parent": {
"child": [
{
"w_2": 0.5,
"w_1": 0.1,
"id": 14226,
"name": "air"
},
{
"w_2": null,
"w_1": 91,
"id": 25002,
"name": "Water"
}]
},
"p_name": "anacin",
"#version": "1",
"id": 28841
}
I want to edit the details. I want to construct new fields.
Field 1) coorperations: (details.name | details.postal_code details.street ; details.name | details.postal_code details.street)
Output:
Coorperations: (abc coorperation |22926 xyz 14; abc coorperation | 22926 mnc 14)
Field 2) access_code: (details.status-details.id-details.group1-details.group2-details.group(always two digit)/details.year(only last two digits); details.status-details.id-details.group1-details.group2-details.group(always two digit)/details.year(only last two digits))
Output: access_code (2-32-3817-03-06; 2-32-3817-02-07)
How can I achieve this for all the values in details. Here is how final results should look like.
{
"#timestamp": "2017-09-04",
"parent": {
"child": [
{
"w_2": 0.5,
"w_1": 0.1,
"id": 14226,
"name": "air"
},
{
"w_2": null,
"w_1": 91,
"id": 25002,
"name": "Water"
}]
},
"p_name": "anacin",
"#version": "1",
"id": 28841,
"Coorperations" : "abc coorperation |22926 xyz 14; abc coorperation | 22926 mnc 14",
"access_code" : "2-32-3817-03-06; 2-32-3817-02-07"
}
You can try to run this code in rails console with hash is your json:
new_hash = hash.except(:details)
coorperations = ""
access_code = ""
elements = hash[:details]
elements.each do |element|
coorperations = "#{coorperations}#{if coorperations.present? then '; ' else '' end}#{element[:name]} | #{element[:postal_code]} #{element[:street]}"
access_code = "#{access_code}#{if access_code.present? then '; ' else '' end}#{element[:status]}-#{element[:id]}-#{element[:group1]}-#{element[:group2]}-#{element[:group1]}-#{element[:group]}"
end
new_hash.merge!(Coorperations: coorperations)
new_hash.merge!(access_code: access_code)
new_hash

Resources