I have an array of objects that I'd like to group by field1 and sum by field2. An example would be a class product that has a title field and a price field.
In an array of products, I have multiple gloves with different prices, and multiple hats with different prices. I'd like to have an array with distinct titles, that aggregate all the prices under the same title.
There's an obvious solution with iterating over the array and using a hash, but I was wondering if there was a "ruby way" of doing something like this? I've seen a lot of examples where Ruby has some unique functionality that applies well to certain scenarios and being a Ruby newbie I'm curious about this.
Thanks
There's a method transform_values added in ruby 2.4 or if you require 'active_support/all', with this you can do something like so:
products = [
{type: "hat", price: 1, name: "fedora"},
{type: "hat", price: 2, name: "sombrero"},
{type: "glove", price: 3, name: "mitten"},
{type: "glove", price: 4, name: "wool"}
]
result = products
.group_by { |product| product[:type] }
.transform_values { |vals| vals.sum { |val| val[:price] } }
# => {"hat"=>3, "glove"=>7}
It's a little unclear to me from the question as asked what your data looks like, so I ended up with this:
Product = Struct.new(:title, :price)
products = [
Product.new("big hat", 1),
Product.new("big hat", 2),
Product.new("small hat", 3),
Product.new("small hat", 4),
Product.new("mens glove", 8),
Product.new("mens glove", 9),
Product.new("kids glove", 1),
Product.new("kids glove", 2)
]
Given that data, this is how I'd go about building a data structure which contains the sum of all the prices for a given title:
sum_by_title = products.inject({}) do |sums, product|
if sums[product.title]
sums[product.title] += product.price
else
sums[product.title] = product.price
end
sums
end
This produces:
{"big hat"=>3, "small hat"=>7, "mens glove"=>17, "kids glove"=>3}
To explain:
Ruby inject takes an initial value and passes that to the iteration block as a "memo". Here, {} is the initial value. The return value from the block is passed into the next iteration as the memo.
The product.title is used as a hash key and the running sum is stored in the hash value. An if statement is necessary because the first time a product title is encountered, the stored value for that title is nil and cannot be incremented.
I probably wouldn't ship this code due to the hidden magic of the hash default value constructor but it's possible to write the same code without the if statement:
sum_by_title = products.inject(Hash.new { 0 }) do |sums, product|
sums[product.title] += product.price
sums
end
Hope you enjoy Ruby as much as I do!
Related
I have a Typescript object like this (the properties are made up, but the object is in the form listed below):
shipments = [
{
id: 1,
name: 'test',
age: 2,
orderNumber: null
},
{
id: 2,
name: 'test2',
age: 4,
orderNumber: '1434'
},
]
I need to write a postgresql statement that takes that array and puts it into a table that has columns id, name, age, and orderNumber. I can't do any iteration on the data (that's why I'm trying to stuff an array I already have using one import statement - because it's way faster than iteration). I need to take that array - without adding any kind of Typescript manipulation to it - and use it in an postgresql insert statement. Is this possible? To maybe make more clear what I want to do, I want to take the shipments object from above and insert it similar to what this insert statement would do.
INSERT INTO table (id, name, age, orderNumber) VALUES (1, 'test', 2', null), (2, 'test2', 4, '1434')
But more automated such as:
INSERT INTO table (variable_column_list) VALUES shipments_array_with_only_values_not_keys
I saw an example using json_to_recordset, but it wasn't working for me, so the use case may have been different.
This is what I am currently doing, using adonis and multiInsert; however, that only allows 1000 records at a time. I was hoping for all the records in one postgres statement.
await Database.connection(tenant).table(table).multiInsert(shipments)
Thanks in advance, for the help!
Are you sure you don't wanna use any ORM / libs for that?
You can generate SQL from array like this (not the best solution, just quick one):
const getQuery = shipments => `INSERT INTO table (${Object.keys(shipments[0]).map(key => '`' + key + '`').join(', ')})\nVALUES\n${shipments.map(row => `(${Object.values(row).map(value => value ? typeof value !== 'number' ? '`' + value + '`' : value : 'null').join(', ')})`).join(',\n')}`;
console.log(getQuery(shipments));
Output:
INSERT INTO table (`id`, `name`, `age`, `orderNumber`)
VALUES
(1, 'test', 2, null),
(2, 'test2', 4, '1434')
All records will be merged into one insert query, but:
Large amount of data per one query is unreasonable and causes crashes / freezes. So you still need to chunk data somehow (This question might be useful):
for (let chunk of chunks) {
await Database.rawQuery(getQuery(chunk))
}
No dynamic structure here! Each array element should have same structure with same set of keys in exactly same order:
interface IShipment {
id: number;
name: string;
age: number;
orderNumber?: string;
} // or whatever
const shipments: IShipment[] = [ ... ] // always should be
Interpolation depends of type: in shipments.map I interpolate numbers, and null(~ish) values. All other types I convert to string, which is OK for this case but completely wrong in general. For example or array should be stringified, not converted to string.
You need to deal with possible injections (SQL injection wiki page). For example with promise-mysql package you can use pool.escape() method for your values to prevent injection:
Object.values(row).map(value => pool.escape(value)); // or somehow else
Conclusion:
Push all records into one statement is not the best idea, especially on your own.
I suggest chunking & inserting via adonis you aleady used:
const shipments: IShipment[] = [ ... ] // any amount of records
cosnt chunkSize = 1000; // adonis limit
const chunks: IShipment[][] = shipments.reduce((resultArray, item, index) => {
const chunkIndex = Math.floor(index / chunkSize);
if(!resultArray[chunkIndex]) resultArray[chunkIndex] = [];
resultArray[chunkIndex].push(item);
return resultArray;
}, []); // split them into chunks
for (let chunk of chunks) {
await Database.rawQuery(getQuery(chunk)); // insert chunks one-by-one
}
I am currently building an iOS application that stores user added products using Google Firestore. Each product that is added is concatenated into a single, user specific "products" array (as shown below - despite having separate numbers they are part of the same array but separated in the UI by Google to show each individual sub-array more clearly)
I use the following syntax to return the data from the first sub-array of the "products" field in the database
let group_array = document["product"] as? [String] ?? [""]
if (group_array.count) == 1 {
let productName1 = group_array.first ?? "No data to display :("`
self.tableViewData =
[cellData(opened: false, title: "Item 1", sectionData: [productName1])]
}
It is returned in the following format:
Product Name: 1, Listing Price: 3, A brief description: 4, Product URL: 2, Listing active until: 21/04/2021 10:22:17
However I am trying to query each of the individual sections of this sub array, so for example, I can return "Product Name: 1" instead of the whole sub-array. As let productName1 = group_array.first is used to return the first sub-array, I have tried let productName1 = group_array.first[0] to try and return the first value in this sub-array however I receive the following error:
Cannot infer contextual base in reference to member 'first'
So my question is, referring to the image from my database (at the top of my question), if I wanted to just return "Product Name: 1" from the example sub-array, is this possible and if so, how would I extract it?
I would reconsider storing the products as long strings that need to be parsed out because I suspect there are more efficient, and less error-prone, patterns. However, this pattern is how JSON works so if this is how you want to organize product data, let's go with it and solve your problem.
let productRaw = "Product Name: 1, Listing Price: 3, A brief description: 4, Product URL: 2, Listing active until: 21/04/2021 10:22:17"
First thing you can do is parse the string into an array of components:
let componentsRaw = productRaw.components(separatedBy: ", ")
The result:
["Product Name: 1", "Listing Price: 3", "A brief description: 4", "Product URL: 2", "Listing active until: 21/04/2021 10:22:17"]
Then you can search this array using substrings but for efficiency, let's translate it into a dictionary for easier access:
var product = [String: String]()
for component in componentsRaw {
let keyVal = component.components(separatedBy: ": ")
product[keyVal[0]] = keyVal[1]
}
The result:
["Listing active until": "21/04/2021 10:22:17", "A brief description": "4", "Product Name": "1", "Product URL": "2", "Listing Price": "3"]
And then simply find the product by its key:
if let productName = product["Product Name"] {
print(productName)
} else {
print("not found")
}
There are lots of caveats here. The product string must always be uniform in that commas and colons must always adhere to this strict formatting. If product names have colons and commas, this will not work. You can modify this to handle those cases but it could turn into a bowl of spaghetti pretty quickly, which is also why I suggest going with a different data pattern altogether. You can also explore other methods of translating the array into a dictionary such as with reduce or grouping but there are big-O performance warnings. But this would be a good starting point if this is the road you want to go down.
All that said, if you truly want to use this data pattern, consider adding a delimiter to the product string. For example, a custom delimiter would greatly reduce the need for handling edge cases:
let productRaw = "Product Name: 1**Listing Price: 3**A brief description: 4**Product URL: 2**Listing active until: 21/04/2021 10:22:17"
With a delimiter like **, the values can contain commas without worry. But for complete safety (and efficiency), I would add a second delimiter so that values can contain commas or colons:
let productRaw = "name$$1**price$$3**description$$4**url$$2**expy$$21/04/2021 10:22:17"
With this string, you can much more safely parse the components by ** and the value from the key by $$. And it would look something like this:
let productRaw = "name$$1**price$$3**description$$4**url$$2**expy$$21/04/2021 10:22:17"
let componentsRaw = productRaw.components(separatedBy: "**")
var product = [String: String]()
for component in componentsRaw {
let keyVal = component.components(separatedBy: "$$")
product[keyVal[0]] = keyVal[1]
}
if let productName = product["name"] {
print(productName)
} else {
print("not found")
}
i struggle my had now sine several days with a way to do conditional filter of an object array with another object array.
lack on capabilities to properly abstract here... maybe you ahve some ideas.
I have a given Object Array A but more complex
var ArrA = [{
number: 1,
name: "A"
}, {
number: 2,
name: "C"
}]
And i want to filer for all results matiching id of Object Array B
var ArrB = [{
id: 1,
categorie: "wine"
}, {
id: 3,
categorie: "beer"
}, {
id: 10,
categorie: "juice"
}]
And in the best case moving this directly also together with an if condition.... but i was not able to handle it ... here is where i am now ... which is not working....
let newArray = ArrA.filter{$0.number == ArrB.... }.
if (newArray.count != 0){
// Do something
}
is there a lean way to compare one attribute of every object in an array with one attribute of another every object in an array ?
Lets break this down: You need all arrA objects that matches arrB ids, so first thing first you need to map your arrB to a list of id (because you dont need the other infos)
let arrBid = Set(arrB.map({ $0.id })) // [1, 3, 10]
As commented below, casting it to Set will give you better results for huge arrays but is not mandatory though
Then you just need to filter your first arrA by only keeping object that id is contained into arrBid :
let arrAFilter = arrA.filter({ arrBid.contains($0.number) })
[(number: 1, name: "A")]
and voila
I am creating a project where I have scraped data from a webpage for product info.
My scraping method returns a nested array with a collection of product_names, urls, and prices. I'm trying to use the nested array to create instances of my class Supplies, with attributes of a name, url, and price. The nested array has the same number of elements in each of the 3 array.
I want to ##all to return an array of instances of all products in the collection with their attributes set.
name_url_price = [[],[],[]]
class Catalog::Supplies
attr_accessor :name, :price, :url
##all = []
def initialize(name_url_price)
count = 0
while count <= name_url_price[0].length
self.name = name_url_price[0][count]
self.url = name_url_price[1][count]
self.price = name_url_price[2][count]
##all << self
count += 1
end
end
There's a lot going wrong here but nothing that can't be fixed:
Storing all the instances in a class variable is a little strange. Creating the instances and having the caller track them would be more common and less confusing.
Your while count <= ... loop isn't terminated with an end.
Your loop condition looks wrong, name_url_price[0] is the first element of name_url_price so name_url_price[0].length will, presumably, always be three. If name_url_price looks more like [ [names], [urls], [prices] ] then the condition is right but that's a strange and confusing way to store your data.
while loops for iteration are very rare in Ruby, you'd normally use name_url_place.each do ... end or something from Enumerable.
Your array indexing is possibly backwards, you want to say name_url_price[count][0], name_url_price[count][1], ... But see (3) if I'm misunderstanding how your data is structured.
Your ##all << self is simply appending the same object (self) to ##all over and over again. ##all will end up with multiple references to the same object and that object's attributes will match the last iteration of the while loop.
The initialize method is meant to initialize a single instance, having it create a bunch of instances is very strange and confusing.
It would be more common and generally understandable for your class to look like this:
class Catalog::Supplies
attr_accessor :name, :price, :url
def initialize(name, url, price)
self.name = name
self.url = url
self.price = price
end
end
And then your name_url_price array would look more like this:
name_url_price = [
[ 'name1', 'url1', 1 ],
[ 'name2', 'url2', 2 ],
[ 'name3', 'url3', 3 ],
[ 'name4', 'url4', 4 ]
]
and to get the supplies as objects, whatever wanted the list would say:
supplies = name_url_price.map { |a| Catalog::Supplies.new(*a) }
You could also use hashes in name_url_price:
name_url_price = [
{ name: 'name1', url: 'url1', price: 1 },
{ name: 'name2', url: 'url2', price: 2 },
{ name: 'name3', url: 'url3', price: 3 },
{ name: 'name4', url: 'url4', price: 4 }
]
and then create your instances like this:
supplies = name_url_price.map do |h|
Catalog::Supplies.new(
h[:name],
h[:url],
h[:price]
)
end
or like this:
supplies = name_url_price.map { |h| Catalog::Supplies.new(*h.values_at(:name, :url, :price)) }
I've 18 documents in my collection movie. For each movie for example:
{
title: "Test Movie 2",
date: [20130808, 20130606],
score: [ {"pete": 1, "mary": 1, "simon": 1, "pat": 2, "mike": 0},
{"pete": 5, "mary": 5, "simon": 5, "pat": 0, "mike": 5}]
}
Now, I want to show the date and sum of the second document in the array 'score' on the client, like:
<div class="details">
Test Movie 2: 20 points 20130606
</div>
Have somebody an idea how I can do that?
You could use a transform, it might be better to define each document with a name and explicitly defining the points as name/value pair instead of points being the value for each persons name.
But this should work:
Movies.find({}, {transform: function(doc) {
var total_points = 0;
var people = doc.score[1]; //Second in array
for(point in people) {
total_points += people[point]
}
doc.points = total_points;
return doc;
}});
This should give you:
{
title: "Test Movie 2",
points: 20,
date: [20130808, 20130606],
score: [ {"pete": 1, "mary": 1, "simon": 1, "pat": 2, "mike": 0},
{"pete": 5, "mary": 5, "simon": 5, "pat": 0, "mike": 5}]
}
Mongo can likely do this outright, but you're not going to be able to do this directly by querying a collection due to limitation of the Mongo livedata package as of 0.6.5. Aggregation framework is also off-limits, and they seem to have pulled the 'hidden' method that allowed direct access to Mongo.
So your options are:
Use transform as per Akshat's answer - the best of both worlds
Aggregate manually in the client in a template helper. I recommend using _.js which comes 'free' with Meteor (again this might change, but you could always pull the library in manually later).
var sum = _.reduce(score[1], function(memo, num){ return memo + num; }, 0);
(I didn't test the above, but it should send you on the right track).
Aggregate upstream, during the insert/update/deletes, likely by observing changes on the collection and 'feeding in' the sum() of the elements you are inserting in either the same collection or an aggregate one.
Which method you use depends on where performance matters most to you, usually doing aggregates before you insert tend to avoid issues later on, and lightens load.
Good luck and let us know how you get on.