How to do 'where model.field contains array' in Ecto?

How to do 'where model.field contains array' in Ecto? - arrays

Let's say I have a model with a field sizes that is an array (e.g. sizes: ['S', 'M', 'L']). What I want to accomplish, building an API, is for a user to be able to filter those models based on sizes. So a GET request to this path:
.../products?sizes=['S','M']
Should return all the products for which the given array is a subarray of their sizes field. So I don't want nor need an exact match but I want the users to be able to filter as explained above. How would I go about accomplishing this in my Phoenix API ?
I could only accomplish filtering the ones that contained a specific value (where: this in that), but if I pass in an array and I wanna check if that array is contained in that model field, I'm a bit lost.
Thanks in advance for any help, let me know if you need any additional information.
EDIT
I am trying to use fragment("? #> ?::varchar[]", p.sizes, ^params["sizes'] ) and it works, but it fails if I add any simple filter like [color: "red"] on top of the existing one, this means I can't create a set of filters and then add it to the where clauses like ... and ^filters
filters = Ecto.Changeset.cast(%Product{}, params, [], [:color])
|> Map.fetch!(:changes)
|> Map.to_list
# Do I need to actually do this check like this ? (It fails otherwise)
sizes = if is_nil(params["sizes"]) do [] else params["sizes"] end
products_query = from(
p in Product,
where: fragment("? #> ?::varchar[]", p.sizes, ^sizes) and
^filters
)
This doesn't currently work.

As you've figured out already, you need to use #> with fragment for the "array contains array" operation. For your second question, to chain where expressions with and, you can just add another where:
products_query = from(
p in Product,
where: fragment("? #> ?::varchar[]", p.sizes, ^sizes),
where: ^filters
)

Related

Filter Array For IDs Existing in Another Array with Ruby on Rails/Mongo

I need to compare the 2 arrays declared here to return records that exist only in the filtered_apps array. I am using the contents of previous_apps array to see if an ID in the record exists in filtered_apps array. I will be outputting the results to a CSV and displaying records that exist in both arrays to the console.
My question is this: How do I get the records that only exist in filtered_apps? Easiest for me would be to put those unique records into a new array to work with on the csv.
start_date = Date.parse("2022-02-05")
end_date = Date.parse("2022-05-17")
valid_year = start_date.year
dupe_apps = []
uniq_apps = []
# Finding applications that meet my criteria:
filtered_apps = FinancialAssistance::Application.where(
:is_requesting_info_in_mail => true,
:aasm_state => "determined",
:submitted_at => {
"$exists" => true,
"$gte" => start_date,
"$lte" => end_date })
# Finding applications that I want to compare against filtered_apps
previous_apps = FinancialAssistance::Application.where(
is_requesting_info_in_mail: true,
:submitted_at => {
"$exists" => true,
"$gte" => valid_year })
# I'm using this to pull the ID that I'm using for comparison just to make the comparison lighter by only storing the family_id
previous_apps.each do |y|
previous_apps_array << y.family_id
end
# This is where I'm doing my comparison and it is not working.
filtered_apps.each do |app|
if app.family_id.in?(previous_apps_array) == false
then #non_dupe_apps << app
else "No duplicate found for application #{app.hbx_id}"
end
end
end
So what am I doing wrong in the last code section?

Let's check your original method first (I fixed the indentation to make it clearer). There's quite a few issues with it:
filtered_apps.each do |app|
if app.family_id.in?(previous_apps_array) == false
# Where is "#non_dupe_apps" declared? It isn't anywhere in your example...
# Also, "then" is not necessary unless you want a one-line if-statement
then #non_dupe_apps << app
# This doesn't do anything, it's just a string
# You need to use "p" or "puts" to output something to the console
# Note that the "else" is also only triggered when duplicates WERE found...
else "No duplicate found for application #{app.hbx_id}"
end # Extra "end" here, this will mess things up
end
end
Also, you haven't declared previous_apps_array anywhere in your example, you just start adding to it out of nowhere.
Getting the difference between 2 arrays is dead easy in Ruby: just use -!
uniq_apps = filtered_apps - previous_apps
You can also do this with ActiveRecord results, since they are just arrays of ActiveRecord objects. However, this doesn't help if you specifically need to compare results using the family_id column.
TIP: Getting the values of only a specific column/columns from your database is probably best done with the pluck or select method if you don't need to store any other data about those objects. With pluck, you only get an array of values in the result, not the full objects. select works a bit differently and returns ActiveRecord objects, but filters out everything but the selected columns. select is usually better in nested queries, since it doesn't trigger a separate query when used as a part of another query, while pluck always triggers one.
# Querying straight from the database
# This is what I would recommend, but it doesn't print the values of duplicates
uniq_apps = filtered_apps.where.not(family_id: previous_apps.select(:family_id))
I highly recommend getting really familiar with at least filter/select, and map out of the basic array methods. They make things like this way easier. The Ruby docs are a great place to learn about them and others. A very simple example of doing a similar thing to what you explained in your question with filter/select on 2 arrays would be something like this:
arr = [1, 2, 3]
full_arr = [1, 2, 3, 4, 5]
unique_numbers = full_arr.filter do |num|
if arr.include?(num)
puts "Duplicates were found for #{num}"
false
else
true
end
end
# Duplicates were found for 1
# Duplicates were found for 2
# Duplicates were found for 3
=> [4, 5]
NOTE: The OP is working with ruby 2.5.9, where filter is not yet available as an array method (it was introduced in 2.6.3). However, filter is just an alias for select, which can be found on earlier versions of Ruby, so they can be used interchangeably. Personally, I prefer using filter because, as seen above, select is already used in other methods, and filter is also the more common term in other programming languages I usually work with. Of course when both are available, it doesn't really matter which one you use, as long as you keep it consistent.

EDIT: My last answer did, in fact, not work.
Here is the code all nice and working.
It turns out the issue was that when comparing family_id from the set of records I forgot that the looped record was a part of the set, so it would return it, too. I added a check for the ID of the array to match the looped record and bob's your uncle.
I added the pass and reject arrays so I could check my work instead of downloading a csv every time. Leaving them in mostly because I'm scared to change anything else.
start_date = Date.parse(date_from)
end_date = Date.parse(date_to)
valid_year = start_date.year
date_range = (start_date)..(end_date)
comparison_apps = FinancialAssistance::Application.by_year(start_date.year).where(
aasm_state:'determined',
is_requesting_voter_registration_application_in_mail:true)
apps = FinancialAssistance::Application.where(
:is_requesting_voter_registration_application_in_mail => true,
:submitted_at => date_range).uniq{ |n| n.family_id}
#pass_array = []
#reject_array = []
apps.each do |app|
family = app.family
app_id = app.id
previous_apps = comparison_apps.where(family_id:family.id,:id.ne => app.id)
if previous_apps.count > 0
#reject_array << app
puts "\e[32mApplicant hbx id \e[31m#{app.primary_applicant.person_hbx_id}\e[32m in family ID \e[31m#{family.id}\e[32m has registered to vote in a previous application.\e[0m"
else
<csv fields here>
csv << [csv fields here]
end
end
Basically, I pulled the applications into the app variable array, then filtered them by the family_id field in each record.
I had to do this because the issue at the bottom of everything was that there were records present in app that were themselves duplicates, only submitted a few days apart. Since I went on the assumption that the initial app array would be all unique, I thought the duplicates that were included were due to the rest of the code not filtering correctly.
I then use the uniq_apps array to filter through and look for matches in uniq_apps.each do, and when it finds a duplicate, it adds it to the previous_applications array inside the loop. Since this array resets each go-round, if it ever has more than 0 records in it, the app gets called out as being submitted already. Otherwise, it goes to my csv report.
Thanks for the help on this, it really got my brain thinking in another direction that I needed to. It also helped improve the code even though the issue was at the very beginning.

How to get all vector ids from Milvus2.0?

I used to use Milvus1.0. And I can get all IDs from Milvus1.0 by using get_collection_stats and list_id_in_segment APIs.
These days I am trying Milvus2.0. And I also want to get all IDs from Milvus2.0. But I don't find any ways to do it.

milvus v2.0.x supports queries using boolean expressions.
This can be used to return ids by checking if the field is greater than zero.
Let's assume you are using this schema for your collection.
referencing: https://github.com/milvus-io/pymilvus/blob/master/examples/hello_milvus.py
as of 3/8/2022
fields = [
FieldSchema(name="pk", dtype=DataType.INT64, is_primary=True, auto_id=False),
FieldSchema(name="random", dtype=DataType.DOUBLE),
FieldSchema(name="embeddings", dtype=DataType.FLOAT_VECTOR, dim=dim)
]
schema = CollectionSchema(fields, "hello_milvus is the simplest demo to introduce the APIs")
hello_milvus = Collection("hello_milvus", schema, consistency_level="Strong")
Remember to insert something into your collection first... see the pymilvus example.
Here you want to query out all ids (pk)
You cannot currently list ids specific to a segment, but this would return all ids in a collection.
res = hello_milvus.query(
expr = "pk >= 0",
output_fields = ["pk", "embeddings"]
)
for x in res:
print(x["pk"], x["embeddings"])
I think this is the only way to do it now, since they removed list_id_in_segment

Sitecore Solr Search by Multilist with Values from Another Multilist

I have a set of product items. Each product item has a multilist field that points to a set of product type items. When on a product page, I want to show a paged list of related items. These should be items that share a product type with the currently selected item. I'm running into some trouble because products can have multiple types. I need to split the type list on the current item and check that against the list of products in an expression. For some reason split and contains are throwing runtime exceptions and I can't really figure out why. I saw some things about the predicate builder being used for dynamic queries and I will try to use that with what I currently have but I'd like to know why this can't be done straight in the where clause.
Another issue I ran into is that the list of ids stored in solr are being stripped of their '{', '}', and '-' characters.

If you are already on the product page I assume you already have the product item and that product item should have a "ProductType" multilist field. You can use Sitecore.Data.Fields.MultilistFiled to avoid worrying about have to split the raw values.
You can then use Sitecore's Predicate Builder to build out your search predicate, which I assume you want to find all products that have one similar product type. You should adjust this search logic as needed. I am using the ObjectIndexerKey (see more here -> http://www.sitecore.net/Learn/Blogs/Technical-Blogs/Sitecore-7-Development-Team/Posts/2013/05/Sitecore-7-Predicate-Builder.aspx) to go after a named field, but you should build out a proper search model and actually define ProductTypes as a List< ID> or something similar. You may need to add other conditions to the search predicate as well such as path or templateid to limit your results. After that you can just execute the search and consume the results.
As far as Solr stripping the special characters, this is expected behavior based on the Analyzer used on the field. Sitecore and Solr will apply the proper query time analyzers to match things up so you shouldn't have to worry about formatting as long as the proper types are used.
var pred = PredicateBuilder.True<SearchResultItem>();
Sitecore.Data.Fields.MultilistField multilistField = Sitecore.Context.Item.Fields["ProductTypes"];
if (multilistField != null)
{
foreach (ID id in multilistField.TargetIDs)
{
pred = pred.Or(x => ((ID)x[(ObjectIndexerKey)"ProductType"]).Contains(id);
}
}
ISearchIndex _searchIndex = ContentSearchManager.GetIndex("sitecore_master_index"); // change to proper search index
using (var context = _searchIndex.CreateSearchContext())
{
var relatedProducts = context.GetQueryable<SearchResultItemModel>().Where(pred);
foreach(var relatedProduct in relatedProducts)
{
// do something here with search results
}
}

Just an Improvement to #Matt Gartman code,
The error (ID doesn't contain a definition for Contains) which keeps popping up is because .Contains is not a functionality of Type ID, I recommend you to Cast it in string type as below
foreach (ID id in multilistField.TargetIDs)
{
pred = pred.Or(x => (Convert.ToString((ID)x[(ObjectIndexerKey)"ProductType"]).Contains(id.toString())));
}

Slick: Difficulties working with Column[Int] values

I have a followup to another Slick question I recently asked (Slick table Query: Trouble with recognizing values) here. Please bear with me!! I'm new to databasing and Slick seems especially poor on documentation. Anyway, I have this table:
object Users extends Table[(Int, String)]("Users") {
def userId = column[Int]("UserId", O.PrimaryKey, O.AutoInc)
def userName = column[String]("UserName")
def * = userId ~ userName
}
Part I
I'm attempting to query with this function:
def findByQuery(where: List[(String, String)]) = SlickInit.dbSlave withSession {
val q = for {
x <- Users if foo((x.userId, x.userName), where)
} yield x
q.firstOption.map { case(userId, userName) =>
User(userId, userName)}
}
where "where" is a list of search queries //ex. ("userId", "1"),("userName", "Alex")
"foo" is a helper function that tests equality. I'm running into a type error.
x.userId is of type Column[Int]. How can one manipulate this as an Int? I tried casting, ex:
foo(x.userId.asInstanceOf[Int]...)
but am also experiencing trouble with that. How does one deal with Slick return types?
Part II
Is anyone familiar with the casting function:
def * = userId ~ userName <> (User, User.unapply _)
? I know there have been some excellent answers to this question, most notably here: scala slick method I can not understand so far and a very similar question here: mapped projection with companion object in SLICK. But can anyone explain why the compiler responds with
<> method overloaded
for that simple line of code?

Let's start with the problem:
val q = for {
x <- Users if foo((x.userId, x.userName), where)
} yield x
See, Slick transforms Scala expressions into SQL. To be able to transform conditions, as you want, into SQL statement, Slick requires some special types to be used. The way these types works are actually part of the transformation Slick performs.
For example, when you write List(1,2,3) filter { x => x == 2 } the filter predicate is executed for each element in the list. But Slick can't do that! So Query[ATable] filter { arow => arow.id === 2 } actually means "make a select with the condition id = 2" (I am skipping details here).
I wrote a mock of your foo function and asked Slick to generate the SQL for the query q:
select x2."UserId", x2."UserName" from "Users" x2 where false
See the false? That's because foo is a simple predicate that Scala evaluates into the Boolean false. A similar predicate done in a Query, instead of a list, evaluates into a description of a what needs to be done in the SQL generation. Compare the difference between the filters in List and in Slick:
List[A].filter(A => Boolean):List[A]
Query[E,U].filter[T](f: E => T)(implicit wt: CanBeQueryCondition[T]):Query[E,U]
List filter evaluates to a list of As, while Query.filter evaluates into new Query!
Now, a step towards a solution.
It seems that what you want is actually the in operator of SQL. The in operator returns true if there is an element in a list, eg: 4 in (1,2,3,4) is true. Notice that (1,2,3,4) is a SQL list, not Tuple like in Scala.
For this use case of the in SQL operator Slick uses the operator inSet.
Now comes the second part of the problem. (I renamed the where variable to list, because where is a Slick method)
You could try:
val q = for {
x <- Users if (x.userId,x.userName) inSet list
} yield x
But that won't compile! That's because SQL doesn't have Tuples the way Scala has. In SQL you can't do (1,"Alfred") in ((1,"Alfred"),(2, "Mary")) (remember, the (x,y,z) is the SQL syntax for lists, I am abusing the syntax here only to show that it's invalid -- also there are many dialects of SQL out there, it is possible some of them do support tuples and lists in a similar way.)
One possible solution is to use only the userId field:
val q = for {
x <- Users if x.userId inSet list2
} yield x
This generates select x2."UserId", x2."UserName" from "Users" x2 where x2."UserId" in (1, 2, 3)
But since you are explicitly using user id and user name, it's reasonable to assume that user id doesn't uniquely identify a user. So, to amend that we can concatenate both values. Of course, we need to do the same in the list.
val list2 = list map { t => t._1 + t._2 }
val q2 = for {
x <- Users if (x.userId.asColumnOf[String] ++ x.userName) inSet list2
} yield x
Look the generated SQL:
select x2."UserId", x2."UserName" from "Users" x2
where (cast(x2."UserId" as VARCHAR)||x2."UserName") in ('1a', '3c', '2b')
See the above ||? It's the string concatenation operator used in H2Db. H2Db is the Slick driver I am using to run your example. This query and the others can vary slightly depending on the database you are using.
Hope that it clarifies how slick works and solve your problem. At least the first one. :)

Part I:
Slick uses Column[...]-types instead of ordinary Scala types. You also need to define Slick helper functions using Column types. You could implement foo like this:
def foo( columns: (Column[Int],Column[String]), values: List[(Int,String)] ) : Column[Boolean] = values.map( value => columns._1 === value._1 && columns._2 === value._2 ).reduce( _ || _ )
Also read pedrofurla's answer to better understand how Slick works.
Part II:
Method <> is indeed overloaded and when types don't work out the Scala compiler can easily become uncertain which overload it should use. (We should get rid of the overloading in Slick I think.) Writing
def * = userId ~ userName <> (User.tupled _, User.unapply _)
may slightly improve the error message you get. To solve the problem make sure that the Column types of userId and userName exactly correspond to the member types of your User case class, which should look something like case class User( id:Int, name:String ). Also make sure, that you extend Table[User] (not Table[(Int,String)]) when mapping to User.

Mongoid Syntax Questions

1) Finding by instance object
Assuming I have the instance object called #topic. I want to retrieve the answers for this given topic. I was thinking I should be able to pass in :topics=>#topic, but i had to do the very ugly query below.
#answers = Answers.where(:topic_ids => {"$in" => [#topic.id]})
2) Getting the string representation of the id. I have a custom function (shown below). But shouldn't this be a very common requirement?
def sid
return id.to_s
end

If your associations are set up correctly, you should be able to do:
#topic.answers
It sounds like the above is what you are looking for. Make sure you have set up your associations correctly. Mongoid is very forgiving when defining associations, so it can seem that they are set up right when there is in fact a problem like mismatched names in references_many and referenced_in.
If there's a good reason why the above doesn't work and you have to use a query, you can use this simple query:
#answers = Answer.where(:topic_ids => #topic.id)
This will match any Answer record whose topic_ids include the supplied ID. The syntax is the same for array fields as for single-value fields like Answer.where(:title => 'Foo'). MongoDB will interpret the query differently depending on whether the field is an array (check if supplied value is in the array) or a single value (check if the supplied value is a match).
Here's a little more info on how MongoDB handles array queries:
http://www.mongodb.org/display/DOCS/Advanced+Queries#AdvancedQueries-ValueinanArray