So I understand that Neo4j 3.5 and above implements full-text search in cypher query via createNodeIndex(), e.g.:
CALL db.index.fulltext.createNodeIndex("myIndex", ["PersonNode"], ["name"])
where myIndex is an arbitrary variable I make up to store the index, PersonNode is the name of my Node label, and name is one of the attributes of PersonNode where I want the full-text search performed.
And to actually perform the search by name, I can do something like the following:
CALL db.index.fulltext.queryNodes("myIndex", "Charlie")
But now assume that PersonNode has a relationship of type PURCHASED_ITEM, which is connected to another node label ProductNode as follows:
PersonNode-[:PURCHASED_ITEM]->ProductNode
And assume further that ProductNode has an attribute called productTitle indicating the display title name for each product.
My question is, I would like to set up an index for this relationship (using, presumably, createRelationshipIndex()), and perform a full-text search by productTitle and return a list of all PersonNode that purchased the given product. How can I do this?
Addendum: I understand that the above could be done by first getting a list of all ProductNode instances matching the given title, then performing a normal cypher query to extract all related PersonNode instances. I also understand that for the above example, a normal cypher query would be all that I need. But the reason I'm asking this question is that I eventually need to implement a single search bar that would allow the user to input any text, including possible misspellings and all, and have it perform a search through multiple attributes and/or relationships of PersonNode, and the results need to be sorted by some kind of relevance score. And in order to do this, I feel I need to first grasp exactly how the relationship queries work in neo4j.
Here is an example of how to create a full-text index for the productTitle property of PURCHASED_ITEM relationships:
CALL db.index.fulltext.createRelationshipIndex("myRelIndex", ["PURCHASED_ITEM"], ["productTitle"])
And here is a snippet showing the use of that index:
CALL db.index.fulltext.queryRelationships("myRelIndex", "Hula Hoop") YIELD relationship, score
...
product title is the property of product node not the purchased item
Related
I have this query and it refuses to use an index, idk if it's because the "Expand" stage in the pipeline or what exactly, but I can't get it to use an index in this form, especially in the ORDER BY clause, it still gives me a "Sort" stage in the planner, and I'd like to avoid it.
The index is the createdAt property.
PROFILE
MATCH (u:User {user_id: '61c84762da4e457d55656efa'})-[follows:FOLLOWS]->(following:User)-[relatedTo:POSTED|SHARED]->(everything)
WHERE relatedTo.createdAt > datetime("2000-02-12T15:42:10.866+00:00")
RETURN u, relatedTo, everything
ORDER BY relatedTo.createdAt DESC
Here is a picture of the planner
The only way it does what I want it to do, is if I remove everything prior to the last relation, which obviously defies the point of that query but it was just for testing.
PROFILE
MATCH (following:User)-[relatedTo:POSTED|SHARED]->(everything)
WHERE relatedTo.createdAt > datetime("2000-02-12T15:42:10.866+00:00")
RETURN relatedTo, everything
ORDER BY relatedTo.createdAt DESC
Now it uses the index.
Any ideas how to do I get it to use an index in both, the query & the sort?
I'm not entirely clear why you want to use an index?
In your first query an index is used to find the :User node and then relationship pointers are followed to find the other nodes of interest. In Neo4j following relationship pointers is always faster than trying to use an index to find nodes (unlike a relational database). Typically, you only want to use an index to find your start nodes in a path, which is what your first query is doing.
If you really want to split the query to start the index search in a different part of the path you could split the query into multiple parts using WITH.
I'm putting together a query to index medicines. A user should be able to enter their search term into a single search box. Their search term might be either a brand name for a drug, a generic name (the underlying compound on which all brands are based) or an indication and they should be returned a list of medicines that correspond to their search. I'd like to have a category facet for the type - either indication, brand or generic.
To have a category facet, my understanding is that I'd have to send my data through as one row per search term where that search term might be a brand, indication or a generic, rather than one row per brand with columns for generic list and indication. Is this correct or is there another way to get at what I'm wanting to do?
I hope I understand your ask here. From the screenshot you provided, I would assume what you would want to do is make the field "MedicineInformationType" a Facetable field in your Azure Search index and make the field "SearchTerm", "Product", "GenericList", and "ActionList" all Searchable fields in your Azure Search index (although I am not sure why you would want the "SearchTerm" field if the term in this field is already in one of the other fields).
If you structure your index this way, you can do a search for say "phosphate" and facet over the "MedicineInformationType" field to get a count of the results that are generic or brands.
For example (as a REST call):
search=phosphate&facet=MedicineInformationType
Suppose I want to create a recommendation system to suggest people you should connect with based off of certain attributes that I know about you and attributes I have about other people that are stored in a Solr index. Is it possible to query the index with a list of attributes (along with boosts for each attribute) and have Solr return scored results even if some of my fields return no matches? The way that I understand that Solr works is that if one of your fields doesn't contain a match in any documents found in your index, you get zero results for the entire query (even if other fields in the query matched) - is that right? What I would hope is that I could query the index and get a list of results back in order of a score given based on how many (and which) fields matched to something, even if some fields have no matches, for example:
Say that there are 2 people documents stored in the index as follows (figuratively):
Person 1:
Industry: Manufacturing
City: Oakland
Person 2:
Industry: Manufacturing
City: San Jose
And say that I perform a pseudo-Solr query that basically says "Search for everyone whose industry is equal to manufacturing and whose city is equal to Oakland". What I would like is to receive both results back in the result set, even though one of the "Persons" does not reside in Oakland. I just want that person to come back as a result with a lower score than Person1. Is this possible? What might a solr query look like to handle this? Assume that I have many more than 2 attributes for each person (so saying that I can use "And" and "Or" in my solr query isn't really feasible.. or is it?) Thanks in advance for your helpful input! (PS I'm using Solr 3.6)
You mention using the AND operator, which is likely your problem.
The default behavior of Lucene, and Solr, query syntax is exactly what you are asking for. A query like:
industry:manufacturing city:oakland
Will match either, with scoring preference on those that match both. See the lucene query syntax documentation
You can use the bq parameter (boost query) does not affect matching, but affects the scores only.
http://localhost:8983/solr/persons/select?q=industry:manufacturing&bq=City:Oakland^2
play with the boosting factor at the end to get the correct balance between matching score, and boosting score.
Suppose a user enters a two word input for search, since the default boolean applied is OR, all entries containing all or both entries appear.
What I was interested to know, is that if conditions specifically meeting the AND condition could be boosted.
In case of multiple words, can words be specified to imply specific constraints in searching or boost few parameters in case these words are present.For e.g: , if input be "with x and y without z", can i make my solr to interpret it as (x AND y) AND (Not z)? or at least boost those entries which partially or fully meet the requirement?
EDIT:
I have tried using boost with edismax as shown here:
$query = $client->createSelect(); //create search query
$query->setQuery('memberType:'.$searchQuery.' firstName:'.$searchQuery.' gender:'.$searchQuery); //include fields required for searching //meantion fields to be searched and search query/ies
$edismax = $query->getEDisMax();
$edismax->setQueryFields('firstName memberType^3 gender^2'); //boost fields
$query->setStart($start)->setRows($rows); //vary bracketted numbers to vary results staring point and no. of rows to be displayed, use variables instead of constants
$query->setFields(array('id', 'firstName', 'lastName', 'eid', 'gender', 'memberType')); //set return fields
//$query->addSort('id', $query::SORT_ASC); //sort field and customisations
$resultSet = $client->select($query);
When i search for a name with a particular member type, like "sanjay candidate" i expect the order to be entries with sanjay and candidate, and then all users who are candidates and then all users who are sanjay, but instead i get sanjay and candidate then all who are sanjay and then all candidates.
I am not able to figure out what the issue may be or if i can provide a more customized boosting.
If you are using eDismax, you have a whole collection of boosting options for a phrase, bigram, a separate boosting query and so on. Reading through the wiki page and experiment. You should not need to do any custom coding for this scenario.
I need to implement further functionality picture of it is attached below. I've already built an application based on Solr search.
In a few words about this functionality: drop down will contain similar search phrases within concrete category and number of items found.
In what way to make Solr collect such data and somehow receive it?
Yes, you can do that in Solr using Facets, which allow grouping results. The default behaviour of facets is to return the group name and the number of items found. You do that by adding these 2 items you your query string facet=true, facet.field=category.
An example query in your case will be
http://localhost:8983/solr/NAME_OF_YOUR_INDEX/select/?wt=json&indent=on&q=ipo&fl=category,name&facet=true&facet.field=category
Take a look at the tutorial for more details.
This is roughly equivalent to doing this in SQL:
SELECT category, COUNT(*) FROM items WHERE text LIKE "%ipo%" GROUP BY category;