How to match with common cities visited in - arrays

The goal is to match the special and the normal and the condition is that they have a common city. The special's name should appear in Column E and the Normal's name should appear in Column F and they're being matched because they both are in the same city or have visited the same city.
So the data dump that, um, will be updated on a daily basis. So for example, Type Special - Caroline has visited in Cambridge. If there's also a Type Normal that has visited Cambridge. Hence both visited Cambridge and that's why they were matched.
All right. Another example is, um, let's see here, Regina. Is a special type. Now we want to match her with a type Normal. So in this case, there are no matches, so that would not appear in the outcomes.
Only those who have been tagged as a Special or a Normal can be matched. So the match always has to be between Special and Nornal and their names just have to appear here at the visited city.
Using FILTER Function thought it would work out, that will automatically detect a Special and a Normal when there is a common keyword in a city that they visited in, but it isn't working.
GoogleSheets

try:
=QUERY(A1:D17, "select max(A),B where D is not null group by B pivot D", 1)
update:
=ARRAYFORMULA(QUERY(SPLIT(FLATTEN(FILTER(A2:A17&"×"&B2:B17, D2:D17="special")&"×"&
TRANSPOSE(FILTER(A2:A17&"×"&B2:B17, D2:D17="normal"))), "×"),
"select Col1,Col3,Col4 where Col2=Col4", ))

Related

Why duplicated results query with cypher neo4j

I am implementing the example database Movies en Neo4j. I already search something about duplicated rows but I still have doubts
I am using XOR. I am getting the
MATCH (m:Movie)<-[r]-(p:Person)
WHERE m.title STARTS WITH 'The'
XOR (m.released = 1999 OR m.released = 2003)
RETURN m.title, m.released
So, my result is
As you can see, there are duplicated rows, I don't understand why there are doing that and the number of duplicated results is according to what?
I know that DISTINCT removes duplicated. But I am interested in understanding why the query duplicated the results and the number of duplicated is according to what?.
This is because you are matching
MATCH (m:Movie)<-[r]-(p:Person)
So the movie title will be returned for each person in the movie, so if there are 4 people in the movie, you will get four movie titles back. You can remove duplicates by matching only the movie
MATCH (m:Movie)
As Tomaz said, it is returning a row for every :Person that has a relationship to :Movie. If you concluded your query with just RETURN m and viewed the results, you probably would only see non-duplicated nodes appear. Otherwise, you can conclude the query with RETURN DISTINCT m to ensure that non-duplicated results are returned.

Entity Framework complex search function

I'm using Entity Framework with a SQL Express database and now I have to make a search function to find users based on a value typed in a textbox, where the end user just can type in everything he wants (like Google)
What is the best way to create a search function for this. The input should search all columns.
So for example, I have 4 columns. firstname,lastname,address,emailaddress.
When someone types in the searchbox foo, all columns need to be searched for everything that contains foo.
So I thought I just could do something like
context.Users.Where(u =>
u.Firstname.Contains('foo') ||
u.Lastname.Contains('foo') ||
u.Address.Contains('foo') ||
u.EmailAddress.Contains('foo')
);
But... The end user may also type in foo bar. And then the space in the search value becomes an and requirement. So all columns should be searched and for example firstname might be foo and lastname can be bar.
I think this is to complex for a Linq query?
Maybe I should create a search index and combine all columns into the search index like:
[userId] [indexedValue] where indexedValue is [firstname + " "+ lastname + " "+ address +" " + emailaddress].
Then first split the search value based on spaces and then search for columns that have all words in the search value. Is that a good approach?
The first step with any project is managing expectation. Find the minimum viable solution for the business' need and develop that. Expand on it as the business value is proven. Providing a really flexible and intelligent-feeling search capability would of course make the business happy, but it can often not do what they expect it to do, or perform to a standard that they need, where a simpler solution would do what they need, be simpler to develop and execute faster.
If this represents the minimum viable solution and you want to "and" conditions based on spaces:
public IQueryable<User> SearchUser(string criteria)
{
if(string.IsNullOrEmpty(criteria))
return new List<User>().AsQueryable();
var criteriaValues = criteria.Split(' ');
var query = context.Users.AsQueryable();
foreach(var value in criteriaValues)
{
query = query.Where(u =>
|| u.Firstname.Contains(value)
|| u.Lastname.Contains(value)
|| u.Address.Contains(value)
|| u.EmailAddress.Contains(value));
}
return query;
}
The trouble with trying to index the combined values is that there is no guarantee that for a value like "foo bar" that "foo" represents a first name and "bar" represents a last name, or that "foo" represents a complete vs. partial value. You'd also want to consider stripping out commas and other punctuation as someone might type "smith, john"
When it comes to searching it might pay to perform a bit more of a pattern match to detect what the user might be searching for. For instance a single word like "smith" might search an exact match for first name or last name and display results. If there were no matches then perform a Contains search. If it contains 2 words then a First & last name match search assuming "first last" vs. "last, first" If the value has an "#" symbol, default to an e-mail address search, if it starts with a number, then an address search. Each detected search option could have a first pass search (expecting more exact values) then a 2nd pass more broad search assumption if it comes back empty. There could be even 3rd and 4th pass searches available with broader checks. When results are presented there could be a "more results..." button provided to trigger a 2nd, 3rd, 4th, etc. pass search if the returned results didn't return what the user was expecting.
The idea being when it comes to searching: Try to perform the most typical, narrow expected search and allow the user to broaden the search if they so desire. The goal would be to try and "hit" the most relevant results early, helping mold how users enter their criteria, and then tuning to better perform based on user feedback rather than try and write queries to return as many possible hits as possible. The goal is to help users find what they are looking for on the first page of results. Either way, building a useful search will add complexity of leverage new 3rd party libraries. First determine if that capability is really required.

Cypher query in neo4j to find specific node with most paths matching pattern

I have a neo4j database with statistical information on water and waste. In this database are data points linked with the facts that are relevant, including mappings to internal definitions. Here in the attached screenshot is an example of a data point and the related metadata. The node in the center is the value, and the immediate nodes linked by "HAS_DIMENSION" are the dimensions that came with the data provider. These are not fixed and change depending on the provider. Each dimension of interest is mapped to an internal definition. Currently this is my query:
MATCH (o:Observation {uq_id:'e__ABS_AGR_AQ__FSW__MIO_M3__BG__1970____9f07c7a629625e5ae00e35838fcd4f824a3593dd'})-[:HAS_DIMENSION]->()
MATCH (o)-[:HAS_DIMENSION]->()-[:HAS_SYNONYM_FROM]->()-[:WITH_TARGET_DEF]->(v:Variable)<-[:HAS_UNIT]-(u:Unit)
MATCH (o)-[vl0:HAS_DIMENSION]->()-[:HAS_SYNONYM_FROM]->()-[:WITH_TARGET_DEF]->(l:Location)
MATCH (o)-[vc0:HAS_DIMENSION]->()-[:HAS_SYNONYM_FROM]->()-[:WITH_TARGET_DEF]->(c:Country)
MATCH (o)-[vy0:HAS_DIMENSION]->()-[:HAS_SYNONYM_FROM]->()-[:WITH_TARGET_DEF]->(y:Year)
MATCH (o)-[:HAS_DIMENSION]->(unk0)
MATCH (o)-[sr0:CAME_FROM_FILE]->(ds0)-[sr1:BELONGS_TO]->(s0)
OPTIONAL MATCH (o)-[dtr0:HAS_DIMENSION]->()-[:HAS_SYNONYM_FROM]->()-[:WITH_TARGET_DEF]->(d:DataType)
RETURN *
The issue I have is exemplified by the pink circles. I want only one pink circle (which is a node with label Variable) in the query, in particular I want the variable like follows
MATCH (v:Variable)<-[:MAPS_TO]-()<-[:HAS_DIMENSION]-(o:Observation)
By this I want to force it to observe a pattern where it identifies the single variable that matches the pattern above for the most number of intermediate nodes. So the "Fresh surface water abstracted" variable would match this pattern, since it has two paths that match this. But the "Fresh groundwater abstracted" would not, since it only has one. How could I accomplish this?
It sounds like you want to return the Variable node with the most number of paths leading to it. Would something like this roughly return the results you are after? You will need to adapt according to your matching statements.
MATCH p=(o:Observation {uq_id:'<your_id>'})-[:HAS_DIMENSION]->()<-[:MAPS_TO]-(v:Variable)
RETURN v.name, COUNT(p) as p ORDER BY p DESC LIMIT 1

Neo4j/Cypher Match only if predicate applies for all relationships

I need to match nodes only when every relationship the node has fullfills a whereclause:
MATCH (o:Otherthing)
WHERE id(o) = 1
MATCH (unknown:Thing)
WHERE (unknown)-[:DEPENDS_ON]->(:Thing)<-[:DEPENDS_ON*]-(:Thing)<-[:STARTED_WITH]-(o)
RETURN unknown
Every matched "Thing" should only have relationships labeled with "DEPENDS_ON" and all of them should fullfill the condition.
How can I achieve that?
This may work for you:
MATCH (u:Thing)-[:DEPENDS_ON]->(:Thing)<-[:DEPENDS_ON*]-(:Thing)<-[:STARTED_WITH]-(o)
WHERE ID(o) = 1
WITH u, COUNT(*) AS num
WHERE SIZE((u)--()) = num
RETURN u
The query uses an efficient degreeness check to get the total number of relationships for u, and compares that with the number of times u satisfied the MATCH. Also, since you are identifying o by its native ID (which I am assuming is always going to be the ID for an Otherthing), it is more efficient to not specify its label (to avoid a label verification operation).

neo4j / cypher - why is starting node excluded?

I have a simple graph:
When I run this simple query in neoeclipse:
START me=node:node_auto_index(name="Me")
MATCH me-[:LIVES_IN]->()<-[:LIVES_IN]-(f)
RETURN f.name;
only my Girlfriend is returned!
Why am I excluded from the result?
Results
f.name Girlfriend
Because a path (what you specify in the match) will never contain the same relationship twice.
To find all the people living in the same location including yourself, you need to split into two actions, one finding your city and the other collecting people in this city using the with statement:
start me=node:node_auto_index(name='Me')
match me-[:LIVES_IN]->homebase
with homebase
match homebase<-[:LIVES_IN]-people
return people
See http://console.neo4j.org/?id=t0wjhg

Resources