Adding scores from separate queries in Neo4j - database

Say I have two tables as results of two separate Cypher queries:
First table:
login score
abc 10
def 20
And second table:
login score
abc 50
ghi 100
I need a table in which the scores for the logins that exist in both tables are summed, and for other logins, they are listed with the single score available for them.
login score
abc 60
def 20
ghi 100
Can you help with a Cypher query for this? What if I want to apply a custom aggregate function instead of simple summation?

This will not be super efficient if you have a large dataset, but something like the following will do the trick :
MATCH (c:Choice)
WITH collect(c.login) AS cset
MATCH (t:Thing)
WHERE NOT t.login IN cset
RETURN t.login AS login, t.score AS score
UNION ALL
MATCH (t:Thing)
WITH collect(t.login) AS tset
MATCH (c:Choice)
WHERE NOT c.login IN tset
RETURN c.login AS login, c.score AS score
UNION ALL
MATCH (c:Choice),(t:Thing {login: c.login})
RETURN c.login AS login, c.score + t.score AS score;
There may be more efficient ways to do this, but
First part gets you the scores for logins only in the Thing-nodes
Second part gets you the scores for logins only in the Choice-nodes
Third part sums the scores for logins present in both node types.
UNIONs link it all together.
Hope this helps,
Tom

Related

Is it possible to create an SQL query that displays results like this?

Background
I have a database that hold records of all assets in an office. Each asset have a condition, a category name and an age.
A ConditionID can be;
In use
Spare
In Circulation
CategoryID are;
Phone
PC
Laptop
and Age is just a field called AquiredDate which holds records like;
2009-04-24 15:07:51.257
Example
I've created an example of the inputs of the query to explain better what I need if possible.
NB.
Inputs are in Orange in the above example.
I've split the example into two separate queries.
Count would be the output
Question
Is this type of query and result set possible using SQL alone? And if so where do I start? Would it be easier to use Ms Excel also?
Yes it is possible, for your orange fields you can just e.g.
where CategoryID ='Phone' and ConditionID in ('In use', 'In Circulation')
For the yellow one you could do a datediff of days of accuired date to now and divide it by 365 and floor that value, to get the last one (6+ years category) you need to take the minimum of 5 and the calculated value so you get 0 for all between 0-1 year old etc. until 5 which has everything above 6 years.
When you group by that calculated column and select the additional the count you get what you desire.

Fuzzy name matching algorithm

I have a database containing names of certain blacklisted companies and individuals.
All transactions created, its detail needs to be scanned against these blacklisted names. The created transactions may have names not correctly spelled, for example one can write "Wilson" as "Wilson", "Vilson" or "Veelson". The Fuzzy search logic or utility should match against the name "Wilson" present in the blacklisted database and based on the required correctness / accuracy percentage set by the user, has to show the matching name within the percentage set.
The transactions will be sent in batches or real time to check against black listed names.
I would appreciate, if users who had similar requirement and has implemented them, could also give their views and implementation
T-SQL leaves a lot to be desired in the realm of fuzzy search. Your best options are third party libraries, but if you don't want to mess with that, your best best is using the DIFFERENCE function built in to SQL Server. For example:
SELECT * FROM tblUsers U WHERE DIFFERENCE(U.Name, #nameEntered) >= 3
A higher return value for DIFFERENCE indicates higher accuracy. A drawback of this is that the algorithm favors words that sound alike, which may not be your desired characteristic.
This next example shows how to get the best match out of a table:
DECLARE #users TABLE (Name VARCHAR(255))
INSERT INTO #users VALUES ('Dylan'), ('Bob'), ('Tester'), ('Dude')
SELECT *, MAX(DIFFERENCE(Name, 'Dillon')) AS SCORE FROM #users GROUP BY Name ORDER BY SCORE DESC
It returns:
Name | Score
Dylan 4
Dude 3
Bob 2
Tester 0

Solr: Searching a term in multiple, indexed fields and returning top 'N' hits from each search field

I have two indexed fields in my Solr schema
Employee Name
Manager Name
Which are plain strings.
my Question is: Given a search term, I want to display top 5 suggested completions from Manager Names and the next 5 from Employee Names.
I can use copy fields, but sometimes I get all top 10 results from Employee Names.
I have a hunch that boosting can help me.. but could not figure out how?
Boost can't help you control the results and distribute 5 each in the top 10 results.
Probably you can check on Field Collapsing, where you can group per role (Manager and Name) and limit 5 results for the group.
So you would have 2 groups returned back to you with 5 results each.

How to write a query to see who called who the most in an Access database of phone calls?

I have a phone bill in Excel that shows all calls made to and from my phone and I imported it into a table in Access 2007. I want to learn to use Access to do a simple query to determine who I talk to the most.
Say we have Column A (caller) and Column B (person being called), and that my number will always be in either column. How do I make a query in Access to determine which phone number I talk the most with? I've got the Table with the Excel data in it, but I need some step-by-step handholding to learn how to do the query.
In simple english, I want to query all phone calls that contain my number in either column A or column B. Then, I want to count each unique pair (mynumber + othernumber or othernumber + mynumber should be counted under the same pair). Then, I want to count/summarize each unique pair to see which pair yields the highest count.
E.g. Go to Create ribbon, click Query Wizard, etc...
Thanks!
Lets say you have the following table :-
Column A : Column B
---------:----------
Fred : 1
Bill : 2
Fred : 1
You could do a query for example :-
SELECT A, B, Count(B) AS CountOfB
FROM Table1
GROUP BY A, B
ORDER BY Count(B) DESC
This would give you :-
Column A : Column B : CountOfB
---------:----------:----------
Fred : 1 : 2
Bill : 2 : 1
The first row would list the most common occurrences of column B and the count would list the number of times that row has been seen.

GROUP_CONCAT and DISTINCT are great, but how do i get rid of these duplicates i still have?

i have a mysql table set up like so:
id uid keywords
-- --- ---
1 20 corporate
2 20 corporate,business,strategy
3 20 corporate,bowser
4 20 flowers
5 20 battleship,corporate,dungeon
what i WANT my output to look like is:
20 corporate,business,strategy,bowser,flowers,battleship,dungeon
but the closest i've gotten is:
SELECT DISTINCT uid, GROUP_CONCAT(DISTINCT keywords ORDER BY keywords DESC) AS keywords
FROM mytable
WHERE uid !=0
GROUP BY uid
which outputs:
20 corporate,corporate,business,strategy,corporate,bowser,flowers,battleship,corporate,dungeon
does anyone have a solution? thanks a ton in advance!
What you're doing isn't possible with pure SQL the way you have your data structured.
No SQL implementation is going to look at "Corporate" and "Corporate, Business" and see them as equal strings. Therefore, distinct won't work.
If you can control the database,
The first thing I would do is change the data setup to be:
id uid keyword <- note, not keyword**s** - **ONE** value in this column, not a comma delimited list
1 20 corporate
2 20 corporate
2 20 business
2 20 strategy
Better yet would be
id uid keywordId
1 20 1
2 20 1
2 20 2
2 20 3
with a seperate table for keywords
KeywordID KeywordText
1 Corporate
2 Business
Otherwise you'll need to massage the data in code.
Mmm, your keywords need to be in their own table (one record per keyword). Then you'll be able to do it, because the keywords will then GROUP properly.
Not sure if MySql has this, but SQL Server has a RANK() OVER PARTITION BY that you can use to assign each result a rank...doing so would allow you to only select those of Rank 1, and discard the rest.
You have two options as I see it.
Option 1:
Change the way your store your data (keywords in their own table, join the existing table with the keywords table using a many-to-many relationship). This will allow you to use DISTINCT. DISTINCT doesn't work currently because the query sees "corporate" and "corporate,business,strategy" as two different values.
Option 2:
Write some 'interesting' sql to split up the keywords strings. I don't know what the limits are in MySQL, but SQL in general is not designed for this.

Resources