I'm trying to populate an array (fruit.Price)with properties supplied in the first WITH line of the following cypher code:
WITH [{Price_1:15,Price_2:20,Price_3:17,strFruit:"apples"},{Price_1:2,Price_2:1,Price_3:1.5,Price_4:3,strFruit:"pears"}] AS props
UNWIND props as p
MATCH (fruit:Fruit) WHERE fruit.strFruit=p.strFruit
FOREACH (price in [p.Price_1,p.Price_2,p.Price_3,p.Price_4] |SET fruit.Price = fruit.Price + price)
RETURN fruit
where the maximum quantity of p.Price_n is 4, but not all are necessarily supplied (as above, where p.Price_4 is missing in the first row). These properties will always be supplied consecutively i.e. Price_4 won't be supplied without Price_3 also.
How do I populate an array with a variable number of elements in this way? For what it's worth; I'm actually using the HTTP Rest API and the WITH line is in reality a parameters: command.
thanks
I would use coalesce(), and default to 0 for the ones that don't exist. Also, it might be easier to do reduce() instead of foreach(). (Updated to use CASE/WHEN instead of coalesce.)
Even easier would be to pass in an array of variable length {prices:[15,20,17], strFruit:"apples"}... or just the total price (if you have control over that).
WITH [{Price_1:15,Price_2:20,Price_3:17,strFruit:"apples"},{Price_1:2,Price_2:1,Price_3:1.5,Price_4:3,strFruit:"pears"}] AS props
UNWIND props as p
MATCH (fruit:Fruit) WHERE fruit.strFruit=p.strFruit
SET fruit.Price = reduce(total = [], price in [p.Price_1,p.Price_2,p.Price_3,p.Price_4] | CASE WHEN NOT price is NULL THEN total + price ELSE total END)
RETURN fruit
http://console.neo4j.org/r/o69bii
Related
I have this function that I need to modified to select from elements table the rows in which the column filters(array type) contain the feature_id param
def get_categories_by_company_and_feature(company_id:,feature_id:)
DB[:categories]
.where(company_id: company_id)
.join(:elements, category_id: :id)
.order(:category_name, :element_name)
.select_all(:categories, :elements)
.select_append(Sequel[:categories][:user_active].as(:category_user_active),
Sequel[:categories][:id].as(:category_id),
Sequel[:elements][:id].as(:element_id))
end
I've seen documentation with array_op.contains(:a) and many operators< but I don't know how to combine it with this syntax.
Also, what kind of operators can I apply for a column like Sequel[:categories][:user_active]?
I have docs with following structure:
{
id: 1
type: 1
prop1: "any value"
prop2: "any value"
...
...
}
type can be 1 or 2
Now I would like to create a query which returns all of type 1 and limited (LIMIT = 100) results of type 2 with filtering props and ordering by score.
My try so far is as follow, which isn't correct, resp. sorting by score isn't correct:
I combine two queries:
prepare a first query for using in the mainquery : type:2 AND commonfilters, size=LIMIT, sort by score, ID -> returns a list of id's
main query : (type:1 AND commonfilters) OR (id:[ids from first query]), sort by score, ID
The order isn't correct (sort by score), because it was sorted for two different independent sets of data and not sorted over all id's in combination.
What I need is something like the following SQL Query:
select * from data where commonfilters order by score, id MINUS (select * from data where rowcount > LIMIT)
Does anyone know how to achieve correct ordering for this case?
So this might be trivial, but it's kinda hard to ask. I'd like to FILTER a range based other FILTER results.
I'll try to explain from inside out (related to image below):
I use filter to find all names for given id (the results are joined in column B). This works fine and returns an array of values. This is the inner FILTER.
I want to use this array of names to find all values for them using another outer FILTER.
In other words: Find maximum value for all names for given id.
Here is what I've figured:
=MAX(FILTER(J:J, CONTAINS???(FILTER(G:G, F:F = A2), I:I)))
^--- imaginary function returning TRUE for every value in I
that is contained in the array
=MAX(FILTER(J:J, I:I = FILTER(G:G, F:F = A2)))
^--- equal does not work here when filter returns more than 1 value
=MAX(FILTER(J:J, REGEXMATCH(JOIN(",", FILTER(G:G, F:F = A2)), I:I)))
^--- this approach WORKS but is ineffective and slow on 10k+ cells
https://docs.google.com/spreadsheets/d/1k5lOUYMLebkmU7X2SLmzWGiDAVR3u3CSAF3dYZ_VnKE
I hope to find better CONTAINS function then the REGEXMATCH/JOIN combo, or to do the task using other approach.
try this in A2 cell (after you delete everything in A2:C range):
=SORTN(SORT({INDIRECT("F2:F"&COUNTA(F2:F)+1),
TRANSPOSE(SUBSTITUTE(TRIM(QUERY(QUERY(QUERY({F2:G},
"select max(Col2) group by Col2 pivot Col1"), "offset 1"),,999^99)), " ", ",")),
VLOOKUP(INDIRECT("G2:G"&COUNTA(F2:F)+1), I2:J, 2, 0)}, 1, 1, 3, 0), 999^99, 2, 1, 1)
I've created a CLR funtion for SQL Server 2014 that should calculate subtraction between the first and the last value in column [Value].
Here is the table:
Date_Time Value
-------------------------------------
2018-03-29 09:30:02.533 6771
2018-03-29 10:26:23.557 6779
2018-03-29 13:12:04.550 6787
2018-03-29 13:55:44.560 6795
Here is the code:
using System;
using System.Data;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;
[Serializable]
[SqlUserDefinedAggregate(Format.Native,
IsInvariantToDuplicates = false,
IsInvariantToNulls = true,
IsInvariantToOrder = false,
IsNullIfEmpty = true,
Name = "SUBT")]
public struct SUBT
{
private double first;
private double last;
private int count;
public void Init()
{
first = 0.0;
last = 0.0;
count = 0;
}
public void Accumulate(SqlDouble Value)
{
if (!Value.IsNull)
{
if (count == 0)
first = (double)Value;
else
last = (double)Value;
count += 1;
}
}
public void Merge(SUBT Group)
{
first = Group.first;
last = Group.last;
count += 1;
}
public SqlDouble Terminate()
{
double value = (double)(last - first);
return new SqlDouble(value);
}
}
So the result should be [Value]=24, i.e. 6795 - 6771, but I get 6795 :(
Where is the error?
This aggregate function is dependent on order but there is no ordering guarantee for the aggregation input stream. Consequently, the results are execution plan dependent.
Assuming the Date_Time value is the desired ordering, you could provide both Date_Time and Value as function arguments, save the value with the lowest and highest Date_time values and use those in the Merge and Terminate methods.
There are a couple of problems here, I believe:
In the Merge method you are making two bad assumptions:
You are assuming that the incoming Group has a value, yet it could have been called on one or more NULL values, in which case all 3 internal variables are 0. Yet you are overwriting the current values of first and last, which could have non-0 values prior to Merge being called, but then will end up back at 0 due to being overwritten.
You are assuming that at least one of the instances — the current one or the incoming Group — has values set (i.e. has been called at least once on a non-NULL value). In the case that both instances have only been called with a NULL value, you will have 0 for first and last, yet you will increment counter. I am not sure if Accumulate will be called again once things are being merged, but if it does, you will skip setting first. This is not your problem at the moment since you do not have multiple (or any) NULL values, but it is a potential problem for real data sets.
In the case where both instances have been called on non-NULL values, both will at least have first set, and maybe last (or maybe not). By overwriting the current instance with the incoming Group, you could be losing the real first value, losing the real last value.
As #DanGuzman mentioned in his answer, there is no guaranteed ordering of User-Defined Aggregates (the IsInvariantToOrder property of the SqlUserDefinedAggregate attribute is ignored / unused). And, as he noted, you will need to pass in the Date_Time value in order to handle this aspect of the operation manually. However, it won't be used in the Terminate method. It will instead be used to compare to two new variables: firstDate and lastDate, initialized to a future and past, respectively (this will likely require changing the Format to UserDefined and then adding custom Read and Write methods -- unless you can store the full DateTime values as ticks, perhaps).
Get rid of the counter variable
In the Accumulate method, you will need to:
IF the incoming Date_Value against firstDate. If it is before to firstDate then store this new value as firstDate and store Value as first, ELSE
IF the incoming Date_Value against lastDate. If it is after to lastDate then store this new value as lastDate and store Value as last, ELSE do nothing
In the Merge method, do a similar comparison for firstDate between both instances and keep the earlier one (date and value). Do the same with lastDate and keep the later one (date and value). (Note: these changes should fix all of the Merge issues noted above in # 1)
The terminate method shouldn't change
For what it's worth, I ran the code exactly as you have posted in the question, and it returns the expected value, using the following test query:
CREATE TABLE #Test ([Date_Time] DATETIME, [Value] FLOAT);
-- TRUNCATE TABLE #Test;
INSERT INTO #Test VALUES ('2018-03-29 09:30:02.533', 6771);
INSERT INTO #Test VALUES ('2018-03-29 10:26:23.557', 6779);
INSERT INTO #Test VALUES ('2018-03-29 13:12:04.550', 6787);
INSERT INTO #Test VALUES ('2018-03-29 13:55:44.560', 6795);
SELECT dbo.SUBT([Value])
FROM #Test;
-- 24
So, if you are still having issues, then you will need to post more info, such as the test query (and maybe table) that you are using. But even if it appears to work, like it does on my system, it still has the potential ordering problem and will need to be updated as noted above regardless.
Other notes:
In the Accumulate method you have (double)Value. There is no need to cast the incoming parameter. All Sql* types have a Value property that returns the value in the native .NET type. In this case, just use Value.Value. That isn't great for readability, so consider changing the name of the input paramter ;-).
You never use the value of counter, so why increment it? You could instead just use a bool, and set it to true here. Setting it to true for each non-NULL value will not change the operation. However, this is a moot point since you truly need to set either first or last in each call to this UDA, based on the current Date_Time value.
I have a cell array called BodyData in MATLAB that has around 139 columns and 3500 odd rows of skeletal tracking data.
I need to extract all rows between two string values (these are timestamps when an event happened) that I have
e.g.
BodyData{}=
Column 1 2 3
'10:15:15.332' 'BASE05' ...
...
'10:17:33:230' 'BASE05' ...
The two timestamps should match a value in the array but might also be within a few ms of those in the array e.g.
TimeStamp1 = '10:15:15.560'
TimeStamp2 = '10:17:33.233'
I have several questions!
How can I return an array for all the data between the two string values plus or minus a small threshold of say .100ms?
Also can I also add another condition to say that all str values in column2 must also be the same, otherwise ignore? For example, only return the timestamps between A and B only if 'BASE02'
Many thanks,
The best approach to the first part of your problem is probably to change from strings to numeric date values. In Matlab this can be done quite painlessly with datenum.
For the second part you can just use logical indexing... this is were you put a condition (i.e. that second columns is BASE02) within the indexing expression.
A self-contained example:
% some example data:
BodyData = {'10:15:15.332', 'BASE05', 'foo';...
'10:15:16.332', 'BASE02', 'bar';...
'10:15:17.332', 'BASE05', 'foo';...
'10:15:18.332', 'BASE02', 'foo';...
'10:15:19.332', 'BASE05', 'bar'};
% create column vector of numeric times, and define start/end times
dateValues = datenum(BodyData(:, 1), 'HH:MM:SS.FFF');
startTime = datenum('10:15:16.100', 'HH:MM:SS.FFF');
endTime = datenum('10:15:18.500', 'HH:MM:SS.FFF');
% select data in range, and where second column is 'BASE02'
BodyData(dateValues > startTime & dateValues < endTime & strcmp(BodyData(:, 2), 'BASE02'), :)
Returns:
ans =
'10:15:16.332' 'BASE02' 'bar'
'10:15:18.332' 'BASE02' 'foo'
References: datenum manual page, matlab help page on logical indexing.