Couchbase: Create Index on Array containing Array of objects - arrays

I am trying to create an index on the following structure:
"creators": [
{
"ag_name": "Travel",
"ag_ids": [
{
"id": "1234",
"type": "TEST"
}
]
}
]
The index that I created is the following:
CREATE INDEX `Test_Index` ON `bucket`((ARRAY(ARRAY [t.ag_name, v] FOR v IN OBJECT_VALUES(t.`ag_ids`) END) FOR t IN `indexed_data`.`pos` END))
WHERE ((SUBSTR0((META().`id`), 0, 2) = "tt") AND (`indexed_data` IS VALUED))
Question
I started using couchbase a couple of hours ago. I was wondering. Is the index that I created correct? I mean it is being created successfully. But I am not sure if it’s covering all the fields including the ones in the substructure array
Query
SELECT META().id
FROM bucket
WHERE SUBSTR0((META().`id`), 0, 2) = "tt"
AND indexed_data.reservation_type = "HOLA"
AND chain_code="FOO1"
AND indexed_data.property_code="BAR1"
AND ANY creator IN indexed_data.creators SATISFIES creator.ag_name="FOO" END
AND ANY creator IN indexed_data.creators SATISFIES (ANY ag in creator.ag_ids SATISFIES ag.id="1234" END AND ANY ag in creator.ag_ids SATISFIES ag.type="TEST" END) END

The only way above query you can have covering index indexed_data.creators ARRAY as whole. Example 1 at https://docs.couchbase.com/server/current/n1ql/n1ql-language-reference/indexing-arrays.html#covering-array-index. You can also create ARRAY index one of the field from ARRAY. As you are referencing multiple fields from array you will not able to use Implicit Covering Array Index that described above link
CREATE INDEX ix1 ON bucket (chain_code,indexed_data.reservation_type, indexed_data.property_code, indexed_data.creators )
WHERE SUBSTR0((META().`id`), 0, 2) = "tt";
Also you are doing AND of multiple ANY clauses of same ARRAY. i.e. means it can match with any position in the array If need same position have all matched you should use following query.
SELECT META().id
FROM bucket
WHERE SUBSTR0((META().`id`), 0, 2) = "tt"
AND indexed_data.reservation_type = "HOLA"
AND chain_code="FOO1"
AND indexed_data.property_code="BAR1"
AND (ANY c IN indexed_data.creators
SATISFIES c.ag_name = "FOO"
AND (ANY ag IN c.ag_ids
SATISFIES ag.id = "1234" AND ag.type = "TEST"
END)
END);

I don't know if this is the best way to determine if an index is covering or not, but if you click "Plan" in the Query Workbench, you will see all the various steps visualized. If you see a "Fetch" step, then the index(es) being used are not covering your query.
Further, if you click "Advice", a covering index will be recommended for your query.

Related

What is wrong with this expressions? =CountRows(ReportItems!Textbox58.Value = "Intervene"). I want to count each row which says Intervene

What is wrong with this expression? =CountRows(ReportItems!Textbox58.Value = "Intervene")
I want to count each row which says Intervene.
As Larnu has commented, you cannot use CountRows against the ReportItems collection.
Probably what you need to do is
Look at the expression in Textbox58 and see where it gets it's data from. In this exmaple let's say it comes from Fields!myFieldName.Value.
Now we need to count the rows where Fields!myFieldName.Value = "Intervene" but rather than using count, we can convert these matches to return 1 or 0 where the field is not "Intervene"
So the expression would look something like this
=SUM(IIF(Fields!myFieldName.Value = "Intervene", 1, 0))
This will sum the rows withing the current scope, so if this is contained in a row group for example, then it will only sum those rows in that row group.
If you need to count based on a a different scope (e.g. the entire dataset) then you can specify that in the SUM() function like this
=SUM(IIF(Fields!myFieldName.Value = "Intervene", 1, 0), "DataSet1")
Here we are summing across the entire dataset where the dataset name is DataSet1
Update based on OP comment
As your expression is
=SUM(IIF(Fields!Actual_Duration.Value >= 10, "Intervene", "No Intervention Needed"))
What we actually need to count is instances where Actual_Duration is >= 10.
So the final expression should be
=SUM(IIF(Fields!Actual_Duration.Value >= 10, 1, 0))

order results by score of two combined sets of data (Solr)

I have docs with following structure:
{
id: 1
type: 1
prop1: "any value"
prop2: "any value"
...
...
}
type can be 1 or 2
Now I would like to create a query which returns all of type 1 and limited (LIMIT = 100) results of type 2 with filtering props and ordering by score.
My try so far is as follow, which isn't correct, resp. sorting by score isn't correct:
I combine two queries:
prepare a first query for using in the mainquery : type:2 AND commonfilters, size=LIMIT, sort by score, ID -> returns a list of id's
main query : (type:1 AND commonfilters) OR (id:[ids from first query]), sort by score, ID
The order isn't correct (sort by score), because it was sorted for two different independent sets of data and not sorted over all id's in combination.
What I need is something like the following SQL Query:
select * from data where commonfilters order by score, id MINUS (select * from data where rowcount > LIMIT)
Does anyone know how to achieve correct ordering for this case?

Convert string to variable name in Lua

In Lua, I have a set of tables:
Column01 = {}
Column02 = {}
Column03 = {}
ColumnN = {}
I am trying to access these tables dynamically depending on a value. So, later on in the programme, I am creating a variable like so:
local currentColumn = "Column" .. variable
Where variable is a number 01 to N.
I then try to do something to all elements in my array like so:
for i = 1, #currentColumn do
currentColumn[i] = *do something*
end
But this doesn't work as currentColumn is a string and not the name of the table. How can I convert the string into the name of the table?
If I understand correctly, you're saying that you'd like to access a variable based on its name as a string? I think what you're looking for is the global variable, _G.
Recall that in a table, you can make strings as keys. Think of _G as one giant table where each table or variable you make is just a key for a value.
Column1 = {"A", "B"}
string1 = "Column".."1" --concatenate column and 1. You might switch out the 1 for a variable. If you use a variable, make sure to use tostring, like so:
var = 1
string2 = "Column"..tostring(var) --becomes "Column1"
print(_G[string2]) --prints the location of the table. You can index it like any other table, like so:
print(_G[string2][1]) --prints the 1st item of the table. (A)
So if you wanted to loop through 5 tables called Column1,Column2 etc, you could use a for loop to create the string then access that string.
C1 = {"A"} --I shorted the names to just C for ease of typing this example.
C2 = {"B"}
C3 = {"C"}
C4 = {"D"}
C5 = {"E"}
for i=1, 5 do
local v = "C"..tostring(i)
print(_G[v][1])
end
Output
A
B
C
D
E
Edit: I'm a doofus and I overcomplicated everything. There's a much simpler solution. If you only want to access the columns within a loop instead of accessing individual columns at certain points, the easier solution here for you might just be to put all your columns into a bigger table then index over that.
columns = {{"A", "1"},{"B", "R"}} --each anonymous table is a column. If it has a key attached to it like "column1 = {"A"}" it can't be numerically iterated over.
--You could also insert on the fly.
column3 = {"C"}
table.insert(columns, column3)
for i,v in ipairs(columns) do
print(i, v[1]) --I is the index and v is the table. This will print which column you're on, and get the 1st item in the table.
end
Output:
1 A
2 B
3 C
To future readers: If you want a general solution to getting tables by their name as a string, the first solution with _G is what you want. If you have a situation like the asker, the second solution should be fine.

How can I use an array as input for FILTER function in Google Spreadsheet?

So this might be trivial, but it's kinda hard to ask. I'd like to FILTER a range based other FILTER results.
I'll try to explain from inside out (related to image below):
I use filter to find all names for given id (the results are joined in column B). This works fine and returns an array of values. This is the inner FILTER.
I want to use this array of names to find all values for them using another outer FILTER.
In other words: Find maximum value for all names for given id.
Here is what I've figured:
=MAX(FILTER(J:J, CONTAINS???(FILTER(G:G, F:F = A2), I:I)))
^--- imaginary function returning TRUE for every value in I
that is contained in the array
=MAX(FILTER(J:J, I:I = FILTER(G:G, F:F = A2)))
^--- equal does not work here when filter returns more than 1 value
=MAX(FILTER(J:J, REGEXMATCH(JOIN(",", FILTER(G:G, F:F = A2)), I:I)))
^--- this approach WORKS but is ineffective and slow on 10k+ cells
https://docs.google.com/spreadsheets/d/1k5lOUYMLebkmU7X2SLmzWGiDAVR3u3CSAF3dYZ_VnKE
I hope to find better CONTAINS function then the REGEXMATCH/JOIN combo, or to do the task using other approach.
try this in A2 cell (after you delete everything in A2:C range):
=SORTN(SORT({INDIRECT("F2:F"&COUNTA(F2:F)+1),
TRANSPOSE(SUBSTITUTE(TRIM(QUERY(QUERY(QUERY({F2:G},
"select max(Col2) group by Col2 pivot Col1"), "offset 1"),,999^99)), " ", ",")),
VLOOKUP(INDIRECT("G2:G"&COUNTA(F2:F)+1), I2:J, 2, 0)}, 1, 1, 3, 0), 999^99, 2, 1, 1)

Loop over data values in neo4j

I have a movies.csv file, which has a feature vector per line (E.g - id|Name|0|1|1|0|0|0|1 has 2 features for name and id, 7 features for genre classification)
I want a node m from class Movies to have a relationship [:HAS_GENRE] with nodes g from class Genres. For that, I need to loop over all the '|' separated features and only make a relationship if the value is 1.
IN essence, I want to have -
x = a //where a is the index of the first genre feature
while (x < lim) //lim is the last index of the feature vector
{
if line[x] is 1:
(m{id:toInt(line[0]})-[:HAS_GENRE]->(g{id=line[x]})
}
How do I do that?
try this
WITH ["Genre1","Genre2",...] as genres
LOAD CSV FROM "file:movies.pdv" using fieldterminator "|" AS row
MERGE (m:Movie {id:row[0]}) ON CREATE SET m.title = row[1]
FOREACH (idx in filter(range(0,size(genres)-1) WHERE row[2+idx]="1") ) |
MERGE (g:Genre {name:genres[idx]})
CREATE (m)-[:HAS_GENRE]->(g)
)
it loads each row of the file of as a collection
the first two elements are used to create a movie
then filter the potential indexes range(0,size(genres)-1) by the existence of a "1" in the input row,
the resulting list of indexes is then used to lookup the genre-name or id
and connect the movie with the genre

Resources