Is any way to use more than one vertex set in a SELECT statement? - graph-databases

I was wondering if there is any way to use more than one vertex set in a SELECT statement.
I would think it should be possible because... why not?
For example, say we have this basic query:
CREATE QUERY coolQuery(VERTEX<Foo> foo, String bar, String biz) FOR GRAPH cool_graph SYNTAX v2 {
f = {foo};
x = SELECT i
FROM SomeVertex:i -(PathType1>)- f
y = SELECT i
FROM x:i -(<PathType2)- BarVertex:br
WHERE br.id == bar;
z = SELECT i
FROM y:i -(PathType3>.PathType4>)- BizVertex:bz
WHERE bz.id == biz;
PRINT z;
}
Now, that's all fine and dandy, but what if I know the other vertices whose ids are bar and biz?
Can I use more than one known vertex set in a SELECT statement?
The goal here is to arrive to the final SomeVertex set as quickly as possible via using the indexed vertex id values.
This is what I'm thinking:
CREATE QUERY coolQuery2(VERTEX<FooVertex> foo, VERTEX<BarVertex> bar, Vertex<BizVertex> biz) FOR GRAPH cool_graph SYNTAX v2 {
f = {foo};
br = {bar};
bz = {biz};
x = SELECT i
FROM SomeVertex:i -(PathType1>)- f
y = SELECT i
FROM x:i -(<PathType2)- br
z = SELECT i
FROM y:i -(PathType3>.PathType4>)- bz
PRINT z;
}
I get syntax errors with this and I can't find anything in the docs that does this type
of thing where more than one known vertex set is used in a SELECT statement.

in your case, it is recommended to write the query this way:
Version 1:
CREATE QUERY coolQuery2(VERTEX<FooVertex> foo, VERTEX<BarVertex> bar, Vertex<BizVertex> biz) FOR GRAPH cool_graph SYNTAX v2 {
OrAccum<BOOL> #hasFoo, #hasBar, #hasBiz;
f = {foo, bar, biz};
result = select t from f:s-((PathType1|PathType2|PathType3):e)-:t
accum case when s == foo then t.#hasFoo += true end,
case when s == bar then t.#hasBar += true end,
case when s == biz then t.#hasBiz += true end
having t.#hasFoo and t.#hasBar and t.#hasBiz;
print result;
}
Version 2:
CREATE QUERY coolQuery2(VERTEX<FooVertex> foo, VERTEX<BarVertex> bar, Vertex<BizVertex> biz) FOR GRAPH cool_graph SYNTAX v2 {
OrAccum<BOOL> #hasFoo, #hasBar, #hasBiz;
f = {foo};
br = {bar};
bz = {biz};
fooSet = select t from f-(PathType1)-:t;
barSet = select t from br-(PathType1)-:t;
bizSet = select t from bz-(PathType1)-:t;
result = fooSet intersect barSet intersect bizSet;
print result;
}
In this case version, 1 is more recommended since it has better concurrency and only does only one SELECT.

Related

How can I check in xarray if a coordinate exists?

I want something similar to this: if fileObj.is_file() == True: But for a dataset.
I want to check if a date exists befor I select it.
y_begin = 2007
y_end = 2020
begin_date = '05-01'
end_date = '09-31'
ds_so_merge = None
for y in range(y_begin, y_end +1):
begin = str(y) + '-' + begin_date
end = str(y) + '-' + end_date
!!!here checking if the date exists and if not trying the following date!!!
ds_so = dataset.sel(time=slice(begin, end))
if ds_so_merge is None:
ds_so_merge = ds_so
else:
ds_so_merge = ds_so.merge(ds_so_merge)
you can check if a coordinate contains a specific value with value in coord just like you could with a numpy array or a pandas index.
Another option since you're using slices would just be to pull all elements which match the slice criteria, then select the first matched element.
Something like the following should work:
first_time_matching_slice = dataset.sel(time=slice(begin, end)).isel(time=0)

Stored procedure - get anticipated columns before fully executing statement?

I'm working through a stored procedure and wondering if there's a way to retrieve the anticipated result column list from a sql statement before fully executing.
Scenarios:
dynamic SQL
a UDF that might vary the columns outside of our control
EX:
//inbound parameter
SET QUERY_DEFINITION_ID = 12345;
//Initial statement pulls query text from bank of queries
var sqlText = getQueryFromQueryBank(QUERY_DEFINITION_ID);
//now we run our query
var cmd = {sqlText: sqlText };
stmt = snowflake.createStatement(cmd);
What I'd like to be able to do is say "right - before you run this, give me the anticipated column list" so I can compare it to what's expected.
EX:
Expected: [col1, col2, col3, col4]
Got: [col1]
Result: Oops. Don't run.
Rationale here is that I want to short-circuit the execution if something is missing - before it potentially runs for a while. I can validate all of this after the fact, but it would be really helpful to stop early.
Any ideas very much appreciated!
This sample SP code shows how to get a list of columns that a query will project into the result before you run the query. It should only be used for large, long running queries because it will take a few seconds to get the column list.
There are a couple of caveats. 1) It will only return the names of the columns. It won't tell you how they were built, that is, whether they're aliased, direct from a table, calculated, etc. 2) The example query I used is straight from the Snowflake documentation here https://docs.snowflake.com/en/user-guide/sample-data-tpcds.html#functional-query-definition. For convenience, I minimized the query to a single line. The output of the columns includes object qualifiers in addition to the column names, so V1.I_CATEGORY, V1.D_YEAR, V1.D_MOY, etc. If you don't want them to make it easier to compare names, you can strip off the qualifiers using the JavaScript split function on the dot and take index 1 of the resulting array.
create or replace procedure EXPLAIN_BEFORE_RUNNING()
returns string
language javascript
execute as caller
as
$$
// Set the context for the session to the TPC-H sample data:
executeNonQuery("use schema snowflake_sample_data.tpcds_sf10tcl;");
// Here's a complex query from the Snowflake docs (minimized to one line for convienience):
var sql = `with v1 as( select i_category, i_brand, cc_name, d_year, d_moy, sum(cs_sales_price) sum_sales, avg(sum(cs_sales_price)) over(partition by i_category, i_brand, cc_name, d_year) avg_monthly_sales, rank() over (partition by i_category, i_brand, cc_name order by d_year, d_moy) rn from item, catalog_sales, date_dim, call_center where cs_item_sk = i_item_sk and cs_sold_date_sk = d_date_sk and cc_call_center_sk= cs_call_center_sk and ( d_year = 1999 or ( d_year = 1999-1 and d_moy =12) or ( d_year = 1999+1 and d_moy =1)) group by i_category, i_brand, cc_name , d_year, d_moy), v2 as( select v1.i_category ,v1.d_year, v1.d_moy ,v1.avg_monthly_sales ,v1.sum_sales, v1_lag.sum_sales psum, v1_lead.sum_sales nsum from v1, v1 v1_lag, v1 v1_lead where v1.i_category = v1_lag.i_category and v1.i_category = v1_lead.i_category and v1.i_brand = v1_lag.i_brand and v1.i_brand = v1_lead.i_brand and v1.cc_name = v1_lag.cc_name and v1.cc_name = v1_lead.cc_name and v1.rn = v1_lag.rn + 1 and v1.rn = v1_lead.rn - 1) select * from v2 where d_year = 1999 and avg_monthly_sales > 0 and case when avg_monthly_sales > 0 then abs(sum_sales - avg_monthly_sales) / avg_monthly_sales else null end > 0.1 order by sum_sales - avg_monthly_sales, 3 limit 100;`;
// Before actually running the query, generate an explain plan.
executeNonQuery("explain " + sql);
// Now read the column list from the explain plan from the result set.
var columnList = executeSingleValueQuery("COLUMN_LIST", `select "expressions" as COLUMN_LIST from table(result_scan(last_query_id())) where "operation" = 'Result';`);
// For now, just exit with the column list as the output...
return columnList;
// Your code here...
// Helper functions:
function executeNonQuery(queryString) {
var out = '';
cmd = {sqlText: queryString};
stmt = snowflake.createStatement(cmd);
var rs;
rs = stmt.execute();
}
function executeSingleValueQuery(columnName, queryString) {
var out;
cmd1 = {sqlText: queryString};
stmt = snowflake.createStatement(cmd1);
var rs;
try{
rs = stmt.execute();
rs.next();
return rs.getColumnValue(columnName);
}
catch(err) {
if (err.message.substring(0, 18) == "ResultSet is empty"){
throw "ERROR: No rows returned in query.";
} else {
throw "ERROR: " + err.message.replace(/\n/g, " ");
}
}
return out;
}
$$;
call Explain_Before_Running();

Postgresql/ database - traceout on graph with stop criterium (topology)

I have a undirected graph and want to make a traceout. This works fine with:
WITH RECURSIVE path AS (
SELECT edge_id, start_node, end_node
FROM simulation.edge_data
WHERE start_node = 1 OR end_node = 1
UNION
SELECT e.edge_id, e.start_node, e.end_node
FROM simulation.edge_data e, path s
WHERE s.start_node = e.start_node OR s.start_node = e.end_node OR s.end_node = e.end_node OR s.end_node = e.start_node)
SELECT * FROM path;
Now I want to stop at certain nodes if the corresponding record of the node has the status = closed.
Not further investigate on not reachable nodes.
I tried to add something in the where clause:
(SELECT status FROM getRecord(e.start_node)) = open
This gives me all the correct nodes but not all edges.
I can provide more code if needed dont want to overdo if the answer is quiet simple!

Save google maps polygons in database and find the polygon that contains a location [duplicate]

I have the created the table below
CREATE TABLE geom (g GEOMETRY);
and have inserted many rows, example below:
INSERT INTO geom (g)
VALUES(PolygonFromText('POLYGON((
9.190586853 45.464518970,
9.190602686 45.463993916,
9.191572471 45.464001929,
9.191613325 45.463884676,
9.192136130 45.463880767,
9.192111509 45.464095594,
9.192427961 45.464117804,
9.192417811 45.464112862,
9.192509035 45.464225851,
9.192493139 45.464371079,
9.192448471 45.464439002,
9.192387444 45.464477861,
9.192051402 45.464483037,
9.192012814 45.464643592,
9.191640825 45.464647090,
9.191622331 45.464506215,
9.190586853 45.464518970))')
);
Now I want to search all the data and return the entries where a lat / long I have falls withn any of the polygons.
How can this be done using mysql? or is anyone aware of any links that will point me in the right direction?
MySQL as of v5.1 only supports operations on the minimum bounding rectangles (MBR). While there is a "Contains" function which would do what you need, it is not fully implemented and falls back to using MBRContains
From the relevant manual page
Currently, MySQL does not implement
these functions according to the
specification. Those that are
implemented return the same result as
the corresponding MBR-based functions.
This includes functions in the
following list other than Distance()
and Related().
These functions may be implemented in
future releases with full support for
spatial analysis, not just MBR-based
support.
What you could do is let MySQL give you an approximate result based on MBR, and then post process it to perform a more accurate test. Alternatively, switch to PostGIS!
(Update May 2012 - thanks Mike Toews)
MySQL 5.6.1+ offers functions which use object shapes rather than MBR
MySQL originally implemented these functions such that they used
object bounding rectangles and returned the same result as the
corresponding MBR-based functions. As of MySQL 5.6.1, corresponding
versions are available that use precise object shapes. These versions
are named with an ST_ prefix. For example, Contains() uses object
bounding rectangles, whereas ST_Contains() uses object shapes.
If you cannot change dbs to one that has spatial operators implemented correctly like PostgreSQL's PostGIS extension http://postgis.refractions.net/ , you can solve this problem using a two-part approach.
First let MySQL give you an bounding box pre-filtering result based on the bounding box (that is what it does by default) using their intersects operator (http://dev.mysql.com/doc/refman/5.1/en/functions-that-test-spatial-relationships-between-geometries.html#function_intersects).
If queries are slow, make sure that you have an index on your geometry field first.
Then hydrate the original geometry that you used in your query into a geometry object of GIS geometry library like GEOS (http://trac.osgeo.org/geos/) (C++ based, although it also has bindings for different languages like Python), Shapely (http://trac.gispython.org/lab/wiki/Shapely), OGR ( or the Java Topology Suite (JTS) http://www.vividsolutions.com/jts/jtshome.htm).
Test each of the geometries that you get back from your query result using the appropriate operator like within or intersects. Any of these libraries will give you a boolean result.
Personally, I would look at the samples for OGR since it has a big community that is ready to help.
Oh yeah, and sorry for putting the links like that... I guess since I am "new" I can only post one link (?)
The function given in this post on the MySQL forums works perfectly for me.
It's not very quick and you have to ensure the parameter 'mp' is the same type as the spatial column you are using (I used ogr2ogr to import an Ordnance Survey shapefile into MySQL, so had to change it from 'MULTIPOLYGON' to 'GEOMETRY')
I have rewritten function that was given in previous post by #danherd, so it can work with real multipolygons which consist from more that one polygon. For those of you who
still keep using old MySql version it should help.
Here it is:
DELIMITER //
CREATE FUNCTION GISWithin(pt POINT, mp MULTIPOLYGON) RETURNS INT(1) DETERMINISTIC
BEGIN
DECLARE str_big, str, xy LONGTEXT;
DECLARE x, y, p1x, p1y, p2x, p2y, m, xinters DECIMAL(16, 13) DEFAULT 0;
DECLARE counter INT DEFAULT 0;
DECLARE p, pb, pe, sb, se, ct DECIMAL(16, 0) DEFAULT 0;
SELECT MBRWithin(pt, mp) INTO p;
IF p != 1 OR ISNULL(p) THEN
return p;
END IF;
SELECT X(pt), Y(pt), ASTEXT(mp) INTO x, y, str_big;
SET str_big = REPLACE(str_big, 'MULTIPOLYGON(((','');
SET str_big = REPLACE(str_big, ')))', '');
SET str_big = REPLACE(str_big, ')),((', '|');
SET str_big = CONCAT(str_big, '|');
SET sb = 1;
SET se = LOCATE('|', str_big);
SET str = SUBSTRING(str_big, sb, se - sb);
WHILE se > 0 DO
SET ct = ct + 1;
SET str = SUBSTRING(str_big, sb, se - sb);
SET pb = 1;
SET pe = LOCATE(',', str);
SET xy = SUBSTRING(str, pb, pe - pb);
SET p = INSTR(xy, ' ');
SET p1x = SUBSTRING(xy, 1, p - 1);
SET p1y = SUBSTRING(xy, p + 1);
SET str = CONCAT(str, xy, ',');
WHILE pe > 0 DO
SET xy = SUBSTRING(str, pb, pe - pb);
SET p = INSTR(xy, ' ');
SET p2x = SUBSTRING(xy, 1, p - 1);
SET p2y = SUBSTRING(xy, p + 1);
IF p1y < p2y THEN SET m = p1y; ELSE SET m = p2y; END IF;
IF y > m THEN
IF p1y > p2y THEN SET m = p1y; ELSE SET m = p2y; END IF;
IF y <= m THEN
IF p1x > p2x THEN SET m = p1x; ELSE SET m = p2x; END IF;
IF x <= m THEN
IF p1y != p2y THEN
SET xinters = (y - p1y) * (p2x - p1x) / (p2y - p1y) + p1x;
END IF;
IF p1x = p2x OR x <= xinters THEN
SET counter = counter + 1;
END IF;
END IF;
END IF;
END IF;
SET p1x = p2x;
SET p1y = p2y;
SET pb = pe + 1;
SET pe = LOCATE(',', str, pb);
END WHILE;
SET sb = se + 1;
SET se = LOCATE('|', str_big, sb);
END WHILE;
RETURN counter % 2;
END
DELIMITER ;

What is LINQ equivalent of SQL’s "IN" keyword

How can I write below sql query in linq
select * from Product where ProductTypePartyID IN
(
select Id from ProductTypeParty where PartyId = 34
)
There is no direct equivalent in LINQ. Instead you can use contains () or any
other trick to implement them. Here's an example that uses Contains:
String [] s = new String [5];
s [0] = "34";
s [1] = "12";
s [2] = "55";
s [3] = "4";
s [4] = "61";
var result = from d in context.TableName
where s.Contains (d.fieldname)
select d;
check this link for details: in clause Linq
int[] productList = new int[] { 1, 2, 3, 4 };
var myProducts = from p in db.Products
where productList.Contains(p.ProductID)
select p;
Syntactic variations aside, you can write it in practically the same way.
from p in ctx.Product
where (from ptp in ctx.ProductTypeParty
where ptp.PartyId == 34
select ptp.Id).Contains(p.ProductTypePartyID)
select p
I prefer using the existential quantifier, though:
from p in ctx.Product
where (from ptp in ctx.ProductTypeParty
where ptp.PartyId == 34
&& ptp.Id == p.ProductTypePartyID).Any()
select p
I expect that this form will resolve to an EXISTS (SELECT * ...) in the generated SQL.
You'll want to profile both, in case there's a big difference in performance.
Something similar to this
var partyProducts = from p in dbo.Product
join pt in dbo.ProductTypeParty on p.ProductTypePartyID equal pt.PartyId
where pt.PartyId = 34
select p
You use the Contains in a Where clause.
Something along these lines (untested):
var results = Product.Where(product => ProductTypeParty
.Where(ptp => ptp.PartyId == 34)
.Select(ptp => ptp.Id)
.Contains(product.Id)
);

Resources