geometry data type of snowflake - snowflake-cloud-data-platform

I am trying to use the geometry data type and was wondering what the unit of the spatial functions will be. The documentation says as below. Is there a way I can pass a unit to the function. If not, how do i find out the unit
The measurement functions (e.g. ST_LENGTH) use the same units as the
coordinate system
Thanks

I was surprised by the results, and the docs could be clearer.
With a geography, st_length() returns the distance in meters, as described in the docs. For example, for a line between SF and a point 10km to the east:
select st_length(
to_geography('LINESTRING(-122.4194 37.7749,-122.3094 37.7749 )')
);
-- 9,668.032993573
However, that's not what you get when you have a geometry between the same points:
select st_length(
to_geometry('LINESTRING(-122.4194 37.7749,-122.3094 37.7749 )')
);
-- 0.11
So where does the 0.11 come from? Vertica says:
"For GEOMETRY objects, the length is measured in Cartesian coordinate units. For GEOGRAPHY objects, the length is measured in meters."
Same on PostGIS:
For geometry types: returns the 2D Cartesian length of the geometry [...]
For geography types: [...] Units of length are in meters. [...]
https://postgis.net/docs/ST_Length.html
I'll notify our docs team so we make the necessary update on https://docs.snowflake.com/en/sql-reference/functions/st_length.html#returns.

Related

Difference between geodist() and dist() for Geo-Spacial Search

What is the Difference between Geodist(sfield,x,y) and dist(2,x,y,a,b) in Apache Solr for Geo-Spacial Searches ??
dist(2,x,y,0,0) :- calculates the Euclidean distance between (0,0) and (x,y) for each document. Return the Distance between two Vectors (points) in an n-dimensional space.
I was earlier using geodist() distance function for Geo-Spatial searches on my website but its response time was large. so have done a POC(proof of concept) for different distance functions and found that dist(2,x,y,0,0) distance function is relatively taking half of the time. But I want to know the reason behind this and the algorithms which both functions are using to calculate the distance.
I have to make a difference matrix for the same to convey it further.
The main difference is that geodist() is intended to work with spatial field types.
Most spatial implementation are based on Lucene's Points API, which is a BKD Index. This field type is strictly limited to coordinates in lat/lon decimal degrees. Behind the scenes, latitude and longitude are indexed as separate numbers. Four main field types are available for spatial search :
LatLonPointSpatialField
LatLonType (now deprecated) and its non-geodetic twin PointType
SpatialRecursivePrefixTreeFieldType (RPT for short), including RptWithGeometrySpatialField, a derivative
BBoxField (for areas, 4 instances of another field type referred to by numberType)
In geodist (sfield, x, y), sfield is a spatial field type that represents two points (lat,lon), so the direct equivalent using dist() would be to implement dist (2, sfieldX, sfieldY, x, y) with sfieldX and sfieldY being respectively the (lat,lon) coordinates of sfield.
Using dist (power, a, b, ...) you can't query a spatial field type. In order to perform the same spatial search, you would have to specify every point's dimension separately. It would require 2 indexed fields (or values per field at least) for 2 dimensions, 3 for 3d, and so on. That makes a huge difference because you would have to index every coordinates of each point separately.
Besides, you can also use geodist() as is with the BBoxField field type that indexes a single rectangle per document field and supports searching via a bounding box. To do the same with dist() you would have to compute the center point of the box to input each one of its coordinates as a function argument, so it would be too much hassle to yield the same result if you want to use an area as parameter.
Lastly, LatLonPointSpatialField for example does distance calculations based on Haversine formula (Great Circle), BBoxField does it a little faster because the rectangular shape is faster to compute. It's true that dist() may be even faster but remember that requires more field to be indexed, a lot of preprocess at query time to be able to yield the same calculated distance, and, as mentioned by Mats, it wouldn't take the earth' curvature into account.
An euclidean distance doesn't account for the curvature of the earth. If you're only sorting by the distance, the behavior can be OK - but only if your hits are within a small geographical area (the value of a unit compared to meters greatly change when you're getting closer to the poles).
There's an extensive and good answer that explains the difference between a Euclidean distance and a proper geographical distance (usually calculated using haversine) available at the GIS Stack Exchange.
Although at small scales any smooth surface looks like a plane, the accuracy of the Pythagorean formula depends on the coordinates used. When those coordinates are latitude and longitude on a sphere (or ellipsoid), we can expect that
Distances along lines of longitude will be reasonably accurate.
Distances along the Equator will be reasonably accurate.
All other distances will be erroneous, in rough proportion to the differences in latitude and longitude.

Buffer Polygon on Point in Polygon Query

I would like to buffer the warning polygon by two miles can anyone help me with
this so if ema personal are with in to miles of the warning the are listed, I've been trying to use ST Buffer (to expand the polygon coverage for the search) but cant seem to get it right? Is it in Meters (3218.69)? I'm using the latest opengeo suite.
SELECT DISTINCT ON (ema.name)
ST_X(ema.geom),ST_Y(ema.geom),ema."name", torpoly.expire
FROM ema INNER JOIN torpoly ON ST_Within(ema.geom, ST_BUFFER(torpoly.geom)
ORDER BY ema."name"
Your options are either:
Use an appropriate projected coordinate system for the region that uses linear units in metres or feet (UTM, State plane, etc.). All distance calculations on geometry types use a Cartesian coordinate system, which is quick and simple.
Use the geography type, which does distance calculations on objects with EPSG:4326 (lat/lon) with distance units in metres. If you don't want to change the data types, you can use a geom::geography cast, and maybe make an index on that cast.
And never do ST_Within(.., ST_Buffer()) for this type of analysis. It is slower and imperfect. Instead, use ST_DWithin, which finds all geometry/geography objects within a distance threshold of each other, which is just like a buffer. This function may use a spatial GiST index, if present.

Mix and match spatial-reference systems?

Assuming that I have a table of postal codes, with a Geography column that was populated with Global - WGS84 (SRID 4326), can I accurately compare them (using STDistance) with a Geography point that has been populated with North America – NAD83 (SRID 4269)?
In short, No.
SQL Server requires for spatial functions that all items' have the same SRID. This is because the SRID provides other information in the background used to calculate distances and such on an ellipsoidal model.
That said, you could have a second column which is calculated to have a common SRID and use that for distance calculations. It's as simple as:
Declare #commonSrid geography = geography::STGeomFromWKB(<existing column>.STAsBinary(), 4326);
In doing this, you must be sure that all SRIDs are based from latitude and longitude decimal coordinates, and not for example grid references. Also because you're not doing a proper conversion between them, you may find distances are not 100% accurate - but they will be very very close.

SRID meaning in postgis

I would like to find out what is the pragmatic meaning of SRID (spatial reference id) in postgis.
I really do not understand what it is for. Can anyone throw some light on the matter?
For instance I noticed that the postigs function ST_GeomFromText(text WKT, integer srid) accept such an (optional) param as second argument. Why would I need to pass it in the get postigs to turn the text representation into a binary one? What is the value it adds?
Thanks
Spatial reference ID refers to the spatial reference system being employed -- this is important when going from a a geographic view of the world to a projected view of the world, ie, what you see when you look at a 2 dimensional paper map.
Spatial reference systems contain a couple of elements.
Firstly, the geoid, is a model of the shape of the earth -- the earth is not a sphere (sh, don't tell Google), it is in fact an oblate spheroid. The geoid shape used for GPS is known as WGS84, which is a model that works faily well globally. National mapping agencies use other geoids, that might be a better fit to local geographies.
Secondly, the projection type. This is essentially the mathematical model used to go from a 3D to a 2D representation of the world. Types include Mercator, Transverse Mercator, (both cylindical), Azimuthal, Conic, etc. All of these have trade-offs between accurately measuring distance, area or direction -- you can't preserve all three.
So, essentially when you declare a SRID in Postgis you are saying use this geoid and this projection model. Under the hood, Postgis uses a library called Proj.4, and based on the SRID information, it can convert from one coordinate system to another.
So, for example, to convert from lat/lon, which is know as 4326 in SRID terms to 900913, which is spherical Mercator, as used by Google/Bing maps, and other web mapping frameworks, you could run something like:
select st_astext(st_transform(st_setsrid(st_makepoint(-.5,52),4326),900913));
This is an example of a query I use. It uses the Lambert azimuthal equal-area projection (ETRS89-LAEA, srid = 3035).
ST_GeomFromText('POINT(2843711.1098048678, 2279498.6551480694)', 3035);
If you don't pass the srid, postgis will not know which spatial reference system to use.

Geometry column: STGeomFromText and SRID (what is an SRID?)

I'm playing with the new geography column in SQL Server 2008 and the STGeomFromText function. Here is my code (works with AdventureWorks2008)
DECLARE #region geography;
set #region = geography::STGeomFromText('POLYGON((
-80.0 50.0, -90.0 50.0,
-90.0 25.0, -80.0 25.0,
-80.0 50.0))', 4326);
SELECT #region;
My question is about the 4326 in the code. It is supposed to be a spacial Reference ID. When I go to MSDN there isn't a lot on it. If I change the value to 56 I get an error telling me the value must be in the sys.spatial_reference_systems table.
You can look at that table by executing:
select * from sys.spatial_reference_systems
There is a well_known_text column in that table, but it doesn't tell me much. The value for 4326 is:
GEOGCS["WGS 84", DATUM["World Geodetic System 1984", ELLIPSOID["WGS 84", 6378137, 298.257223563]], PRIMEM["Greenwich", 0], UNIT["Degree", 0.0174532925199433]]
Can anyone explain this mystery to me? What is the SRID?
So I ended up talking with an ex-military guy yesterday who was a radar/mapping specialist.
Basically, he knew exactly what that number (4326) was, where it came from, and why it is there.
It is an industry standard for computing geography. The problem is that the earth is not a perfect sphere (it bulges in the middle), and SRID 4326 accounts for that.
As I stated, the table sys.spatial_reference_systems lists all of the code and what they are. But the short version is that you are really only going to use 4326 unless you have a very specific reason to use something different.
SRID = Spatial Reference IDentifier
coordinates must use the same SRID to be comparable. otherwise you'd end up comparing kilometeres and miles. or something similar.
There are a lot of systems to map the earth. For example you want to map some state in USA. You can set the most south-east point as 0,0 and map all other spatial coordinates according to this point. On the other hand you may want to map some spatial data that span all over the map. In any case you must choose some point as 0,0. In addition you must select some sort of measurement unit: miles/kilometers/degrees/some other magical unit that suits you better. Over the years a lot of such systems where developed. Each has its own zero point, its own coordinates, its own rules about if the earth is flat or not. SRID or SRS is the id of such system. Using this id you can map point expressed in one system to another system, although sometimes it involves some pretty complex math.
And about 4326 SRID. It also called "WGS 84"
(http://en.wikipedia.org/wiki/World_Geodetic_System) system. It's the most common system to represent point on spherical(not flat) earth. It uses degree,minute,second notation and its x and y coordinates are usually called latitude and longitude.
Most used non-spherical earth projection is called UTM. You can read about it here: http://en.wikipedia.org/wiki/Universal_Transverse_Mercator_coordinate_system
Anyway, as long you are not doing any spatial conversions from one system to other, you don't really care about the system that you data uses.
I have found this website: http://spatialreference.org/ref/epsg/4326/ quite helpful in understanding the SRID you intend to use. It provides a handy map, some bounding box information and other links.
For other SRIDs simply change the digits at the end of the URL to what you are after.
The distance returned depends on the "Spatial Reference Identifier (SRID)" you define for your geography types.
In the example below, the default SRID of 4336 is used, see the second argument of STGeomFromText. This means the distance returned is in meters, you find this via querying the catalog view spatial_reference_systems i.e. select srs.unit_of_measure from sys.spatial_reference_systems as srs where srs.spatial_reference_id = 4326
As an alternative to STGeomFromText, you can use parse which assumes a SRID of 4326 and you don't have to specify one explicitly.
When calculating the distance between two points, you must use the same SRID for both geography types else you get an error. Example:
DECLARE #address1 GEOGRAPHY
DECLARE #address2 GEOGRAPHY
DECLARE #distance float
SET #address1 = GEOGRAPHY::STGeomFromText ('point(53.046908 -2.991673)',4326)
SET #address2 = GEOGRAPHY::STGeomFromText ('point(51.500152 -0.126236)',4326)
SET #distance = #address1.STDistance(#address2)
SELECT #distance --this is the distance in meters

Resources