How to use JaguarDB to store geometric objects - database

I found the new feature of Jaguar Database 3.0 is the geometric functions. I am wondering how does it works? What I can do with these objects

It is possible to store geometric object (2d and 3d) and use some basic utility functions afterwards.
You can calculate Area, Angle, Dimension etc.
You can do Union, Intersection, etc.
You can do Rotation or Scaling.
More details available in the UserManual

Related

What is the most efficient way of changing Cartesian topology to one that supports MKL Cluster FFT?

I have a 3-dimensional array of size = [Nx, Ny, Nz] currently distributed among nprocs = nprocs_y * nprocs_z processes as subarrays of local_size = [Nx, Ny/nprocs_y, Nz/nprocs_z] with the data stored in column-major (Fortran) order.
I wish to Fourier transform this data concurrently. However, according to Intel's documentation on MKL Cluster FFT, the distribution of data has to be such that local_size_new = [Nx, Ny, Nz/nprocs]. The documentation does not seem to suggest that the cluster FFT technology can work with arbitrary topologies.
This forces me to attempt a redistribution of data according to the topology supported by the cluster FFT functions provided by Intel. Could you please suggest some ideas as to how this could be done most efficiently? Thank you.
Order of FFT dimensions is the same as the order of array dimensions in the programming language. For example, a 3-dimensional FFT with Lengths=(m,n,l) can be computed over an array Ar[m][n][l].
You could redistribute the data across the processes as per your task requirement. Please find the below link for details regarding Distributing Data among Processes.
https://www.intel.com/content/www/us/en/develop/documentation/onemkl-developer-reference-c/top/fourier-transform-functions/cluster-fft-functions/distributing-data-among-processes.html

Defining Unit Vectors in Spherical Coordinates for use with Eigen3

I'm posting here because I'm at a bit of a loss
I'm trying to implement a solution to Maxwells equations (p47 2-2)
,
which is given in Spherical coordinates in C++ so it may be used in a larger modeling project. I'm using Eigen3 as a base for linear algebra, which as far as I can find doesn't explicitly support spherical coordinates (I'm open to alternatives)
To implement the solution I need (or at least i think i need) to define the spherical unit vectors as spherical coordinates however, as they're not constants like in Cartesian Coordinates and I don't understand how to do this.
I'm hesitant to convert the solution to Cartesian coordinates as I don't think I understand the implications of doing this (is it even valid?)
Any and all input and advice is appreciated
The solution, which seems obvious now I have found it, is to implement Spherical Unit Vector Identities as 3 functions (one for each unit vector) that takes r, Theta, and Phi as arguments and return a vector.

Difference between geodist() and dist() for Geo-Spacial Search

What is the Difference between Geodist(sfield,x,y) and dist(2,x,y,a,b) in Apache Solr for Geo-Spacial Searches ??
dist(2,x,y,0,0) :- calculates the Euclidean distance between (0,0) and (x,y) for each document. Return the Distance between two Vectors (points) in an n-dimensional space.
I was earlier using geodist() distance function for Geo-Spatial searches on my website but its response time was large. so have done a POC(proof of concept) for different distance functions and found that dist(2,x,y,0,0) distance function is relatively taking half of the time. But I want to know the reason behind this and the algorithms which both functions are using to calculate the distance.
I have to make a difference matrix for the same to convey it further.
The main difference is that geodist() is intended to work with spatial field types.
Most spatial implementation are based on Lucene's Points API, which is a BKD Index. This field type is strictly limited to coordinates in lat/lon decimal degrees. Behind the scenes, latitude and longitude are indexed as separate numbers. Four main field types are available for spatial search :
LatLonPointSpatialField
LatLonType (now deprecated) and its non-geodetic twin PointType
SpatialRecursivePrefixTreeFieldType (RPT for short), including RptWithGeometrySpatialField, a derivative
BBoxField (for areas, 4 instances of another field type referred to by numberType)
In geodist (sfield, x, y), sfield is a spatial field type that represents two points (lat,lon), so the direct equivalent using dist() would be to implement dist (2, sfieldX, sfieldY, x, y) with sfieldX and sfieldY being respectively the (lat,lon) coordinates of sfield.
Using dist (power, a, b, ...) you can't query a spatial field type. In order to perform the same spatial search, you would have to specify every point's dimension separately. It would require 2 indexed fields (or values per field at least) for 2 dimensions, 3 for 3d, and so on. That makes a huge difference because you would have to index every coordinates of each point separately.
Besides, you can also use geodist() as is with the BBoxField field type that indexes a single rectangle per document field and supports searching via a bounding box. To do the same with dist() you would have to compute the center point of the box to input each one of its coordinates as a function argument, so it would be too much hassle to yield the same result if you want to use an area as parameter.
Lastly, LatLonPointSpatialField for example does distance calculations based on Haversine formula (Great Circle), BBoxField does it a little faster because the rectangular shape is faster to compute. It's true that dist() may be even faster but remember that requires more field to be indexed, a lot of preprocess at query time to be able to yield the same calculated distance, and, as mentioned by Mats, it wouldn't take the earth' curvature into account.
An euclidean distance doesn't account for the curvature of the earth. If you're only sorting by the distance, the behavior can be OK - but only if your hits are within a small geographical area (the value of a unit compared to meters greatly change when you're getting closer to the poles).
There's an extensive and good answer that explains the difference between a Euclidean distance and a proper geographical distance (usually calculated using haversine) available at the GIS Stack Exchange.
Although at small scales any smooth surface looks like a plane, the accuracy of the Pythagorean formula depends on the coordinates used. When those coordinates are latitude and longitude on a sphere (or ellipsoid), we can expect that
Distances along lines of longitude will be reasonably accurate.
Distances along the Equator will be reasonably accurate.
All other distances will be erroneous, in rough proportion to the differences in latitude and longitude.

Feature extraction of 3D image dataset

Assume a workflow for 2D image feature extraction by using SIFT, SURF, or MSER methods followed by bag-of-words/features encoded and subsequently used to train classifiers.
I was wondering if there is an analogous approach for 3D datasets, for example, a 3D volume of MRI data. When dealing with 2D images, each image represents an entity with features to be detected and indexed. However, in a 3D dataset is it possible to extract features from the three-dimensional entity? Does this have to be done slice-by-slice, by decomposing the 3D images to multiple 2D images (slices)? Or is there a way of reducing the 3D dimensionality to 2D while retaining the 3D information?
Any pointers would be greatly appreciated.
You can perform feature extraction by passing your 3D volumes through a pre-trained 3D convolutional neural network. Because pre-trained 3D CNNs are hard to find, you could consider training your own on a similar, but distinct, dataset.
Here is a link for code for a 3D CNN in Lasagne. The authors use 3D CNN versions of VGG and Resnet.
Alternatively, you can perform 2D feature extraction on each slice of the volume and then combine the features for each slice, using PCA to reduce the dimensionality to something reasonable. For this, I recommend using ImageNet pre-trained Resnet-50 or VGG.
In Keras, these can be found here.
Assume a grey-scale 2D image which can mathematically be described as a matrix. Generalizing the concept of a matrix results in theory about tensors (informally you can think of a multidimensional array). I.e. a RGB 2D image is represented as a tensor of size [width, height, 3]. Further a RGB 3D Image is represented as a tensor of size [width, height, depth, 3]. Moreover and like in the case of matrices you can also perform tensor-tensor multiplications.
For instance consider the typical neural network with 2D images as input. Such a network does basically nothing else than matrix-matrix multiplications (despite of the elementwise non-linear operations at nodes). In the same way a neural network operates on tensors by performing tensor-tensor multiplications.
Now back to your question of feature extraction: Indeed the problem of tensors are their high dimensionality. Hence modern research problems regard the efficient decomposition of tensors retaining the initial (most meaningful) information. In order to extract features from tensors a tensor decomposition approach might be a good start in order to reduce the rank of the tensor. A few papers on tensors in machine learning are:
Tensor Decompositions for Learning Latent Variable Models
Supervised Learning With Quantum-Inspired Tensor Networks
Optimal Feature Extraction and Classification of Tensors via Matrix Product State Decomposition
Hope this helps, even though the math behind is not easy.

SRID meaning in postgis

I would like to find out what is the pragmatic meaning of SRID (spatial reference id) in postgis.
I really do not understand what it is for. Can anyone throw some light on the matter?
For instance I noticed that the postigs function ST_GeomFromText(text WKT, integer srid) accept such an (optional) param as second argument. Why would I need to pass it in the get postigs to turn the text representation into a binary one? What is the value it adds?
Thanks
Spatial reference ID refers to the spatial reference system being employed -- this is important when going from a a geographic view of the world to a projected view of the world, ie, what you see when you look at a 2 dimensional paper map.
Spatial reference systems contain a couple of elements.
Firstly, the geoid, is a model of the shape of the earth -- the earth is not a sphere (sh, don't tell Google), it is in fact an oblate spheroid. The geoid shape used for GPS is known as WGS84, which is a model that works faily well globally. National mapping agencies use other geoids, that might be a better fit to local geographies.
Secondly, the projection type. This is essentially the mathematical model used to go from a 3D to a 2D representation of the world. Types include Mercator, Transverse Mercator, (both cylindical), Azimuthal, Conic, etc. All of these have trade-offs between accurately measuring distance, area or direction -- you can't preserve all three.
So, essentially when you declare a SRID in Postgis you are saying use this geoid and this projection model. Under the hood, Postgis uses a library called Proj.4, and based on the SRID information, it can convert from one coordinate system to another.
So, for example, to convert from lat/lon, which is know as 4326 in SRID terms to 900913, which is spherical Mercator, as used by Google/Bing maps, and other web mapping frameworks, you could run something like:
select st_astext(st_transform(st_setsrid(st_makepoint(-.5,52),4326),900913));
This is an example of a query I use. It uses the Lambert azimuthal equal-area projection (ETRS89-LAEA, srid = 3035).
ST_GeomFromText('POINT(2843711.1098048678, 2279498.6551480694)', 3035);
If you don't pass the srid, postgis will not know which spatial reference system to use.

Resources