Subtracting a SQL Server geometry from another - sql-server

Is there a way to subtract a geometry from another? A kind of reverse STUnion..
The problem I am having is that I need to ensure a shape fits within another (without changing the larger shape). I thought I could use the STIntersection to get the shape thats "in". However, STIntersection is not accurate and produces a shape that can (and does) not equate to the true intersection.
You can easily see this if you then take the STDifference of the original shape.
So , what I would like to do is given two shapes I want to subtract one from the other - e.g. Take the STIntersection and then subtract the STDifference.
Any ideas?
Edit: For now, I have created my intersection from a STBuffer(-1) version of the bigger shape, this should account the mathematical variation of STIntersection with a slight reduction in accuracy. However, I would still love to know if you can subtract a geometry from another..

Just use .STDifference(). No need to intersect first, then subtract the intersection. Just subtract directly.

Did you try STWithin?

Related

Should I use Halfcomplex2Real or Complex2Complex

Good morning, I'm trying to perform a 2D FFT as 2 1-Dimensional FFT.
The problem setup is the following:
There's a matrix of complex numbers generated by an inverse FFT on an array of real numbers, lets call it arr[-nx..+nx][-nz..+nz].
Now, since the original array was made up of real numbers, I exploit the symmetry and reduce my array to be arr[0..nx][-nz..+nz].
My problem starts here, with arr[0..nx][-nz..nz] provided.
Now I should come back in the domain of real numbers.
The question is what kind of transformation I should use in the 2 directions?
In x I use the fftw_plan_r2r_1d( .., .., .., FFTW_HC2R, ..), called Half complex to Real transformation because in that direction I've exploited the symmetry, and that's ok I think.
But in z direction I can't figure out if I should use the same transformation or, the Complex to complex (C2C) transformation?
What is the correct once and why?
In case of needing here, at page 11, the HC2R transformation is briefly described
Thank you
"To easily retrieve a result comparable to that of fftw_plan_dft_r2c_2d(), you can chain a call to fftw_plan_dft_r2c_1d() and a call to the complex-to-complex dft fftw_plan_many_dft(). The arguments howmany and istride can easily be tuned to match the pattern of the output of fftw_plan_dft_r2c_1d(). Contrary to fftw_plan_dft_r2c_1d(), the r2r_1d(...FFTW_HR2C...) separates the real and complex component of each frequency. A second FFTW_HR2C can be applied and would be comparable to fftw_plan_dft_r2c_2d() but not exactly similar.
As quoted on the page 11 of the documentation that you judiciously linked,
'Half of these column transforms, however, are of imaginary parts, and should therefore be multiplied by I and combined with the r2hc transforms of the real columns to produce the 2d DFT amplitudes; ... Thus, ... we recommend using the ordinary r2c/c2r interface.'
Since you have an array of complex numbers, you can either use c2r transforms or unfold real/imaginary parts and try to use HC2R transforms. The former option seems the most practical.Which one might solve your issue?"
-#Francis

Multiple IF QUARTILEs returning wrong values

I am using a nested IF statement within a Quartile wrapper, and it only kind of works, for the most part because it's returning values that are slightly off from what I would have expected if I calculate the range of values manually.
I've looked around but most of the posts and research is about designing the fomrula, I haven't come across anything compelling in terms of this odd behaviour I'm observing.
My formula (ctrl+shift enter as it's an array): =QUARTILE(IF(((F2:$F$10=$W$4)($Q$2:$Q$10=$W$3))($E$2:$E$10=W$2),IF($O$2:$O$10<>"",$O$2:$O$10)),1)
The full dataset:
0.868997877*
0.99480118
0.867040346*
0.914032128*
0.988150438
0.981207615*
0.986629288
0.984750004*
0.988983643*
*The formula has 3 AND conditions that need to be met and should return range:
0.868997877
0.867040346
0.914032128
0.981207615
0.984750004
0.988983643
At which 25% is calculated based on the range.
If I take the output from the formula, 25%-ile (QUARTILE,1) is 0.8803, but if I calculate it manually based on the data points right above, it comes out to 0.8685 and I can't see why.
I feel it's because the IF statements identifies slight off range but the values that meet the IF statements are different rows or something.
If you look at the table here you can see that there is more than one way of estimating quartile (or other percentile) from a sample and Excel has two. The one you are doing by hand must be like Quartile.exc and the one you are using in the formula is like Quartile.inc
Basically both formulas work out the rank of the quartile value. If it isn't an integer it interpolates (e.g. if it was 1.5, that means the quartile lies half way between the first and second numbers in ascending order). You might think that there wouldn't be much difference, but for small samples there is a massive difference:
Quartile.exc Rank=(N+1)/4
Quartile.inc Rank=(N+3)/4
Here's how it would look with your data

Does anyone know of potential problems with st_line_substring in postGIS?

Specifically I'm getting a result that I do not understand. It is possible that my understanding is simply wrong, but I don't think so. So I'm hoping that someone will either say "yes, that's a known problem" or "no, it is working correct and here is why your understanding is wrong".
Here is my example.
To start I have the following geometry of lat/longs.
LINESTRING(-1.32007599 51.06707497,-1.31192207 51.09430508,-1.30926132 51.10206677,-1.30376816 51.11133597,-1.29261017 51.12981493,-1.27510071 51.15906713,-1.27057314 51.16440941,-1.26606703 51.16897072,-1.26235485 51.17439257,-1.26089573 51.17875111,-1.26044512 51.1833917,-1.25793457 51.19727033,-1.25669003 51.20141159,-1.25347137 51.20630532,-1.24845028 51.21110444,-1.23325825 51.22457158,-1.2274003 51.22821321,-1.22038364 51.23103494,-1.20326042 51.23596583,-1.1776185 51.24346193,-1.16356373 51.24968088,-1.13167763 51.26363353,-1.12247229 51.2659966,-1.11629248 51.26682901,-1.10906124 51.26728549,-1.09052181 51.26823871,-1.08522177 51.26885628,-1.07013702 51.27070895,-1.03683472 51.27350122,-1.00917578 51.27572955,-0.98243952 51.2779175,-0.9509182 51.28095094,-0.9267354 51.28305811,-0.90499878 51.28511151,-0.86051702 51.2883055,-0.83661318 51.29023789,-0.7534647 51.29708113,-0.74908733 51.29795323,-0.7400322 51.2988924,-0.71535587 51.30125366,-0.68475723 51.29863749,-0.65746307 51.30220618,-0.63246489 51.30380261,-0.60542822 51.30645873,-0.58150291 51.3103219,-0.57603121 51.31150225,-0.57062387 51.31317883,-0.54195642 51.32475227,-0.4855442 51.34771616,-0.4553318 51.36283147)
This is in a column called "geom" in my table, called "fibre_lines". When I run the following query,
select st_length(geography(geom), false) as full_length,
st_length(geography(st_line_substring(geom, 0, 1)), false) as full_length_2,
st_length(geography(st_line_substring(geom, 0, 0.5)), false) as first_half,
st_length(geography(st_line_substring(geom, 0.5, 1)), false) as second_half
from fibre_lines
where id = 10;
I get the following result...
76399.4939375278 76399.4939375278 41008.9667229201 35390.5272197668
The first two make sense to me, they are simply the length of my line assuming a spherical earth. The first is just using the obvious function while the second is using st_line_substring to get the length of the entire line. These two values agree.
But the last two have me puzzled. I am asking for the length of the first half of the line, then I'm asking for the length of the last half. My expectation was that these would be equal or nearly equal. Instead the first half is about 6km longer than the second half.
If you plot the geometry on the map you will see that the first third of the line is fairly north/south oriented and the remaining two thirds are more east/west. I wouldn't have thought that would make a difference when asking for the length on a spherical earth, but I am happy to be told that I'm wrong (so long as it is also explained why I'm wrong).
For reference the PostGIS I am using is 1.5.8. If this is a bug, upgrading to a newer version is possible, but not trivial, so I would prefer to only do that if it is necessary.
Anyone have ideas?
While Arunas' comments didn't directly answer my question, it did lead me to some research that I think identifies the problem. I'm posting it here in part to get it straight in my own mind and in part in case others are wondering.
It seems the key is the PostGIS distinction between a "geometry" and a "geography". A geometry is a 2D planar geometry that is typically in UTMs and used with a projection of the globe onto a flat surface (which projection is configurable). A geography, on the other hand, is designed to store latitude/longitude information specifically and is used to work either on a sphere or a spheroid. So the essential problem I have is twofold:
Perhaps not obvious from my original post is that I am using a geometry object to store lat/long information rather than UTMs. I cast that to a geography most of the time so that I get the correct answers, but it would be more correct if I actually stored it as a geography object. That would eliminate the need for a number of the casts in my code as well as allow PostGIS to tell me when I am doing something wrong.
While ST_Length will work with either a geometry or a geography, ST_Line_Substring only works with geometries. Hence when I ask it for the halfway point, I am asking it for the halfway point of a flat geometry. This will give me the correct answer for the latitude coordinate, but for the longitude it will have an error term that increases (for most projections) the farther I am from the equator.
I've looked into newer versions of PostGIS and they don't seem to have an ST_Line_Substring or anything similar that will give me the 50% point of a geography, so I will have to do it the "hard" way by using ST_Length to give me all my segment lengths and then adding them up and doing the math needed for my interpolation.
Sorry I can't add comments so will provide it as an answer.
I experienced the same problem and I resolved by transforming my lat-lon geometries to utm geometries into st_line_substring function call. The I as getting sub-geometries with proper length. Of course I had to transform them back to lat-lon afterward.

object / shape / piece fitting

I've been thinking for a few days about the best solution for this but can't seem to get the right idea on how to do this.
I have a pieces (objects) and I want to fit them in the smallest possible space.
What I'm ultimately looking for is something like this
http://i.stack.imgur.com/Yg09E.gif
But a simpler version of just calculating the best possible fit of two lines(stripes) would already do for now
like the lines(stripes) on the right
http://i.stack.imgur.com/HijMo.jpg
What I have is 2 arrays of points(vertices) on a xy axis representing two lines(stripes) and I'd like to arrange them in such a manner that there is 10 or 20 mm space between the closest point of the two.
I was thinking of looking at the first half of the array and finding the highest point then looking at the second half and finding it's highest point then compare the two
but that doesn't really seem to be a proper solution.
And I can't really imagine writing a program that fits shapes as in the first image is even possible using such methods.
Can anyone guide me in the right direction?
Well, this is really possible.
All you would Have to do is build area and distance function. You might need to add different algorithms for different kinds of shapes.
For the Ones you have provided in the first picture, it is difficult to calculate area. So, Probably will have to specify distance of vertices. Also, you need to add a condition to make sure that the locus of the shapes does not co-incide at any point.

KD-Trees and missing values (vector comparison)

I have a system that stores vectors and allows a user to find the n most similar vectors to the user's query vector. That is, a user submits a vector (I call it a query vector) and my system spits out "here are the n most similar vectors." I generate the similar vectors using a KD-Tree and everything works well, but I want to do more. I want to present a list of the n most similar vectors even if the user doesn't submit a complete vector (a vector with missing values). That is, if a user submits a vector with three dimensions, I still want to find the n nearest vectors (stored vectors are of 11 dimensions) I have stored.
I have a couple of obvious solutions, but I'm not sure either one seem very good:
Create multiple KD-Trees each built using the most popular subset of dimensions a user will search for. That is, if a user submits a query vector of thee dimensions, x, y, z, I match that query to my already built KD-Tree which only contains vectors of three dimensions, x, y, z.
Ignore KD-Trees when a user submits a query vector with missing values and compare the query vector to the vectors (stored in a table in a DB) one by one using something like a dot product.
This has to be a common problem, any suggestions? Thanks for the help.
Your first solution might be fastest for queries (since the tree-building doesn't consider splits in directions that you don't care about), but it would definitely use a lot of memory. And if you have to rebuild the trees repeatedly, it could get slow.
The second option looks very slow unless you only have a few points. And if that's the case, you probably didn't need a kd-tree in the first place :)
I think the best solution involves getting your hands dirty in the code that you're working with. Presumably the nearest-neighbor search computes the distance between the point in the tree leaf and the query vector; you should be able to modify this to handle the case where the point and the query vector are different sizes. E.g. if the points in the tree are given in 3D, but your query vector is only length 2, then the "distance" between the point (p0, p1, p2) and the query vector (x0, x1) would be
sqrt( (p0-x0)^2 + (p1-x1)^2 )
I didn't dig into the java code that you linked to, but I can try to find exactly where the change would need to go if you need help.
-Chris
PS - you might not need the sqrt in the equation above, since distance squared is usually equivalent.
EDIT
Sorry, didn't realize it would be so obvious in the source code. You should use this version of the neighbor function:
nearest(double [] key, int n, Checker<T> checker)
And implement your own Checker class; see their EuclideanDistance.java to see the Euclidean version. You may also need to comment out any KeySizeException that the query code throws, since you know that you can handle differently sized keys.
Your second option looks like a reasonable solution for what you want.
You could also populate the missing dimensions with the most important( or average or whatever you think it should be) values if there are any.
You could try using the existing KD tree -- by taking both branches when the split is for a dimension that is not supplied by the source vector. This should take less time than doing a brute force search, and might be less trouble than trying to maintain a bunch of specialized trees for dimension subsets.
You would need to adapt your N-closest algorithm (without more info I can't advise you on that...), and for distance you would use the sum of the squares of only those elements supplied by the source vector.
Here's what I ended up doing: When a user didn't specify a value (when their query vector lacked a dimension), I I simply adjusted my matching range (in the API) to something huge so that I match any value.

Resources