I have a data table with destinations and LAT/LON data (~100K records)
DESTINATIONS {
id,
lat,
lon,
...
}
Now I need to insert distances into a new table...
DISTANCES {
id_a,
id_b,
distance
}
What's the best way to do that?
I don't need all data (cartesian product), only the 100 closest.
No duplicates (a_id+b_id == b_id+a_id), e.g. [NYC:Chicago] == [Chicago:NYC] (same distance)
Not by itself (a_id != b_id), because it 0 miles from [NYC:NYC] ;)
This is the calculation (in kilometers/meters):
ROUND(111045
* DEGREES(ACOS(COS(RADIANS(A.lat))
* COS(RADIANS(B.lat))
* COS(RADIANS(A.lon) - RADIANS(B.lon))
+ SIN(RADIANS(A.lat))
* SIN(RADIANS(B.lat)))),0)
AS 'distance'
Okay, the JOIN is no problem, but how can I implement the three "filters"?
Maybe with a WHILE loop and SUBSELECT LIMIT/TOP 100 ORDER BY distance ASC?
Or is it also possible to INSERT by JOIN?
Does somebody have a idea?
Psuedocode:
INSERT INTO [newTable] (ColumnList...)
SELECT TOP 100 a.id, b.id, DistanceFormula(a.id, b.id)
FROM Destination a
CROSS JOIN Destination b
WHERE a.id<b.id
ORDER BY DistanceFormula(a.id, b.id) ASC
EDIT to get 100 b for every a:
INSERT INTO [newTable] (ColumnList...)
SELECT a.id, b.id, DistanceFormula(a.id, b.id)
FROM Destination a
INNER JOIN Destination b
ON b.id=(
SELECT TOP 100 c.id
FROM Destination c
WHERE a.id<c.id
ORDER BY DistanceFormula(a.id, c.id) ASC
)
I've simplified it (distcalc)...
INSERT INTO [DISTANCES] (id_a, id_b, distance)
SELECT
A.id,
B.id,
25 /*ROUND(111045 * DEGREES(ACOS(COS(RADIANS(A.geo_lat)) * COS(RADIANS(B.geo_lat)) * COS(RADIANS(A.geo_lon) - RADIANS(B.geo_lon)) + SIN(RADIANS(A.geo_lat)) * SIN(RADIANS(B.geo_lat)))),0)*/
FROM [DESTINATIONS] AS A
INNER JOIN [DESTINATIONS] AS B
ON b.id IN(
SELECT TOP 100
C.id
FROM [DESTINATIONS] AS C
WHERE
A.id < C.id
ORDER BY A.id /*ROUND(111045 * DEGREES(ACOS(COS(RADIANS(A.geo_lat)) * COS(RADIANS(C.geo_lat)) * COS(RADIANS(A.geo_lon) - RADIANS(C.geo_lon)) + SIN(RADIANS(A.geo_lat)) * SIN(RADIANS(C.geo_lat)))),0)*/ ASC
)
You mean like this?
Okay. That works. :)
But it is definitely too slow!
I'll program a routine that returns only the 100 nearest results on request.
And another (sub) routine will insert/update these (program-sided) results with timestamp into the distances table, so that it's possible to accessed to any existing results by the next call.
But thank you very very much! :)
Related
enter image description hereI have three tables that want to make some calculation based on. However, what I stated in the below did not work.
Could someone give me Any feedback?
Thank you,
formula:
(runnincost/total(gas_production of eacy year))*gas_production of each year)
as:
CTE c (id,filed,year_1,year_2,year_3)
as( select g.id, g.field,
(r.year_1/sum(g.year_1))*g.year_1 ,
(r.year_2/sum(g.year_2))*g.year_2 ,
(r.year_3/sum(g.year_3))*g.year_3 ,
from group_1 as g
inner join ref_fee as r
on r.id=g.id
group by g.field )
select c.id, c.filed,
c.year_1*b.year_1 as year_1,
c.year_2*b.year_2 as year_2,
c.year_3*b.year_3 as year_3
from c
inner join back b
on b.id=c.id
group by c.field;
"Did not work" is difficult to debug.
What is evident is that tables' aliases - in Oracle - can't have the AS keyword (columns can). When fixed, query looks like this:
WITH
c (id,
filed,
year_1,
year_2,
year_3)
AS
( SELECT g.id,
g.field,
(r.year_1 / SUM (g.year_1)) * g.year_1,
(r.year_2 / SUM (g.year_2)) * g.year_2,
(r.year_3 / SUM (g.year_3)) * g.year_3
FROM group_1 g INNER JOIN ref_fee r ON r.id = g.id
GROUP BY g.field)
SELECT c.id,
c.filed,
c.year_1 * b.year_1 AS year_1,
c.year_2 * b.year_2 AS year_2,
c.year_3 * b.year_3 AS year_3
FROM c INNER JOIN back b ON b.id = c.id
GROUP BY c.field;
I have no idea whether it'll work or not as I don't have your tables, nor I know what "calculations" you're about to perform.
I have a database with 3 tables; tblCustomers, tblBookings, tblFlights.
In the Bookings table I have the number of tickets sold for each flight, and in the Flight table the Capacity of each flight. I want to return the remaining number of tickets left for each flight.
I have tried subtracting the capacity from the tickets, but can't get the syntax right, I know I have created a JOIN and it does not return the correct information.
I have tried:
SELECT *, (Capacity - Tickets)
from tblFlights, tblBookings
where (Capacity - Tickets)
You are actually cross joining the tables, but you should do an INNER or LEFT join based on the related columns of the tables, which I believe must have names like flight_id:
select *, (f.Capacity - b.Tickets) tickets_left
from tblFlights f inner join tblBookings b
on b.flight_id = f.flight_id
where (f.Capacity - b.Tickets) > 0
I kept the where clause because you use it in your code.
If the relation of tblFlights and tblBookings is not 1:1 then you also need aggregation:
select f.*, (f.Capacity - coalesce(b.Tickets, 0)) tickets_left
from tblFlights f left join (
select flight_id, sum(Tickets) Tickets
from tblBookings
group by flight_id
) b on b.flight_id = f.flight_id
Your syntax should be something like this:
SELECT *, (Capacity - Tickets) as Remaining
from tblFlights Tf, tblBookings Tb
where Tb.id = Tf.id and (Capacity - Tickets) > 0
You can also use a join statement:
SELECT *, (Capacity - Tickets) as Remaining
from tblFlights Tf
join tblBookings Tb on (Tf.id = Tb.id)
where (Capacity - Tickets) > 0
What you had initially creates permutations between the two tables.
I want to sum/substract 'salevalue' from the two tables in my procedure. Sale 1 has receipts, 2nd is with returns. But I am lost in ideas.
SELECT *
FROM #possale1
SELECT *
FROM #possale2
SELECT sum(salevalue) AS S1
FROM #possale1
SELECT sum(salevalue)*-1 AS S2
FROM #possale2
select sum(sum(a.salevalue)-sum(b.salevalue))
from #possale1 a
inner join #possale2 b on a.receiptdate=b.receiptdate
Without aggregation next should do:
select ((SELECT sum(salevalue) FROM #possale1) - (SELECT sum(salevalue) FROM #possale2)) as balance
Are you trying for this ?
SELECT SUM(ISNULL(a.salevalue,0) - ISNULL(b.salevalue,0))
FROM #possale1 a FULL OUTER JOIN #possale2 b on a.receiptdate=b.receiptdate
I have two Identical table with different purpose each.
Table A
ID,TypeofAsset, Amount
1,C,300
2,A,40
3,F,90
Table B
ID,TypeofAsset,amount
1,G,500
2,A,20
3,C,150
Result with Query (Table A id= 1 compare with Table B ID =3)
Col, Result
TypeofAsset, match -- (C)
Amount, 150 --(Absolute value of Amount difference)
Any help will be appreciate.
Thanks
You can do a JOIN on TypeofAsset column like
select t1.TypeofAsset,
case when t1.Amount > t2.Amount then t1.Amount - t2.Amount
else t2.Amount - t1.Amount end as diff_amount
from tablea t1 join tableb t2 on t1.TypeofAsset = t2.TypeofAsset;
You can as well use ABS() function as commented like
select t1.TypeofAsset,
ABS(t1.Amount - t2.Amount) as diff_amount
from tablea t1 join tableb t2 on t1.TypeofAsset = t2.TypeofAsset;
I inherit an old SQL script that I want to optimize but after several tests, I must admit that all my tests only creates huge SQL with repetitive blocks. I would like to know if someone can propose a better code for the following pattern (see code below). I don't want to use temporary table (WITH). For simplicity, I only put 3 levels (table TMP_C, TMP_D and TMP_E) but the original SQL have 8 levels.
WITH
TMP_A AS (
SELECT
ID,
Field_X
FROM A
TMP_B AS(
SELECT DISTINCT
ID,
Field_Y,
CASE
WHEN Field_Z IN ('TEST_1','TEST_2') THEN 'CATEG_1'
WHEN Field_Z IN ('TEST_3','TEST_4') THEN 'CATEG_2'
WHEN Field_Z IN ('TEST_5','TEST_6') THEN 'CATEG_3'
ELSE 'CATEG_4'
END AS CATEG
FROM B
INNER JOIN TMP_A
ON TMP_A.ID=TMP_B.ID),
TMP_C AS (
SELECT DISTINCT
ID,
CATEG
FROM TMP_B
WHERE CATEG='CATEG_1'),
TMP_D AS (
SELECT DISTINCT
ID,
CATEG
FROM TMP_B
WHERE CATEG='CATEG_2' AND ID NOT IN (SELECT ID FROM TMP_C)),
TMP_E AS (
SELECT DISTINCT
ID,
CATEG
FROM TMP_B
WHERE CATEG='CATEG_3'
AND ID NOT IN (SELECT ID FROM TMP_C)
AND ID NOT IN (SELECT ID FROM TMP_D))
SELECT * FROM TMP_C
UNION
SELECT * FROM TMP_D
UNION
SELECT * FROM TMP_E
Many thanks in advance for your help.
First off, select DISTINCT will prevent duplicates from the result set, so you are overworking the condition. By adding the "WITH" definitions and trying to nest their use makes it more confusing to follow. The data is ultimately all coming from the "B" table where also has key match in "A". Lets start with just that... And since you are not using anything from the (B)Field_Y or (A)Field_X in your result set, don't add them to the mix of confusion.
SELECT DISTINCT
B.ID,
CASE WHEN B.Field_Z IN ('TEST_1','TEST_2') THEN 'CATEG_1'
WHEN B.Field_Z IN ('TEST_3','TEST_4') THEN 'CATEG_2'
WHEN B.Field_Z IN ('TEST_5','TEST_6') THEN 'CATEG_3'
ELSE 'CATEG_4'
END AS CATEG
FROM
B JOIN A ON B.ID = A.ID
WHERE
B.Field_Z IN ( 'TEST_1', 'TEST_2', 'TEST_3', 'TEST_4', 'TEST_5', 'TEST_6' )
The where clause will only include those category qualifying values you want and still have the results per each category.
Now, if you actually needed other values from your "Field_Y" or "Field_X", then that would generate a different query. However, your Tmp_C, Tmp_D and Tmp_E are only asking for the ID and CATEG columns anyhow.
This may perform better
SELECT DISTINCT B.ID, 'CATEG_1'
FROM
B JOIN A ON B.ID = A.ID
WHERE
B.Field_Z IN ( 'TEST_1', 'TEST_2')
UNION
SELECT DISTINCT B.ID, 'CATEG_2'
FROM
B JOIN A ON B.ID = A.ID
WHERE
B.Field_Z IN ( 'TEST_3', 'TEST_4')
...