three different tables nested with CTE and Join SQL functions - database

enter image description hereI have three tables that want to make some calculation based on. However, what I stated in the below did not work.
Could someone give me Any feedback?
Thank you,
formula:
(runnincost/total(gas_production of eacy year))*gas_production of each year)
as:
CTE c (id,filed,year_1,year_2,year_3)
as( select g.id, g.field,
(r.year_1/sum(g.year_1))*g.year_1 ,
(r.year_2/sum(g.year_2))*g.year_2 ,
(r.year_3/sum(g.year_3))*g.year_3 ,
from group_1 as g
inner join ref_fee as r
on r.id=g.id
group by g.field )
select c.id, c.filed,
c.year_1*b.year_1 as year_1,
c.year_2*b.year_2 as year_2,
c.year_3*b.year_3 as year_3
from c
inner join back b
on b.id=c.id
group by c.field;

"Did not work" is difficult to debug.
What is evident is that tables' aliases - in Oracle - can't have the AS keyword (columns can). When fixed, query looks like this:
WITH
c (id,
filed,
year_1,
year_2,
year_3)
AS
( SELECT g.id,
g.field,
(r.year_1 / SUM (g.year_1)) * g.year_1,
(r.year_2 / SUM (g.year_2)) * g.year_2,
(r.year_3 / SUM (g.year_3)) * g.year_3
FROM group_1 g INNER JOIN ref_fee r ON r.id = g.id
GROUP BY g.field)
SELECT c.id,
c.filed,
c.year_1 * b.year_1 AS year_1,
c.year_2 * b.year_2 AS year_2,
c.year_3 * b.year_3 AS year_3
FROM c INNER JOIN back b ON b.id = c.id
GROUP BY c.field;
I have no idea whether it'll work or not as I don't have your tables, nor I know what "calculations" you're about to perform.

Related

SQL How to display people with highest sum

SELECT EMPLOYEE.Fname,EMPLOYEE.Lname,
D.Dnumber,
SUM(WORKS_ON.HOURS) AS SUMHOUR
FROM PROJECT
INNER JOIN DEPARTMENT D ON D.Dnumber = PROJECT.Dnum
INNER JOIN EMPLOYEE ON PROJECT.Dnum= EMPLOYEE.Dno
INNER JOIN WORKS_ON ON WORKS_ON.Pno = PROJECT.Pnumber
GROUP BY EMPLOYEE.Fname,EMPLOYEE.Lname, D.Dnumber
I'm writing a code that lists people with the highest SUMHOUR.
Now, I've found who has the biggest sum, but I can't set condition like max(sum()) for displaying them.
This is my output. In this image, people with Dnumber '5' have highest SUMHOUR '150' and I want to display them. What should I do?
One simple approach uses TOP:
SELECT TOP 1 WITH TIES
e.Fname,
e.Lname,
d.Dnumber,
SUM(w.HOURS) AS SUMHOUR
FROM PROJECT p
INNER JOIN DEPARTMENT d
ON d.Dnumber = p.Dnum
INNER JOIN EMPLOYEE e
ON p.Dnum = e.Dno
INNER JOIN WORKS_ON w
ON w.Pno = p.Pnumber
GROUP BY
e.Fname,
e.Lname,
d.Dnumber
ORDER BY
SUMHOUR DESC;
You have puted Dnumber in group by so it returns highest SUMHOUR in each Dnumber.
So sloution is just remove Dnumber from group by then it return highest SUMHOUR only.

SSIS merge join lacks row (and also How to simulate SSIS join with tsql query)

In my project I have a merge join transformation, that uses inner join. It is supposed to join the files lookup with the rest of the data flow. However, the join seems to not include some rows, with files, even though it should? I'm trying to simulate the join in tsql, but I seem to be doing it wrong as it shows me the missing rows.
Here are the outputs I'm trying to join
Input A:
SELECT *
FROM
tblExpense expense
OUTER APPLY(
SELECT TOP 1 *
FROM tblExpenseDtl Details
WHERE expense.intExpenseID = Details.intExpenseID
ORDER BY Details.sintLineNo
) details
WHERE
expense.dtUpdateDateTime > '2017-06-01'
ORDER BY expense.intExpenseID desc
Input B:
SELECT f.*
FROM dbo.tblExpense e
JOIN tblExpenseDtl d ON d.intExpenseID = e.intExpenseID
JOIN tblExpReceiptFile f ON f.intExpenseDtlID = d.intExpenseDtlID
WHERE
e.dtUpdateDateTime > '2017-06-01'
ORDER BY e.intExpenseID desc
And the sql query that I thought would produce the same result as my SSIS inner join
SELECT *
FROM
tblExpense expense
OUTER APPLY(
SELECT TOP 1 *
FROM tblExpenseDtl Details
WHERE expense.intExpenseID = Details.intExpenseID
ORDER BY Details.sintLineNo
) details
inner join ( SELECT f.*
FROM dbo.tblExpense e
JOIN tblExpenseDtl d ON d.intExpenseID = e.intExpenseID
JOIN tblExpReceiptFile f ON f.intExpenseDtlID = d.intExpenseDtlID
WHERE
e.dtUpdateDateTime > '2017-06-01'
ORDER BY e.intExpenseID desc
) innerJ
WHERE
expense.dtUpdateDateTime > '2017-06-01'
ORDER BY expense.intExpenseID desc
The join key in the SSIS is the expense.intExpenseID = e.intExpenseID.
Input A gives 1 row, with an expenseID=X, and input B gives 2 rows with an expenseID=X
How are you sorting data before merging it? According to this SSIS is sorting in different way than SQL Server (in most cases). Maybe there is a problem.
Edit: What type is intExpenseID?

SQL Server : INSERT "cartesian product"

I have a data table with destinations and LAT/LON data (~100K records)
DESTINATIONS {
id,
lat,
lon,
...
}
Now I need to insert distances into a new table...
DISTANCES {
id_a,
id_b,
distance
}
What's the best way to do that?
I don't need all data (cartesian product), only the 100 closest.
No duplicates (a_id+b_id == b_id+a_id), e.g. [NYC:Chicago] == [Chicago:NYC] (same distance)
Not by itself (a_id != b_id), because it 0 miles from [NYC:NYC] ;)
This is the calculation (in kilometers/meters):
ROUND(111045
* DEGREES(ACOS(COS(RADIANS(A.lat))
* COS(RADIANS(B.lat))
* COS(RADIANS(A.lon) - RADIANS(B.lon))
+ SIN(RADIANS(A.lat))
* SIN(RADIANS(B.lat)))),0)
AS 'distance'
Okay, the JOIN is no problem, but how can I implement the three "filters"?
Maybe with a WHILE loop and SUBSELECT LIMIT/TOP 100 ORDER BY distance ASC?
Or is it also possible to INSERT by JOIN?
Does somebody have a idea?
Psuedocode:
INSERT INTO [newTable] (ColumnList...)
SELECT TOP 100 a.id, b.id, DistanceFormula(a.id, b.id)
FROM Destination a
CROSS JOIN Destination b
WHERE a.id<b.id
ORDER BY DistanceFormula(a.id, b.id) ASC
EDIT to get 100 b for every a:
INSERT INTO [newTable] (ColumnList...)
SELECT a.id, b.id, DistanceFormula(a.id, b.id)
FROM Destination a
INNER JOIN Destination b
ON b.id=(
SELECT TOP 100 c.id
FROM Destination c
WHERE a.id<c.id
ORDER BY DistanceFormula(a.id, c.id) ASC
)
I've simplified it (distcalc)...
INSERT INTO [DISTANCES] (id_a, id_b, distance)
SELECT
A.id,
B.id,
25 /*ROUND(111045 * DEGREES(ACOS(COS(RADIANS(A.geo_lat)) * COS(RADIANS(B.geo_lat)) * COS(RADIANS(A.geo_lon) - RADIANS(B.geo_lon)) + SIN(RADIANS(A.geo_lat)) * SIN(RADIANS(B.geo_lat)))),0)*/
FROM [DESTINATIONS] AS A
INNER JOIN [DESTINATIONS] AS B
ON b.id IN(
SELECT TOP 100
C.id
FROM [DESTINATIONS] AS C
WHERE
A.id < C.id
ORDER BY A.id /*ROUND(111045 * DEGREES(ACOS(COS(RADIANS(A.geo_lat)) * COS(RADIANS(C.geo_lat)) * COS(RADIANS(A.geo_lon) - RADIANS(C.geo_lon)) + SIN(RADIANS(A.geo_lat)) * SIN(RADIANS(C.geo_lat)))),0)*/ ASC
)
You mean like this?
Okay. That works. :)
But it is definitely too slow!
I'll program a routine that returns only the 100 nearest results on request.
And another (sub) routine will insert/update these (program-sided) results with timestamp into the distances table, so that it's possible to accessed to any existing results by the next call.
But thank you very very much! :)

TSQL optimizing code for NOT IN

I inherit an old SQL script that I want to optimize but after several tests, I must admit that all my tests only creates huge SQL with repetitive blocks. I would like to know if someone can propose a better code for the following pattern (see code below). I don't want to use temporary table (WITH). For simplicity, I only put 3 levels (table TMP_C, TMP_D and TMP_E) but the original SQL have 8 levels.
WITH
TMP_A AS (
SELECT
ID,
Field_X
FROM A
TMP_B AS(
SELECT DISTINCT
ID,
Field_Y,
CASE
WHEN Field_Z IN ('TEST_1','TEST_2') THEN 'CATEG_1'
WHEN Field_Z IN ('TEST_3','TEST_4') THEN 'CATEG_2'
WHEN Field_Z IN ('TEST_5','TEST_6') THEN 'CATEG_3'
ELSE 'CATEG_4'
END AS CATEG
FROM B
INNER JOIN TMP_A
ON TMP_A.ID=TMP_B.ID),
TMP_C AS (
SELECT DISTINCT
ID,
CATEG
FROM TMP_B
WHERE CATEG='CATEG_1'),
TMP_D AS (
SELECT DISTINCT
ID,
CATEG
FROM TMP_B
WHERE CATEG='CATEG_2' AND ID NOT IN (SELECT ID FROM TMP_C)),
TMP_E AS (
SELECT DISTINCT
ID,
CATEG
FROM TMP_B
WHERE CATEG='CATEG_3'
AND ID NOT IN (SELECT ID FROM TMP_C)
AND ID NOT IN (SELECT ID FROM TMP_D))
SELECT * FROM TMP_C
UNION
SELECT * FROM TMP_D
UNION
SELECT * FROM TMP_E
Many thanks in advance for your help.
First off, select DISTINCT will prevent duplicates from the result set, so you are overworking the condition. By adding the "WITH" definitions and trying to nest their use makes it more confusing to follow. The data is ultimately all coming from the "B" table where also has key match in "A". Lets start with just that... And since you are not using anything from the (B)Field_Y or (A)Field_X in your result set, don't add them to the mix of confusion.
SELECT DISTINCT
B.ID,
CASE WHEN B.Field_Z IN ('TEST_1','TEST_2') THEN 'CATEG_1'
WHEN B.Field_Z IN ('TEST_3','TEST_4') THEN 'CATEG_2'
WHEN B.Field_Z IN ('TEST_5','TEST_6') THEN 'CATEG_3'
ELSE 'CATEG_4'
END AS CATEG
FROM
B JOIN A ON B.ID = A.ID
WHERE
B.Field_Z IN ( 'TEST_1', 'TEST_2', 'TEST_3', 'TEST_4', 'TEST_5', 'TEST_6' )
The where clause will only include those category qualifying values you want and still have the results per each category.
Now, if you actually needed other values from your "Field_Y" or "Field_X", then that would generate a different query. However, your Tmp_C, Tmp_D and Tmp_E are only asking for the ID and CATEG columns anyhow.
This may perform better
SELECT DISTINCT B.ID, 'CATEG_1'
FROM
B JOIN A ON B.ID = A.ID
WHERE
B.Field_Z IN ( 'TEST_1', 'TEST_2')
UNION
SELECT DISTINCT B.ID, 'CATEG_2'
FROM
B JOIN A ON B.ID = A.ID
WHERE
B.Field_Z IN ( 'TEST_3', 'TEST_4')
...

TSQL - Return recent date

Having issues getting a dataset to return with one date per client in the query.
Requirements:
Must have the recent date of transaction per client list for user
Will need have the capability to run through EXEC
Current Query:
SELECT
c.client_uno
, c.client_code
, c.client_name
, c.open_date
into #AttyClnt
from hbm_client c
join hbm_persnl p on c.resp_empl_uno = p.empl_uno
where p.login = #login
and c.status_code = 'C'
select
ba.payr_client_uno as client_uno
, max(ba.tran_date) as tran_date
from blt_bill_amt ba
left outer join #AttyClnt ac on ba.payr_client_uno = ac.client_uno
where ba.tran_type IN ('RA', 'CR')
group by ba.payr_client_uno
Currently, this query will produce at least 1 row per client with a date, the problem is that there are some clients that will have between 2 and 10 dates associated with them bloating the return table to about 30,000 row instead of an idealistic 246 rows or less.
When i try doing max(tran_uno) to get the most recent transaction number, i get the same result, some have 1 value and others have multiple values.
The bigger picture has 4 other queries being performed doing other parts, i have only included the parts that pertain to the question.
Edit (2011-10-14 # 1:45PM):
select
ba.payr_client_uno as client_uno
, max(ba.row_uno) as row_uno
into #Bills
from blt_bill_amt ba
inner join hbm_matter m on ba.matter_uno = m.matter_uno
inner join hbm_client c on m.client_uno = c.client_uno
inner join hbm_persnl p on c.resp_empl_uno = p.empl_uno
where p.login = #login
and c.status_code = 'C'
and ba.tran_type in ('CR', 'RA')
group by ba.payr_client_uno
order by ba.payr_client_uno
--Obtain list of Transaction Date and Amount for the Transaction
select
b.client_uno
, ba.tran_date
, ba.tc_total_amt
from blt_bill_amt ba
inner join #Bills b on ba.row_uno = b.row_uno
Not quite sure what was going on but seems the Temp Tables were not acting right at all. Ideally i would have 246 rows of data, but with the previous query syntax it would produce from 400-5000 rows of data, obviously duplications on data.
I think you can use ranking to achieve what you want:
WITH ranked AS (
SELECT
client_uno = ba.payr_client_uno,
ba.tran_date,
be.tc_total_amt,
rnk = ROW_NUMBER() OVER (
PARTITION BY ba.payr_client_uno
ORDER BY ba.tran_uno DESC
)
FROM blt_bill_amt ba
INNER JOIN hbm_matter m ON ba.matter_uno = m.matter_uno
INNER JOIN hbm_client c ON m.client_uno = c.client_uno
INNER JOIN hbm_persnl p ON c.resp_empl_uno = p.empl_uno
WHERE p.login = #login
AND c.status_code = 'C'
AND ba.tran_type IN ('CR', 'RA')
)
SELECT
client_uno,
tran_date,
tc_total_amt
FROM ranked
WHERE rnk = 1
ORDER BY client_uno
Useful reading:
Ranking Functions (Transact-SQL)
ROW_NUMBER (Transact-SQL)
WITH common_table_expression (Transact-SQL)
Using Common Table Expressions

Resources