SQL nested case with value differences between 2 tables - sql-server

I have imported two Excel sheets as tables in Microsoft's SQL Server Management Server 2007, and they are both identical, except for the fact that they are from 2 different dates.
I'm looking to do 2 things that I'm struggling to do:
Calculate the monthly difference between values for the 2 tables which I can do with cast, and
inner join, but I'm not successful in using those values to be able to sum those values with a nested case like so:
SELECT SUM(
CASE WHEN ID <>'MISSING' THEN
CASE WHEN SUM(VALUE)>=0 THEN
SUM(VALUE)
ELSE
0
END
END)
I've tried many different ways, but one of the main errors I get is:
Cannot perform an aggregate function on an expression containing an aggregate or a subquery.
The data would like so:
dbo.Table1
date(dd/mm/yy) | name | id | value
---------------+------+---------+-------
1/1/14 | A | MISSING | 56
1/1/14 | A | MISSING | -1
1/1/14 | B | YES | 56
1/1/14 | B | YES | -1
dbo.Table2
date(dd/mm/yy) | name | id | value
---------------+------+---------+-------
1/2/14 | A | MISSING | 24
1/2/14 | A | MISSING | -11
1/2/14 | B | YES | 24
1/2/14 | B | YES | -11

Don't use SUM inside another SUM, just keep only the outer SUM. Also your outer CASE has no ELSE part, but you'd probably want to have 0 in that scenario as well, so why not use a single CASE condition?
SELECT SUM(CASE WHEN ID <> 'MISSING' AND VALUE >= 0 THEN VALUE ELSE 0 END)

Related

Running count of duplicate values

I have a table showing pallets and the amount of product ("units") on those pallets. Individual pallets can have multiple records due to multiple possible defect codes. This means when I am trying to sum the total units on all pallets, the same pallet could get counted more than once, which is undesirable. I would like (but don't know how) to add a running tally column to show how many times a specific pallet ID has appeared so that I can filter out any record where the count is greater than 1:
| Pallet_ID | Units | Defect_Code | COUNT |
+-----------+-------+-------------+-------+
| A1 | 100 | 03 | 1 |
| A1 | 100 | 05 | 2 |
| B1 | 95 | 03 | 1 |
| C1 | 300 | 05 | 1 |
| C1 | 300 | 06 | 2 |
| D1 | 210 | 03 | 1 |
| A1 | 100 | 10 | 3 |
| D1 | 210 | 03 | 2 |
In the above example, the correct sum total of units should be 705. A solution in SQL or in DAX would work (although I lean towards SQL). I have searched for a long time but could not find a solution that fits this particular scenario. Many thanks in advance for your time and consideration!
You may use the windowing function row_number() with the over clause where you partition by the pallet. Within each partition you can control which row is assigned the number 1 by using the order by inside the over clause.
select
*
from (
select
Pallet_ID
, Units
, Defect_Code
, row_number() over(partition by Pallet_ID order by defect_code) as count_of
from yourtable
)
where count_of = 1
Note I have arbitrability use the column defect_code to order by as I don't know what other columns may exist. If your table has a date/time value for when the row was created you could use this instead, or perhaps the unique key of the table.
side note:
I would not recommend using column alias of "count" as it's a SQL reserved word

Maximum Daisy Chain Length

I have a bunch of value pairs (Before, After) by users in a table. In ideal scenarios these values should form an unbroken chain. e.g.
| UserId | Before | After |
|--------|--------|-------|
| 1 | 0 | 10 |
| 1 | 10 | 20 |
| 1 | 20 | 30 |
| 1 | 30 | 40 |
| 1 | 40 | 30 |
| 1 | 30 | 52 |
| 1 | 52 | 0 |
Unfortunately, these records originate in multiple different tables and are imported into my investigation table. The other values in the table do not lend themselves to ordering (e.g. CreatedDate) due to some quirks in the system saving them out of order.
I need to produce a list of users with gaps in their data. e.g.
| UserId | Before | After |
|--------|--------|-------|
| 1 | 0 | 10 |
| 1 | 10 | 20 |
| 1 | 20 | 30 |
// Row Deleted (30->40)
| 1 | 40 | 30 |
| 1 | 30 | 52 |
| 1 | 52 | 0 |
I've looked at the other Daisy Chaining questions on SO (and online in general), but they all appear to be on a given problem space, where one value in the pair is always lower than the other in a predictable fashion. In my case, there can be increases or decreases.
Is there a way to quickly calculate the longest chain that can be created? I do have a CreatedAt column that would provide some (very rough) relative ordering - When the date is more than about 10 seconds apart, we could consider them orderable)
Are you not therefore simply after this to get the first row where the "chain" is broken?
SELECT UserID, Before, After
FROM dbo.YourTable YT
WHERE NOT EXISTS (SELECT 1
FROM dbo.YourTable NE
WHERE NE.After = YT.Before)
AND YT.Before != 0;
If you want to last row where the row where the "chain" is broken, just swap the aliases on the columns in the WHERE in the NOT EXISTS.
the following performs hierarchical recursion on your example data and calculates a "chain" count column called 'h_level'.
;with recur_cte([UserId], [Before], [After], h_level) as (
select [UserId], [Before], [After], 0
from dbo.test_table
where [Before] is null
union all
select tt.[UserId], tt.[Before], tt.[After], rc.h_level+1
from dbo.test_table tt join recur_cte rc on tt.UserId=rc.UserId
and tt.[Before]=rc.[After]
where tt.[Before]<tt.[after])
select * from recur_cte;
Results:
UserId Before After h_level
1 NULL 10 0
1 10 20 1
1 20 30 2
1 30 40 3
1 30 52 3
Is this helpful? Could you further define which rows to exclude?
If you want users that have more than one chain:
select t.UserID
from <T> as t left outer join <T> as t2
on t2.UserID = t.UserID and t2.Before = t.After
where t2.UserID is null
group by t.UserID
having count(*) > 1;

Using Where clause in Case to compare

I am new to T-sql .
I am using the following Query:
SELECT e.Id,e.cAvg,
CASE
WHEN e.cAvg<=0.8 and cAvg>=0 THEN t.Model when t.Cr='0.8' then t.Model
WHEN e.cAvg>0.8 and cAvg<=5.4 THEN t.Model WHEN t.Cr='5.4' then t.Model
WHEN e.cAvg>5.4 and cg<=8 THEN t.Model WHEN t.Cr='8' then t.Model
ELSE 'No Change Required'
END
from A e, B t;
What I am trying to do is:
Select id and cAvg columns in Table A.
Compare cAvg in Table A with Cr in Table B.
Use the comparison in CASE to select the particular row which satisfies the condition.
Use the selected row to give query results.
t.Model is a column of table B. I want to select t.Model value of the selected row in the case statement.
I feel the way is to somehow include a equivalent of the where clause inside When of CASE.
Need Direction!!
The table schema:
Table A:
+----+------+
| id | cAvg |
+----+------+
| 1 | .8 |
| 2 | 5.4 |
| 3 | 6.0 |
+----+------+
Table B:
+-----+-------+
| Cr | Model |
+-----+-------+
| 2 | M1 |
| 5.5 | M2 |
| 8 | M3 |
+-----+-------+
I want to the following:
Compare the values of cAvg with a condition => (cAvg<=8 And cAvg>=5.5 => the model selected must be M3.)
The result I want to get is:
+----+------+-------+
| id | cAvg | Model |
+----+------+-------+
| 1 | .8 | M1 |
| 2 | 5.4 | M2 |
| 3 | 6.0 | M3 |
+----+------+-------+
I tired Join as suggested in the comments, A great thanks , I learnt a lot because of it!!.
My problem is that there are no common columns to join.
Also I need to compare the column in one table with that of another table and then give a result based on the comparison.
I referred to many answers in stack overflow but all the answers are for the premise where there is a common column.
I tried the following:
Inner Join
Cases
I need a direction as to which direction I need to go into.
Thank you!!
1st of all, you're selecting from 2 tables but without any link restrictions, so all rows are compared
If there is a matching key between the tables, so only relevant pairs of rows would be compared, it should be used, in a JOIN statement:
A e JOIN B t ON e.id = t.id
2nd of all, in order to select relevant lines, you should decide what these are..
you can inside a WHERE statement define whatvare the relent cases
WHERE
e.cAvg > 12
You can use the case statement inside WHERE but then the result should be conditioned are returned TRUE
SELECT e.Id,e.cAvg, t.Model
A e JOIN B t ON e.id = t.id
WHERE
CASE WHEN e.cAvg<=6 THEN t.Model when t.Cr=6 then t.Model
WHEN e.cAvg>6 and e.cAvg<=12 THEN t.Model
WHEN t.Cr='12' then t.Model
WHEN e.cAvg>12 and cg<=24 THEN t.Model
WHEN t.Cr='24' then t.Model
ELSE -1 END ! = -1
EDIT
Following you question edit, I think that what you need is a JOIN with a condition
Basically, instead of joining the tables with an equal key, join them with an unequal key.
Since you're looking for cAvg BETWEEN to t. Cr rows, 2 JOINs are needed
SELECT e.Id,e.cAvg, t.Model
FROM
A e JOIN B t ON
e.cAvg >= t.Cr
JOIN B t2 ON
e.cAvg < t2.Cr
WHERE
t.Cr IS NOT NULL
AND t2.Cr IS NOT NULL
The idea is that only where the 2 conditions are met, you would get the results of e
Hope that helps
I found a possible work around for the problem.
Problem Statement:
Compare two tables with no common column.
Use the comparison in CASE to select a particular row.
A WHERE Clause inside CASE is not accepted in T-SQL.
My Work Around :
Add a new column in the second table.
Assign An id from Table B to the column in Table A.
Use the assigned id to select the required row in Table B.
Tables:
Table A:
+----+------+
| id | cAvg |
+----+------+
| 1 | .8 |
| 2 | 5.4 |
| 3 | 6.0 |
+----+------+
Table B
+-----+-----+-------+
| Bid | Cr | Model |
+-----+-----+-------+
| 1 | 2 | M1 |
| 2 | 5.5 | M2 |
| 3 | 8 | M3 |
+-----+-----+-------+
Query to assign id's:
CREATE VIEW [AssignIDView] AS
SELECT DISTINCT e.id,
e.cAvg,
(CASE
WHEN e.cAvg>=0 and e.cAvg<=2 THEN 1
WHEN e.cAvg>2 and e.cAvg<=5.5 THEN 2
WHEN e.cAvg>3 and e.cAvg<=8 THEN 3
ELSE 'Invalid'
END) As BId
FROM A e, B t;
The result of the above Query will be a view as follows:
+----+------+-----+
| id | cAvg | Bid |
+----+------+-----+
| 1 | .8 | 1 |
| 2 | 5.4 | 2 |
| 3 | 6.0 | 3 |
+----+------+-----+
Now use Bid to select rows from table B to assign Model from B:
Query:
CREATE VIEW [ModelAssignView] AS
select e.id,
e.cAvg,
t.Model as [Model]
FROM A e, B t where e.TierID = t.id;
The result of the Query will be as follows:
+----+------+-------+
| id | cAvg | Model |
+----+------+-------+
| 1 | .8 | M1 |
| 2 | 5.4 | M2 |
| 3 | 6.0 | M3 |
+----+------+-------+
The intention of my question was to do the above.
For that I wanted to find an Equivalent of A WHERE Clause inside CASE.
But the above method achieved the solution for me.
Hope it helps:)!

SQL apply functions to multiple id rows

I'm using SQL Server 2008, and trying to gather individual customer data appearing over multiple rows in my table, an example of my database is as follows:
custID | status | type | value
-------------------------
1 | 1 | A | 150
1 | 0 | B | 100
1 | 0 | A | 153
1 | 0 | A | 126
2 | 0 | A | 152
2 | 0 | B | 101
2 | 0 | B | 103
For each custID, my task is to find a flag if status=1 for any row, if type=B for any row, and the average of value in all cases where type=B. So my solution should look like:
custID | statusFlag | typeFlag | valueAv
-------------------------------------------
1 | 1 | 1 | 100
2 | 0 | 1 | 102
I can get answers for this using lots of row_number() over (partition by .. ), to create ids, and creating subtables for each column selecting the desired id. My issue is this method is awkward and time consuming, as I have many more columns than shown above to do this over, and many tables to repeat it for. My ideal solution would be to define my own aggregate() function so I could just do:
select custID, ag1(statusFlag), ag2(typeFlag)
group by custID
but as far as I can tell custom aggregates can't be defined in SQL server. Is there a nicer general approach to this problem, which doesn't require defining lots of id's ?
use CASE WHEN to evaluate the value and apply the aggregate function accordingly
select custID,
statusFlag = max(status),
typeFlag = max(case when type = 'B' then 1 else 0 end),
valueAv = avg(case when type = 'B' then value end)
from samples
group by custID

TSQL Multiple column unpivot with named rows possible?

I know there are several unpivot / cross apply discussions here but I was not able to find any discussion that covers my problem. What I've got so far is the following:
SELECT Perc, Salary
FROM (
SELECT jobid, Salary_10 AS Perc10, Salary_25 AS Perc25, [Salary_Median] AS Median
FROM vCalculatedView
WHERE JobID = '1'
GROUP BY JobID, SourceID, Salary_10, Salary_25, [Salary_Median]
) a
UNPIVOT (
Salary FOR Perc IN (Perc10, Perc25, Median)
) AS calc1
Now, what I would like is to add several other columns, eg. one named Bonus which I also want to put in Perc10, Perc25 and Median Rows.
As an alternative, I also made a query with cross apply, but here, it seems as if you can not "force" sort the rows like you can with unpivot. In other words, I can not have a custom sort, but only a sort that is according to a number within the table, if I am correct? At least, here I do get the result like I wish to have, but the rows are in a wrong order and I do not have the rows names like Perc10 etc. which would be nice.
SELECT crossapplied.Salary,
crossapplied.Bonus
FROM vCalculatedView v
CROSS APPLY (
VALUES
(Salary_10, Bonus_10)
, (Salary_25, Bonus_25)
, (Salary_Median, Bonus_Median)
) crossapplied (Salary, Bonus)
WHERE JobID = '1'
GROUP BY crossapplied.Salary,
crossapplied.Bonus
Perc stands for Percentile here.
Output is intended to be something like this:
+--------------+---------+-------+
| Calculation | Salary | Bonus |
+--------------+---------+-------+
| Perc10 | 25 | 5 |
| Perc25 | 35 | 10 |
| Median | 27 | 8 |
+--------------+---------+-------+
Do I miss something or did I something wrong? I'm using MSSQL 2014, output is going into SSRS. Thanks a lot for any hint in advance!
Edit for clarification: The Unpivot-Method gives the following output:
+--------------+---------+
| Calculation | Salary |
+--------------+---------+
| Perc10 | 25 |
| Perc25 | 35 |
| Median | 27 |
+--------------+---------+
so it lacks the column "Bonus" here.
The Cross-Apply-Method gives the following output:
+---------+-------+
| Salary | Bonus |
+---------+-------+
| 35 | 10 |
| 25 | 5 |
| 27 | 8 |
+---------+-------+
So if you compare it to the intended output, you'll notice that the column "Calculation" is missing and the row sorting is wrong (note that the line 25 | 5 is in the second row instead of the first).
Edit 2: View's definition and sample data:
The view basically just adds computed columns of the table. In the table, I've got Columns like Salary and Bonus for each JobID. The View then just computes the percentiles like this:
Select
Percentile_Cont(0.1)
within group (order by Salary)
over (partition by jobID) as Salary_10,
Percentile_Cont(0.25)
within group (order by Salary)
over (partition by jobID) as Salary_25
from Tabelle
So the output is like:
+----+-------+---------+-----------+-----------+
| ID | JobID | Salary | Salary_10 | Salary_25 |
+----+-------+---------+-----------+-----------+
| 1 | 1 | 100 | 60 | 70 |
| 2 | 1 | 100 | 60 | 70 |
| 3 | 2 | 150 | 88 | 130 |
| 4 | 3 | 70 | 40 | 55 |
+----+-------+---------+-----------+-----------+
In the end, the view will be parameterized in a stored procedure.
Might this be your approach?
After your edits I understand, that your solution with CROSS APPLY would comes back with the right data, but not in the correct output. You can add constant values to your VALUES and do the sorting in a wrapper SELECT:
SELECT wrapped.Calculation,
wrapped.Salary,
wrapped.Bonus
FROM
(
SELECT crossapplied.*
FROM vCalculatedView v
CROSS APPLY (
VALUES
(1,'Perc10',Salary_10, Bonus_10)
, (2,'Perc25',Salary_25, Bonus_25)
, (3,'Median',Salary_Median, Bonus_Median)
) crossapplied (SortOrder,Calculation,Salary, Bonus)
WHERE JobID = '1'
GROUP BY crossapplied.SortOrder,
crossapplied.Calculation,
crossapplied.Salary,
crossapplied.Bonus
) AS wrapped
ORDER BY wrapped.SortOrder

Resources