AVG giving a Count instead of Average - sql-server

This is probably a silly mistake on my end but I can't quite figure it out on my on.
I'm trying to calculate average over a set of data pulled from a sub-query presented in the following way:
TotalPDMPs DefaultClinicID
13996 -1
134 23
432 29
123 26
39 27
13 21
40 24
46 30
1 25
Now the average for each 'DefaultClinicID' calculated for 'TotalPDMPs' is the same as the data above.
Here's my query for calculating the average:
select DefaultClinicID as ClinicID, AVG(TotalPDMPs)
from
(select count(p.PatientID) as TotalPDMPs, DefaultClinicID from PatientPrescriptionRegistry ppr, Patient p
where p.PatientID = ppr.PatientID
and p.NetworkID = 2
group by DefaultClinicID) p
group by DefaultClinicID
can someone tell me what I'm doing wrong here?
Thanks.

The group by column is the same so it gets a count in the inner query by DefaultClinicID and then it tries to take an average of the same DefaultClinicID.
Does that make sense? Any aggregation on that column while you group by the same thing will return the same thing. So for clinic 23 the average calculation would be: 134 / 1 = 134.
I think you just need to do the average in your inner query and you get what you want. Or maybe avg(distinct p.patientID) is what you are after?

In the inner sub-query you already grouped by DefaultClinicID,
So every unique DefaultClinicID has already only one row.
And the avg of x is x.

Related

need help in solving sql problems using order by to order building floor

i have this building floor data selected:
6
5
4
3
2
1
UG
GM
G
LG
5B
5A
B1
B2
for this sorting i use this kind of Order by :
order by
(case when ISNUMERIC(floorNo) = 1 then CAST(floorNo AS Int) end) desc ,
(case when ISNUMERIC(left(floorNo,1)) = 0 and ISNUMERIC(substring(floorNo,2,1)) = 1 then floorNo end) asc,
(case when ISNUMERIC(floorNo) = 0 and left(floorNo,1) <>'L' then floorNo end) desc
but i want to make it like this :
6
5B
5A
5
4
3
2
1
UG
GM
G
LG
B1
B2
Can ANy one Help me solve it?
If you make a complicated enough (set of) case statement(s), you would eventually be able to handle all the possibilities, but it is likely to run very slow if you have a lot of data.
If I had to do this, I would probably make a separate lookup table (FloorOrder) with two columns; this floor code and an order column (integer). Create a script to populate the lookup table with all the various possibilities - pick a maximum number of floors, basements, and subfloors per floor, and make all of the possibilities with some loops. Then add all the various floors near ground floor. Make sure the order numbers are spread out enough that you can easily add other codes in between when somebody comes up with a new option (because they will). Something like this subset.
Code Order
2 2000
1C 1300
1B 1200
1A 1100
1 1000
UG 800
GM 500
G 0
LG -300
B1 -1000
It doesn't really matter what the order codes are, as long as they sort the list in the right order, can be easily generated when creating the table, and leave space for fitting things in the gap. Whenever somebody comes up with a new weird floor code (some I've seen near me are things like M (Mezzanine, UM for Upper Mezzanine, etc), add new records to the FloorOrder table to fit them in. Make sure you table has an index on the floor codes
To use it, join to the FloorOrder table, sort by the Order column.

Wierd SUM result in nested Row Group

I have three nested ROW groups:-
The first one is a depended on a wether a field is true or false in the dataset, for each case. This is the where the error is worst. The second is nested on the first and is based on a group variable in the cases (1 to many), the third is the ref number of the cases.
The sums don't work for a cloumn that is produced by a join, depending on the ID of the second group. It seems to pull the right value, but multiplies by the number of cases. I can divide by the case numbers here, inside the last nested group(ref#) to get the right value. Tried using "Count" , Blank, Add total after..
If I try to sum the column with "=Sum(ReportItems!Textbox231.Value)" Produces:-
The Value expression for the textrun 'Textbox232.Paragraphs[0].TextRuns[0]' uses an aggregate function on a report item. Aggregate functions can be used only on report items contained in page headers and footers.
The sums work fine for the non joined values..in all three nested row groups. But for the joined values they are out by an order of magnitude. Why is this?
SUM not working for 3rd column
SUM yields wired results
SELECT DISTINCT
Here is a common reason why this kind of problem happen.
The likely reason for the SUM being wrong is the fact that the DISTINCT in your select hides duplicates in the underlying query. Since the SUM is executed before the distinct, it sum the results that you don't see after they're filtered out by the DISTINCT.
Instead of DISTINCT use a GROUP BY query, then you can either make a base query that do not have duplicates (which you don't have to hide with a DISTINCT) or if you can't get rid of the duplicates, aggregate your column before displaying it by doing a MIN, a MAX or an AVG.
I'd be happy to help more but there's not enough information in your question to reproduce the problem on my computer.
There are other reasons why a SUM can return unexpected results: typically implicit cast (SQL server decides on an unexpected datatype and rounds the numbers), and in some situations a CASE clause which is executed either before or after a WHERE condition. But these don't seem to be the problem here.
Example
DECLARE #T TABLE (ID INT IDENTITY(1,1) PRIMARY KEY CLUSTERED, NumVal INT)
DECLARE #i INT
SET #i = 1
WHILE #i < 1000
BEGIN
INSERT INTO #T (NumVal) VALUES (#i)
IF RIGHT (CAST (#i AS VARCHAR(12)),1) = 7
BEGIN INSERT INTO #T (NumVal) VALUES (#i) END
SET #i = #i +1
END
SELECT DISTINCT NumVal, SUM (NumVal) FROM #T GROUP BY NumVal
In the example above, I have inserted 999 distinct entries in a table, but duplicated any number which ends with 7. The select distinct give the impression that there are only 999 entries, while a sum adds the numbers ending with 7. Your situation is probably more complicated, but what I want to show here is that duplicates in the underlying becomes invisible with a DISTINCT and reappear with a SUM:
NumVal Sum
1 1
2 2
3 3
4 4
5 5
6 6
7 14
8 8
9 9
10 10
11 11
12 12
13 13
14 14
15 15
16 16
17 34
18 18
19 19
20 20
21 21
22 22
23 23
24 24
25 25
26 26

SQL Server : WHERE Clause on a selected variable of a column

I have table of 10k records, and would like to filter the data on a criteria.
Basically the criteria is on two columns one with int and other with text.
Sample Data :
Label Value
A 24
A 18
A 15
A 35
A 27
A 37
B 18
B 29
B 18
B 16
B 16
I wanted to filter and display the data excluding the Value < 20 and Label = A.
Please do help me out in getting an answer for this issue.
Thanks in advance.
How about this simple query?
Select * From MyTable Where Value >= 20 And Label <> 'A'
I think you were nearly there. try this:
Select * From MyTable
Where [Label] = 'B' OR ([Label] = 'A' AND [Value] > 20)

Optimize MDX query

I have two needs in my query
First : to have a sorted product list base on my measure.product with higher sales should appears first.
ProductCode Sales
----------- ------------
123 18
332 17
245 16
656 15
Second : to have cumulative sum on my presorted product list.
ProductCode Sales ACC
----------- ------------ ----
123 18 18
332 17 35
245 16 51
656 15 66
I wrote below MDX in order to achieve above goal:
WITH
SET SortedProducts AS
Order([DIMProduct].[ProductCode].[ProductCode].AllMEMBERS,[Measures]. [Sales],BDESC)
MEMBER [Measures].[ACC] AS
Sum
(
Head
(
[SortedProducts],Rank([DIMProduct].[ProductCode].CurrentMember,[SortedProducts])
)
,[Measures].[Sales]
)
SELECT
{[Measures].[Sales] ,[Measures].[ACC]}
ON COLUMNS,
SortedProducts
ON ROWS
FROM [Model]
But it takes about 3 minutes to run,any suggestion on how to optimize my code or is it normal?
I have 9635 products in total
if you do a quick research on google, there are different ways to achieve it (many answers here as well).
That said, I will give a try to this different way to calculate your running total
MEMBER [Measures].[SortedRank] AS Rank([Product].[Product].CurrentMember, [SortedProducts])
MEMBER [Measures].[ACC2] AS SUM(TopCount([SortedProducts], [Measures].[SortedRank]) ,[Measures].[Internet Sales Amount])
I don't know if TopCount will perform faster than Head for your case, but for example your query on my test machine on AdventureWorks cube takes the same time using Head or TopCount function.
Hope this helps

SQLServer Calculate Average of Multiple Columns

I have generated a table using PIVOT and the ouput of columns are dynamic. One of the output is as given below:
user test1 test2 test3
--------------------------------
A1 10 20 30
A2 90 87 75
A3 78 12 34
The output of above table represents a list of users attending tests. The tests will be added dynamically, so the columns are dynamic in nature.
Now, I want to find out average marks of each user as well as average marks of each test.
I am able to calculate the average of each test, but got puzzled to find out the average of each user.
Is there a way to do this??
Please help.
Mahesh
You can add the marks for each user then divide by the number of columns:
SELECT
user,
(test1 + test2 + test3) / 3 AS average_mark
FROM users
Or to ignore NULL values:
SELECT
user,
(ISNULL(test1, 0) + ISNULL(test2, 0) + ISNULL(test3, 0)) / (
CASE WHEN test1 IS NULL THEN 0 ELSE 1 END +
CASE WHEN test2 IS NULL THEN 0 ELSE 1 END +
CASE WHEN test3 IS NULL THEN 0 ELSE 1 END
) AS average_mark
FROM users
Your table structure has two disadvantages:
Because your table structure is created dynamically you would also have to construct this query dynamically.
Because some students will not have taken all tests yo may have some NULL values.
You may want to consider changing your table structure to fix both of these problems. I would suggest that you use the following structure for your table:
user test mark
-------------------
A1 1 10
A2 1 90
A3 1 78
A1 2 20
A2 2 87
A3 2 12
A1 3 30
A2 3 75
A3 3 34
Then you can do this to get the average mark per user:
SELECT user, AVG(mark) AS average_mark
FROM users
GROUP BY user
And this to get the average mark per test:
SELECT test, AVG(mark) AS average_mark
FROM users
GROUP BY test
Can you do it on your data source before you pivot it?
The simple answer is to UNPIVOT the same way you just PIVOTed. But the best answer is to not do the PIVOT in the first place! Store the unpivoted data in a table first, then from that do your PIVOT and your average.

Resources