Selecting counts with conditions and grouping - sql-server

Some easy points for somebody here, take the following table of data:
EventCode ProcessId
---------- ---------
2 1
-6 3
42 1
-6 2
-12 2
23 4
4 2
-23 1
12 3
-26 1
I need a query that will get be a count of all Process ID where there is a negative event code. So from the dataset above the result would be 3 (Process ID's 1,2 and 3 have negative event codes, process ID 4 does not)
Probably really simple, involving groups but I just can't see the wood for the trees.
As a relevant aside, there are millions of rows in this table.

SELECT COUNT(DISTINCT ProcessID) FROM table
WHERE EventCode < 0

In case there are multiple negative values with the same ProcessId:
SELECT COUNT(DISTINCT ProcessId) FROM table WHERE EventCode < 0;

If you want the counts grouped by ProcessId then this:
SELECT COUNT(*), [ProcessId]
FROM TBL
WHERE [EventCode] < 0
GROUP BY [ProcessId]
If you want the entire negative count:
SELECT COUNT(DISTINCT([ProcessId])) FROM Tbl WITH (NOLOCK) WHERE [EventCode] < 0
For performance:
Create a filtered non-clustered index on column EventCode and include column ProcessId, where EventCode < 0.

The quick answer is
SELECT COUNT(DISTINCT t.ProcessId)
FROM mytable t
WHERE t.EventCode < 0
AND t.ProcessId IS NOT NULL
(It's not necessary to include the predicate on ProcessId; I do here, to point out that COUNT won't include a NULL value.)
That's the simplest approach, but not the only one.
The performance of other possible queries is really going to depend on the organization of the table (HEAP or CLUSTERED), and what indexes are available.

Hope you mean that:
select count(*) from (YourTableNameHere) where EventCode < 0

Related

Classifying rows into a grouping column that shows the data is related to prior rows

I have a set of data that I want to classify into groups based on a prior record id existing on the newer rows. The initial record of the group has a prior sequence id = 0.
The data is as follows:
customer id
sequence id
prior_sequence id
1
1
0
1
2
1
1
3
2
2
4
0
2
5
4
2
6
0
2
7
6
Ideally, I would like to create the following grouping column and yield the following results:
customer id
sequence id
prior sequence id
grouping
1
1
0
1
1
2
1
1
1
3
2
1
2
4
0
2
2
5
4
2
2
6
0
3
2
7
6
3
I've attempted to utilize island gap logic utilizing the ROW_NUMBER() function. However, I have been unsuccessful in doing so. I suspect the need here is more along the lines of a recursive CTE, which I am attempting at the moment.
I agree that a recursive CTE will do the job. Something like:
WITH reccte AS
(
/*query that determines starting point for recursion
*
* In this case we want all records with no prior_sequence_id
*/
SELECT
customer_id,
sequence_id,
prior_sequence_id,
/*establish grouping*/
ROW_NUMBER() OVER (ORDER BY sequence_id) as grouping
FROM yourtable
WHERE prior_sequence_id = 0
UNION
/*join the recursive CTe back to the table and iterate*/
SELECT
yourtable.customer_id,
yourtable.sequence_id,
yourtable.prior_sequence_id,
reccte.grouping
FROM reccte
INNER JOIN yourtable ON reccte.sequence_id = yourtable.prior_sequence_id
)
SELECT * FROM reccte;
It looks like you could use a simple correlated query, at least given your sample data:
select *, (
select Sum(Iif(prior_sequence_id = 0, 1, 0))
from t t2
where t2.sequence_id <= t.sequence_id
) Grouping
from t;
See Example Fiddle

SQL Pivot only select rows

I am attempting to pivot a database so that only certain rows become columns. Below is what my table looks like:
ID QType CharV NumV
1 AccNum 10
1 EmpNam John Inc 0
1 UW Josh 0
2 AccNum 11
2 EmpNam CBS 0
2 UW Dan 0
I would like the table to look like this:
ID AccNum EmpNam
1 10 John Inc
2 11 CBS
I have two main problems I am trying to account for.
1st: the value that I am trying to get isn't always in the same column. So while AccNum is always in the NumV column, EmpName is always in the CharV column.
2nd: I need to find a way to ignore data that I don't want. In this example it would be the row with UW in the QType column.
Below is the code that I have:
SELECT *
FROM testTable
Pivot(
MAX(NumV)
FOR[QType]
In ([AccNum],[TheValue])
)p
But it's giving me the below result:
ID CharV AccNum TheValue
1 10 NULL
2 11 NULL
2 CBS NULL NULL
2 Dan NULL NULL
1 John IncNULL NULL
1 Josh NULL NULL
In this case grouping with conditional aggregation should work. Try something like:
SELECT ID
, MAX(CASE WHEN QType = 'AccNum' THEN NumV END) AS AccNum
, MAX(CASE WHEN QType = 'EmpNam' THEN CharV END) AS EmpNam
FROM testTable
GROUP BY ID
Since the inner CASE only gets a value when the WHEN condition is met, the MAX function will give you the value desired. This of course, only works as long as there are only unique QTypes per ID.
Generally using PIVOT in Sql-Server doesn't work in one step when your conditions are complex, specially when you need values from different columns. You could pivot your table in two queries and join those, but it would perform poorly and is less readable than my suggestion.

SQL Server query to display all columns but with distinct values in one of the columns (not grouping anything)

I have a table with 106 columns. One of those columns is a "Type" column with 16 types.
I want 16 rows, where the Type is distinct. So, row 1 has a type of "Construction", row 2 has a type of "Elevator PVT", etc.
Using Navicat.
From what I've found (and understood) so far, I can't use Distinct (because that looks across all rows), I can't use Group By (because that's for aggregating data, which I'm not looking to do), so I'm stuck.
Please be gentle- I'm really really new at this.
Below is a part of the table (how can I share this normally?)- it's really big so I didn't share the whole thing. Below is a partial result I'm looking for, where the Violation_Type is unique and the rest of the columns display.
Got it.. Sheesh... (took me forever, but got it...)
D_ID B_ID V_ID V_Type S_ID c_f d_y l_u p_s du_p
------ ------ ------- -------------- ------ ----- ------ ------ ----- ------
184 117 V 032 Elevator PVT 2 8 0 0
4 140 V 100 Construction 1 8 0 0
10 116 V 122 Electric 1 8 2005 0 0
11 117 V 033 Boiler Local 1 0 2005 0 0
You can use ROW_NUMBER for this:
SELECT *
FROM(
SELECT *,
rn = ROW_NUMBER() OVER(PARTITION BY V_Type ORDER BY (SELECT NULL))
FROM tbl
)t
WHERE rn = 1
Modify the ORDER BY depending on what row you want to prioritize.
From the documentation:
Returns the sequential number of a row within a partition of a result
set, starting at 1 for the first row in each partition.
This means that for every row within a partition (specified by the PARTITION BY clause), sql-server assigns a number from 1 depending on the order specified in the ORDER BY clause.
ROW_NUMBER requires an ORDER BY clause. SELECT NULL tells the sql-server that we do not want to enforce a particular order. We just want the rows numbered by partition.
The WHERE rn = 1 obviously filters only rows that has a ROW_NUMBER of 1. This gives you one row for every V_TYPE available.

group by with 'pre-defined row'

Say I have to following PaymentTransaction Table:
ID Amount PayMethodID
----------------------------
10254 100 1
15789 150 1
15790 200 0
16954 300 0
17864 400 1
19364 500 1
PayMethodID Desc
----------------------------
0 CASH
1 VISA
2 MASTER
3 AMEX
4 ETC
I can simply use a group by to group the PayMethodID under 1 and 0.
What i am trying to do is to show also the non-exist PayMethodID under GROUP BY
My current result with simple group by statement is
PayMethodID TotalAmount
-------------------------
0 500
1 1150
Expected result (to show 0 if its not exits in the transaction table):
PayMethodID TotalAmount
-------------------------
0 500
1 1150
2 0
3 0
4 0
This might be a simple and duplicated question, but i just cant find the keyword to search around. I would remove this post if you can find me any duplication. Thanks.
You can use LEFT JOIN, so all rows from leftmost table (TableA) will be shown whether it has a matching values on the other table or not.
SELECT a.PayMethodID,
TotalAmount = ISNULL(SUM(b.Amount), 0)
FROM TableA AS a -- <== contains list of card type
LEFT JOIN TableB AS b -- <== contains the payment list
ON a.PayMethodID = b.PayMethodID
GROUP BY a.PayMethodID
A regular OUTER (LEFT) JOIN will give you all rows from the PayMethod table no matter if they exist in the PaymentTransaction table, the rest of the sums being NULL. You can then use a COALESCE to make the null rows zero;
SELECT pm.PayMethodID, COALESCE(SUM(pt.Amount), 0) TotalAmount
FROM PayMethod pm
LEFT JOIN PaymentTransaction pt
ON pm.PayMethodID = pt.PayMethodID
GROUP BY pm.PayMethodID
An SQLfiddle to test with.

Rolling a number from rows with a flag into the next row without the flag

I'm a bit stumped about how to solve this particular piece of a problem I'm working on. I started with a much bigger problem, but I managed to simplify it into this while keeping good performance intact.
Say I have the following result set. AggregateMe is something I'm deriving from SQL conditionals.
MinutesElapsed AggregateMe ID Type RowNumber
1480 1 1 A 1
1200 0 1 A 2
1300 0 1 B 3
1550 0 1 C 4
725 1 1 A 5
700 0 1 A 6
1900 1 2 A 7
3300 1 2 A 8
4900 0 2 A 9
If AggregateMe is 1 (true) or, if you prefer, if is true, I want the counts to be aggregated into the next row where AggregateMe (or conditions) do not evaluate to true.
Aggregate functions or Subqueries are fair game as is PARTITION BY.
For example, the above result set would become:
MinutesElapsed ID Type
2680 1 A
1300 1 B
1550 1 C
1425 1 A
10100 2 A
Is there a clean way to do this? If you want, I can share more about the original problem, but it is a bit more complicated.
Edited to add: SUM and GROUP BY alone won't work, because some sums would be rolled into the wrong row. My sample data did not reflect this case, so I added rows where this case can occur. In the updated sample data, using an aggregate function in the simplest way would cause the 2680 count and the 1425 count to be rolled together, which I do not want.
EDIT: And if you're wondering how I got here in the first place, here you go. I'm going to aggregate statistics about how long our program left something in a certain ActionType, and my first step was by creating this subquery. Please feel free to criticize:
select
ROW_NUMBER() over(order by claimid, insertdate asc) as RowNbr,
DateDiff(mi, ahCurrent.InsertDate, CASE WHEN ahNext.NextInsertDate is null THEN GetDate() ELSE ahNext.NextInsertDate END) as MinutesInActionType,
ahCurrent.InsertDate, ahNext.NextInsertDate,
ahCurrent.ClaimID, ahCurrent.ActionTypeID,
case when ahCurrent.ActionTypeID = ahNext.NextActionTypeID and ahCurrent.ClaimID = ahNext.NextClaimID then 1 else 0 end as aggregateme
FROM
(
select ROW_NUMBER () over(order by claimid, insertdate asc) as RowNum, ClaimID, InsertDate, ActionTypeID
From autostatushistory
--Where AHCurrent is not AHPast
) ahCurrent
LEFT JOIN
(
select ROW_NUMBER() over(order by claimid, insertdate asc) as RowNum, ClaimID as NextClaimID, InsertDate as NextInsertDate, ActionTypeID as NextActionTypeID
FROM autostatushistory
) ahNext
ON (ahCurrent.ClaimID = ahNext.NextClaimID AND ahCurrent.RowNum = ahNext.RowNum - 1 and ahCurrent.ActionTypeID = ahNext.NextActionTypeID)
here the query the you need to execute,
it's not clean, maybe you'll optimize it:
WITH cte AS( /* Create a table containing row number */
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 1)) AS ROW,
MinutesElapsed,
AggregateMe,
ID,
TYPE
FROM rolling
)
SELECT MinutesElapsed + (CASE /* adding minutes from next valid records*/
WHEN cte.AggregateMe <> 1 /*if current record is 0 then */
THEN 0 /*skip it*/
ELSE
(SELECT SUM(MinutesElapsed) /* calculating sum of all -> */
FROM cte localTbl
WHERE
cte.ROW < localTbl.ROW /* next records -> */
AND
localTbl.ROW <= ( /* until we find aggregate = 0 */
SELECT MIN(ROW)
FROM cte sTbl
WHERE sTbl.AggregateMe = 0
AND
sTbl.ROW > cte.ROW
)
AND
(localTbl.AggregateMe = 0 OR /* just to be sure :) */
localTbl.AggregateMe = 1))
END) as MinutesElapsed,
AggregateMe,
ID,
TYPE
FROM cte
WHERE cte.ROW = 1 OR NOT( /* not showing records used that are used in sum, skipping 1 record*/
( /* records with agregate 0 after record with aggregate 1 */
cte.AggregateMe = 0
AND
(
SELECT AggregateMe
FROM cte tblLocal
WHERE cte.ROW = (tblLocal.ROW + 1)
)>0
)
OR
( /* record with aggregate 1 after record with aggregate 1 */
cte.AggregateMe = 1
AND
(
SELECT AggregateMe
FROM cte tblLocal
WHERE cte.ROW = (tblLocal.ROW + 1)
)= 1
)
);
test here
hope it helps to your problem.
feel free to ask questions.
By looking at your result set seems like following would work,
SELECT ID,Type,SUM(MinutesElapsed)
FROM mytable
GROUP BY ID,Type
But cannot tell for sure without looking into original dataset.

Resources