SQL Server SUM based on subsequent records - sql-server

Microsoft SQL Server 2012 (SP1) - 11.0.3156.0 (X64)
I am not sure of the best way to word this and have tried a few different searches with different combinations of words without success.
I only want to Sum Sequence = 1 when there are Sequence > 1, in the table below the Sequence = 1 lines marked with *. I don't care at all about checking that Sequence 2,3,etc match the same pattern because if they exist at all I need to Sum them.
I have data that looks like this:
| Sequence | ID | Num | OtherID |
|----------|----|-----|---------|
| 1 | 1 | 10 | 1 |*
| 2 | 1 | 15 | 1 |
| 3 | 1 | 20 | 1 |
| 1 | 2 | 10 | 1 |*
| 2 | 2 | 15 | 1 |
| 1 | 3 | 10 | 1 |
| 1 | 1 | 40 | 3 |
I need to sum the Num column but only when there is more than one sequence. My output would look like this:
Sequence Sum OtherID
1 20 1
2 30 1
3 20 1
I have tried grouping the queries in a bunch of different ways but really by the time I get to the sum, I don't know how to look ahead to make sure there are greater than 1 sequences for an ID.
My query at the moment looks something like this:
select Sequence, Sum(Num) as [Sum], OtherID
from tbl
where ID in (Select ID from tbl where Sequence > 1)
Group by Sequence, OtherID
tbl is a CTE that I wrapped around my query and it partially works, but is not really the filter I wanted.
If this is something that just shouldn't be done or can't be done then I can handle that, but if it's something I am missing I'd like to fix the query.
Edit:
I can't give the full query here but I started with this table/data (to get the above output). The OtherID is there because the data has the same ID/Sequence combinations but that OtherID helps separate them out so the rows are not identical (multiple questions on a form).
Create table #tmpTable (ID int, Sequence int, Num int, OtherID int)
insert into #tmpTable (ID, Sequence, Num, OtherID) values (1, 1, 10, 1)
insert into #tmpTable (ID, Sequence, Num, OtherID) values (1, 2, 15, 1)
insert into #tmpTable (ID, Sequence, Num, OtherID) values (1, 3, 20, 1)
insert into #tmpTable (ID, Sequence, Num, OtherID) values (2, 1, 10, 1)
insert into #tmpTable (ID, Sequence, Num, OtherID) values (2, 2, 15, 1)
insert into #tmpTable (ID, Sequence, Num, OtherID) values (3, 1, 10, 1)
insert into #tmpTable (ID, Sequence, Num, OtherID) values (1, 1, 40, 3)

The following will sum over Sequence and OtherID, but only when:
Either
sequence is greater than 1
or
there is something else with the same ID and OtherID, but a different sequence.
Query:
select Sequence, Sum(Num) as SumNum, OtherID from #tmpTable a
where Sequence > 1
or exists (select * from #tmpTable b
where a.ID = b.ID
and a.OtherID = b.OtherID
and b.Sequence <> a.Sequence)
group by Sequence, OtherID;

It looks like you are trying to sum by Sequence and OtherID if the Count of ID >1, so you could do something like below:
select Sequence, Sum(Num) as [Sum], OtherID
from tbl
where ID in (Select ID from tbl where Sequence > 1)
Group by Sequence, OtherID
Having count(id)>1

Related

Query items that don't have a related record in link table but return results with Id from link table

I have an Item table:
Id | Title | Active
====================
1 | Item 1 | 1
2 | Item 2 | 1
A Location table:
Id | Name
=========
1 | A1
2 | B1
and a link table, where EventId specifies a cycle count event:
Id | EventId | ItemId | LocationId
=============|====================
1 | 1 | 1 | 2
2 | 1 | 2 | 1
3 | 2 | 1 | 1
4 | 2 | 2 | 2
5 | 3 | 1 | 1
I need to determine what items haven't been cycle-counted for a specified EventId (which in this example would be ItemId 2 for EventId 3). We're using a code generation tool that only supports tables and views with a simple filter, so I can't use a sproc or table-valued function. Ideally we'd like to be to do this:
SELECT [EventId], [ItemId] FROM [SomeView] WHERE [EventId] = 3
and get a result like
EventId | ItemId
================
3 | 2
I've tried to wrap my head around this -- unsuccessfully -- because I know it's difficult to query a negative. Is this even possible?
Is something like the following what you're after?
select l.eventId, x.Id ItemId
from Link l
cross apply (
select *
from Items i
where i.Id != l.ItemId
)x
where l.EventId = 3;
--data to work with
DECLARE #items TABLE (ID int, Title nvarchar(100), Active int)
INSERT INTO #items VALUES (1, 'Item 1', 1)
INSERT INTO #items VALUES (2, 'Item 2', 1)
DECLARE #location TABLE (ID int, Name nvarchar(100))
INSERT INTO #location VALUES (1, 'A1')
INSERT INTO #location VALUES (2, 'B1')
DECLARE #linkTable TABLE (ID int, EventId int, ItemId int, LocationId int)
INSERT INTO #linkTable VALUES (1, 1, 1, 2)
INSERT INTO #linkTable VALUES (2, 1, 2, 1)
INSERT INTO #linkTable VALUES (3, 2, 1, 1)
INSERT INTO #linkTable VALUES (4, 2, 2, 2)
INSERT INTO #linkTable VALUES (5, 3, 1, 1)
INSERT INTO #linkTable VALUES (6, 4, 2, 1)
--query you want
SELECT 3 as EventID, ID as ItemID
FROM #items i
WHERE ID not in (SELECT ItemId
FROM #linkTable
WHERE EventId = 3)
Get all the ItemIDs from the LinkTable and then get all the items from the Items table that dont have the sync event. You can replace the 3 in WHERE and SELECT clauses with whatever event you are looking for. And if you want all such pairs of event + item then this should do it:
SELECT subData.EventId, subData.ItemID
FROM (SELECT i.ID as ItemID, cj.EventId
FROM #items i CROSS JOIN (SELECT DISTINCT EventId
FROM #linkTable) cj) subData
left join #linkTable lt ON lt.EventId = subData.EventId and lt.ItemId = subData.ItemID
WHERE lt.ID is null
This could be heavy on performance because CROSS JOIN and DISTINCT and subjoins but it gets the job done. At 1st you create a data of all possible items and events pairs, then left join linked table to it and if the linked table's ID is null that means that there is no event + item pair which means that the item is not synced for that event.

Extract Quantity and price from text

I have these data
CREATE TABLE #Items (ID INT , Col VARCHAR(300))
INSERT INTO #Items VALUES
(1, 'Dave sold 10 items are sold to ABC servercies at 2.50 each'),
(2, '21 was sold to Tray Limited 3.90 each'),
(3, 'Consulting ordered 15 at 7.11 per one'),
(4, 'Returns from Murphy 7 at a cost of 6.10 for each item')
from the Col i want to extract Quantity and Price
I have written the below query which extract the quantity
SELECT
ID,
Col,
LEFT(SUBSTRING(Col, PATINDEX('%[0-9]%', Col), LEN(Col)),2) AS Qty
FROM #Items
my difficulty is that i don't how i can extract the Pice.
Expected output
You were told already, that storing values within such a string is a real no-no-go.
But - if you have to deal with external input - you might try this:
DECLARE #items TABLE(ID INT , Col VARCHAR(300))
INSERT INTO #items VALUES
(1, 'Dave sold 10 items are sold to ABC servercies at 2.50 each'),
(2, '21 was sold to Tray Limited 3.90 each'),
(3, 'Consulting ordered 15 at 7.11 per one'),
(4, 'Returns from Murphy 7 at a cost of 6.10 for each item');
SELECT i.ID
,i.Col
,A.Casted.value('/x[not(empty(. cast as xs:int?))][1]','int') AS firstNumberAsInt
,A.Casted.value('/x[not(empty(. cast as xs:decimal?))][2]','decimal(10,4)') AS SecondNumberAsDecimal
FROM #items i
CROSS APPLY(SELECT CAST('<x>' + REPLACE((SELECT i.Col AS [*] FOR XML PATH('')),' ','</x><x>') + '</x>' AS XML)) A(Casted);
The idea in short:
we use some string methods to transform your string into XML, where each word is within it's own <x>-element.
We use XML-XQuery's abilities to pick only nodes which answer a predicate.
We use the predicate not(empty(. cast as someType)). This will return an element only in cases, where its content can be casted. Any other element is omitted.
The result:
+----+------------------------------------------------------------+------------------+-----------------------+
| ID | Col | firstNumberAsInt | SecondNumberAsDecimal |
+----+------------------------------------------------------------+------------------+-----------------------+
| 1 | Dave sold 10 items are sold to ABC servercies at 2.50 each | 10 | 2.5000 |
+----+------------------------------------------------------------+------------------+-----------------------+
| 2 | 21 was sold to Tray Limited 3.90 each | 21 | 3.9000 |
+----+------------------------------------------------------------+------------------+-----------------------+
| 3 | Consulting ordered 15 at 7.11 per one | 15 | 7.1100 |
+----+------------------------------------------------------------+------------------+-----------------------+
| 4 | Returns from Murphy 7 at a cost of 6.10 for each item | 7 | 6.1000 |
+----+------------------------------------------------------------+------------------+-----------------------+
I'm sure you know that there are millions of cases where this kind of parsing will break...
First things first: DON'T store things like that in a DB and expect to be able just "extract" data. I can give you a solution given the data you have, but it's going to fall down pretty quickly if anyone enters something silly, for example "Sold ice creams 1.50 each x 10" or "Bought 5 sorbets total 20".
What we will do is use CROSS APPLY in series to calculate the positions of each number.
SELECT
ID,
Col,
CAST(SUBSTRING(Col, FirstNum, EndFirst - 1) AS int) AS Qty,
CAST(SUBSTRING(Col, FirstNum + EndFirst + SecondNum - 2, EndSecond) AS decimal(18,2)) AS Price
FROM #Items
CROSS APPLY (VALUES (PATINDEX('%[0-9]%', Col) ) ) v1(FirstNum)
CROSS APPLY (VALUES (PATINDEX('%[^0-9]%', SUBSTRING(Col, FirstNum, LEN(Col))) ) ) v2(EndFirst)
CROSS APPLY (VALUES (PATINDEX('%[0-9.]%', SUBSTRING(Col, FirstNum + EndFirst - 1, LEN(Col))) ) ) v3(SecondNum)
CROSS APPLY (VALUES (PATINDEX('%[^0-9.]%', SUBSTRING(Col, FirstNum + EndFirst - 1 + SecondNum, LEN(Col))) ) ) v4(EndSecond)

SQL: Number of occurrences grouped by frequency

It's not clear the exact statement for me to use here. I want to know how many times certain occurrences happen in the table when the value is A. So for some sample data:
user | value
1 | A
1 | A
1 | B
4 | A
4 | A
4 | B
5 | A
5 | A
5 | A
Would result in:
Occurrence Frequency
1 0
2 2
3 1
Which reads as: there are 0 users that have 1 value A. There are 2 users that have two value A etc.
I feel like I should use a group by and a count(*) by not clear to me how to construct it.
Since you want the occurrences even for 0 frequencies, you need a recursive cte which return all occurrences from 1 to the max number of occurrences.
Then you join this cte with a LEFT join to a query that aggregates on the table and aggregate once more to get the frequencies:
with
cte as (
select count(*) counter
from tablename
where value = 'A'
group by [user]
),
top_counter as (select max(counter) counter from cte),
occurrences as (
select 1 occurrence
union all
select occurrence + 1
from occurrences
where occurrence < (select counter from top_counter)
)
select o.occurrence, count(c.counter) frequency
from occurrences o left join cte c
on c.counter = o.occurrence
group by o.occurrence
See the demo.
Results:
> occurrence | frequency
> ---------: | --------:
> 1 | 0
> 2 | 2
> 3 | 1
You do use COUNT, just 2 of them:
WITH Counts AS(
SELECT V.[User],
COUNT([Value]) AS Frequency
FROM (VALUES(1,'A'),
(1,'A'),
(1,'B'),
(4,'A'),
(4,'A'),
(4,'B'),
(5,'A'),
(5,'A'),
(5,'A'))V([User],[Value]) --USER is a reserved keyword and should not be used for object names
WHERE V.[Value] = 'A'
GROUP BY V.[user])
SELECT V.I,
COUNT(C.Frequency) AS Frequecy
FROM (VALUES(1),(2),(3))V(I)
LEFT JOIN Counts C ON V.I = C.Frequency
GROUP BY V.I;
Here's my take:
with cte as (
select * from (values
(1, 'A'),
(1, 'A'),
(1, 'B'),
(4, 'A'),
(4, 'A'),
(4, 'B'),
(5, 'A'),
(5, 'A'),
(5, 'A')
) as x([User], [Value])
)
select c, count(*)
from (
select [User], count(*) as c
from cte
where [Value] = 'A'
group by [User]
) as s
group by c;
The common table expression isn't important here - it's just setting up your test data.
What you're after is an aggregation of aggretations. That is, the first level aggregate is a "count of value by user". But then you're going to get a "count of (count of value by user) by (that count)". Note, my set doesn't produce the "0 users that have 1 value A". Nor does it produce "0 users that have 17 value A". If it's important that it produce certain negative results, you'll need a list of which ones you care about and join that list with this set of results with an outer join.

Insert multuple rows at once with a calculated column from prior inserts into SQL Server

I'm trying to figure out how to do a multi-row insert as one statement in SQL Server, but where one of the columns is a column computer based on the data as it stands after every insert row.
Let's say I run this simple query and get back 3 records:
SELECT *
FROM event_courses
WHERE event_id = 100
Results:
id | event_id | course_id | course_priority
---+----------+-----------+----------------
10 | 100 | 501 | 1
11 | 100 | 502 | 2
12 | 100 | 503 | 3
Now I want to insert 3 more records into this table, except I need to be able to calculate the priority for each record. The priority should be the count of all courses in this event. But if I run a sub-query, I get the same priority for all new courses:
INSERT INTO event_courses (event_id, course_id, course_priority)
VALUES (100, 500,
(SELECT COUNT (id) + 1 AS cnt_event_courses
FROM event_courses
WHERE event_id = 100)),
(100, 501,
(SELECT COUNT (id) + 1 AS cnt_event_courses
FROM event_courses
WHERE event_id = 1))
Results:
id | event_id | course_id | course_priority
---+----------+-----------+-----------------
10 | 100 | 501 | 1
11 | 100 | 502 | 2
12 | 100 | 503 | 3
13 | 100 | 504 | 4
14 | 100 | 505 | 4
15 | 100 | 506 | 4
Now I know I could easily do this in a loop outside of SQL and just run a bunch of insert statement, but that's not very efficient. There's got to be a way to calculate the priority on the fly during a multi-row insert.
Big thanks to #Sean Lange for the answer. I was able to simplify it even further for my application. Great lead! Learned 2 new syntax tricks today ;)
DECLARE #eventid int = 100
INSERT event_courses
SELECT #eventid AS event_id,
course_id,
course_priority = existingEventCourses.prioritySeed + ROW_NUMBER() OVER(ORDER BY tempid)
FROM (VALUES
(1, 501),
(2, 502),
(3, 503)
) courseInserts (tempid, course_id) -- This basically creates a temp table in memory at run-time
CROSS APPLY (
SELECT COUNT(id) AS prioritySeed
FROM event_courses
WHERE event_id = #eventid
) existingEventCourses
SELECT *
FROM event_courses
WHERE event_id = #eventid
Here is an example of how you might be able to do this. I have no idea where your new rows values are coming from so I just tossed them in a derived table. I doubt your final solution would look like this but it demonstrates how you can leverage ROW_NUMBER for accomplish this type of thing.
declare #EventCourse table
(
id int identity
, event_id int
, course_id int
, course_priority int
)
insert #EventCourse values
(100, 501, 1)
,(100, 502, 2)
,(100, 503, 3)
select *
from #EventCourse
insert #EventCourse
(
event_id
, course_id
, course_priority
)
select x.eventID
, x.coursePriority
, NewPriority = y.MaxPriority + ROW_NUMBER() over(partition by x.eventID order by x.coursePriority)
from
(
values(100, 504)
,(100, 505)
,(100, 506)
)x(eventID, coursePriority)
cross apply
(
select max(course_priority) as MaxPriority
from #EventCourse ec
where ec.event_id = x.eventID
) y
select *
from #EventCourse

updating min value on the second column when the first column appears more then once

Im struggling with how to do this in one step.
I have a column with values which vary between 1 and +-20. Linked to this is a second value which is normally between 1 and 5.
what i want to do is when Number 1 values appears more then once then I need to update the value in column Number 2 to 99 but only the highest number in the Number 2 column.
I have added a pic to explain better.
Basically id is unique, if value 1 appears more then once I need to update value 2 for where the value in value 2 is the highest value.
You can use row_number() to find the row with the highest No2 value and you can use count() over() to check if there are more than one row present for a No1 value.
SQL Fiddle
MS SQL Server 2008 Schema Setup:
create table YourTable
(
No1 int,
No2 int
);
insert into YourTable values
(1, 3),
(1, 2),
(2, 1);
Query 1:
with C as
(
select No2,
row_number() over(partition by No1 order by No2 desc) as rn,
count(*) over(partition by No1) as c
from YourTable
)
update C
set No2 = 99
where rn = 1 and
c > 1
Results:
Query 2:
select *
from YourTable
Results:
| NO1 | NO2 |
|-----|-----|
| 1 | 99 |
| 1 | 2 |
| 2 | 1 |

Resources