Subtract data from the same column - sql-server

I am having difficulties subtracting data from the same column. My data structure looks like this:
TimeStamp ID state Gallons
1/01/2012 1 NY 100
1/05/2012 1 NY 90
1/10/2012 1 NY 70
1/15/2012 1 NY 40
So I would like to get the result of (100-40) = 60 gallons used for that time period. I tried using the following query:
SELECT T1.Gallons, T1.Gallons -T1.Gallons AS 'Gallons used'
FROM Table 1 AS T1
where Date = '2012-07-01'
ORDER BY Gallons
It seems like a simple way of doing this. However, I can not find an efficient way of including the timestamp as a way of retrieving the right values (ex. gallon value on 1/1/2012 - gallon value on 1/30/2012).
ps. The value of "Gallons" will always decrease

select max(gallons) - min(gallons)
from table1
group by Date
having Date = '2012-07-01'

try
select max(gallons) - min(gallons)
from table1 t1
where date between '2012-01-01' and '2012-01-31'

SELECT MAX(gallons) - MIN(gallons) AS 'GallonsUsed'
FROM table;
would work in this instance, if you are doing a monthly report this could work. How does the value of gallons never increase? what happens when you run out?

Will you always be comparing the maximum and minimum amount? If not, maybe something like the following will suffice (generic sample)? The first time you'll run the script you'll get an error but you can disregard it.
drop table TableName
create table TableName (id int, value int)
insert into TableName values (3, 20)
insert into TableName values (4, 17)
insert into TableName values (5, 16)
insert into TableName values (6, 12)
insert into TableName values (7, 10)
select t1.value - t2.value
from TableName as t1 inner join TableName as t2
on t1.id = 5 and t2.id = 6
The thing you're asking for is the last line. After we've inner joined the table onto itself, we simply request to match the two lines with certain ids. Or, as in your case, the different dates.

Related

SQL Function to return sequential id's

Consider this simple INSERT
INSERT INTO Assignment (CustomerId,UserId)
SELECT CustomerId,123 FROM Customers
That will obviously assign UserId=123 to all customers.
What I need to do is assign them to 3 userId's sequentially, so 3 users get one third of the accounts equally.
INSERT INTO Assignment (CustomerId,UserId)
SELECT CustomerId,fnGetNextId() FROM Customers
Could I create a function to return sequentially from a list of 3 ID's?, i.e. each time the function is called it returns the next one in the list?
Thanks
Could I create a function to return sequentially from a list of 3 ID's?,
If you create a SEQUENCE, then you can assign incremental numbers with the NEXT VALUE FOR (Transact-SQL) expression.
This is a strange requirement, but the modulus operator (%) should help you out without the need for functions, sequences, or altering your database structure. This assumes that the IDs are integers. If they're not, you can use ROW_NUMBER or a number of other tactics to get a distinct number value for each customer.
Obviously, you would replace the SELECT statement with an INSERT once you're satisfied with the code, but it's good practice to always select when developing before inserting.
SETUP WITH SAMPLE DATA:
DECLARE #Users TABLE (ID int, [Name] varchar(50))
DECLARE #Customers TABLE (ID int, [Name] varchar(50))
DECLARE #Assignment TABLE (CustomerID int, UserID int)
INSERT INTO #Customers
VALUES
(1, 'Joe'),
(2, 'Jane'),
(3, 'Jon'),
(4, 'Jake'),
(5, 'Jerry'),
(6, 'Jesus')
INSERT INTO #Users
VALUES
(1, 'Ted'),
(2, 'Ned'),
(3, 'Fred')
QUERY:
SELECT C.Name AS [CustomerName], U.Name AS [UserName]
FROM #Customers C
JOIN #Users U
ON
CASE WHEN C.ID % 3 = 0 THEN 1
WHEN C.ID % 3 = 1 THEN 2
WHEN C.ID % 3 = 2 THEN 3
END = U.ID
You would change the THEN 1 to whatever your first UserID is, THEN 2 with the second UserID, and THEN 3 with the third UserID. If you end up with another user and want to split the customers 4 ways, you would do replace the CASE statement with the following:
CASE WHEN C.ID % 4 = 0 THEN 1
WHEN C.ID % 4 = 1 THEN 2
WHEN C.ID % 4 = 2 THEN 3
WHEN C.ID % 4 = 3 THEN 4
END = U.ID
OUTPUT:
CustomerName UserName
-------------------------------------------------- --------------------------------------------------
Joe Ned
Jane Fred
Jon Ted
Jake Ned
Jerry Fred
Jesus Ted
(6 row(s) affected)
Lastly, you will want to select the IDs for your actual insert, but I selected the names so the results are easier to understand. Please let me know if this needs clarification.
Here's one way to produce Assignment as an automatically rebalancing view:
CREATE VIEW dbo.Assignment WITH SCHEMABINDING AS
WITH SeqUsers AS (
SELECT UserID, ROW_NUMBER() OVER (ORDER BY UserID) - 1 AS _ord
FROM dbo.Users
), SeqCustomers AS (
SELECT CustomerID, ROW_NUMBER() OVER (ORDER BY CustomerID) - 1 AS _ord
FROM dbo.Customers
)
-- INSERT Assignment(CustomerID, UserID)
SELECT SeqCustomers.CustomerID, SeqUsers.UserID
FROM SeqUsers
JOIN SeqCustomers ON SeqUsers._ord = SeqCustomers._ord % (SELECT COUNT(*) FROM SeqUsers)
;
This shifts assignments around if you insert a new user, which could be quite undesirable, and it's also not efficient if you had to JOIN on it. You can easily repurpose the query it contains for one-time inserts (the commented-out INSERT). The key technique there is joining on ROW_NUMBER()s.

Retrieve Sorted Column Value in SQL Server

What i have:
I have a Column
ID  SerialNo
1  101
2  102
3  103
4  104
5  105
6  116
7  117
8  118
9  119
10 120
These are just the 10 dummy rows. The actual table has over 100 000 rows.
What I Want to get:
A method or formula like any sorting technique which could return me the starting and ending element of [SerialNo] Column for every sub-series. For example
Expected Result: 101-105, 115-120
The comma separation in the above result is not important, only the starting and ending elements are important.
What I have tried:
I did it by PL/SQL programming, by running a loop in which I’m getting the starting and ending elements getting stored in a TABLE.
But due to no. of rows (over 100 000) the query execution is taking around 2 minutes.
I have also searched about some sorting techniques for the SQL Server but I found nothing. Because rendering every row will take twice the time then a sorting algorithm
Assuming every sub series should contain 5 records, I got expected result using below sql. I hope this helps.
DECLARE #subSeriesRange INT=5;
CREATE TABLE #Temp(ID INT,SerialNo INT);
INSERT INTO #Temp VALUES(1,101),
(2,102),
(3,103),
(4,104),
(5,105),
(6,116),
(7,117),
(8,115),
(9,119),
(10,120);
SELECT STUFF((SELECT CONCAT(CASE ID%#subSeriesRange WHEN 1 THEN ',' ELSE '-' END,SerialNo)
FROM #Temp
WHERE ID%#subSeriesRange = 1 OR ID%#subSeriesRange=0
ORDER BY ID
FOR XML PATH('')),1,1,''
);
DROP TABLE #Temp;
Just finding the start and end of each series is quite straightforward:
declare #t table (ID int not null, SerialNo int not null)
insert into #t(ID,SerialNo) values
(1 ,101), (2 ,102), (3 ,103),
(4 ,104), (5 ,105), (6 ,116),
(7 ,117), (8 ,118), (9 ,119),
(10,120)
;With Starts as (
select t1.SerialNo,ROW_NUMBER() OVER (ORDER BY t1.SerialNo) as rn
from
#t t1
left join
#t t1_no
on t1.SerialNo = t1_no.SerialNo + 1
where t1_no.ID is null
), Ends as (
select t1.SerialNo,ROW_NUMBER() OVER (ORDER BY t1.SerialNo) as rn
from
#t t1
left join
#t t1_no
on t1.SerialNo = t1_no.SerialNo - 1
where t1_no.ID is null
)
select
s.SerialNo as StartSerial,
e.SerialNo as EndSerial
from
Starts s
inner join
Ends e
on s.rn = e.rn
The logic being that a Start is a row where there is no row that has the SerialNo one less than the current row, and an End is a row where there is no row that has the SerialNo one more than the current row.
This may still perform poorly if there is no index on the SerialNo column.
Results:
StartSerial EndSerial
----------- -----------
101 105
116 120
Which is hopefully acceptable since you didn't seem to care what the specific results look like. It's also keeping things set-based.

SUM() column based on other columns

I have table with sales plan data for every week, which consists of few columns:
SAL_DTDGID -- which is date of every Sunday, for example 20160110, 20160117
SAL_MQuantity --sum of sales plan value
SAL_MQuantityYTD --sum of plans since first day of the year
SAL_CoreElement --sales plan data for few core elements
SAL_Site --unique identifier of place, where sale has happened
How do I sum values in SAL_MQuantityYTD as values of SAL_MQuantity since first records in 2016 to 'now' for every site and every core element?
Every site mentioned in SAL_Site has 52 rows corresponding week count in a year along with 5 different SAL_CoreElement's
Example:
SAL_DTDGID|SAL_MQuantity|SAL_MQuantityYTD|SAL_CoreElement|SAL_Site
20160110 |20000 |20000 |1 |1234
20160117 |10000 |30000 |1 |1234
20160124 |30000 |60000 |1 |1234
If something isn't clear I'll try to explain.
Not sure I completely understand your question, but this should allow you to recreate the running sum for SAL_MQuantityYTD. Replace #test with whatever your table/view is called.
SELECT *,
(SELECT SUM(SAL_MQuantity)
FROM #test T2
WHERE T2.SAL_DTDGID <= T1.SAL_DTDGID
AND T2.SAL_Site = T1.SAL_Site
AND T2.SAL_coreElement = T1.SAL_coreElement) AS RunningTotal
FROM #test T1
If you wanted to create the yearly figure then you could also use a correlated subquery like this
SELECT *,
(SELECT SUM(SAL_MQuantity)
FROM #test T2
WHERE cast(left(T2.SAL_DTDGID,4) as integer) = cast(left(T1.SAL_DTDGID,4) as integer)
AND T2.SAL_Site = T1.SAL_Site
AND T2.SAL_coreElement = T1.SAL_coreElement) AS RunningTotal
FROM #test T1
Edit: Just seen, basically the same answer, using a window function.
Let me explain you an idea. Please try below.
Select A, B,
(Select SUM(SAL_MQuantity)
FORM [Your Table]
WHERE [your date column] between '20160101' AND '[Present date]') AS SAL_MQuantityYTD
FROM [Your Table]
My understanding from your questions is that you want to have the YTD sum of SAL_MQuantity for each year (you can simply 'where' after if you only want 2016), SAL_Site, SAL_CoreElement.
The code below should achieve that and will run on SQL 2008 r2 (im running 2005).
'##t1' is the temp table name I used to test, replace it with your table name.
Select distinct
sum (SAL_MQuantity) over (partition by
left (cast (cast (SAL_DTDGID as int) as varchar (8)),4)
, SAL_Site
, SAL_CoreElement
) as Sum_SAL_DTDGID
,left (cast (cast (SAL_DTDGID as int) as varchar (8)),4) as Time_Period
, SAL_Site
, SAL_CoreElement
from ##t1

Getting Random Number for each row

I have a table with some names in a row. For each row I want to generate a random name. I wrote the following query to:
BEGIN transaction t1
Create table TestingName
(NameID int,
FirstName varchar(100),
LastName varchar(100)
)
INSERT INTO TestingName
SELECT 0,'SpongeBob','SquarePants'
UNION
SELECT 1, 'Bugs', 'Bunny'
UNION
SELECT 2, 'Homer', 'Simpson'
UNION
SELECT 3, 'Mickey', 'Mouse'
UNION
SELECT 4, 'Fred', 'Flintstone'
SELECT FirstName from TestingName
WHERE NameID = ABS(CHECKSUM(NEWID())) % 5
ROLLBACK Transaction t1
The problem is the "ABS(CHECKSUM(NEWID())) % 5" portion of this query sometime returns more than 1 row and sometimes returns 0 rows. I must be missing something but I can't see it.
If I change the query to
DECLARE #n int
set #n= ABS(CHECKSUM(NEWID())) % 5
SELECT FirstName from TestingName
WHERE NameID = #n
Then everything works and I get a random number per row.
If you take the query above and paste it into SQL management studio and run the first query a bunch of times you will see what I am attempting to describe.
The final update query will look like
Update TableWithABunchOfNames
set [FName] = (SELECT FirstName from TestingName
WHERE NameID = ABS(CHECKSUM(NEWID())) % 5)
This does not work because sometimes I get more than 1 row and sometimes I get no rows.
What am I missing?
The problem is that you are getting a different random value for each row. That is the problem. This query is probably doing a full table scan. The where clause is executed for each row -- and a different random number is generated.
So, you might get a sequence of random numbers where none of the ids match. Or a sequence where more than one matches. On average, you'll have one match, but you don't want "on average", you want a guarantee.
This is when you want rand(), which produces only one random number per query:
SELECT FirstName
from TestingName
WHERE NameID = floor(rand() * 5);
This should get you one value.
Why not use top 1?
Select top 1 firstName
From testingName
Order by newId()
This worked for me:
WITH
CTE
AS
(
SELECT
ID
,FName
,CAST(5 * (CAST(CRYPT_GEN_RANDOM(4) as int) / 4294967295.0 + 0.5) AS int) AS rr
FROM
dbo.TableWithABunchOfNames
)
,CTE_ForUpdate
AS
(
SELECT
CTE.ID
, CTE.FName
, dbo.TestingName.FirstName AS RandomName
FROM
CTE
LEFT JOIN dbo.TestingName ON dbo.TestingName.NameID = CTE.rr
)
UPDATE CTE_ForUpdate
SET FName = RandomName
;
This solution depends on how smart optimizer is.
For example, if I use INNER JOIN instead of LEFT JOIN (which is the correct choice for this query), optimizer would move calculation of random numbers outside the join loop and end result would be not what we expect.
I created a table TestingName with 5 rows as in the question and a table TableWithABunchOfNames with 100 rows.
Here is the execution plan with LEFT JOIN. You can see the Compute scalar that calculates random numbers is done before the join loop. You can see that 100 rows were updated:
Here is the execution plan with INNER JOIN. You can see the Compute scalar that calculates random numbers is done after the join loop and with extra filter. This query may update not all rows in TableWithABunchOfNames and some rows in TableWithABunchOfNames may be updated several times. You can see that Filter left 102 rows and Stream aggregate left only 69 rows. It means that only 69 rows were eventually updated and also there were multiple matches for some rows (102 - 69 = 33).
To guarantee that the result is what you expect you should generate random number for each row in TableWithABunchOfNames and explicitly remember the result, i.e. materialize the CTE shown above. Then use this temporary result to join with the table TestingName.
You can add a column to TableWithABunchOfNames to store generated random numbers or save CTE to a temp table or table variable.

SQL Newbie Needs Assistance w/ Query

Below is what I am trying to do in SQL Server 2012. I want to Update Table 2 with the % of total that each AMT value is to the total in Table 1. But the denominator to get the % should only be the total of the rows that have the same MasterDept. I can use this SELECT query to get the correct percentages when I load the table with only one MasterDept but do not know how to do it when there are multiple MasterDept. The first 2 columns in each table are identical both in structure and the data within the columns.
SELECT ABCID,
[AMT%] = ClientSpreadData.AMT/CONVERT(DECIMAL(16,4),(SELECT SUM(ClientSpreadData.AMT)
FROM ClientSpreadData))
FROM ClientSpreadData
Table data
TABLE 1 (MasterDept varchar(4), ABCID varchar(20), AMT INT)
Sample Data (4700, 1, 25),
(4300, 2, 30),
(4700, 3, 50),
(4300, 4, 15)
TABLE 2 (MasterDept varchar(4), ABCID varchar(20), [AMT%] INT)
Sample Data (4700, 1, AMT%)
AMT% should equal AMT / SUM(AMT). SUM(AMT) should only be summing the values where the MasterDept on Table 1 matches the MasterDept from the record on Table 2.
Does that make sense?
You can use a window to get a partitioned SUM():
SELECT MasterDept, ABCID, AMT, SUM(AMT) OVER(PARTITION BY MasterDept)
FROM #Table1
You can use that to get the percentage for each row to update your second table (this assumes 1 row per MasterDept/ABCID combination):
UPDATE A
SET A.[AMT%] = B.[AMT%]
FROM Table2 A
JOIN (SELECT MasterDept
, ABCID
, AMT
, CASE WHEN SUM(AMT) OVER(PARTITION BY MasterDept) = 0 THEN 0
ELSE AMT*1.0/SUM(AMT) OVER(PARTITION BY MasterDept)
END 'AMT%'
FROM #Table1
) B
ON A.MasterDept = B.MasterDept
AND A.ABCID = B.ABCID
As you can see in the subquery, a percent of total can be added to your Table1, so perhaps you don't even need Table2 as it's a bit redundant.
Update: You can use a CASE statement to handle a SUM() of 0.

Resources