Running total with purchases and sales in query column - sql-server

I have a table like below:
How can I have a column like below using Transact-SQL (Order By Date)?
I'm using SQL Server 2016.

The thing you need is called an aggregate windowing function, specifically SUM ... OVER.
The problem is that a 'running total' like this only makes sense if you can specify the order of the rows deterministically. The sample data does not include an attribute that could be used to provide this required ordering. Tables, by themselves, do not have an explicit order.
If you have something like an entry date column, a solution like the following would work:
DECLARE #T table
(
EntryDate datetime2(0) NOT NULL,
Purchase money NULL,
Sale money NULL
);
INSERT #T
(EntryDate, Purchase, Sale)
VALUES
('20180801 13:00:00', $1000, NULL),
('20180801 14:00:00', NULL, $400),
('20180801 15:00:00', NULL, $400),
('20180801 16:00:00', $5000, NULL);
SELECT
T.Purchase,
T.Sale,
Remaining =
SUM(ISNULL(T.Purchase, $0) - ISNULL(T.Sale, 0)) OVER (
ORDER BY T.EntryDate
ROWS UNBOUNDED PRECEDING)
FROM #T AS T;
Demo: db<>fiddle
Using ROWS UNBOUNDED PRECEDING in the window frame is shorthand for ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW. The behaviour of ROWS is different w.r.t duplicates (and generally better-performing) than the default RANGE. There are strong arguments to say that ROWS ought to have been the default, but that is not what we were given 🙂.
For more information see How to Use Microsoft SQL Server 2012's Window Functions by Itzik Ben-Gan, and his excellent book on the topic.

Related

How would one create a DATE column that is updated via a DATETIME column

Consider the following:
CREATE TABLE mytable
(
[id] INT NOT NULL,
[foobar] VARCHAR(25) NULL,
[created_on] DATETIME NOT NULL
);
SELECT *
FROM mytable
WHERE CAST(created_on AS DATE) = '2019-01-01';
I have a lot of queries like this, where I need to store the full date and time for audit (and sorting) purposes, but most queries only care about the date portion when it comes to searching.
In order to improve performance, I was considering adding a sister column that stores the value as a DATE, and then update it via triggers; but before I go down that rabbit hole, I wanted to know if there is a better way to solve this issue. Is there some mechanism in SQL Server that offers a better solution to this issue?
I am currently stuck on SQL Server 2008, but I am open to solutions that use newer versions
My preference would be to just write a sargable
WHERE created_on >= '2019-01-01' and created_on < '2019-01-02';
The
CAST(created_on AS DATE) = '2019-01-01';
Is in fact mostly sargable but somewhat sub optimal ...
... and splitting it out into a separate indexed column can help other cases like GROUP BY date
If you decide you do need a separate column you can create a computed column and index that.
This is preferable to triggers as it has less performance overhead as well as allowing SQL Server to match both the column name and the original expression. (any index on a column populated by a trigger won't be matched to a query containing CAST(created_on AS DATE))
CREATE TABLE mytable
(
[id] INT NOT NULL,
[foobar] VARCHAR(25) NULL,
[created_on] DATETIME NOT NULL,
[created_on_date] AS CAST(created_on AS DATE)
);
CREATE INDEX ix_created_on_date
ON mytable(created_on_date)
include (foobar, id, created_on)
SELECT foobar,
id,
created_on
FROM mytable
WHERE CAST(created_on AS DATE) = '2019-01-01';

T-SQL Select where Subselect or Default

I have a SELECT that retrieves ROWS comparing a DATETIME field to the highest available value of another TABLE.
The Two Tables have the following structure
DeletedRecords
- Id (Guid)
- RecordId (Guid)
- TableName (varchar)
- DeletionDate (datetime)
And Another table which keep track of synchronizations using the following structure
SynchronizationLog
- Id (Guid)
- SynchronizationDate (datetime)
In order to get all the RECORDS that have been deleted since the last synchronization, I run the following SELECT:
SELECT
[Id],[RecordId],[TableName],[DeletionDate]
FROM
[DeletedRecords]
WHERE
[TableName] = '[dbo].[Person]'
AND [DeletionDate] >
(SELECT TOP 1 [SynchronizationDate]
FROM [dbo].[SynchronizationLog]
ORDER BY [SynchronizationDate] DESC)
The problem occurs if I do not have synchronizations available yet, the T-SQL SELECT does not return any row while it should returns all the rows cause there are no synchronization records available.
Is there a T-SQL function like COALESCE that I can use with DateTime?
Your subquery should look like something like this:
SELECT COALESCE(MAX([SynchronizationDate]), '0001-01-01')
FROM [dbo].[SynchronizationLog]
It says: Get the last date, but if there is no record (or all values are NULL), then use the '0001-01-01' date as start date.
NOTE '0001-01-01' is for DATETIME2, if you are using the old DATETIME data type, it should be '1753-01-01'.
Also please note (from https://msdn.microsoft.com/en-us/library/ms187819(v=sql.100).aspx)
Use the time, date, datetime2 and datetimeoffset data types for new work. These types align with the SQL Standard. They are more portable. time, datetime2 and datetimeoffset provide more seconds precision. datetimeoffset provides time zone support for globally deployed applications.
EDIT
An alternative solution is to use NOT EXISTS (you have to test it if its performance is better or not):
SELECT
[Id],[RecordId],[TableName],[DeletionDate]
FROM
[DeletedRecords] DR
WHERE
[TableName] = '[dbo].[Person]'
AND NOT EXISTS (
SELECT 1
FROM [dbo].[SynchronizationLog] SL
WHERE DR.[DeletionDate] <= SL.[SynchronizationDate]
)

GROUP BY multiple derived Max() and Min() columns

I am trying to create a script that will group a tables contents by a series of columns that I have taken MAX() of.
It difficult to describe without the scenario:
I have a table of bookings which I have to create a table from of single customer, the customers are split down by res system, each res system requires a different grouping using different columns.
I.E:
277000 bookings for Ipcos
289300 bookings for Daph
300000 bookings for Tard
They are all stored in same table and I want to take the max of all columns except a couple which I have to cast into integers and sum() other columns up.
My problem comes when I have to group by the casted value I have created and the Min() value I have created.
I tried joining table onto each other but that didn't work, can someone point me in the right direction please as getting very frustrated.
Code for selecting Ipcos
SELECT
SUM(CAST(TotalCost AS MONEY)) AS TotalCost,
NettCost = Null,
Paid = Null,
Balance = Null,
Discount = Null,
Commission = Null,
Max(Adults) AS Adults,
Max(Children)AS Children,
Max(Infants) AS Infants,
Max(PAX) AS PAX,
Sum(CAST(Duration AS int)) AS Duration,
MAX([IdentityValue]) AS IdentityValue,
MAX([FileName]) AS FileName,
MAX([SheetName]) AS SheetName,
MAX([LineNum]) AS LineNum,
MAX([BookingDate]) AS BookingDate,
CAST([DepartureDate] AS Int) AS IntDepartureDate,
CAST([BookingDate] AS int) AS IntBookingDate,
MIN(LineNum) AS MinLineNum
FROM
Booking
WHERE ResID = 3
GROUP BY
MinLineNum, IntBookingDate, IntDepartureDate
the other systems are very similar to this
any help would be fantastic cheers in adavnce
You can use the over ( partition by ) statement in order tou group by different columns. Look here Over clause

SQL Server strange distinct query

I use SQL Server 2008, C#, I have a table which contains about 20000 rows, I have several similar rows in this table, there are about 900 distinct rows, it is my table structure:
tblCourse
courselevel, coursecode, coursename, branchcode...
For example I have 20 rows with the same coursecode/coursename but with different branchcode or courselevel, I'm going to have a table which contains item with only unique coursecode.
here is a little sample of my table:
... courselevel=1,coursecode=1200,coursename=A,branchcode=200...
... courselevel=2,coursecode=1200,coursename=A,branchcode=200...
... courselevel=1,coursecode=1200,coursename=A,branchcode=220...
... courselevel=1,coursecode=1200,coursename=A,branchcode=230...
... courselevel=1,coursecode=1200,coursename=A,branchcode=240...
... courselevel=1,coursecode=1200,coursename=A,branchcode=250...
... courselevel=2,coursecode=1200,coursename=A,branchcode=251...
... courselevel=1,coursecode=1200,coursename=A,branchcode=225...
I want to have only the first row:
... courselevel=1,coursecode=1200,coursename=A,branchcode=200...
because all rows have similar coursecode,
What should I do?
How should I write my select query string?
I have tested different methods (group by, distinct, max(ID)...) with no luck, please help me!
thanks
You can GROUP BY the similar columns and use any Aggregate Function on the other columns to have them just return one record. What that one value would be entirely depends on the aggregate function you use.
Aggregate Functions
Aggregate functions perform a calculation on a set of values and
return a single value. Except for COUNT, aggregate functions ignore
null values. Aggregate functions are frequently used with the GROUP BY
clause of the SELECT statement
In this example, I have used the min/max and avg aggregate functions.
SELECT courselevel
, coursecode
, coursename
, MIN(branchcode)
, MAX(othercolumn)
, AVG(numberColumn)
, ...
FROM yourTable
GROUP BY
courselevel
, coursecode
, coursename

SQL Server ORDER BY date and nulls last

I am trying to order by date. I want the most recent dates coming in first. That's easy enough, but there are many records that are null and those come before any records that have a date.
I have tried a few things with no success:
ORDER BY ISNULL(Next_Contact_Date, 0)
ORDER BY ISNULL(Next_Contact_Date, 999999999)
ORDER BY coalesce(Next_Contact_Date, 99/99/9999)
How can I order by date and have the nulls come in last? The data type is smalldatetime.
smalldatetime has range up to June 6, 2079 so you can use
ORDER BY ISNULL(Next_Contact_Date, '2079-06-05T23:59:00')
If no legitimate records will have that date.
If this is not an assumption you fancy relying on a more robust option is sorting on two columns.
ORDER BY CASE WHEN Next_Contact_Date IS NULL THEN 1 ELSE 0 END, Next_Contact_Date
Both of the above suggestions are not able to use an index to avoid a sort however and give similar looking plans.
One other possibility if such an index exists is
SELECT 1 AS Grp, Next_Contact_Date
FROM T
WHERE Next_Contact_Date IS NOT NULL
UNION ALL
SELECT 2 AS Grp, Next_Contact_Date
FROM T
WHERE Next_Contact_Date IS NULL
ORDER BY Grp, Next_Contact_Date
According to Itzik Ben-Gan, author of T-SQL Fundamentals for MS SQL Server 2012, "By default, SQL Server sorts NULL marks before non-NULL values. To get NULL marks to sort last, you can use a CASE expression that returns 1 when the" Next_Contact_Date column is NULL, "and 0 when it is not NULL. Non-NULL marks get 0 back from the expression; therefore, they sort before NULL marks (which get 1). This CASE expression is used as the first sort column." The Next_Contact_Date column "should be specified as the second sort column. This way, non-NULL marks sort correctly among themselves." Here is the solution query for your example for MS SQL Server 2012 (and SQL Server 2014):
ORDER BY
CASE
WHEN Next_Contact_Date IS NULL THEN 1
ELSE 0
END, Next_Contact_Date;
Equivalent code using IIF syntax:
ORDER BY
IIF(Next_Contact_Date IS NULL, 1, 0),
Next_Contact_Date;
order by -cast([Next_Contact_Date] as bigint) desc
If your SQL doesn't support NULLS FIRST or NULLS LAST, the simplest way to do this is to use the value IS NULL expression:
ORDER BY Next_Contact_Date IS NULL, Next_Contact_Date
to put the nulls at the end (NULLS LAST) or
ORDER BY Next_Contact_Date IS NOT NULL, Next_Contact_Date
to put the nulls at the front. This doesn't require knowing the type of the column and is easier to read than the CASE expression.
EDIT: Alas, while this works in other SQL implementations like PostgreSQL and MySQL, it doesn't work in MS SQL Server. I didn't have a SQL Server to test against and relied on Microsoft's documentation and testing with other SQL implementations. According to Microsoft, value IS NULL is an expression that should be usable just like any other expression. And ORDER BY is supposed to take expressions just like any other statement that takes an expression. But it doesn't actually work.
The best solution for SQL Server therefore appears to be the CASE expression.
A bit late, but maybe someone finds it useful.
For me, ISNULL was out of question due to the table scan. UNION ALL would need me to repeat a complex query, and due to me selecting only the TOP X it would not have been very efficient.
If you are able to change the table design, you can:
Add another field, just for sorting, such as Next_Contact_Date_Sort.
Create a trigger that fills that field with a large (or small) value, depending on what you need:
CREATE TRIGGER FILL_SORTABLE_DATE ON YOUR_TABLE AFTER INSERT,UPDATE AS
BEGIN
SET NOCOUNT ON;
IF (update(Next_Contact_Date)) BEGIN
UPDATE YOUR_TABLE SET Next_Contact_Date_Sort=IIF(YOUR_TABLE.Next_Contact_Date IS NULL, 99/99/9999, YOUR_TABLE.Next_Contact_Date_Sort) FROM inserted i WHERE YOUR_TABLE.key1=i.key1 AND YOUR_TABLE.key2=i.key2
END
END
Use desc and multiply by -1 if necessary. Example for ascending int ordering with nulls last:
select *
from
(select null v union all select 1 v union all select 2 v) t
order by -t.v desc
I know this is old but this is what worked for me
Order by Isnull(Date,'12/31/9999')
I think I found a way to show nulls in the end and still be able to use indexes for sorting.
The idea is super simple - create a calculatable column which will be based on existing column, and put an index on it.
ALTER TABLE dbo.Users
ADD [FirstNameNullLast]
AS (case when [FirstName] IS NOT NULL AND (ltrim(rtrim([FirstName]))<>N'' OR [FirstName] IS NULL) then [FirstName] else N'ZZZZZZZZZZ' end) PERSISTED
So, we are creating a persisted calculatable column in the SQL, in that column all blank and null values will be replaced by 'ZZZZZZZZ', this will mean, that if we will try to sort based on that column, we will see all the null or blank values in the end.
Now we can use it in our new index.
Like this:
CREATE NONCLUSTERED INDEX [IX_Users_FirstNameNullLast] ON [dbo].[Users]
(
[FirstNameNullLast] ASC
)
So, this is an ordinary nonclustered index. We can change it however we want, i.e. include extra columns, increase number of indexes columns, change sorting order etc.
I know this is a old thread, but in SQL Server nulls are always lower than non-null values. So it's only necessary to order by Desc
In your case Order by Next_Contact_Date Desc should be enough.
Source: order by with nulls- LearnSql

Resources