GROUP BY on a column to get remove nulls - sql-server

I have the following table
Order_ID Loc_ID OrderDate ShippingDate DeliveryDate
10 2 10/12/2018 null null
10 2 null null 18/12/2018
10 2 null 12/13/2019 null
Basically, every time a date is recorded, it is added as a row. I want the table to look like this:
Order_ID Loc_ID Order_Date ShippingDate DeliveryDate
10 2 10/12/2018 13/12/2018 18/12/2018
Can someone tell me how I should do this?

Use MAX:
SELECT Order_ID,
Loc_ID,
MAX(OrderDate) AS OrderDate,
MAX(ShippingDate) AS ShippingDate,
MAX(DeliveryDate) AS DeliveryDate
FROM dbo.YourTable
GROUP BY Order_ID,
Loc_ID;
When ordering data NULL has the lowest value, so any non-NULL value will have a "greater" value. As a result MAX will return the non-NULL value.

A simple aggregation should do the trick
Example
Select Order_ID
,Loc_ID
,OrderDate = max(OrderDate)
,ShippingDate = max(ShippingDate)
,DeliveryDate = max(DeliveryDate)
From YourTable
Group By Order_ID,Loc_ID

Related

Select a grouped table (by Id) filtered by a datetime column according to NULL and MAX(date) values

Imagine that I have a table with pretty many columns in there, but that has to be returned filtered just by Id and EndDate.
Id
EndDate
...
1
NULL
1
01.01.2022 15:25
1
01.01.2022 15.24
2
15.01.2022 10:00
2
15.01.2022 11:00
2
17.01.2022 00:00
3
NULL
3
10.10.2022 22:12
4
18.05.2022 17:15
4
18.05.2022 17:17
4
19.05.2022 00:00
The resulting table must be the following:
Id
EndDate
...
1
NULL
2
17.01.2022 00:00
3
NULL
4
19.05.2022 00:00
The record with a specific Id must be picked either having a NULL EndDate value or MAX value otherwise. As it's seen on the resulting table, record with Id = 1 has NULL EndDate so then it must be picked, record with Id = 4 doesn't have a NULLable EndDate, so the value with MAX(EndDate) must be returned.
I was trying different scenarios with joining and UNIONing, but it seems desperate. Also, I considered something with CTE tables, but it seems irrelevant. The point is also get an optimal solution, because resulting table are considered to be joined to another table.
If there will be at least an idea of how to get a desired result, I would be appreciate.
You can use ROW_NUMBER in a common table expression to define the priority. Just replace the NULL with a date far in the future like 9999-12-31, then you can just order the date.
WITH cte
AS
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY Id ORDER BY ISNULL(EndDate,'99991231') DESC) AS RN
FROM dbo.myTable
)
SELECT *
FROM cte
WHERE cte.RN = 1;
With simple aggregation and a CASE expression where you check if there are any null dates for each Id:
SELECT Id,
CASE WHEN COUNT(*) = COUNT(EndDate) THEN MAX(EndDate) END AS EndDate
FROM tablename
GROUP BY Id;
The condition COUNT(*) = COUNT(EndDate) is satisfied only if all dates are not null.
See the demo.

How to return results from 2 SQL Server tables where one column in common

I've been reading for about 2 hours this afternoon and trying different things to get the results that I need but so far have failed.
Table: Schedule
ScheduleID NOT NULL
EmployeeID NOT NULL
ItemDate NOT NULL
Table: Holidays
HolidayID NOT NULL
EmployeeID NOT NULL
ItemDate NOT NULL
I want to return a result set that has all of the Schedule dates and all of the Holiday dates for a given EmployeeID
Sample data:
Schedule:
ScheduleID EmployeeID ItemDate
------------------------------------
1 1 1/1/2021
2 1 3/1/2021
Holiday:
HolidayID EmployeeID ItemDate
-----------------------------------
1 1 2/1/2021
Should return the following result set
ScheduleID 1 EmployeeID 1 ItemDate 1/1/2021
HolidayID 1 EmployeeID 1 ItemDate 2/1/2021
ScheduleID 2 EmployeeID 1 ItemDate 3/1/2021
I have tried all sorts of joins, inner, outer, right, left but I can't seem to find any scenario that works for what I want.
I'm happy to have NULL values for any of the columns in the returned result set as I can handle this in the code.
The closest I've got is this but I need to have the HolidayID (even if NULL) and/or the ScheduleID (even if NULL) in the results.
SELECT ScheduleID, HolidayID, EmployeeID, ItemDate
FROM Schedule
FULL OUTER JOIN Holiday ON Holiday.EmployeeID = Schedule.EmployeeID
ORDER BY ItemDate
WHERE EmployeeID = 1
Thanks
A simple way to do this is with a UNION operator: https://www.w3schools.com/sql/sql_union.asp
A union will append multiple select statements into one table result. A requirement for a union is the columns must be in order and the same data type. I am putting the results into a WITH clause. This allows you to quickly search for a specific employee ID. If you did not do this you would need two where clauses within the union.
WITH Dates AS (
SELECT ScheduleID, EmployeeID, ItemDate
FROM Schedule
UNION
SELECT HolidayID, EmployeeID, ItemDate
FROM Holiday
)
SELECT *
FROM Dates
WHERE EmployeeID = 1

Select rowset with null value in first row of group by result set

I am stuck with a problem.
I have some data likes these :
Id Creation date Creation date hour range Id vehicule Id variable Value
1 2017-03-01 9:10 2017-03-01 9:00 1 6 0.18
2 2017-03-01 9:50 2017-03-01 9:00 1 3 0.50
3 2017-03-01 9:27 2017-03-01 9:00 1 3 null
4 2017-03-01 10:05 2017-03-01 10:00 1 3 0.35
5 2017-03-01 10:17 2017-03-01 10:00 1 3 0.12
6 2017-03-01 9:05 2017-03-01 9:00 1 5 0.04
7 2017-03-01 9:57 2017-03-01 9:00 1 5 null
I need to select rowset group by Id vehicule, Id variable, Creation date hour range and order by group by Id vehicule, Id variable, Creation date where the first Value is null but second value, third value, ... is not null. So, in the sample above, the following rowset :
Id Creation date Creation date hour range Id vehicule Id variable Value
3 2017-03-01 9:27 2017-03-01 9:00 1 3 null
2 2017-03-01 9:50 2017-03-01 9:00 1 3 0.50
Could you help me please ?
Thank you
You will have no luck with a group by in this case. I would give 2 "if exists" into the where clause to filter all IDs that fit your criteria:
(for example/not tested/probably takes forever)
select *
from yourTable y1
where id in
--the id must be in all IDs, where the first value of the set is null
--same ID instead of group by
(select 1 from yourTable y2 where y1.IDs = y2.IDs and
--the first in the set
y2.createdate = (select min(createdate) from yourtable y3 with sameid) and
y2.value is null)
AND
--the id must also be in the IDs, where there are values besides the first that are not null
id in (same select but with "not min" and "not null" obviously
hope that helped :)
Include the Value field in the ORDER BY clause and it will be sorted to the top because NULL has a lower practical value than a non-NULL value.
Assuming (because your middle paragraph is hard to understand) you want all the fields output but you want the 4th and 5th columns to produce some grouping of the output, with Value = NULL at the top of each group:
SELECT Id, CreatedDate, CreatedDateHourRange, IdVehicule, IdVariable, Value
ORDER BY IdVehicule, IdVariable, Value
I don't see any need for an actual GROUP BY clause.
I think it is unclear as to whether you want to limit the NULL Value rows in each block to just one row of NULL, but if you do you would need to state the order for which the datetime columns are sorted.
indeed group by was no use here. Also I wasn't sure where your 10:00 records were going to. Does this help?
;WITH CTE_ADD_SOME_LOGIC
AS
(
SELECT Id, CreationDate ,CreationDateHourRange ,IdVehicle ,IdVariable ,Value
, CASE WHEN Value IS NULL THEN 1 ELSE 0 END AS VALUE_IS_NULL FROM tbl
),
CTE_MORE_LOGIC
AS
(
SELECT Id, CreationDate ,CreationDateHourRange ,IdVehicle ,IdVariable ,Value,VALUE_IS_NULL
, RANK() OVER (ORDER BY CreationDateHourRange,VALUE_IS_NULL) AS RN FROM CTE_ADD_SOME_LOGIC),
CTE_ORDER
AS
(
SELECT Id, CreationDate ,CreationDateHourRange ,IdVehicle ,IdVariable ,Value,VALUE_IS_NULL, RN
, ROW_NUMBER() OVER(PARTITION BY RN ORDER BY RN,IdVehicle,IdVariable,CreationDate, VALUE_IS_NULL DESC) AS HIERARCHY FROM CTE_MORE_LOGIC
)
SELECT Id, CreationDate ,CreationDateHourRange ,IdVehicle ,IdVariable ,Value FROM CTE_ORDER WHERE HIERARCHY = 1
ORDER BY Id
Try this Query
DECLARE #Nulloccurrence INT=1 -- Give like 1,2,3 value to get first null occurrence 2 for 2nd null occurrence
SELECT TOP 2 *
FROM cte
WHERE Id <= (
SELECT ID FROM
(
SELECT Id, ROW_NUMBER()OVER( Order by id) AS Seq
FROM cte
WHERE (
CASE
WHEN CAST(variableValue AS VARCHAR) IS NULL
THEN 'P'
ELSE CAST(variableValue AS VARCHAR)
END
) = 'P'
)Dt
WHERE Dt.Seq=#Nulloccurrence
)
ORDER BY 1 DESC
Expected Result
Id Creationdate Creationdatehourrange Ids vehicleId variableValue
------------------------------------------------------------------------
3 2017-03-01 9:27 2017-03-01 9:00 1 3 NULL
2 2017-03-01 9:50 2017-03-01 9:00 1 3 0.50
For 'where the first Value is null but second value, third value, ... is not null' i suppose you want to filter cases where there is a null and a not null value at [Value] within the set you group by, to decide to filter or not that grouped row. This cannot be filtered on standard WHERE clause because at WHERE clause each row is filtered with conditions relevant to that row scope only. Simply put, each row filtered cannot 'see' other rows unless you use sub-query. You need to use HAVING clause (the comment out is for 2+ null records)
This will work:
> DECLARE #mytbl TABLE(Id INT, [Creation date] DATETIME, [Creation date
> hour range] DATETIME, [Id veh] INT, [Id var] INT, Value INT )
>
> INSERT INTO #mytbl VALUES (1,'2017-03-01 9:10 ','2017-03-01 9:00 ',1,
> 6, 0.18) INSERT INTO #mytbl VALUES (2,'2017-03-01 9:50 ','2017-03-01
> 9:00 ',1, 3, 0.50) INSERT INTO #mytbl VALUES (3,'2017-03-01 9:27
> ','2017-03-01 9:00 ',1, 3, NULL) INSERT INTO #mytbl VALUES
> (4,'2017-03-01 10:05','2017-03-01 10:00',1, 3, 0.35) INSERT INTO
> #mytbl VALUES (5,'2017-03-01 10:17','2017-03-01 10:00',1, 3, 0.12)
> INSERT INTO #mytbl VALUES (6,'2017-03-01 9:05 ','2017-03-01 9:00 ',1,
> 5, 0.04) INSERT INTO #mytbl VALUES (7,'2017-03-01 9:57 ','2017-03-01
> 9:00 ',1, 5, NULL)
>
> SELECT [Id veh], [Id var],[Creation date hour range] FROM #mytbl GROUP
> BY [Id veh], [Id var],[Creation date hour range] HAVING COUNT([Id
> veh]) - COUNT(Value) = 1
> --HAVING COUNT([Id veh]) - COUNT(Value) >= 1 ORDER BY [Id veh], [Id var],[Creation date hour range]

Calculate minimum total value and delete all other rows from Table

Here's my table.
Table MyTable
-------------
ID Distance1 Cost1 Distance2 Cost2 Distance3 Cost3
1 711.9 6196.90432379846 NULL NULL NULL NULL
2 672.4 7316.33 NULL NULL 103.5 900.941 8217.271
3 787.7 8570.9 252 2193.59 NULL NULL
What I want is, find out row which has minimum total (Cost1+Cost2+Cost3). Keep that row and delete everything else.
So far I have achieved this. This gives me row which has minimum total value.
select TOP 1 *, ISNULL(Cost1, 0 )+ISNULL(Cost2, 0 )+ISNULL(Cost3, 0 ) as TotalCost from MyTable order by TotalCost
I also want to delete other rows. Is there anyway I can do this in one statement.
Use CTE and Row_Number window function to delete
;with cte as
(
select Row_number()over(order by ISNULL(Cost1, 0)+ISNULL(Cost2, 0 )+ISNULL(Cost3, 0)) rn,*
from MyTable
)
Delete from cte where rn > 1

Query trick - kind of unpivot

I have the following table
SnapShotDay OperationalUnitNumber IsOpen StatusDate
1-01-2014 001 1 1-01-2014
2-01-2014 NULL NULL NULL
3-01-2014 001 0 3-01-2014
4-01-2014 NULL NULL NULL
5-01-2014 001 1 5-01-2014
I obtain this with a SELECT construct, but what I need to do now is fill in the "NULL"ed rows by taking values from the first Non nulled row before. The latter would give:
SnapShotDay OperationalUnitNumber IsOpen StatusDate
1-01-2014 001 1 1-01-2014
2-01-2014 001 1 1-01-2014
3-01-2014 001 0 3-01-2014
4-01-2014 001 0 3-01-2014
5-01-2014 001 1 5-01-2014
In functional words: I have events records that give me an event on a date for an oprrational unit; the event is: IsOpen or IsClosed. Chaining those events together according to the date gives a sort of Ranges. What I need is generate daily records for those ranges (target is a fact table).
I am trying to achieve this in plain SQL query (no stored procedure).
Can you think of a trick ?
Declare #t table(
SnapShotDay date,
OperationalUnitNumber int,
IsOpen bit,
StatusDate date
)
insert into #t
select '1-01-2014', 001 , 1 , '1-01-2014' union all
select '2-01-2014', NULL, NULL, NULL union all
select '3-01-2014', 001 , 0 ,'3-01-2014' union all
select '4-01-2014', NULL,NULL,NULL union all
select '5-01-2014', 001 ,1,'5-01-2014'
;
with CTE as
(
select *,row_number()over( order by (select 0))rn from #t
)
select *,
case when a.isopen is null then (
select IsOpen from cte where rn=a.rn-1
) else a.isopen end
from cte a
ok i got it create one more cte1 then,
,cte1 as
(
select top 1 rn ,IsOpen from cte where IsOpen is not null order by rn desc
)
--select * from Statuses
select *,
case
when a.rn<=(select b.rn from cte1 b) and a.IsOpen is null then
(
select
a1.IsOpen
from
cte a1
where
a1.rn=a.rn-1
)
when a.rn>=(select b.rn from cte1 b) and a.IsOpen is null then
(select IsOpen from cte1)
else
a.isopen
end
from
cte a
Try this. In the main query we're looking for the previous date with not null values. Then just JOIN this table with this LastDate.
WITH T1 AS
(
SELECT *, (SELECT MAX(SnapShotDay)
FROM T
WHERE SnapShotDay<=TMain.SnapShotDay
AND OPERATIONALUNITNUMBER IS NOT NULL)
as LastDate
FROM T as TMain
)
SELECT T1.SnapShotDay,
T.OperationalUnitNumber,
T.IsOpen,
T.StatusDate
FROM T1
JOIN T ON T1.LastDate=T.SnapShotDay
SQLFiddle demo
SELECT
t1.SnapShotDay,
CASE WHEN t1.OperationalUnitNumber IS NOT NUll
THEN t1.OperationalUnitNumber
ELSE (SELECT TOP 1 t2.OperationalUnitNumber FROM YourTable t2 WHERE t2.SnapShotDay < t1.SnapShotDay AND t2.OperationalUnitNumber IS NOT NULL ORDER BY SnapShotDay DESC)
END AS OperationalUnitNumber,
CASE WHEN t1.IsOpen IS NOT NUll
THEN t1.IsOpen
ELSE (SELECT TOP 1 t2.IsOpen FROM YourTable t2 WHERE t2.SnapShotDay < t1.SnapShotDay AND t2.IsOpen IS NOT NULL ORDER BY SnapShotDay DESC)
END AS IsOpen,
CASE WHEN t1.StatusDate IS NOT NUll
THEN t1.StatusDate
ELSE (SELECT TOP 1 t2.StatusDate FROM YourTable t2 WHERE t2.SnapShotDay < t1.SnapShotDay AND t2.StatusDate IS NOT NULL ORDER BY SnapShotDay DESC)
END AS StatusDate
FROM YourTable t1
You asked for 'plain sql', here is a tested attempt using SQL, with comments, that gives the required answer.
I have tested the code using 'sqlite' and 'mysql' on windows xp. It is pure SQL and should work everywhere.
SQL is about 'sets' and combining them and ordering the results.
This problem seems to be about two separate sets:
1) The 'snap shot day' that have readings.
2) the 'snap shot day' that don't have readings.
I have added extra columns so that we can easily see where values came from.
let us deal with the easy set first:
This is the set of 'supplied' readings.
SELECT dss.SnapShotDay theDay,
'supplied' readingExists,
dss.OperationalUnitNumber,
dss.IsOpen,
dss.StatusDate
FROM dailysnapshot dss
WHERE dss.OperationalUnitNumber IS NOT NULL
results:
theDay readingExists OperationalUnitNumber IsOpen StatusDate
2014-01-01 supplied 001 1 2014-01-01
2014-01-03 supplied 001 0 2014-01-03
2014-01-05 supplied 001 1 2014-01-05
Now let us deal with the set of 'days that have missing readings'. We need to get the 'most recent day that has readings that is closest to the day with the missing readings' and assume the same values from the 'most recent day' that is before the 'current' missing day.
It sounds complex but it isn't. It asks:
foreach day without a reading - get me the closest, earlier, date that has readings and i will use those readings.
Here is the query:
SELECT emptyDSS.SnapShotDay,
'missing' readingExists,
maxPrevDSS.OperationalUnitNumber,
maxPrevDSS.IsOpen,
maxPrevDSS.StatusDate
FROM dailysnapshot emptyDSS
INNER JOIN dailysnapshot maxPrevDSS ON maxPrevDSS.SnapShotDay =
(SELECT MAX(dss.SnapShotDay)
FROM dailysnapshot dss
WHERE dss.SnapShotDay < emptyDSS.SnapShotDay
AND dss.OperationalUnitNumber IS NOT NULL)
WHERE emptyDSS.OperationalUnitNumber IS NULL
results:
SnapShotDay readingExists OperationalUnitNumber IsOpen StatusDate
2014-01-02 missing 001 1 2014-01-01
2014-01-04 missing 001 0 2014-01-03
This is not about efficiency! It is about getting the correct 'result set' with the easiest to understand SQL code. I assume the database engine will optimize the query. The query can be 'tweaked' later if required.
We now need to combine the two queries and order the results in the manner we require.
The standard way of combining results from SQL queries is with set operators (union, intersection, minus).
we use 'union' and an 'order by' on the result set.
this gives the final query of:
SELECT dss.SnapShotDay theDay,
'supplied' readingExists,
dss.OperationalUnitNumber,
dss.IsOpen,
dss.StatusDate
FROM dailysnapshot dss
WHERE `OperationalUnitNumber` IS NOT NULL
UNION
SELECT emptyDSS.SnapShotDay theDay,
'missing' readingExists,
maxPrevDSS.OperationalUnitNumber,
maxPrevDSS.IsOpen,
maxPrevDSS.StatusDate
FROM dailysnapshot emptyDSS
INNER JOIN dailysnapshot maxPrevDSS ON maxPrevDSS.SnapShotDay =
(SELECT MAX(dss.SnapShotDay)
FROM dailysnapshot dss
WHERE dss.SnapShotDay < emptyDSS.SnapShotDay
AND dss.OperationalUnitNumber IS NOT NULL)
WHERE emptyDSS.OperationalUnitNumber IS NULL
ORDER BY theDay ASC
result:
theDay readingExists dss.OperationalUnitNumber dss.IsOpen dss.StatusDate
2014-01-01 supplied 001 1 2014-01-01
2014-01-02 missing 001 1 2014-01-01
2014-01-03 supplied 001 0 2014-01-03
2014-01-04 missing 001 0 2014-01-03
2014-01-05 supplied 001 1 2014-01-05
I enjoyed doing this.
It should work with most SQL engines.

Resources