Can someone show me how to compile code in T-SQL that will allow me to view the distribution of data in columns?
For example in the sample table, there is a column called model. In that column, 50% of the values are Fiestas. I would like to a query that will help determine the distribution of in data in columns.
I have included some sample code to help:
CREATE TABLE #tmpTable
(
registration varchar(50),
make varchar(50),
model varchar(50),
engine_size float
)
INSERT INTO #tmpTable VALUES
('JjFw5a0','SKODA','OCTAVIA',1.8),
('VkfCDpZ','FORD','FIESTA',1.7),
('5E93ZEq','SKODA','OCTAVIA',1.3),
('L2PPN0m','FORD','FIESTA',1.1),
('9xKghxp','FORD','FIESTA',1.5),
('WHShdBm','FORD','FIESTA',1.4),
('TNRHyy7','NISSAN','QASHQAI',1.2),
('6RNX0XG','SKODA','OCTAVIA',1.4),
('tJ9bOD8','FORD','FIESTA',1.1),
('ablFUSC','FORD','FIESTA',1),
('4B7RLYL','MERCEDED_BENZ','E CLASS',1.3),
('tlJiwVY','FORD','FIESTA',1),
('Fb9lcvG','FORD','FIESTA',1.4),
('nW4lqBC','FORD','FIESTA',1.6),
('LggTmL5','HYUNDAI','I20',1),
('2mGgSjS','FORD','FIESTA',1.1),
('IDvOzcM','FORD','FIESTA',1.3),
('JefpXK2','FORD','FIESTA',1.5),
('0h1uWfZ','MERCEDED_BENZ','E CLASS',1.4),
('ylBoGbV','MERCEDED_BENZ','E CLASS',1.7),
('XzoILDK','VAUXHALL','CORSA',1.8),
('Xhocs1Z','FORD','FIESTA',1.5),
('Lh2yWGa','KIA','RIO',1.5),
('hM5GWA0','FORD','FIESTA',1.3),
('PbpxkFt','FORD','FIESTA',1.7),
('SDHWV2r','FORD','FIESTA',1.2),
('n83Je2D','FORD','FIESTA',1.8),
('sDN0gex','FORD','FIESTA',1.2),
('7EICOZY','KIA','RIO',1.5),
('PUuMmIH','FORD','FIESTA',1),
('HiBwSg2','FORD','FIESTA',1.8),
('1yk1vDm','KIA','RIO',1.7),
('cMpH72R','HYUNDAI','I20',1.1),
('ZgQL0gt','MERCEDED_BENZ','E CLASS',1.3),
('jhpamQG','KIA','RIO',1.1),
('pk0lU2F','VAUXHALL','CORSA',1.4),
('fDCUeq1','FORD','FIESTA',1.1),
('ono5QFC','FORD','FIESTA',1.7),
('VohWwGR','FORD','FIESTA',1.5),
('Hih8dKc','SUZUKI','SWIFT',1.2),
('D2RNn3h','SUZUKI','SWIFT',1.2),
('QaYQulE','FORD','FIESTA',1.1),
('xmQPxAG','FORD','FIESTA',1.8),
('vmTqkTO','FORD','FIESTA',1.2),
('lvUtVUA','MERCEDED_BENZ','E CLASS',1),
('SFoj00d','FORD','FIESTA',1),
('9S6wrWV','MERCEDED_BENZ','E CLASS',1),
('0SBnW0z','FORD','FIESTA',1.1),
('HnDHdfj','MERCEDED_BENZ','E CLASS',1),
('RV7q947','FORD','FIESTA',1.4),
('JZqCtTg','FORD','FIESTA',1.7),
('XVgBwgi','FORD','FIESTA',1.8),
('iqJDsIF','FORD','FIESTA',1.6),
('CMbpRFa','FORD','FIESTA',1.6),
('vF7K5Xg','SUZUKI','SWIFT',1.1),
('3j6XGDH','FORD','FIESTA',1.5),
('ommqugM','FORD','FIESTA',1.1),
('LMQkPnw','NISSAN','QASHQAI',1.4),
('1dKgcdd','FORD','FIESTA',1.5),
('hC8BxiP','MERCEDED_BENZ','E CLASS',1.1),
('wLTWol7','FORD','FIESTA',1.6),
('TY8ChYN','FORD','FIESTA',1.6),
('Gw1CpI8','FORD','FIESTA',1.4),
('L4OPAJq','FORD','FIESTA',1.1),
('6TyYpfi','NISSAN','QASHQAI',1.6),
('ozoOcGL','FORD','FIESTA',1.4),
('6IME19U','FORD','FIESTA',1.4),
('BxpmJO5','FORD','FIESTA',1.4),
('0zc2n5A','FORD','FIESTA',1.3),
('FqbBZE2','FIAT','500',1.7),
('2EkTOTz','FORD','FIESTA',1.4),
('fNBvIvg','MERCEDED_BENZ','E CLASS',1.2),
('u5j4R4S','KIA','RIO',1.4),
('zpWaUZo','FORD','FIESTA',1.1),
('FQPVQYc','NISSAN','QASHQAI',1.7),
('8RBQADq','KIA','RIO',1.7),
('TOz2bcT','HYUNDAI','I20',1.7),
('jebhCex','FORD','FIESTA',1.3),
('cdHA1gL','FORD','FIESTA',1.2),
('FoaN4AT','FORD','FIESTA',1.7),
('atGn288','FORD','FIESTA',1.5),
('es8VNdW','FIAT','500',1.3),
('hDWoMXa','KIA','RIO',1.4),
('Q9C6Br1','KIA','RIO',1.5),
('mFSy4aF','FORD','FIESTA',1.6),
('bbbKnrM','SKODA','OCTAVIA',1.5),
('qY7lz6I','FORD','FIESTA',1),
('8Ch2OeU','VAUXHALL','CORSA',1.3),
('dcWsjJv','VAUXHALL','CORSA',1.3),
('bnnoBPg','SKODA','OCTAVIA',1.8),
('mvDyYkK','FORD','FIESTA',1.4),
('KpWDYap','FORD','FIESTA',1.3),
('7EK9K4z','FORD','FIESTA',1.3),
('ZPLHtlP','FORD','FIESTA',1.6),
('4EpYeSB','FORD','FIESTA',1.6),
('O1eZ20M','FORD','FIESTA',1),
('WfVntKk','FORD','FIESTA',1.7),
('6VlkBdi','FORD','FIESTA',1.1),
('hFQfKjk','KIA','RIO',1.4),
('3Y4njNP','KIA','RIO',1),
('3UuNqG0','FORD','FIESTA',1.7),
('qpvMYAu','FORD','FIESTA',1.1),
('NCYJUqx','FORD','FIESTA',1.3),
('M0AvWzg','FORD','FIESTA',1.6),
('XbVmtFf','FORD','FIESTA',1.3),
('l8qZy0H','SKODA','OCTAVIA',1.3),
('EDUbxaU','MERCEDED_BENZ','E CLASS',1.6),
('nWLd82o','FORD','FIESTA',1.7),
('4AkoyWx','FORD','FIESTA',1),
('nOoO25v','FORD','FIESTA',1.3),
('VAm5aV8','NISSAN','QASHQAI',1.4),
('zbd3cie','FORD','FIESTA',1.5),
('hyAN71W','NISSAN','QASHQAI',1),
('FxACHDf','FIAT','500',1.7),
('wOZdaeV','FORD','FIESTA',1.6),
('gfxZl99','VAUXHALL','CORSA',1.1),
('06HhwEJ','SKODA','OCTAVIA',1.7),
('PCTgYiG','KIA','RIO',1.7),
('U54WXZQ','KIA','RIO',1.6),
('FHgrRiF','FORD','FIESTA',1.6),
('R3jP73p','SKODA','OCTAVIA',1.5),
('etVPKX9','SUZUKI','SWIFT',1.1),
('BE3yReB','FORD','FIESTA',1.7),
('zXmX878','FORD','FIESTA',1.6),
('wdM3P2m','FORD','FIESTA',1.7),
('tb727BM','FORD','FIESTA',1.1)
SELECT * FROM #tmpTable
You can apply a Windowed Aggregate to get the overall count:
SELECT make
, model
, count(*) as cnt -- count per Model
, cast(count(*) * 100.0 -- compared to all counts
/ sum(count(*))
over () as dec(5,2)) as distribution
FROM #tmptable
group by make
, model
order by distribution desc;
See fiddle
If you want the percentage of the Model for each Make you need to add PARTITION BY:
SELECT make
, model
, count(*) as cnt -- count per Model
, cast(count(*) * 100.0
/ sum(count(*)) -- compared to all counts per Make
over (partition by Make) as dec(5,2)) as distribution
FROM #tmptable
group by make
, model
order by make, distribution desc;
You can use conditional aggregation to get the ratio of the count of Ford Fiestas and the total count.
SELECT 100.0
* count(CASE
WHEN make = 'FORD'
AND model = 'FIESTA' THEN
1
END)
/ count(*)
FROM #tmptable;
Edit:
If you want the figures for all car models you can simply aggregate and group to get the count for each car model and divide that by the total count which you can get via a subquery.
SELECT make,
model,
100.0
* count(*)
/ (SELECT count(*)
FROM #tmptable)
FROM #tmptable
GROUP BY make,
model;
I am trying to find a way to get the last date by location and product a sum was positive. The only way i can think to do it is with a cursor, and if that's the case I may as well just do it in code. Before i go down that route, i was hoping someone may have a better idea?
Table:
Product, Date, Location, Quantity
The scenario is; I find the quantity by location and product at a particular date, if it is negative i need to get the sum and date when the group was last positive.
select
Product,
Location,
SUM(Quantity) Qty,
SUM(Value) Value
from
ProductTransactions PT
where
Date <= #AsAtDate
group by
Product,
Location
i am looking for the last date where the sum of the transactions previous to and including it are positive
Based on your revised question and your comment, here another solution I hope answers your question.
select Product, Location, max(Date) as Date
from (
select a.Product, a.Location, a.Date from ProductTransactions as a
join ProductTransactions as b
on a.Product = b.Product and a.Location = b.Location
where b.Date <= a.Date
group by a.Product, a.Location, a.Date
having sum(b.Value) >= 0
) as T
group by Product, Location
The subquery (table T) produces a list of {product, location, date} rows for which the sum of the values prior (and inclusive) is positive. From that set, we select the last date for each {product, location} pair.
This can be done in a set based way using windowed aggregates in order to construct the running total. Depending on the number of rows in the table this could be a bit slow but you can't really limit the time range going backwards as the last positive date is an unknown quantity.
I've used a CTE for convenience to construct the aggregated data set but converting that to a temp table should be faster. (CTEs get executed each time they are called whereas a temp table will only execute once.)
The basic theory is to construct the running totals for all of the previous days using the OVER clause to partition and order the SUM aggregates. This data set is then used and filtered to the expected date. When a row in that table has a quantity less than zero it is joined back to the aggregate data set for all previous days for that product and location where the quantity was greater than zero.
Since this may return multiple positive date rows the ROW_NUMBER() function is used to order the rows based on the date of the positive quantity day. This is done in descending order so that row number 1 is the most recent positive day. It isn't possible to use a simple MIN() here because the MIN([Date]) may not correspond to the MIN(Quantity).
WITH x AS (
SELECT [Date],
Product,
[Location],
SUM(Quantity) OVER (PARTITION BY Product, [Location] ORDER BY [Date] ASC) AS Quantity,
SUM([Value]) OVER(PARTITION BY Product, [Location] ORDER BY [Date] ASC) AS [Value]
FROM ProductTransactions
WHERE [Date] <= #AsAtDate
)
SELECT [Date], Product, [Location], Quantity, [Value], Positive_date, Positive_date_quantity
FROM (
SELECT x1.[Date], x1.Product, x1.[Location], x1.Quantity, x1.[Value],
x2.[Date] AS Positive_date, x2.[Quantity] AS Positive_date_quantity,
ROW_NUMBER() OVER (PARTITION BY x1.Product, x1.[Location] ORDER BY x2.[Date] DESC) AS Positive_date_row
FROM x AS x1
LEFT JOIN x AS x2 ON x1.Product=x2.Product AND x1.[Location]=x2.[Location]
AND x2.[Date]<x1.[Date] AND x1.Quantity<0 AND x2.Quantity>0
WHERE x1.[Date] = #AsAtDate
) AS y
WHERE Positive_date_row=1
Do you mean that you want to get the last date of positive quantity come to positive in group?
For example, If you are using SQL Server 2012+:
In following scenario, when the date going to 01/03/2017 the summary of quantity come to 1(-10+5+6).
Is it possible the quantity of following date come to negative again?
;WITH tb(Product, Location,[Date],Quantity) AS(
SELECT 'A','B',CONVERT(DATETIME,'01/01/2017'),-10 UNION ALL
SELECT 'A','B','01/02/2017',5 UNION ALL
SELECT 'A','B','01/03/2017',6 UNION ALL
SELECT 'A','B','01/04/2017',2
)
SELECT t.Product,t.Location,SUM(t.Quantity) AS Qty,MIN(CASE WHEN t.CurrentSum>0 THEN t.Date ELSE NULL END ) AS LastPositiveDate
FROM (
SELECT *,SUM(tb.Quantity)OVER(ORDER BY [Date]) AS CurrentSum FROM tb
) AS t GROUP BY t.Product,t.Location
Product Location Qty LastPositiveDate
------- -------- ----------- -----------------------
A B 3 2017-01-03 00:00:00.000