capture reoccurring seventh day in new column - sql-server

I have the below table...
run_dt check_type curr_cnt
6/1/21 ALL 50
5/31/21 ALL 25
5/26/21 ALL 43
5/25/21 ALL 70
6/1/21 SUB 23
5/25/21 SUB 49
I would like to capture the value of what the check_type was seven days from the run_dt. What was the previous weekday value.
Something like...
run_dt check_type curr_cnt prev_nt
6/1/21 ALL 50 70
5/31/21 ALL 25
5/26/21 ALL 43
5/25/21 ALL 70
6/1/21 SUB 23 49
5/25/21 SUB 49
Can I use lead/lag or CTE?
What's the best option here, appreciate the feedback.

You could join the table to itself:
SELECT
a.run_dt,
a.check_type,
a.curr_cnt,
b.curr_cnt as prev_nt
from table a
left join table b on b.run_dt = dateadd(d,-7,a.run_dt)

Related

Formatting an sql column to contain only time in minutes

I have an SQL table that has this data
I need the data to be formatted so that instead of showing this string of numbers and characters, I want to show time in minutes without the string. For example (in minutes):
88
85
85
67
63
76
71
75
75
42
I echo with Larnu comment.
You can try something like below.
declare #string varchar(20) = '1 hour 28 mins'
Select #string,case when CHARINDEX('hour',#string)>1 then
SUBSTRING(#string,1,CHARINDEX('hour',#string)-1) * 60 else 0 end
+
case when CHARINDEX('mins',#string)>1 then
SUBSTRING(#string,CHARINDEX('mins',#string)-3,2) else 0 end

Pulling ordered list using array functions in Excel

I have a report in excel that displays the sales results from each employee. The columns are Location, Region, Username & Sales. It is sorted by Sales descending, showing which employee has the best sales in the company.
I am attempting to have an additional sheet per region that displays the results for all employees in that region also sorted by Sales (to avoid sorting the results of the many regions myself everyday).
An example version of the first 12 rows of the Data Sheet:
G H I J K X
Row Location Username Sales Region Region
1 38 John.Doe 85 North1 North1
2 154 John.Smith 83 South2
3 23 E.Williams 83 North1
4 210 M.Williams 79 East5
5 139 Joe.Dawn 77 North2
6 22 Kay.Smith 69 South2
7 51 Jay.Smith 69 South2
8 125 L.Smith 69 East2
9 51 L.Day 69 South2
10 23 23.Guest2 67 North1
11 92 U.Goode 65 North4
I have successfully created an array function that pulls the Sales column of only the results in the specified region.
{=LARGE(SMALL(IF(IF(ISERROR(K:K),"",K:K)=$X$2,J:J),
ROW(INDIRECT("1:"&COUNTIF(K:K,$X$2)))),F2)}
I am attempting now for an array function that pulls the Username that matches the corresponding sales amount in the original array, and also matches the region. I am having trouble when a single region has 'ties' or more than one employee with the same sales that month. Here is what I started with for that function:
=INDEX(I:I,MATCH(1,(Y2=J:J)*($X$1=K:K),0)
but that is having trouble when a single region has multiple users with the same sales. So I am trying a conditional to accomodate, with the function I know that works for singles when there's only one of that sales for that region.
{=IF(COUNTIF($AB$2:AB2,AB2)>1,
INDEX(I:I,
SMALL(IF(J:J=AB2,
IF(K:K=$AB$2,ROW(K:K)-ROW(INDEX(K:K,1,1))+1)),
COUNTIF($AB$2:AB2,AB2))),
INDEX(I:I,MATCH(1,(AC2=J:J)*($AB$2=K:K),0)))}
The inner piece may be sufficient if it worked, excluding the need for the conditional:
{=INDEX(I:I,
SMALL(IF(J:J=AB2,
IF(K:K=$AB$2,ROW(K:K)-ROW(INDEX(K:K,1,1))+1)),
COUNTIF($AB$2:AB2,AB2)))}
I'll use the same function for Username.
Expected results for two regions:
X Y Z AA AB AC AD AE
Region Sales Username Location Region Sales Username Location
North1 85 John.Doe 38 South2 83 John.Smith 154
83 E.Williams 23 69 Kay.Smith 22
67 23.Guest2 23 69 Jay.Smith 51
69 L.Day 51
Since beginning to type this question I have found a work around that includes a few additional columns to complete the calculation, but still wanted to ask this to see if it was possible for knowledge's sake.
With North1 in X2, these are the formulas for Y2:AA2.
=IFERROR(AGGREGATE(14, 6, ($J$2:$J$999)/($K$2:$K$999=X$2), ROW(1:1)), "")
=IFERROR(INDEX($H:$H, AGGREGATE(15, 6, ROW($2:$999)/(($K$2:$K$999=X$2)*($J$2:$J$999=Y2)), COUNTIF(Y$2:Y2, Y2))), "")
=IFERROR(INDEX($H:$H, AGGREGATE(15, 6, ROW($2:$999)/(($K$2:$K$999=X$2)*($J$2:$J$999=Y2)), COUNTIF(Y$2:Y2, Y2))), "")
Fill down as necessary.
With South2 in AB2, copy Y2:AA2 to AC2:AE2 and fill down as necessary.

Sum of multiple variables by group

I have a dataset with over 900 observations, each observation represents the population of a sub-geographical area for a given year by gender (male, female, all) and 20 different age groups.
I have dropped the variable for the sub-geographical area and I want to collape into the greater geographical area (called Geo).
I am having a difficult time doing a SUM or PROC MEANS because I have so many age groups to sum up and I am trying to avoid writing them all out. I want to collapse across the group year, geo, sex so that I only have 3 observations per Geo (my raw data could have as many as 54 observations).
This is an example of what a tiny section of the raw data looks like:
Year Geo Sex Age0005 Age0610 Age1115 (etc)
2010 1 1 92 73 75
2010 1 2 57 81 69
2010 1 3 159 154 144
2010 1 1 41 38 43
2010 1 2 52 41 39
2010 1 3 93 79 82
2010 2 1 71 66 68
2010 2 2 63 64 70
2010 2 3 134 130 138
2010 2 1 32 35 34
2010 2 2 29 31 36
2010 2 3 61 66 70
This is how I want it to look:
Year Group Sex Age0005 Age0610 Age1115 (etc)
2010 1 1 133 111 118
2010 1 2 109 122 08
2010 1 3 252 233 226
2010 2 1 103 101 102
2010 2 2 92 95 106
2010 2 3 195 196 208
Any ideas? Please help!
You don't have to write out each variable name individually - there are ways of getting around that. E.g. if all of the age group variables that need to be summed up start with age then you can use a : wildcard to match them:
proc summary nway data = have;
var age:;
class year geo sex;
output out = want sum=;
run;
If your variables don't have a common prefix, but are all next to each other in one big horizontal group in your dataset, you can use a double dash list instead:
proc summary nway data = have;
var age005--age1115; /*Includes all variables between these two*/
class year geo sex;
output out = want sum=;
run;
Note also the use of sum= - this means that each summarised variable is reproduced with its original name in the output dataset.
I personally like to use proc sql for this, since it makes it very clear what you're summing and grouping by.
data old ;
input Year Geo Sex Age0005 Age0610 Age1115 ;
datalines;
2010 1 1 92 73 75
2010 1 2 57 81 69
2010 1 3 159 154 144
2010 1 1 41 38 43
2010 1 2 52 41 39
2010 1 3 93 79 82
2010 2 1 71 66 68
2010 2 2 63 64 70
2010 2 3 134 130 138
2010 2 1 32 35 34
2010 2 2 29 31 36
2010 2 3 61 66 70
;
run;
proc sql ;
create table new as select
year
, geo label = 'Group'
, sex
, sum(age0005) as age0005
, sum(age0610) as age0610
, sum(age1115) as age1115
from old
group by geo, year, sex ;
quit;

How to I sum up my data in 4 rows?

Select
AvHours.LineNumber,
(SProd.PoundsMade / (AvHours.AvailableHRS - SUM (ProdDtime.DownTimeHRS))) AS Throughput,
SUM (ProdDtime.DownTimeHRS) AS [Lost Time],
(SUM(cast(ProdDtime.DownTimeHRS AS decimal(10,1))) * 100) / (cast(AvHours.AvailableHRS AS decimal(10,1))) AS [%DownTime],
SUM(SProd.PoundsMade) AS [Pounds Made],
(SProd.PoundsMade / (AvHours.AvailableHRS - SUM (ProdDtime.DownTimeHRS))) * SUM (ProdDtime.DownTimeHRS) AS [Pounds Lost]
FROM rpt_Line_Shift_AvailableHrs AvHours
inner join rpt_Line_Shift_Prod SProd on
AvHours.LineNumber=SProd.LineNumber AND AvHours.Shiftnumber=SProd.Shiftnumber
inner join rpt_Line_Shift_ProdDownTime ProdDtime on
(AvHours.LineNumber=ProdDtime.LineNumber AND AvHours.Shiftnumber=ProdDtime.Shiftnumber)
GROUP BY AvHours.LineNumber,SProd.PoundsMade,AvHours.AvailableHRS
ORDER BY AvHours.LineNumber
The query above gives the following result set:
Line#,Throughput,Lost Time, %downtime,Pounds Made,Pounds Lost
1 53 49 27.222222 97538 2597
1 44 39 20.312500 116229 1716
1 47 40 22.222222 92190 1880
1 55 31 16.145833 133215 1705
1 111 49 27.222222 204442 5439
1 13 31 16.145833 33540 403
1 86 49 27.222222 159432 4214
1 81 31 16.145833 197145 2511
1 74 40 22.222222 146202 2960
1 63 49 27.222222 115920 3087
1 76 39 20.312500 199172 2964
2 64 40 22.222222 126028 2560
2 149 49 27.222222 273966 7301
2 35 39 20.312500 92616 1365
3 49 39 20.312500 129591 1911
3 65 40 22.222222 129248 2600
3 84 39 20.312500 219997 3276
4 95 31 16.145833 229485 2945
4 76 40 22.222222 149996 3040
4 94 31 16.145833 228375 2914
4 99 39 20.312500 259794 3861
What I actually want is just 4 lines (Line# = 1,2,3 or 4) and all the other fields summed.
I'm not sure how to do it. Can anybody help?
Get rid of PoundsMade and AvailableHrs in your group by. It sounds like you only want to group by the Linenumber.
You can use your sql as a nested table and then group by the line number
like the sql below.
Select LineNumber, Sum(Throughput), Sum([Lost Time]), Sum([%DownTime]), Sum([Pounds Made]), Sum([Pounds Lost])
From
(Select
AvHours.LineNumber,
(SProd.PoundsMade / (AvHours.AvailableHRS - SUM (ProdDtime.DownTimeHRS))) AS Throughput,
SUM (ProdDtime.DownTimeHRS) AS [Lost Time],
(SUM(cast(ProdDtime.DownTimeHRS AS decimal(10,1))) * 100) / (cast(AvHours.AvailableHRS AS decimal(10,1))) AS [%DownTime],
SUM(SProd.PoundsMade) AS [Pounds Made],
(SProd.PoundsMade / (AvHours.AvailableHRS - SUM (ProdDtime.DownTimeHRS))) * SUM (ProdDtime.DownTimeHRS) AS [Pounds Lost]
FROM rpt_Line_Shift_AvailableHrs AvHours
inner join rpt_Line_Shift_Prod SProd on
AvHours.LineNumber=SProd.LineNumber AND AvHours.Shiftnumber=SProd.Shiftnumber
inner join rpt_Line_Shift_ProdDownTime ProdDtime on
(AvHours.LineNumber=ProdDtime.LineNumber AND AvHours.Shiftnumber=ProdDtime.Shiftnumber)
GROUP BY AvHours.LineNumber,SProd.PoundsMade,AvHours.AvailableHRS
) A
Group BY LineNumber
ORDER BY LineNumber
I dont have a sql server right now to test this out, But let me know if you encounter any issue
Please mark this as answer if it helped resolving your issue

Display result of query without primary key

I want to join two tables (each table has 200 columns) so the purpose of this is to have a table with 400 columns, but how do I get the result without the primary key?
id a1 a2 a3 ... a200
-----------------------
1 23 4 5 7
2 24 6 8 17
3 13 14 52 73
...
id b1 b2 b3 ... b200
-----------------------
1 53 14 15 87
2 64 16 18 87
3 73 74 12 83
...
So the sesult I want is like
a1 a2 a3 ... a200 b1 b2 b3 .... b200
--------------------------------------
23 4 5 7 53 14 15 87
24 6 8 17 64 16 18 87
13 14 52 73 73 74 12 83
...
I have this
SELECT * a as T1 join b as T2 on T1.id=T2.id;
There is no way to say SELECT (* EXCEPT some_col), sorry. However, it is quite easy to generate the list by dragging the "Columns" node for each table from Object Explorer onto the query window, and then simply remove the PK columns from the list. Click on the Columns node for a view or table, then drag it onto the query window:
Voila!
You will have to specify each individual column in the SELECT statement:
SELECT a1, a2, a3, ..., a200, b1, ..., b200
FROM T1
join T2 on T1.id = T2.id
Clearly this is overly cumbersome.
I would take a look at why you have so many columns and whether your data is properly normalised. Alternatively, is there the potential to simply use the columns you need nearer to the UI (if there is one?)

Resources