I'm trying to figure out if it's possible to transform table rows to columns where the number of rows included changes at the time of the query. Here's a sample of what I'm trying to do:
Characteristics Table
strategy
year
month
aaa
aa
a
InvestmentA
2020
12
5
4
10
InvestmentB
2020
12
8
15
25
Investment(n)
2020
12
x
x
x
Output
year
month
Credit Type
InvestmentA
InvestmentA
Investment(n)
2020
12
aaa
5
8
x
2020
12
aa
4
15
x
2020
12
a
10
25
x
Related
I have a dataset that looks like this:
data have;
input ID P1 P2 P3 P4;
datalines;
ID P1 P2 P3 P4
12 10 15 20 30
12 - 20 5 3
12 - - 25 33
12 - - - 30
19 10 15 20 30
19 - 10 17 30
19 - - 5 30
19 - - - 30
;
run;
I am trying to build in a variable called Year which then can be used to identify that the ID and P1-P4 is an array with each row representing a year. Such that the dataset would look like.
data want;
set have;
input ID P1 P2 P3 P4;
datalines;
ID P1 P2 P3 P4 Year
12 10 15 20 30 2017
12 - 20 5 3 2018
12 - - 25 33 2019
12 - - - 30 2020
19 10 15 20 30 2017
19 - 10 17 30 2018
19 - - 5 30 2019
19 - - - 30 2020
;
run;
I originally used to use this code:
Data Year;
do ID = 1 to 8;
do Year = 2017 to 2020;
output;
end;
end;
run;
data Final;
set have;
Merge Year;
run;
But now that I am working with a different dataset each time and I don't know the structure of the ID, I can't keep changing ID=1 to 8 to suit the dataset each time.
My question: Is there a way to do this through the dataset, possibly a count?
Count ID = 2017;
Year = count + 1;
There is no need to create a second data set that will be merged with the first.
You do need to make assumptions about the grouping in the have data set. The assumptions are the data is already sorted or arranged in a manner that allows a monotonic year value to be assigned to each sequential row in each group.
data want;
set have;
by id;
if first.id
then year = 2017; %* initial year for a group;
else year + 1; %* increment year for subsequent rows of a group;
run;
I've a Matrix to show the values over the months. But the last column I want to show the Varianve between the current month and the previous month. I've this dataset (Months):
Servername Month Year Reference Value Previous_Value
SV1 8 2017 80 11 Null
SV1 9 2017 80 13 11
SV1 10 2017 80 18 13
SV1 11 2017 80 21 18
SV1 12 2017 80 12 21
SV1 1 2018 80 18 12
Basically, I want to build a expression that allows me to get the value from MAX(Month) and MAX(Year). I try this:
=IIF(Fields!Month.Value = max(Fields!Month.Value, "Months") and Fields!Year.Value = max(Fields!Year.Value, "Months"),Fields!Previous_Value.Value,0)
But when I run the report I'm getting 0 to all of my machines... And my final matrix are:
**Servername 8 9 10 11 12 1 Previous_Value**
SV1 11 13 18 21 12 18 12
How can I do this?
Thanks!
Your expression is saying if Month = MAX(Fields!Month.Value) which is 12 and Year = Max(Fields!Year.Value) which is 2018 then show Previous_Value.
As non of your rows match this then it won't work.
I've not tested this but try aggregating the month and year and comparing to that. This is easier to do in SQL by adding a new column
SELECT *, ([Year] *100) + [Month] as YearMonthKey FROM myTable
Now your expression can just check YearMonthKey against MAX(Fields!YearMonthKey.Value)
I have a SQL query already that gets the data I need but I'm struggling to figure out how to get that into a chart. This is sample data as result of my query:
year month day mode amount duration
2013 2 22 0 1 36001
2013 7 7 1 1 55062
2015 12 23 1 6 13
2015 12 23 4 4 11
2015 12 23 7 31 104
2015 12 23 8 2 4
2015 12 23 12 11 21
2015 12 23 13 3 8
2016 3 24 1 207 519
If I wanted to graph lets say amount grouped per year, month and day how would that be done in JFreeChart?
I have an array that I want to add years and months sequentially to using a SAS program:
Original:
ID
1
2
3
End result:
ID YEAR; MONTH
1 2014 11
1 2014 12
1 2015 1
1 2015 2
1 2015 3
2 2014 11
2 2014 12
2 2015 1
2 2015 2
2 2015 3
3 2014 11
3 2014 12
3 2015 1
3 2015 2
3 2015 3
I also need to set the upper lower limits for the years and months I want to add to the table.
Any help is appreciated. Thanks!
As the comments suggest, I'm taking a bit of a guess on what you're looking for. From what you're asking, I'd recommned using a data step to loop through your original data, outputing multiple rows for each line in the original data.
This uses intnx to advance to the next month (intnx documentation)
*Enter start and end date here;
%Let startdt = '01NOV2014'd;
%Let enddt = '01MAR2015'd;
data want (drop=_date);
set original;
*Create multiple records for each observation in 'original'- one for each month;
_date = &startdt;
DO UNTIL (_date > &enddt);
year = year(_date);
month = month(_date);
output;
*Advance to next month;
_date = intnx('month', _date, 1, 'beginning');
END;
run;
Let's say I have maximum temperature data for the last 20 years. My data frame has a column for month, day, year and MAX_C (temperature data). I want to calculate the mean (and standard deviation, and range) maximum temperature from June 31 of one year to July 1 of the preceding year (i.e. mean max daily temp from July 1, 1991 to June 31, 1992). Is there an efficient way to do this?
My approach, thus far, has been to create an array:
maxt.prev12<-tapply(maxt$MAX_C,INDEX=list(maxt$month,maxt$day,maxt$year),mean)
I put mean in as the function as tapply was not producing an array without a function after the INDEX, but mean is not actually calculating anything here. Then I was thinking about trying to take January through June from one the matrices (i.e. 1992), and July through December from the preceding matrix (i.e. 1991), and then computing the mean. I'm not entirely sure how to do that part, however, there must be a more efficient way of performing these calculations in R
EDIT
Here is a simple sample set of data
maxt
day month year MAX_C
1 1 1990 29
1 2 1990 28
1 3 1990 32
1 4 1990 26
1 5 1990 24
1 6 1990 32
1 7 1990 30
1 8 1990 28
1 9 1990 28
1 10 1990 24
1 11 1990 30
1 12 1990 30
1 1 1991 25
1 2 1991 26
1 3 1991 28
1 4 1991 25
1 5 1991 24
1 6 1991 32
1 7 1991 26
1 8 1991 32
1 9 1991 26
1 10 1991 26
1 11 1991 27
1 12 1991 26
1 1 1992 27
1 2 1992 25
1 3 1992 29
1 4 1992 32
1 5 1992 27
1 6 1992 27
1 7 1992 24
1 8 1992 25
1 9 1992 28
1 10 1992 26
1 11 1992 31
1 12 1992 27
I would create an "indicator year" column which was equal to the year if month in July-Dec but equal to year-1 when month in Jan-June.
EDITED month reference in light of the fact it was numeric rather than character:
> maxt$year2 <- maxt$year
> maxt[ maxt$month %in% 1:6, "year2"] <-
+ maxt[ maxt$month %in% 1:6, "year"] -1
> # month.name is a 12 element constant vector in all versions of R
> # check that it matches the spellings of your months
>
> mean_by_year <- tapply(maxt$MAX_C, maxt$year2, mean, na.rm=TRUE)
> mean_by_year
1989 1990 1991 1992
28.50000 27.50000 27.50000 26.83333
If you wanted to change the labels so they reflected the non-calendar year derivation:
> names(mean_by_year) <- paste(substr(names(mean_by_year),3,4),
+ as.character( as.numeric(substr(names(mean_by_year),3,4))+1),
sep="_")
> mean_by_year
89_90 90_91 91_92 92_93
28.50000 27.50000 27.50000 26.83333
Although I don't think it will be quite right at the millennial turn.