I want a sql-server query to calculate cumulative P/L on stock trading (FIFO based calculation).
Input table :
EXECTIME
share_name
Quantity
Price
Buy/Sell
2013-01-01 12:25
abc
100
100
B
2013-01-01 12:26
abc
10
102
S
2013-01-01 12:27
abc
10
102
S
2013-01-01 12:28
abc
10
95
S
2013-01-01 12:29
abc
10
99
S
2013-01-01 12:30
abc
10
105
S
2013-01-01 12:31
abc
100
102
B
2013-01-01 12:32
abc
150
101
S
OUTPUT :
EXECTIME
Cumualative P/L
Winning Streak
Lossing Streak
2013-01-01 12:26
20
1
0
2013-01-01 12:27
40
1
0
2013-01-01 12:28
-10
0
1
2013-01-01 12:29
-20
0
2
2013-01-01 12:30
30
1
0
2013-01-01 12:32
-20
0
1
Explanation :
1st row - 10 shares sold at 102 which were purchased at 100. So profit = (102-100) * 10 = 20
6th row - 150 shares sold at 101,
50 were purchased at 100 - 1st row( 50 already sold above, 50 left)
100 were purchaed at 102 - 7th row
150 * 101 - [(50 * 100)+(100 * 102)] = -50
cumaltive p/l = 30 + (-50) = -20
Winning streak - 1 for positive
Lossing streak - 1,2,... for continuous loss. reset again after profit
I am attempting to return day of the week (i.e. Monday = 1, Tuesday = 2, etc) based on a date column ("Posting_date"). I tried a for loop but got it wrong:
#First date of table was a Sunday (1 March 2019) => so counter starts at 6
posting_df3['Day'] = (posting_df3['Posting_date'] - dt.datetime(2019,3,31)).dt.days.astype('int16')
# Start counter on the right date (31 March 2019 is a Sunday)
count = 7
for x in posting_df3['Day']:
if count != 7:
count = 1
else:
count = count + 1
posting_df3['Day'] = count
Not sure if there are other ways of doing this. Attached is an image of my database structure:
level_0 Posting_date Reservation date Book_window ADR Day
0 9 2019-03-31 2019-04-01 -1 156.00 0
1 25 2019-04-01 2019-04-01 0 152.15 1
2 11 2019-04-01 2019-04-01 0 149.40 1
3 42 2019-04-01 2019-04-01 0 141.33 1
4 45 2019-04-01 2019-04-01 0 159.36 1
... ... ... ... ... ... ...
4278 739 2020-02-21 2019-04-17 310 253.44 327
4279 739 2020-02-22 2019-04-17 310 253.44 328
4280 31 2020-03-11 2019-04-01 345 260.00 346
Final output should be 2019-03-31 Day column should return 7 since it is a Sunday
and 2019-04-01 Day column should return 1 since its Monday etc
You can do it this way
df['weekday']=pd.to_datetime(df['Posting_date']).dt.weekday+1
Input
level_0 Posting_date Reservation_date Book_window ADR Day
0 9 3/31/2019 4/1/2019 -1 156.00 0
1 25 4/1/2019 4/1/2019 0 152.15 1
2 11 4/1/2019 4/1/2019 0 149.40 1
3 42 4/1/2019 4/1/2019 0 141.33 1
4 45 4/1/2019 4/1/2019 0 159.36 1
Output
level_0 Posting_date Reservation_date Book_window ADR Day weekday
0 9 3/31/2019 4/1/2019 -1 156.00 0 7
1 25 4/1/2019 4/1/2019 0 152.15 1 1
2 11 4/1/2019 4/1/2019 0 149.40 1 1
3 42 4/1/2019 4/1/2019 0 141.33 1 1
4 45 4/1/2019 4/1/2019 0 159.36 1 1
I have a huge number of matrices (over 50) included in a single array.
Each of my matrices represents a year (1951,1952, and so on).
Each matrix contains observations of 4 plants at 80 locations.
Consequently each matrix has 4 columns and 80 rows.
I want to rearrange my data into 4 dataframes.
One Dataframe for each plant, meaning the dimensions of my array (the different years) become my colnames and the different locations become my rownames.
1951
10 12 13 24
2 NA NA NA 288
3 114 139 NA 287
4 104 128 NA 285
5 105 128 NA 289
6 107 123 NA 282
7 112 121 NA 289
8 110 130 NA 287
9 112 128 NA 290
10 107 125 NA 284
. . . . .
. . . . .
1952
10 12 13 24
2 45 34 345 45
3 345 139 NA 287
4 104 128 345 285
5 105 128 NA 289
6 137 123 NA 282
7 112 141 123 239
8 110 130 NA 287
9 112 128 123 230
10 307 125 NA 284
. . . . .
. . . . .
Is there any quick way to do this?
This would be of great advantage for my following calculations!
Suppose we have the 9x4x2 array a shown reproducibly in the Note at the end. Then we can use apply to get a list of data frames from it. Replace 2 with 1 or 3 to get other variations.
apply(a, 2, as.data.frame)
giving:
$`10`
1951 1952
2 45 45
3 345 345
4 104 104
5 105 105
6 137 137
7 112 112
8 110 110
9 112 112
10 307 307
$`12`
1951 1952
2 34 34
3 139 139
4 128 128
5 128 128
6 123 123
7 141 141
8 130 130
9 128 128
10 125 125
$`13`
1951 1952
2 345 345
3 NA NA
4 345 345
5 NA NA
6 NA NA
7 123 123
8 NA NA
9 123 123
10 NA NA
$`14`
1951 1952
2 45 45
3 287 287
4 285 285
5 289 289
6 282 282
7 239 239
8 287 287
9 230 230
10 284 284
Note
a <- array(data = c(45L, 345L, 104L, 105L, 137L, 112L, 110L, 112L, 307L, 34L, 139L,
128L, 128L, 123L, 141L, 130L, 128L, 125L, 345L, NA, 345L, NA,
NA, 123L, NA, 123L, NA, 45L, 287L, 285L, 289L, 282L, 239L, 287L,
230L, 284L, 45L, 345L, 104L, 105L, 137L, 112L, 110L, 112L, 307L,
34L, 139L, 128L, 128L, 123L, 141L, 130L, 128L, 125L, 345L, NA,
345L, NA, NA, 123L, NA, 123L, NA, 45L, 287L, 285L, 289L, 282L,
239L, 287L, 230L, 284L),
dim = c(9, 4, 2),
dimnames = list(c("2", "3", "4", "5", "6", "7", "8", "9", "10"), c("10",
"12", "13", "14"), c("1951", "1952"))
)
I made some small example data called years_dfs for the thing you are trying to achieve. It should also work if you use a list of matrices instead of data frames.
library(tidyverse)
years <- 1951:1953
year_dfs <- list(data.frame(a = 1:5, b = 6:10),
data.frame(a = 11:15, b = 16:20),
data.frame(a = 21:25, b = 26:30)) %>%
`names<-`(years)
year_dfs
$`1951`
a b
1 1 6
2 2 7
3 3 8
4 4 9
5 5 10
$`1952`
a b
1 11 16
2 12 17
3 13 18
4 14 19
5 15 20
$`1953`
a b
1 21 26
2 22 27
3 23 28
4 24 29
5 25 30
lapply(1:ncol(year_dfs[[1]]), function(plant)
lapply(1:length(year_dfs), function(year)
year_dfs[[year]][,plant]) %>%
as.data.frame %>%
`colnames<-`(years)
) %>% `names<-`(colnames(year_dfs[[1]]))
$a
1951 1952 1953
1 1 11 21
2 2 12 22
3 3 13 23
4 4 14 24
5 5 15 25
$b
1951 1952 1953
1 6 16 26
2 7 17 27
3 8 18 28
4 9 19 29
5 10 20 30
I am new to R and i've been stuck on this. I have a data set below wherein I created a new array list variable called 'amountOfTxn_array' that contains three numeric values in sequential order. These are amounts of transactions taken from Jan to Mar. My objective is to create new variables from this array list that iterate each data elements in the 'amountOfTxn_array'.
> head(myData_05_Array)
Index accountID amountOfTxn_array
1:00 8887 c(36.44, 75.00,185.24)
2:00 13462 c(639.45,656.10,237.00)
3:00 47249 c(0, 24, 2012)
4:00 49528 c(1189.20,2326.26,1695.89)
5:00 57201 c(24.67, 0.00, 0.00)
6:00 57206 c(0.00, 661.98,2957.68)
str(myData_05_Array)
Classes ‘data.table’ and 'data.frame': 3176 obs. of 4 variables:
$ accountID : int 8887 13462 47249 49528 57201 57206 58522 79073 80465 81032 ...
$ amountOfTxn_200501: num 36.4 639.5 0 1189.2 24.7 ...
$ amountOfTxn_200502: num 75 656 24 2326 0 ...
$ amountOfTxn_200503: num 185 237 2012 1696 0 ...
$ amountOfTxn_array :List of 3176
Also, an example code for creating a new variable is provided below wherein I would like to tag 1 if a value in the array is greater than 100 and 0 else. When I ran the example code, I am getting "Error: (list) object cannot be coerced to type ‘double’ error. May I ask for a solution for this. I would highly appreciate any response.
Thanks!
> for(i in 1:3)
+ {
+ if(myData_05_Array$amountOfTxn_array[i] > 100){
+ myData_05_Array$testArray[i] <- 1
+ }
+ else{
+ myData_05_Array$testArray[i] <- 0
+ }
+ }
Error: (list) object cannot be coerced to type 'double'
What I am expecting as the output is as follows:
amountOfTxn_testArray
c(0, 0, 1)
c(1, 1, 1)
c(0, 0, 0)
c(1, 1, 1)
c(0, 0, 0)
c(0, 1, 1)
"Doing calculations for 24 columns is quite cumbersome"
a HA! welcome to the dplyr world:
library(dplyr)
#generate dummy data
dummyDf <-read.table(text='Index accountID Jan Feb March
1:00 8887 36.44 75.00 185.24
2:00 13462 639.45 656.10 237.00
3:00 47249 0 24 2012
4:00 49528 1189.20 2326.26 1695.89
5:00 57201 24.67 0.00 0.00
6:00 57206 0.00 661.98 2957.68', header=TRUE, stringsAsFactors=FALSE)
mutate column by column index
#the dot (.) argument refers to the focal column
df %>% mutate_at(3:5, funs(as.numeric(.>100)))
mutate columns by predefined names
changeVars =c("Jan","Feb","March")
df %>% mutate_at(.cols=changeVars, funs(as.numeric(.>100)))
mutate columns if some condition is met
df %>%mutate_if(is.double, funs(as.numeric(.>100)))
output:
Index accountID Jan Feb March
1 1:00 8887 0 0 1
2 2:00 13462 1 1 1
3 3:00 47249 0 0 1
4 4:00 49528 1 1 1
5 5:00 57201 0 0 0
6 6:00 57206 0 1 1
I have two tables.
Table1:
Label Date CT
A 2014-01-01 19
A 2014-02-01 10
A 2014-03-01 19
A 2014-04-01 18
B 2014-01-01 20
B 2014-02-01 16
B 2014-03-01 14
B 2014-04-01 16
C 2014-01-01 13
C 2014-02-01 12
C 2014-03-01 19
C 2014-04-01 14
Table2 :
Label Date CT
D 2014-01-01 19
D 2014-02-01 10
D 2014-03-01 19
D 2014-04-01 18
E 2014-01-01 20
E 2014-02-01 16
E 2014-03-01 14
E 2014-04-01 16
F 2014-01-01 13
F 2014-02-01 12
F 2014-03-01 19
F 2014-04-01 14
Desired Output :
Label Jan'14 Feb'14 Mar'14 Apr'14 Total
A 19 10 19 18 66
B 20 16 14 16 66
C 13 12 19 14 58
D 19 10 19 18 66
E 20 16 14 16 66
F 13 12 19 14 58
I'm new to PostgreSQL.
I wanted to take the unique values of Label column from both the table.
And produce the sum total of count to their respective label.
I can combine both the tables in a straight forward method using UNION ALL.
But that'll not give me the desired output or the view like a pivot.
I did google on this but nothing could help me out.
Came across this in SO. And I'm still trying on with it.
But I actually don't have a clue whether it can be done or not.
Can someone help me in getting the desired output.
Thanks in advance!!
Try Like This
select *,("Jan ''14" + "Feb ''14" + "Mar ''14" +"Apr ''14") as total
from crosstab($$
select id,to_char(da,'Mon ''yy') as tt,no from t2
union all
select id,to_char(da,'Mon ''yy') as tt,no from "T1"
$$,$$values ('Jan ''14'), ('Feb ''14'),('Mar ''14'),('Apr ''14') $$) as at
(id text, "Jan ''14" integer,"Feb ''14" integer,"Mar ''14" integer,
"Apr ''14" integer) order by id