When creating a tensor with an array of timestamps, the numbers are incorrect - tensorflow.js

Looking for some kind of solution to this issue:
trying to create a tensor from an array of timestamps
[
1612892067115,
],
but here is what happens
tf.tensor([1612892067115]).arraySync()
> [ 1612892078080 ]
as you can see, the result is incorrect.
Somebody pointed out, I may need to use the datatype int64, but this doesn't seem to exist in tfjs 😭
I have also tried to divide my timestamp to a small float, but I get a similar result
tf.tensor([1.612892067115, 1.612892068341]).arraySync()
[ 1.6128920316696167, 1.6128920316696167 ]
If you know a way to work around using timestamps in a tensor, please help :)
:edit:
As an attempted workaround, I tried to remove my year, month, and date from my timestamp
Here are my subsequent input values:
[
56969701,
56969685,
56969669,
56969646,
56969607,
56969602
]
and their outputs:
[
56969700,
56969684,
56969668,
56969648,
56969608,
56969600
]
as you can see, they are still incorrect, and should be well within the acceptable range

found a solution that worked for me:
Since I only require a subset of the timestamp (just the date / hour / minute / second / ms) for my purposes, I simply truncate out the year / month:
export const subts = (ts: number) => {
// a sub timestamp which can be used over the period of a month
const yearMonth = +new Date(new Date().getFullYear(), new Date().getMonth())
return ts - yearMonth
}
then I can use this with:
subTimestamps = timestamps.map(ts => subts(ts))
const x_vals = tf.tensor(subTimestamps, [subTimestamps.length], 'int32')
now all my results work as expected.

Currently only int32 is supported with tensorflow.js, your data has gone out of the range supported by int32.
Until int64 is supported, this can be solved by using a relative timestamp. Currently a timestamp in js uses the number of ms that elapsed since 1 January 1970. A relative timestamp can be used by using another origin and compute the difference of ms that has elapsed since that date. That way, we will have a lower number that can be represented using int32. The best origin to take will be the starting date of the records
const a = Date.now() // computing a tensor out of it will give an accurate result since the number is out of range
const origin = new Date("02/01/2021").now()
const relative = a - origin
const tensor = tf.tensor(relative, undefined, 'int32')
// get back the data
const data = tensor.dataSync()[0]
// get the initial date
const initial date = new Date(data + origin)
In other scenarios, if using the ms is not of interest, using the number of s that has elapsed since the start would be better. It is called the unix time

Related

lapply calling .csv for changes in a parameter

Good afternoon
I am currently trying to pull some data from pushshift but I am maxing out at 100 posts. Below is the code for pulling one day that works great.
testdata1<-getPushshiftData(postType = "submission", size = 1000, before = "1546300800", after= "1546200800", subreddit = "mysubreddit", nest_level = 1)
I have a list of Universal Time Codes for the beginning and ending of each day for a month. What I would like to do is get the syntax to replace the "after" and "before" values for each day and for each day to be added to the end of the pulled data. Even if it placed the data to a bunch of separate smaller datasets I could work with it.
Here is my (feeble) attempt. "links" is the data frame with the UTCs
mydata<- lapply(1:30, function(x) getPushshiftData(postType = "submission", size = 1000, after= links$utcstart[,x],before = links$utcendstart[,x], subreddit = "mysubreddit", nest_level = 1))
Here is the error message I get: Error in links$utcstart[, x] : incorrect number of dimensions
I've also tried without the "function (x)" argument and get the following message:
Error in ifelse(is.null(after), "", sprintf("&after=%s", after)) :
object 'x' not found
Can anyone help with this?

How can I calculate the difference between two dates in Luxon?

I'm doing a project in React js and I need calculate the difference between dates.
The first one the user choose the startTime(state) and the endTime(state) and the program give the differecen in years, months and days ( this program is done and working).
Now I'm trying calculate the difference in days between the day 1-01-2022 and other date that the user choose.
I have tryed change the format but I receive errors.
//this is working
const date1 = DateTime.fromISO(endTime.toISOString());
const date2 = DateTime.fromISO(startTime.toISOString());
let durationDate = date1.diff(date2, ["years", "months", "days"]);
//--------------
//this is not working
const date3 = DateTime.fromISO(endTime.toISOString());
const dateNow = DateTime.now().startOf("year").toISO();
let dateTermination = endTime.diff(dateNow, ["years", "months", "days"]).toObject();
//----------------

Flink SlidingEventTimeWindows doesnt work as expected

I have a stream execution configured as
object FlinkSlidingEventTimeExample extends App {
case class Trx(timestamp:Long, id:String, trx:String, count:Int)
val env = StreamExecutionEnvironment.createLocalEnvironmentWithWebUI()
val watermarkS1 = WatermarkStrategy
.forBoundedOutOfOrderness[Trx](Duration.ofSeconds(15))
.withTimestampAssigner(new SerializableTimestampAssigner[Trx] {
override def extractTimestamp(element: Trx, recordTimestamp: Long): Long = element.timestamp
})
val s1 = env.socketTextStream("localhost", 9999)
.flatMap(l => l.split(" "))
.map(l => Trx(timestamp = l.split(",")(0).toLong, id = l.split(",")(1), trx = l.split(",")(2), count = 1))
.assignTimestampsAndWatermarks(watermarkS1)
.keyBy(l => l.id)
.window(SlidingEventTimeWindows.of(Time.seconds(20),Time.seconds(5))) // Not working
//.window(SlidingProcessingTimeWindows.of(Time.seconds(20),Time.seconds(5))) // Working
.sum("count")
.print
env.execute("FlinkSlidingEventTimeExample")
}
I have already defined a watermark, but couldn't figure out why it is not producing anything. Does anyone has any ideas? My flink version is 1.14.0
My build.sbt is like below:
scalaVersion := "2.12.15"
libraryDependencies += "org.apache.flink" %% "flink-streaming-scala" % "1.14.0"
libraryDependencies += "org.apache.flink" %% "flink-runtime-web" % "1.14.0"
libraryDependencies += "org.apache.flink" %% "flink-clients" % "1.14.0"
libraryDependencies += "org.apache.flink" % "flink-queryable-state-runtime" % "1.14.0"
I am entering input data from socket(port:9999) like below:
1640375790000,1,trx1
1640375815000,1,trx2
1640375841000,1,trx3
1640375741000,1,trx4
tried to give larger timestamp than window size, but still not working.
Flink Web UI screenshot:
web-ui
watermarks
Earlier answer deleted; it was based on faulty assumptions about the setup.
When event time windows fail to produce results it's always something to do with watermarking.
The timestamps in your input correspond to
December 24, 2021 19:56:30
December 24, 2021 19:56:55
December 24, 2021 19:57:21
December 24, 2021 19:55:41
so there's more than enough data to trigger the closure of several sliding windows. For example, trx2 has a large enough timestamp that it can generate a watermark large enough to close these windows that contain 19:56:30:
19:56:15 - 19:56:34.999
19:56:20 - 19:56:39.999
However, your execution graph looks something like this:
The problem is the rebalance between the socket source and the task that follows (the one doing flatmap -> map -> watermarks). Each of your four events is going to a different instance of the watermark strategy, and some instances aren't receiving any events. That's why there are no watermarks being generated.
What you want to do instead is to chain the input parsing and watermark generation to the source at the same parallelism, so that your execution graph looks like this instead:
This code will do that:
env
.socketTextStream("localhost", 9999)
.map(l => {
val input = l.split(",")
Trx(timestamp = input(0).toLong, id = input(1), trx = input(2), count = 1)
})
.setParallelism(1)
.assignTimestampsAndWatermarks(watermarkS1)
.setParallelism(1)
.keyBy(l => l.id)
.window(SlidingEventTimeWindows.of(Time.seconds(20), Time.seconds(5)))
.sum("count")
.print
In general it's not necessary to do watermarking at a parallelism of one, but it is necessary that every instance of the watermark generator either has enough events to work with, or is configured with withIdleness. (And if every instance is idle then you won't get any results either.)

Exception: decimal.InvalidOperation raised when saving a Django data model

I am storing crypto-currency data into a Django data model (using Postgres database). The vast majority of the records are saved successfully. But, on one record in particular I am getting an exception decimal.InvalidOperation.
The weird thing is, I can't see anything different about the values being saved in the problematic record from any of the others that save successfully. I have included a full stack trace on paste bin. Before the data is saved, I have outputted raw values to the debug log. The following is the data model I'm saving the data to. And the code that saves the data to the data model.
I'm stumped! Anyone know what the problem is?
Data Model
class OHLCV(m.Model):
""" Candles-stick data (open, high, low, close, volume) """
# class variables
_field_names = None
timeframes = ['1m', '1h', '1d']
# database fields
timestamp = m.DateTimeField(default=timezone.now)
market = m.ForeignKey('bc.Market', on_delete=m.SET_NULL, null=True, related_query_name='ohlcv_markets', related_name='ohlcv_market')
timeframe = m.DurationField() # 1 minute, 5 minute, 1 hour, 1 day, or the like
open = m.DecimalField(max_digits=20, decimal_places=10)
high = m.DecimalField(max_digits=20, decimal_places=10)
low = m.DecimalField(max_digits=20, decimal_places=10)
close = m.DecimalField(max_digits=20, decimal_places=10)
volume = m.DecimalField(max_digits=20, decimal_places=10)
Code Which Saves the Data Model
#classmethod
def fetch_ohlcv(cls, market:Market, timeframe:str, since=None, limit=None):
"""
Fetch OHLCV data and store it in the database
:param market:
:type market: bc.models.Market
:param timeframe: '1m', '5m', '1h', '1d', or the like
:type timeframe: str
:param since:
:type since: datetime
:param limit:
:type limit: int
"""
global log
if since:
since = since.timestamp()*1000
exchange = cls.get_exchange()
data = exchange.fetch_ohlcv(market.symbol, timeframe, since, limit)
timeframe = cls.parse_timeframe_string(timeframe)
for d in data:
try:
timestamp = datetime.fromtimestamp(d[0] / 1000, tz=timezone.utc)
log.debug(f'timestamp={timestamp}, market={market}, timeframe={timeframe}, open={d[1]}, high={d[2]}, low={d[3]}, close={d[4]}, volume={d[5]}')
cls.objects.create(
timestamp=timestamp,
market=market,
timeframe=timeframe,
open=d[1],
high=d[2],
low=d[3],
close=d[4],
volume=d[5],
)
except IntegrityError:
pass
except decimal.InvalidOperation as e:
error_log_stack(e)
Have a look at your data and check if it fits within the field limitations:
The mantissa must fit in the max_digits;
The decimal places should be less than decimal_places;
And according to the DecimalValidator : the number of whole digits should not be greater than max_digits - decimal_places;
Not sure how your fetch_ohlcv function fills the data array, but if there is division it is possible that the number of decimal_digits is greater than 10.
The problem I had, that brought me here, was too many digits in the integer part therefore failing the last requirement.
Check this answer for more information on a similar issue.

How to loop through table based on unique date in MATLAB

I have this table named BondData which contains the following:
Settlement Maturity Price Coupon
8/27/2016 1/12/2017 106.901 9.250
8/27/2019 1/27/2017 104.79 7.000
8/28/2016 3/30/2017 106.144 7.500
8/28/2016 4/27/2017 105.847 7.000
8/29/2016 9/4/2017 110.779 9.125
For each day in this table, I am about to perform a certain task which is to assign several values to a variable and perform necessary computations. The logic is like:
do while Settlement is the same
m_settle=current_row_settlement_value
m_maturity=current_row_maturity_value
and so on...
my_computation_here...
end
It's like I wanted to loop through my settlement dates and perform task for as long as the date is the same.
EDIT: Just to clarify my issue, I am implementing Yield Curve fitting using Nelson-Siegel and Svensson models.Here are my codes so far:
function NS_SV_Models()
load bondsdata
BondData=table(Settlement,Maturity,Price,Coupon);
BondData.Settlement = categorical(BondData.Settlement);
Settlements = categories(BondData.Settlement); % get all unique Settlement
for k = 1:numel(Settlements)
rows = BondData.Settlement==Settlements(k);
Bonds.Settle = Settlements(k); % current_row_settlement_value
Bonds.Maturity = BondData.Maturity(rows); % current_row_maturity_value
Bonds.Prices=BondData.Price(rows);
Bonds.Coupon=BondData.Coupon(rows);
Settle = Bonds.Settle;
Maturity = Bonds.Maturity;
CleanPrice = Bonds.Prices;
CouponRate = Bonds.Coupon;
Instruments = [Settle Maturity CleanPrice CouponRate];
Yield = bndyield(CleanPrice,CouponRate,Settle,Maturity);
NSModel = IRFunctionCurve.fitNelsonSiegel('Zero',Settlements(k),Instruments);
SVModel = IRFunctionCurve.fitSvensson('Zero',Settlements(k),Instruments);
NSModel.Parameters
SVModel.Parameters
end
end
Again, my main objective is to get each model's parameters (beta0, beta1, beta2, etc.) on a per day basis. I am getting an error in Instruments = [Settle Maturity CleanPrice CouponRate]; because Settle contains only one record (8/27/2016), it's suppose to have two since there are two rows for this date. Also, I noticed that Maturity, CleanPrice and CouponRate contains all records. They should only contain respective data for each day.
Hope I made my issue clearer now. By the way, I am using MATLAB R2015a.
Use categorical array. Here is your function (without its' headline, and all rows I can't run are commented):
BondData = table(datetime(Settlement),datetime(Maturity),Price,Coupon,...
'VariableNames',{'Settlement','Maturity','Price','Coupon'});
BondData.Settlement = categorical(BondData.Settlement);
Settlements = categories(BondData.Settlement); % get all unique Settlement
for k = 1:numel(Settlements)
rows = BondData.Settlement==Settlements(k);
Settle = BondData.Settlement(rows); % current_row_settlement_value
Mature = BondData.Maturity(rows); % current_row_maturity_value
CleanPrice = BondData.Price(rows);
CouponRate = BondData.Coupon(rows);
Instruments = [datenum(char(Settle)) datenum(char(Mature))...
CleanPrice CouponRate];
% Yield = bndyield(CleanPrice,CouponRate,Settle,Mature);
%
% NSModel = IRFunctionCurve.fitNelsonSiegel('Zero',Settlements(k),Instruments);
% SVModel = IRFunctionCurve.fitSvensson('Zero',Settlements(k),Instruments);
%
% NSModel.Parameters
% SVModel.Parameters
end
Keep in mind the following:
You cannot concat different types of variables as you try to do in: Instruments = [Settle Maturity CleanPrice CouponRate];
There is no need in the structure Bond, you don't use it (e.g. Settle = Bonds.Settle;).
Use the relevant functions to convert between a datetime object and string or numbers. For instance, in the code above: datenum(char(Settle)). I don't know what kind of input you need to pass to the following functions.

Resources