Can someone point me if there is a way to convert Flink SQL TIMESTAMP(3), TIMESTAMP_LTZ(3) to milliseconds without involving UDF?
Or maybe there is a way to represent Kafka Event Time in milliseconds using Table API?
I think something like this might work:
SELECT (1000 * EXTRACT(EPOCH FROM ts)) + EXTRACT(MILLISECOND FROM ts)
Thanks to #David Anderson and Flink Error Messages I found this solution:
1000 * UNIX_TIMESTAMP(CAST(eTime AS STRING)) + EXTRACT(MILLISECOND FROM eTime)
Maybe in the future we'll have a built-in function for it...
Related
I am use flink-1.13 sql. I have a kafka table like
create my_table(
id string,
event_time timestamp(3)
watermark for time as ...
)
I want to group messages every 10 minutes like tumble window, besides I want to recalculate late messages within 1 hour.
One of the way I know is use a udf like
select count(1) from my_table
where event_time >= '1 hour ago'
group by ten_minutes_udf(event_time)
But this way flink state never expired and I can't find a suitable Window TVF Aggregation to do it
Is there another way to do this?
In Flink 1.14 a current_watermark() function was added that can be used to detect and operate on late events.
Since before 1.13 there is an experimental table.exec.emit.allow-lateness configuration option that can be used with the (now legacy) window operations (and not with window TVFs).
So I want to alert when my watermark falls behind.
I want to use metrics reported by flink's job manager. Something like this, but this doesnt work as I like it.
(timestamp(flink_taskmanager_job_task_operator_currentInputWatermark{task_name=~"my_window.*"})-(4*60*60*1000))-flink_taskmanager_job_task_operator_currentInputWatermark{task_name=~"my_window.*"}
Verbally : i'd like to get a diff in currentTime (time when the metric was reported) - wmatermark ts.
(4*60*60*1000) is to convert to EDT -- is there a better way to do this ?
OK. so the above query was almost perfect. what I was doing wrong is shifting an already EDT timestamp to -4h. Below is the perfect query to do this:
timestamp(flink_taskmanager_job_task_operator_currentInputWatermark{task_name="my_window",job_name="session"})*1000-flink_taskmanager_job_task_operator_currentInputWatermark{task_name="my_window",job_name="session"}
the flink_taskmanager_job_task_operator_currentInputWatermark reports doesnt report in ms but timestamp does hence the *1000 conversion
so I'm simulating a streaming task using Flink DataStream and I want to execute an SQL query on each window.
Let's say this is the query
SELECT name, age, sum(days), avg(salary)
FROM employees
WHERE age > 25
GROUP BY name, age
ORDER BY name, age
I'm having a hard time to translate it to Flink. From my understanding, to calculate average I need to do it manually using .apply() and WindowFunction. But how do I calculate the sum then? Also manually in the same WindowFunction?
I'm also wondering if it is possible to do order by on the whole window?
Below is the pseudocode of what I thought of so far. Any help would be appreciated! Thanks!
employeesStream
.filter(new FilterFunction() ....) \\ where clause
.keyby(nameIndex, ageIndex) \\ group by??
.timeWindow(Time.seconds(10), Time.seconds(1))
.apply(new WindowFunction() ....) \\ calculate average (and sum?)
// order by??
I checked the Table API but it seems for streaming not a lot of operations are supported, e.g orderBy.
Ordering in streaming is not trivial. How do you want to sort something that is never ending? In your example you want to calculate an average or a sum, which is just one value per window. You cannot sort one value.
Another possibility is to buffer all values and wait for an indicator of completeness to start sorting. Thanks to event-time and watermarks, it is possible to sort a stream if you know that you have seen all values until a certain time (aka watermarks).
Event-time sort has been introduced recently and will be part of Flink 1.4 Table API. See here for an example.
I have a table with a datetime column I want to retrieve the date in hh:mm:ss where hours is 12hour.
I have tried the following.
convert(varchar(10), t2.CSM_START_TIME,108),
convert(varchar(2),datepart(hour,t2.CSM_START_TIME))
+':'+convert(varchar(2),datepart(mm,t2.CSM_START_TIME))
+':'+convert(varchar(2), datepart(SECOND,t2.CSM_START_TIME))
as START_TIME
SELECT LTRIM(RIGHT(CONVERT(CHAR(20),GETDATE(),22),11));
Result:
11:40:15 PM
Or if you really don't want the AM/PM (which I don't understand):
SELECT LTRIM(LEFT(RIGHT(CONVERT(CHAR(20), GETDATE(),22),11),8));
Result:
11:40:15
Much, much, much better to format this at the client. For example, if you are using C#, look at .Format() and .ToString(). Don't make SQL Server do your dirty presentation work, especially when you have much more flexible features in a more powerful language.
This got me what I wanted.
,substring(convert(varchar(40), t2.CSM_START_TIME,109),12,9)+' '+substring(convert(varchar(40), t2.CSM_START_TIME,109),25,2)
you should try this
DATE_FORMAT(NOW(),'%h:%i %p')
or you will find more here http://www.w3schools.com/sql/func_date_format.asp.
Assume I have object which represents TASK. Task have due date.
How do I create query to get all tasks which are due today?
Some working code like
"select t from Task t where dueDate=:today"
will be usefull.
Thank you in advance.
You assume that #Temporal is supported by Google's GAE/J plugin. It isn't, despite being reported to them over a year ago
http://code.google.com/p/datanucleus-appengine/issues/detail?id=20&colspec=ID%20Stars%20Type%20Status%20Priority%20FoundIn%20TargetRelease%20Owner%20Summary
Assuming your dueDate is a Date annotated like this:
#Temporal(TemporalType.DATE)
private Date dueDate;
Then you can do the following query in JPQL:
select t from Task where t.dueDate = current_date