when using TDenging database, all streams stopped working for no reason - tdengine

all streams stop working at one time for no reason;
one of the stream create sql is:
CREATE STREAM IF NOT EXISTS cq_action_info_1m_s TRIGGER WINDOW_CLOSE into cq_action_info_1m SUBTABLE(CONCAT('cai1_', tbname)) AS SELECT _wstart as _ts, avg(ave_time_top) as ave_time_top, avg(average_time) as average_time, max(handle_now_count) as handle_now_count, max(last_time) as last_time, max(max_time_top) as max_time_top, max(p_error_rate) as p_error_rate, max(profession_error) as profession_error, max(req_count) as req_count, max(req_second_count) as req_second_count, max(total_time) as total_time, max(unusual_count) as unusual_count, action, action_name, host, port, shelltype FROM zzdb.action_info PARTITION BY action, action_name, host, port, shelltype INTERVAL(1m);
some system information is as follows:
TDengine version: 3.0.2.2
show dnodes:
enter image description here
streams:
enter image description here

Related

TDengine all streams stopped working for no reason

all streams stop working at one time for no reason;
one of the stream create sql is:
CREATE STREAM IF NOT EXISTS cq_action_info_1m_s TRIGGER WINDOW_CLOSE into cq_action_info_1m SUBTABLE(CONCAT('cai1_', tbname)) AS SELECT _wstart as _ts, avg(ave_time_top) as ave_time_top, avg(average_time) as average_time, max(handle_now_count) as handle_now_count, max(last_time) as last_time, max(max_time_top) as max_time_top, max(p_error_rate) as p_error_rate, max(profession_error) as profession_error, max(req_count) as req_count, max(req_second_count) as req_second_count, max(total_time) as total_time, max(unusual_count) as unusual_count, action, action_name, host, port, shelltype FROM zzdb.action_info PARTITION BY action, action_name, host, port, shelltype INTERVAL(1m);
some system information is as follows:
TDengine version: 3.0.2.2
But I'm not sure if the result of the last write is complete;
This is my latest query result, it still stops at 30th, I don't know if there is a local log file for query reason:

Flink SQL Tumble Aggregation result not written out to filesystem locally

Context
I have a Flink job coded by python SQL api. it is consuming source data from Kinesis and producing results to Kinesis. I want to make a local test to ensure the Flink application code is correct. So I mocked out both the source Kinesis and sink Kinesis with filesystem connector. And then run the test pipeline locally. Although the local flink job always run successfully. But when I look into the sink file. The sink file is alway empty. This has also been the case when I run the code in 'Flink SQL Client'.
Here is my code:
CREATE TABLE incoming_data (
requestId VARCHAR(4),
groupId VARCHAR(32),
userId VARCHAR(32),
requestStartTime VARCHAR(32),
processTime AS PROCTIME(),
requestTime AS TO_TIMESTAMP(SUBSTR(REPLACE(requestStartTime, 'T', ' '), 0, 23), 'yyyy-MM-dd HH:mm:ss.SSS'),
WATERMARK FOR requestTime AS requestTime - INTERVAL '5' SECOND
) WITH (
'connector' = 'filesystem',
'path' = '/path/to/test/json/file.json',
'format' = 'json',
'json.timestamp-format.standard' = 'ISO-8601'
)
CREATE TABLE user_latest_request (
groupId VARCHAR(32),
userId VARCHAR(32),
latestRequestTime TIMESTAMP
) WITH (
'connector' = 'filesystem',
'path' = '/path/to/sink',
'format' = 'csv'
)
INSERT INTO user_latest_request
SELECT groupId,
userId,
MAX(requestTime) as latestRequestTime
FROM incoming_data
GROUP BY TUMBLE(processTime, INTERVAL '1' SECOND), groupId, userId;
Curious what I am doing wrong here.
Note:
I am using Flink 1.11.0
if I directly dump data from source to sink without windowing and grouping, it works fine. That means the source and sink table is set up correctly. So it seems the problem is around the Tumbling and grouping for local filesystem.
This code works fine with Kinesis source and sink.
Have you enabled checkpointing? This is required if you are in `STREAMING mode which appears to be the case. See https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/connectors/datastream/file_sink/
The most likely cause is that there isn't enough data in the file being read to keep the job running long enough for the window to close. You have a processing-time-based window that is 1 second long, which means that the job will have to run for at least one second to guarantee that the first window will produce results.
Otherwise, once the source runs out of data the job will shut down, regardless of whether the window contains unreported results.
If you switch to event-time-based windowing, then when the file source runs out of data it will send one last watermark with the value MAX_WATERMARK, which will trigger the window.

Modifying values in master..sysprocesses [Sybase]

Is there a way to modify the column program_name in table master..sysprocesses?
I have found two methods, but both set the name during the creation of the connection:
Using parameter appname when executing an isql command
Adding parameter APP= in a connection string when opening an ODBC connection.
I am looking for a way to modify it AFTER it has been created.
I tried the following example:
sp_configure "allow updates",1
go
UPDATE master..sysprocesses
SET program_name = 'test'
where hostname = 'server'
and hostprocess = '23240'
go
sp_configure "allow updates",0
go
But failed:
Could not execute statement.
Table 'sysprocesses' can't be modified.
Sybase error code=270
Severity Level=16, State=1, Transaction State=0
Line 4
You can continue executing or stop.
Changes to column sysprocesses.program_name are not allowed after its been created. But there are three columns in sysprocesses which can be changed after creation of the connection
sysprocesses.clientname
sysprocesses.clientapplname
sysprocesses.clienthostname
Exerpt from the Sybase Infocenter website:
Changing user session information
The set command includes options
that allow you to assign each client an individual name, host name,
and application name. This is useful for differentiating among clients
in a system where many clients connect to Adaptive Server using the
same name, host name, or application name.
The partial syntax for the set command is:
set [clientname client_name | clienthostname host_name | clientapplname application_name]
where:
client_name – is the name you are assigning the client.
host_name – is the name of the host from which the client is
connecting.
application_name – is the application that is connecting to Adaptive
Server.
These parameters are stored in the clientname, clienthostname, and
clientapplname columns of the sysprocesses table.
For example, if a user logs in to Adaptive Server as "client1", you
can assign them an individual client name, host name, and application
name using commands similar to:
set clientname 'alison'
set clienthostname 'money1'
set clientapplname 'webserver2'
.
.
.
Use the client’s system process ID to view their connection
information. For example, if the user “alison” described above
connects with a spid of 13, issue the following command to view all
the connection information for this user:
select * from sysprocesses where spid = 13
To view the connection information for the current client connection (for example, if the user “alison” wanted to view her own connection information), enter:
select * from sysprocesses where spid = ##spid

flink job submit through sql-client.sh sometime without any checkpoint (what is the way to alter it) or how to recover in case of failure

for example sql-client.sh embedded
insert into wap_fileused_daily(orgId, pdate, platform, platform_count) select u.orgId, u.pdate, coalesce(p.platform,'other'), sum(u.isMessage) as platform_count from users as u left join ua_map_platform as p on u.uaType = p.uatype where u.isMessage = 1 group by u.orgId, u.pdate, p.platform
it will show up as:enter image description here
there will never be any checkpoint.
Question: 1) how to trigger checkpoint ( alert job)
2) how to recover in case of failure
You can specify execution configuration parameters in the SQL Client YAML file. For example, the following should work:
configuration:
execution.checkpointing.interval: 42
There is a feature request on flink:https://cwiki.apache.org/confluence/display/FLINK/FLIP-147%3A+Support+Checkpoints+After+Tasks+Finished

Event processing by using Flink SQL API

My Use case-
Collect events for a particular duration and then group them based on the key
Objective
After processing, user can save data of particular duration based on the key
How i am planning to do
1)Receive events from Kafka
2)Create data stream of events
3)associate a table with it and collect data for a particular duration by running a SQL query
4)associate a new table with step-2 output and group collected data according to the key
5)save the data in DB
Solution i tried-
I am able to-
1)receive events from Kafka,
2)setup a data stream(lets say sensorDataStream)-
DataStream<SensorEvent> sensorDataStream
= source.flatMap(new FlatMapFunction<String, SensorEvent>() {
#Override
public void flatMap(String catalog, Collector<SensorEvent> out) {
// create SensorEvent(id, sensor notification value, notification time) creation
});
3)associate a table(lets say table1) with data stream and after running SQL query like-
SELECT id, sensorNotif, notifTime FROM SENSORTABLE WHERE notifTime > t1_Timestamp AND notifTime < t2_Timestamp.
Here t1_Timestamp and t2_Timestamp is predefined epoch time and will change based on some predefined conditions
4)I am able to print this sql query result by using following query on the console-
tableEnv.toAppendStream(table1, Row.class).print();
5)Created a new table(lets say table2) by using table1 and following type of sql query-
Table table2 = tableEnv.sqlQuery("SELECT id AS SensorID, COUNT(sensorNotif) AS SensorNotificationCount FROM table1 GROUP BY id);
6)Collecting and print data by using -
tableEnv.toRetractStream(table2 , Row.class).print();
Problem
1)I am not able to see output of step 6 on the console.
I did some experiment and found that If i skip table1 setup step(that means no sensor data clubbing for a duration) and directly associate my senserDataStream with table2 then i can see the output of step-6 but as this is RetractStream so i can see data in the form of and if new event is coming then this retract stream will invalidate data and print newly calculated data.
Suggestion i would like to have
1)How can i merge step 5 and step 6(means table1 and table2). I already merged these tables but as data is not visible on console so i have doubt? Am i doing something wrong? Or data is merged but not visible?
2)My plan is to --
2.a)filter data in 2 pass, in first pass filter data for a particular interval and in second pass group this data
2.b)Save 2.a output in DB
Will this approach work(i have doubt because i am using data stream and table1 out put is append stream but table2 output is retract stream)?

Resources