Flink task Manager hangs - flink-streaming

Here is the programme
final StreamExecutionEnvironment env =
StreamExecutionEnvironment.getExecutionEnvironment();
env.setRuntimeMode(RuntimeExecutionMode.BATCH);
ParameterTool parameters = ParameterTool.fromArgs(args);
String ftpUri ;
env.readTextFile(ftpUri,"UTF-8")
.map(mapFunction)
.keyBy(tuple2 -> tuple2.f0)
.window(TumblingProcessingTimeWindows.of(Time.seconds(2)))
.reduce((tuple2, t1) -> {
Collection newCol = new ArrayList<OpisRecord>();
Collections.addAll(newCol,tuple2.f1.toArray());
Collections.addAll(newCol,t1.f1.toArray());
return new Tuple2(tuple2.f0,newCol);
})
.addSink(new SinktoDistributedCache());
env.execute();
Works fine with for record size : 10k to 40k. But hangs up for anything above 40k.
I have tried increasing number task managers and parallelism but no gain.
Any clues ?

Related

Cumulative Acknowledgement is not happening in the flink-connector-pulsar

We are using the below libraries-
Flink - 1.15.0
Pulsar- 2.8.2
flink-connector-pulsar=1.15.0
TestJob.java
public class TestJob {
public static void main(String[] args) {
String authParams = String.format("token:%s", PULSAR_CLIENT_AUTH_TOKEN);
String topicPattern = "persistent://a/b/test";
List topics = new ArrayList();
topics.add(topicPattern);
Properties properties = new Properties();
properties.setProperty(PulsarOptions.PULSAR_AUTH_PLUGIN_CLASS_NAME.key(),
AuthenticationToken.class.getName());
properties.setProperty(PulsarOptions.PULSAR_AUTH_PARAMS.key(), authParams);
properties.setProperty(PulsarOptions.PULSAR_TLS_TRUST_CERTS_FILE_PATH.key(),PULSAR_CERT_PATH);
properties.setProperty(PulsarOptions.PULSAR_SERVICE_URL.key(), PULSAR_HOST);
properties.setProperty(PulsarOptions.PULSAR_CONNECT_TIMEOUT.key(),"600000");
properties.setProperty(PulsarOptions.PULSAR_READ_TIMEOUT.key(),"600000");
properties.setProperty(PulsarSourceOptions.PULSAR_ENABLE_AUTO_ACKNOWLEDGE_MESSAGE.key(),Boolean.TRUE.toString());
properties.setProperty(PulsarOptions.PULSAR_REQUEST_TIMEOUT.key(),"600000");
PulsarSource<String> src = PulsarSource.builder()
.setServiceUrl(PULSAR_HOST)
.setAdminUrl(PULSAR_ADMIN_HOST)
.setProperties(properties)
.setConfig(PulsarSourceOptions.PULSAR_PARTITION_DISCOVERY_INTERVAL_MS,10000000L)
.setStartCursor(StartCursor.earliest())
.setDeserializationSchema(PulsarDeserializationSchema.flinkSchema(new SimpleStringSchema()))
.setSubscriptionName("test-subscription-local")
.setSubscriptionType(SubscriptionType.Failover)
.setConsumerName(String.format("test-consumer-local"))
.setTopics(topics).build();
final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
env.getConfig().setAutoWatermarkInterval(0L);
env.addDefaultKryoSerializer(DateTime.class, JodaDateTimeSerializer.class);
String sourceName = String.format("pulsar-source-local");
DataStream<String> stream = env.fromSource(src,
WatermarkStrategy.noWatermarks(),sourceName)
.setParallelism(1)
.uid(sourceName)
.name(sourceName);
stream
.process(new TestProcessFunction()).setParallelism(1)
.uid(String.format("test-job-pf"))
.name(String.format("test-job-pf"))
.addSink(new TestSink()).setParallelism(1)
.uid(String.format("sink-job"))
.name(String.format("sink-job"));
}}
Messages = M-1 ..... M-10
Expected behavior
Upon the acknowledgment, messages should not be appearing again.
Upon job restart after ensuring it has processed all the messages, the messages keep coming back.
We saw that the cumulativeAcknowledgement() function is invoked all the time with or without checkpoint enabled.

Job Executor : job manager or taskmanager

public static void main(String[] args) throws Exception {
final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
env.setRuntimeMode(RuntimeExecutionMode.BATCH);
ParameterTool parameters = ParameterTool.fromArgs(args);
String ftpUri = "ftp://data-injest:12345#ftp-server:21/input/" + parameters.get("input-file-name");
String fileUri = parameters.get("ftp").toUpperCase(Locale.ROOT).equals("TRUE")?ftpUri:localUri;
MapFunction<String,Tuple2<Long,Collection<Some>>> mapFunction = { some code };
SomeSink sink = new SomeSink();
env.readTextFile(fileUri,"UTF-8")
.map(mapFunction)
.keyBy(tuple2 -> tuple2.f0)
.reduce((tuple2, t1) -> {
some-logic-including-loggers
}).addSink(sink);
env.execute("OPIS-PRICE-FEED-with-" + parameters.get("input-file-name"));
}
Which node executes the logic , eg ftpUri definitions above.
I have tried to attach debuger to both job manager and task manager with breakpoints but I dont see those lines enabled.
If a logger statement is added in the same section , which node logger would contain it.
That setup code is executed in the client, and not in the job manager or task managers.

Flink JDBC Sink part 2

I have posted a question few days back- Flink Jdbc sink
Now, I am trying to use the sink provided by flink.
I have written the code and it worked as well. But nothing got saved in DB and no exceptions were there. Using previous sink my code was not finishing(that should happen ideally as its a streaming app) but after the following code I am getting no error and the nothing is getting saved to DB.
public class CompetitorPipeline implements Pipeline {
private final StreamExecutionEnvironment streamEnv;
private final ParameterTool parameter;
private static final Logger LOG = LoggerFactory.getLogger(CompetitorPipeline.class);
public CompetitorPipeline(StreamExecutionEnvironment streamEnv, ParameterTool parameter) {
this.streamEnv = streamEnv;
this.parameter = parameter;
}
#Override
public KeyedStream<CompetitorConfig, String> start(ParameterTool parameter) throws Exception {
CompetitorConfigChanges competitorConfigChanges = new CompetitorConfigChanges();
KeyedStream<CompetitorConfig, String> competitorChangesStream = competitorConfigChanges.run(streamEnv, parameter);
//Add to JBDC Sink
competitorChangesStream.addSink(JdbcSink.sink(
"insert into competitor_config_universe(marketplace_id,merchant_id, competitor_name, comp_gl_product_group_desc," +
"category_code, competitor_type, namespace, qualifier, matching_type," +
"zip_region, zip_code, competitor_state, version_time, compConfigTombstoned, last_updated) values (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)",
(ps, t) -> {
ps.setInt(1, t.getMarketplaceId());
ps.setLong(2, t.getMerchantId());
ps.setString(3, t.getCompetitorName());
ps.setString(4, t.getCompGlProductGroupDesc());
ps.setString(5, t.getCategoryCode());
ps.setString(6, t.getCompetitorType());
ps.setString(7, t.getNamespace());
ps.setString(8, t.getQualifier());
ps.setString(9, t.getMatchingType());
ps.setString(10, t.getZipRegion());
ps.setString(11, t.getZipCode());
ps.setString(12, t.getCompetitorState());
ps.setTimestamp(13, Timestamp.valueOf(t.getVersionTime()));
ps.setBoolean(14, t.isCompConfigTombstoned());
ps.setTimestamp(15, new Timestamp(System.currentTimeMillis()));
System.out.println("sql"+ps);
},
new JdbcConnectionOptions.JdbcConnectionOptionsBuilder()
.withUrl("jdbc:mysql://127.0.0.1:3306/database")
.withDriverName("com.mysql.cj.jdbc.Driver")
.withUsername("xyz")
.withPassword("xyz#")
.build()));
return competitorChangesStream;
}
}
You need enable autocommit mode for jdbc Sink.
new JdbcConnectionOptions.JdbcConnectionOptionsBuilder()
.withUrl("jdbc:mysql://127.0.0.1:3306/database;autocommit=true")
It looks like SimpleBatchStatementExecutor only works in auto-commit mode. And if you need to commit and rollback batches, then you have to write your own ** JdbcBatchStatementExecutor **
Have you tried to include the JdbcExecutionOptions ?
dataStream.addSink(JdbcSink.sink(
sql_statement,
(statement, value) -> {
/* Prepared Statement */
},
JdbcExecutionOptions.builder()
.withBatchSize(5000)
.withBatchIntervalMs(200)
.withMaxRetries(2)
.build(),
new JdbcConnectionOptions.JdbcConnectionOptionsBuilder()
.withUrl("jdbc:mysql://127.0.0.1:3306/database")
.withDriverName("com.mysql.cj.jdbc.Driver")
.withUsername("xyz")
.withPassword("xyz#")
.build()));

Apache Flink EventTime processing not working

I am trying to perform stream-stream join using Flink v1.11 app on KDA. Join wrt to ProcessingTime works, but with EventTime I don’t see any output records from Flink.
Here is my code with EventTime processing which is not working,
public static void main(String[] args) throws Exception {
final StreamExecutionEnvironment env =
StreamExecutionEnvironment.getExecutionEnvironment();
env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);
DataStream<Trade> input1 = createSourceFromInputStreamName1(env)
.assignTimestampsAndWatermarks(
WatermarkStrategy.<Trade>forMonotonousTimestamps()
.withTimestampAssigner(((event, l) -> event.getEventTime()))
);
DataStream<Company> input2 = createSourceFromInputStreamName2(env)
.assignTimestampsAndWatermarks(
WatermarkStrategy.<Company>forMonotonousTimestamps()
.withTimestampAssigner(((event, l) -> event.getEventTime()))
);
DataStream<String> joinedStream = input1.join(input2)
.where(new TradeKeySelector())
.equalTo(new CompanyKeySelector())
.window(TumblingEventTimeWindows.of(Time.seconds(30)))
.apply(new JoinFunction<Trade, Company, String>() {
#Override
public String join(Trade t, Company c) {
return t.getEventTime() + ", " + t.getTicker() + ", " + c.getName() + ", " + t.getPrice();
}
});
joinedStream.addSink(createS3SinkFromStaticConfig());
env.execute("Flink S3 Streaming Sink Job");
}
I got a similar join working with ProcessingTime
public static void main(String[] args) throws Exception {
final StreamExecutionEnvironment env =
StreamExecutionEnvironment.getExecutionEnvironment();
env.setStreamTimeCharacteristic(TimeCharacteristic.ProcessingTime);
DataStream<Trade> input1 = createSourceFromInputStreamName1(env);
DataStream<Company> input2 = createSourceFromInputStreamName2(env);
DataStream<String> joinedStream = input1.join(input2)
.where(new TradeKeySelector())
.equalTo(new CompanyKeySelector())
.window(TumblingProcessingTimeWindows.of(Time.milliseconds(10000)))
.apply (new JoinFunction<Trade, Company, String> (){
#Override
public String join(Trade t, Company c) {
return t.getEventTime() + ", " + t.getTicker() + ", " + c.getName() + ", " + t.getPrice();
}
});
joinedStream.addSink(createS3SinkFromStaticConfig());
env.execute("Flink S3 Streaming Sink Job");
}
Sample records from two streams which I am trying to join:
{'eventTime': 1611773705, 'ticker': 'TBV', 'price': 71.5}
{'eventTime': 1611773705, 'ticker': 'TBV', 'name': 'The Bavaria'}
I don't see anything obviously wrong, but any of the following could cause this job to not produce any output:
A problem with watermarking. For example, if one of the streams becomes idle, then the watermarks will cease to advance. Or if there are no events after a window, then the watermark will not advance far enough to close that window. Or if the timestamps aren't actually in ascending order (with the forMonotonousTimestamps strategy, the events should be in order by timestamp), the pipeline could be silently dropping all of the out-of-order events.
The StreamingFileSink only finalizes its output during checkpointing, and does not finalize whatever files are pending if and when the job is stopped.
A windowed join behaves like an inner join, and requires at least one event from each input stream in order to produce any results for a given window interval. From the example you shared, it looks like this is not the issue.
Update:
Given that what you (appear to) want to do is to join each Trade with the latest Company record available at the time of the Trade, a lookup join or a temporal table join seem like they might be good approaches.
Here are a couple of examples:
https://github.com/ververica/flink-sql-cookbook/blob/master/joins/04/04_lookup_joins.md
https://github.com/ververica/flink-sql-cookbook/blob/master/joins/03/03_kafka_join.md
Some documentation:
https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/streaming/joins.html#event-time-temporal-join
https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/streaming/versioned_tables.html

Flink SQL: How can use a Long type column to Rowtime

Flink1.9.1
I read a csv file. I want to use a long type column to TUMBLE.
I use UDF transfer Long type to Timestamp type,but is can't work
error message: Window can only be defined over a time attribute column.
I try to debug. TimeIndicatorRelDataType is not Timestamp,I don't know how to transfer and why?
def isTimeIndicatorType(relDataType: RelDataType): Boolean = relDataType match {
case ti: TimeIndicatorRelDataType => true
case _ => false
}
CODE
public static void main(String[] args) throws Exception {
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);
env.setParallelism(1);
// read csv
URL fileUrl = HotItemsSql.class.getClassLoader().getResource("UserBehavior-less.csv");
CsvTableSource csvTableSource = CsvTableSource.builder().path(fileUrl.getPath())
.field("userId", BasicTypeInfo.LONG_TYPE_INFO)
.field("itemId", BasicTypeInfo.LONG_TYPE_INFO)
.field("categoryId", BasicTypeInfo.LONG_TYPE_INFO)
.field("behavior", BasicTypeInfo.LONG_TYPE_INFO)
.field("optime", BasicTypeInfo.LONG_TYPE_INFO)
.build();
// trans to stream
DataStream<Row> csvDataStream=csvTableSource.getDataStream(env).assignTimestampsAndWatermarks(new AscendingTimestampExtractor<Row>() {
#Override
public long extractAscendingTimestamp(Row element) {
return Timestamp.valueOf(element.getField(5).toString()).getTime();
}
}).broadcast();
StreamTableEnvironment tableEnv = StreamTableEnvironment.create(env);
tableEnv.registerDataStream("T_UserBehavior",csvDataStream,"userId,itemId,categoryId,behavior,optime");
tableEnv.registerFunction("Long2DateTime",new DateTransFunction());
Table result = tableEnv.sqlQuery("select userId," +
"TUMBLE_START(Long2DateTime(optime), INTERVAL '10' SECOND) as window_start," +
"TUMBLE_END(Long2DateTime(optime), INTERVAL '10' SECOND) as window_end " +
"from T_UserBehavior " +
"group by TUMBLE(Long2DateTime(optime),INTERVAL '10' SECOND),userId");
tableEnv.toRetractStream(result, Row.class).print();
UDF
import java.sql.Timestamp;
public class DateTransFunction extends ScalarFunction {
public Timestamp eval(Long longTime) {
try {
Timestamp t = new Timestamp(longTime);
return t;
} catch (Exception e) {
return null;
}
}
}
error stack
Exception in thread "main" org.apache.flink.table.api.ValidationException: Window can only be defined over a time attribute column.
at org.apache.flink.table.plan.rules.datastream.DataStreamLogicalWindowAggregateRule.getOperandAsTimeIndicator$1(DataStreamLogicalWindowAggregateRule.scala:85)
at org.apache.flink.table.plan.rules.datastream.DataStreamLogicalWindowAggregateRule.translateWindowExpression(DataStreamLogicalWindowAggregateRule.scala:90)
at org.apache.flink.table.plan.rules.common.LogicalWindowAggregateRule.onMatch(LogicalWindowAggregateRule.scala:68)
at org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:319)
at org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:560)
at org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:419)
at org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:256)
at org.apache.calcite.plan.hep.HepInstruction$RuleInstance.execute(HepInstruction.java:127)
at org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:215)
at org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:202)
at org.apache.flink.table.plan.Optimizer.runHepPlanner(Optimizer.scala:228)
at org.apache.flink.table.plan.Optimizer.runHepPlannerSequentially(Optimizer.scala:194)
at org.apache.flink.table.plan.Optimizer.optimizeNormalizeLogicalPlan(Optimizer.scala:150)
at org.apache.flink.table.plan.StreamOptimizer.optimize(StreamOptimizer.scala:65)
at org.apache.flink.table.planner.StreamPlanner.translateToType(StreamPlanner.scala:410)
at org.apache.flink.table.planner.StreamPlanner.org$apache$flink$table$planner$StreamPlanner$$translate(StreamPlanner.scala:182)
Since you already managed to assign a timestamp in DataStream API, you should be able to call:
tableEnv.registerDataStream(
"T_UserBehavior",
csvDataStream,
"userId, itemId, categoryId, behavior, rt.rowtime");
The .rowtime instructs the API to create column with the timestamp stored in every stream record coming from DataStream API.
The community is currently working on making your program easier. In Flink 1.10 you should be able to define your CSV with rowtime table directly in a SQL DDL.

Resources