how to make sure that flink job has finished executing and then perform some tasks - apache-flink

I want to perform some tasks after flink job is completed,I am not having any issues when I run code in Intellij but there are isssues when I run Flink jar in a shell file. I am using below line to make sure that execution of flink program is complete
//start the execution
JobExecutionResult jobExecutionResult = envrionment.execute(" Started the execution ");
is_job_finished = jobExecutionResult.isJobExecutionResult();
I am not sure, if above check is correct or not ?
Then I am using the above varible in below method to perform some tasks
if(print_mode && is_job_finished){
System.out.println(" \n \n -- System related variables -- \n");
System.out.println(" Stream_join Window length = " + WindowLength_join__ms + " milliseconds");
System.out.println(" Input rate for stream RR = " + input_rate_rr_S + " events/second");
System.out.println("Stream RR Runtime = " + Stream_RR_RunTime_S + " seconds");
System.out.println(" # raw events in stream RR = " + Total_Number_Of_Events_in_RR + "\n");
}
Any suggestions ?

You can register a job listener to execution environment.
For example
env.registerJobListener(new JobListener {
//Callback on job submission.
override def onJobSubmitted(jobClient: JobClient, throwable: Throwable): Unit = {
if (throwable == null) {
log.info("SUBMIT SUCCESS")
} else {
log.info("FAIL")
}
}
//Callback on job execution finished, successfully or unsuccessfully.
override def onJobExecuted(jobExecutionResult: JobExecutionResult, throwable: Throwable): Unit = {
if (throwable == null) {
log.info("SUCCESS")
} else {
log.info("FAIL")
}
}
})

Register a JobListener to your StreamExecutionEnvironment.

JobListener is grate program if not SQL API.
if use SQL API, onJobExecuted will never be called. I have a idea, you can refer to it. the source is Kafka, sink can use any type.
let me explain it :
EndSign: follow to last data. when your Flink job consumed it, meaning the partition element rest is empty.
Close loigcal:
When you flink job processing EndSign. job need to call JobController, then JobController counter +1
Until the JobController counter equals partition count. then JobController will check consumer group lag, ensure Flink job get all data.
Now, we know the job is finished

Related

Running cmd from winform

I have a question about running cmd from winform.
I have managed to connect and get information from the remote machine(infotrend disk server).
However I could not make a operation on the remote machine such "create disk part
below I have written these code....
InfotrendProcess.StartInfo.UseShellExecute = false;
InfotrendProcess.StartInfo.RedirectStandardInput = true;
InfotrendProcess.StartInfo.RedirectStandardOutput = true;
InfotrendProcess.StartInfo.RedirectStandardError = true;
InfotrendProcess.StartInfo.WorkingDirectory = workingPath;
InfotrendProcess.StartInfo.FileName = "C:\\Windows\\System32\\cmd.exe";
InfotrendProcess.StartInfo.WindowStyle = ProcessWindowStyle.Hidden;
InfotrendProcess.StartInfo.CreateNoWindow = true;
const string quote = "\"";
cmd_message = "java -jar " + quote + "runCLI" + quote; (MANAGED TO START CLI RUN COMMAND STEP 1)
InfotrendProcess.StartInfo.Arguments = "/K " + cmd_message;
InfotrendProcess.OutputDataReceived += InfotrendProcess_OutputDataReceived1;
InfotrendProcess.ErrorDataReceived += InfotrendProcess_ErrorDataReceived;
InfotrendProcess.Start();
using (StreamWriter sw = InfotrendProcess.StandardInput) (MANAGED TO CONNECT AND SEND DELETE
PART COMMAND STEP 2)
{
sw.WriteLine("connect " + IPNumber);
Thread.Sleep(1000);
sw.WriteLine("del part 01D9F2C6614DF837"); (COULD NOT SEND RESPOND y/s, MAKES NEW LINE)
sw.WriteLine("y")
}
I could not send the "y/n" respond because I have to send it in the same line after a period. Instead it makes a new line. Should I use a different way? Could anyone help me, how I can run the final command.

My H2/C3PO/Hibernate setup does not seem to preserving prepared statements?

I am finding my database is the bottleneck in my application, as part of this it looks like Prepared statements are not being reused.
For example here method I use
public static CoverImage findCoverImageBySource(Session session, String src)
{
try
{
Query q = session.createQuery("from CoverImage t1 where t1.source=:source");
q.setParameter("source", src, StandardBasicTypes.STRING);
CoverImage result = (CoverImage)q.setMaxResults(1).uniqueResult();
return result;
}
catch (Exception ex)
{
MainWindow.logger.log(Level.SEVERE, ex.getMessage(), ex);
}
return null;
}
But using Yourkit profiler it says
com.mchange.v2.c3po.impl.NewProxyPreparedStatemtn.executeQuery() Count 511
com.mchnage.v2.c3po.impl.NewProxyConnection.prepareStatement() Count 511
and I assume that the count for prepareStatement() call should be lower, ais it is looks like we create a new prepared statment every time instead of reusing.
https://docs.oracle.com/javase/7/docs/api/java/sql/Connection.html
I am using C3po connecting poolng wehich complicates things a little, but as I understand it I have it configured correctly
public static Configuration getInitializedConfiguration()
{
//See https://www.mchange.com/projects/c3p0/#hibernate-specific
Configuration config = new Configuration();
config.setProperty(Environment.DRIVER,"org.h2.Driver");
config.setProperty(Environment.URL,"jdbc:h2:"+Db.DBFOLDER+"/"+Db.DBNAME+";FILE_LOCK=SOCKET;MVCC=TRUE;DB_CLOSE_ON_EXIT=FALSE;CACHE_SIZE=50000");
config.setProperty(Environment.DIALECT,"org.hibernate.dialect.H2Dialect");
System.setProperty("h2.bindAddress", InetAddress.getLoopbackAddress().getHostAddress());
config.setProperty("hibernate.connection.username","jaikoz");
config.setProperty("hibernate.connection.password","jaikoz");
config.setProperty("hibernate.c3p0.numHelperThreads","10");
config.setProperty("hibernate.c3p0.min_size","1");
//Consider that if we have lots of busy threads waiting on next stages could we possibly have alot of active
//connections.
config.setProperty("hibernate.c3p0.max_size","200");
config.setProperty("hibernate.c3p0.max_statements","5000");
config.setProperty("hibernate.c3p0.timeout","2000");
config.setProperty("hibernate.c3p0.maxStatementsPerConnection","50");
config.setProperty("hibernate.c3p0.idle_test_period","3000");
config.setProperty("hibernate.c3p0.acquireRetryAttempts","10");
//Cancel any connection that is more than 30 minutes old.
//config.setProperty("hibernate.c3p0.unreturnedConnectionTimeout","3000");
//config.setProperty("hibernate.show_sql","true");
//config.setProperty("org.hibernate.envers.audit_strategy", "org.hibernate.envers.strategy.ValidityAuditStrategy");
//config.setProperty("hibernate.format_sql","true");
config.setProperty("hibernate.generate_statistics","true");
//config.setProperty("hibernate.cache.region.factory_class", "org.hibernate.cache.ehcache.SingletonEhCacheRegionFactory");
//config.setProperty("hibernate.cache.use_second_level_cache", "true");
//config.setProperty("hibernate.cache.use_query_cache", "true");
addEntitiesToConfig(config);
return config;
}
Using H2 1.3.172, Hibernate 4.3.11 and the corresponding c3po for that hibernate version
With reproducible test case we have
HibernateStats
HibernateStatistics.getQueryExecutionCount() 28
HibernateStatistics.getEntityInsertCount() 119
HibernateStatistics.getEntityUpdateCount() 39
HibernateStatistics.getPrepareStatementCount() 189
Profiler, method counts
GooGooStaementCache.aquireStatement() 35
GooGooStaementCache.checkInStatement() 189
GooGooStaementCache.checkOutStatement() 189
NewProxyPreparedStatement.init() 189
I don't know what I shoud be counting as creation of prepared statement rather than reusing an existing prepared statement ?
I also tried enabling c3p0 logging by adding a c3p0 logger ands making it use same log file in my LogProperties but had no effect.
String logFileName = Platform.getPlatformLogFolderInLogfileFormat() + "songkong_debug%u-%g.log";
FileHandler fe = new FileHandler(logFileName, LOG_SIZE_IN_BYTES, 10, true);
fe.setEncoding(StandardCharsets.UTF_8.name());
fe.setFormatter(new com.jthink.songkong.logging.LogFormatter());
fe.setLevel(Level.FINEST);
MainWindow.logger.addHandler(fe);
Logger c3p0Logger = Logger.getLogger("com.mchange.v2.c3p0");
c3p0Logger.setLevel(Level.FINEST);
c3p0Logger.addHandler(fe);
Now that I have eventually got c3p0Based logging working and I can confirm the suggestion of #Stevewaldman is correct.
If you enable
public static Logger c3p0ConnectionLogger = Logger.getLogger("com.mchange.v2.c3p0.stmt");
c3p0ConnectionLogger.setLevel(Level.FINEST);
c3p0ConnectionLogger.setUseParentHandlers(false);
Then you get log output of the form
24/08/2019 10.20.12:BST:FINEST: com.mchange.v2.c3p0.stmt.DoubleMaxStatementCache ----> CACHE HIT
24/08/2019 10.20.12:BST:FINEST: checkoutStatement: com.mchange.v2.c3p0.stmt.DoubleMaxStatementCache stats -- total size: 347; checked out: 1; num connections: 13; num keys: 347
24/08/2019 10.20.12:BST:FINEST: checkinStatement(): com.mchange.v2.c3p0.stmt.DoubleMaxStatementCache stats -- total size: 347; checked out: 0; num connections: 13; num keys: 347
making it clear when you get a cache hit. When there is no cache hit yo dont get the first line, but get the other two lines.
This is using C3p0 9.2.1

Missing events for listeners dronekit

I'm using dronekit and using event listeners to keep a track of camera video recording Status. This is because I didn't find a way to identify the recording status. So I'm keeping track of the commands that I'm sending and changing the modes if they are successful.
But I observed that my listener is not receiving all events. Is this a common issue? Can it be Fixed? Is there a frequency setting that I need to change?
#vehicle.on_message('GOPRO_SET_RESPONSE')
def listener(self, name, message):
global mode, recording, way_points, nadir_taken
if message.cmd_id == 2:
log.debug('Shutter:%s' % message)
if message.status == 0:
if mode == MODE_VIDEO:
if recording:
recording = False
log.info("Stopped video")
# message_handler.set(message_handler.get() + " Stopped Recording.")
record_handler.set(NO_STRING)
plot.info(STOP_STRING_VIDEO)
note.info(STOP_STRING_VIDEO)
thread.start_new(speak, (VIDEO_RECORD_ON_MSG,))
else:
recording = True
log.info("started recording video")
# message_handler.set(message_handler.get() + "\n Started Recording.")
record_handler.set(YES_STRING)
plot.info(START_STRING_VIDEO)
note.info(START_STRING_VIDEO)
thread.start_new(speak, (VIDEO_RECORD_OFF_MSG,))
else:
log.info("Image Captured at %s", str(loc))
else:
log.info('Unidentified Message:%s' % message)

Solr 6.0.0 - SolrCloud java example

I have solr installed on my localhost.
I started standard solr cloud example with embedded zookeepr.
collection: gettingstarted
shards: 2
replication : 2
500 records/docs to process time took 115 seconds[localhost tetsing] -
why is this taking this much time to process just 500 records.
is there a way to improve this to some millisecs/nanosecs
NOTE:
I have tested the same on remote machine solr instance, localhost having data index on remote solr [inside java commented]
I started my solr myCloudData collection with Ensemble with single zookeepr.
2 solr nodes,
1 Ensemble zookeeper standalone
collection: myCloudData,
shards: 2,
replication : 2
Solr colud java code
package com.test.solr.basic;
import java.io.IOException;
import java.util.concurrent.TimeUnit;
import org.apache.solr.client.solrj.SolrClient;
import org.apache.solr.client.solrj.SolrServerException;
import org.apache.solr.client.solrj.impl.CloudSolrClient;
import org.apache.solr.client.solrj.impl.HttpSolrClient;
import org.apache.solr.common.SolrInputDocument;
public class SolrjPopulatorCloudClient2 {
public static void main(String[] args) throws IOException,SolrServerException {
//String zkHosts = "64.101.49.57:2181/solr";
String zkHosts = "localhost:9983";
CloudSolrClient solrCloudClient = new CloudSolrClient(zkHosts, true);
//solrCloudClient.setDefaultCollection("myCloudData");
solrCloudClient.setDefaultCollection("gettingstarted");
/*
// Thread Safe
solrClient = new ConcurrentUpdateSolrClient(urlString, queueSize, threadCount);
*/
// Depreciated - client
//HttpSolrServer server = new HttpSolrServer("http://localhost:8983/solr");
long start = System.nanoTime();
for (int i = 0; i < 500; ++i) {
SolrInputDocument doc = new SolrInputDocument();
doc.addField("cat", "book");
doc.addField("id", "book-" + i);
doc.addField("name", "The Legend of the Hobbit part " + i);
solrCloudClient.add(doc);
if (i % 100 == 0)
System.out.println(" Every 100 records flush it");
solrCloudClient.commit(); // periodically flush
}
solrCloudClient.commit();
solrCloudClient.close();
long end = System.nanoTime();
long seconds = TimeUnit.NANOSECONDS.toSeconds(end - start);
System.out.println(" All records are indexed, took " + seconds + " seconds");
}
}
You are committing every new document, which is not necessary. It will run a lot faster if you change the if (i % 100 == 0) block to read
if (i % 100 == 0) {
System.out.println(" Every 100 records flush it");
solrCloudClient.commit(); // periodically flush
}
On my machine, this indexes your 500 records in 14 seconds. If I remove the commit() call from the for loop, it indexes in 7 seconds.
Alternatively, you can add a commitWithinMs parameter to the solrCloudClient.add() call:
solrCloudClient.add(doc, 15000);
This will guarantee your records are committed within 15 seconds, and also increase your indexing speed.

Apache Camel consumer template to copy file, cannot copy one file twice

Hi I am using apache camel 2.15.2. I have got a consumer template so that I can copy file with dynamic file names:
if (fileInfo != null) {
filename = fileInfo.getFileName();
String camelUri = "file://" + fileInfo.getCopyFilePath() + "/?fileName=RAW("
+ filename + ")&noop=false&idempotent=false&readLock=changed";
System.out.println("Camel uri: " + camelUri);
logger.info("Camel uri: " + camelUri);
Exchange ex = consumerTemplate.receive(camelUri);
....
As you can see, I have set noop, and idempotent explicitly to achieve copying same file more than once. But, it does not do that. It hangs on receive method for subsequent tries to copy a file with same name. It can copy that, only if we restart the application. Any suggestions would be much appreciated. It might be something similar to this issue, but I do not have access to that solution. Thanks in advance.
When I debugged through Camel code, it seems, it is calling EventDrivenPollingConsumer's receive method, and hangs when calls queue.take() (line 110, EventDrivenPollingConsumer). And, even inside that, 'count' variable is zero in ArrayBlockingQueue:
while (count == 0)
notEmpty.await();
Added this, just in case it helps anyone having a clue.
Ok, If I call 'consumerTemplate.doneUoW(ex)', it does copy multiple times. But, at the same time it was deleting (actually moving to .camel folder) the source file, which I did not want to! Then, had to set noop=true:
if (fileInfo != null) {
filename = fileInfo.getFileName();
String camelUri = "file://" + fileInfo.getCopyFilePath() + "/?fileName=RAW("
+ filename + ")&noop=true&idempotent=false&readLock=none";
System.out.println("Camel uri: " + camelUri);
logger.info("Camel uri: " + camelUri);
Exchange ex = consumerTemplate.receive(camelUri);
// consumerTemplate.r
logger.info("File received: " + fileInfo.getFileName());
exchange.getOut().setBody(ex.getIn().getBody());
consumerTemplate.doneUoW(ex);
}
Now, it works as expected.

Resources