Flink jdbc sink not commiting in web ui - apache-flink

I have a problem with one of my new developed flink jobs.
When i run it in IntelliJ the job is working fine and commiting records to the database.
Next step was to upload it to the flink web ui and execute it there.
The database connection is established and also the inserts seem to be sended to the oracle database but the data seems to be not commited.
Im using a DataStream with the following setup:
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
env.enableCheckpointing(10000);
...
DataStreamSink<POJO> pojoSink = filteredStream
.addSink(JdbcSink.sink(
sqlString,
JdbcStatementBuilder,
new JdbcConnectionOptions.JdbcConnectionOptionsBuilder()
.withUrl(url)
.withDriverName(driver)
.withUsername(user)
.withPassword(password)
.build());
I have no clue why it works on my laptop in the IDE but not on at the server via the web ui.
The server logs are also not having any errors and showing the checkpoints.
Maybe someone has a suggestion where i can have a look what the problem might be.
Cheers

It seems like it was a one time error. At the next time the job run perfectly.

Related

Using flink sql client to submit sql query. How to I restore from checkpoint or savepoint

I have started a local flink cluster using
./start-cluster.sh
I have started a local sql-client using
./sql-client.sh
I am able to submit sql statement in Flink SQL terminal.
I have run Set 'state.checkpoints.dir' = 'file:///tmp/flink-savepoints-directory-from-set'; --> I can see checkpoint folder and getting created and updated when the sql job is running. ( sql job is reading from a kafka topic, does some joins and writing to another topic).
When I cancel the job from the flink UI and submit the sql again, the job does not restore from the state. ( I am basing this on the fact that the output or final sink, emits the same message on every restart, its like the job is reading the beginning of source topic again).
I have not shutdown the flink cluster or kafka cluster.
I have 2 questions
How I get the sql query to restore from state ?
Is there a way to use flink run -s ... command to submit sql query directly instead of packaging this as a jar ?
This is documented at https://nightlies.apache.org/flink/flink-docs-stable/docs/dev/table/sqlclient/#start-a-sql-job-from-a-savepoint
SET 'execution.savepoint.path' = '/tmp/flink-savepoints/savepoint-cca7bc-bb1e257f0dab';

Purging Logs in SQL Database - Serilog

I am using serlog for logging in my webapi and working fine. I used SQL Server to log and the following is the serlog config for the same.
__serilogLogger = new LoggerConfiguration()
.Enrich.WithProperty("ApplicationIPv4", _ipv4)
.Enrich.WithProperty("ApplicationIPv6", _ipv6)
.WriteTo.MSSqlServer(connectionString, tableName /*, columnOptions: columnOptions*/)
.WriteTo
.Seq(ConfigurationManager.AppSettings["SerilogServer"])
.CreateLogger();
I am beginner in serilog. My confusion is how to purge the logs in database. Any options in serilog to hold last 3 months data only like that.
Based on chat in serilog Gitter, there is no option for that. We can do using Sql Job Agent or any other scheduled job.

Oracle change notification Exception

My one of the table holds data for business transactions and I have to run a job when there's no transaction for interval 5 minutes, I am trying to achieve this using Timer() in java. So to get notified if any transaction is executed I need some triggering ( I do not have code access as it is 3rd party tool ) for that purpose I am using database change notification.
However while running this I get below error very often. I am using java 1.6, ojdbc6.jar for connection purpose and the application is running on weblogic with oracle 11g database.
Exception in thread "Thread-4" java.lang.IndexOutOfBoundsException at java.nio.Buffer.checkIndex(Buffer.java:540) at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:139) at oracle.jdbc.driver.NTFConnection.unmarshalOneNSPacket(NTFCon‌​nection.java:334) at oracle.jdbc.driver.NTFConnection.run(NTFConnection.java:182)
Please modify the example http://appcrawler.com/wordpress/2012/08/28/jdbc-and-oracle-database-change-notification/ for your listener and check if issue still exists. My understading that issue is not related to Oracle DB, but part of Java realization of your code. Please, add java tag into your question as well.

Data synchronization issue when use connection pool on tomcat server

I am developing a Servlet applicaiton. It obtains a database connection from the connection pool supported by the Tomcat container to query and update database data.
I run into a problem. The Servlet gets a database connection and then add a new table row or delete a table row. After that, it commits the change. Later, a connection is obtained to execute queries. I find that the data returned from the queries using the second connection do not reflect the change made with the first database connection.
Isn't it strange? The changes made with the first database connection have been committed successfully. Why the new rows inserted do not appear in the later query? Why the rows deleted still appear in the later query?
Does it relate to the setting of transaction level?
Can anyone help?
03-12: More Information (#1):
I use MySQL Community Server 5.6.
My servlet runs on Tomcat 7.0.41.0.
The Resource element in the conf/server.xml is as follows:
<Resource type="javax.sql.DataSource"
name="jdbc/storewscloud"
factory="org.apache.tomcat.jdbc.pool.DataSourceFactory"
driverClassName="com.mysql.jdbc.Driver"
url="jdbc:mysql://localhost:3306/myappdb"
maxActive="100"
minIdle="10"
maxWait="10000"
initialSize="10"
removeAbandonedTimeout="60"
removeAbandoned="true"
logAbandoned="true"
username="root"
password="xxxxxxxxxx"
/></li>
I do not use any cache explicitly.
Every time the servlet gets a database connection, it turns the auto-commit mode of the connection off.
When the servlet is invoked, a database connection is obtained. The servet uses it to update data in the database. After that, it commits the changes. Then, it uses Apache HttpClients to invoke the same servlet to do some other thing which also obtains a database connection and execute query. The later query returns 'old' data. If I refresh the web page, the latest data are shown. It looks like some party, mysql jdbc driver or connection object, cache the data somewhere. I have no clue.
03-12: More Information (#2):
I did an experiment getting a connection without using the connection pool. The result is correct. So, the problem is caused by the connection pool.
To make the query return right data using the 2nd connection from the pool, I need to not only commit the data changes using the 1st connection from the pool but also CLOSE the 1st connection.
It seems that the data changes made are not completely saved in the database even the commit() is called until the close() is called.
Why?
I found that there is a new version of C3P0 connection pool released recently. I gave it a try. It works! The problems I had do not occur. Therefore, I use it to replace the bundled connection pool of the Tomcat server. For those who encounter the same problem as I do, C3P0 maybe a solution for you too.
C3P0 Project URL

Manage Test Data: Can you enlist all db connections into a single transaction?

We are currently using Watin to do UI testing on our web application. In effect we are doing integration testing from top to bottom since we are using a test database and not mocking.
In order to make sure the test database is in an expected state we have previously been using SQL Server's snapshot feature to rollback the database at the beginning of each test. This is fairly slow and also causes an error immediately after the snapshot is restored.
Since each the tests are invoking the UI and potentially using multiple db connections, we have no way of start a transaction on each connection.
I was wondering if it is possible to somehow attach all database connections to a single transaction and roll them back at a later point? This would probably have to happen at the db level itself.
If anyone has any other suggestions on how to reset our test data for each UI test I'd love to hear your ideas.
If you fire up, in process, an instance of the Visual Studio development web server, and then run your WatiN test, then you can wrap the test in a single block like so:
using (new TransactionScope())
{
var server = new Server(PORT_NUMBER, VIRTUAL_PATH, PHYSICAL_PATH);
server.Start();
try
{
using (var ie = new IE())
{
// TODO: perform necessary testing using ie object
}
}
finally
{
server.Stop();
}
}
and all your database connections will in theory enlist in a single distributed transaction and their changes will all be rolled back when the TransactionScope is disposed.
To run the dev web server in process, you will need to extract WebDev.WebHost.dll from the GAC and reference it in your project - this is the source of the Server class in the snippet above. Please let me know if you need more detailed instructions.
You'll need to make sure MSDTC is running, and if there are firewalls between you and the databases then depending on the port settings you may struggle. One added bonus of firing up the server in process is that WatiN tests can now contribute to measurements of code coverage.

Resources