How to handle query timeout for Snowflake jdbc? - snowflake-cloud-data-platform

We are using JDBC driver to connect to Snowflake and perform inserts. Using setQueryTimeout on preparedStatement to get the desired timeout behavior. Auto commit is kept default i.e. enabled.
We are observing, On timeout driver tries to cancel the query, however query still committing data into table.
Below is the sample program which uses 1 second as timeout for quick reproducible scenario -
boolean flag = false;
PreparedStatement ps = connection.prepareStatement("insert into Test_Int values (?)");
ps.setQueryTimeout(1);
for (int v =1; v<200; v++) {
ps.setInt(1, v);
ps.addBatch();
flag = true;
if(v%50 == 0) {
try {
ps.executeBatch();
flag = false;
} catch (SQLException se) {
//do not stop execution continue with other batches
}
}
}
if(flag) {
try {
ps.executeBatch();
} catch (SQLException se) {
//do not stop execution continue with other batches
}
}
As per requirement, we are continuing with next batch on SQLException, and all data get committed into table eventhough there is timeout.
Questions -
How does timeout work?
Is there any retry or connection renewal also done by driver in this case?
If the driver initiate cancel command, would query cancellation on DB guaranteed with rollback or it depends?
How to handle timeout-related exceptions in better way in the code?
Thanks you for the help in advance.

Related

In KinesisStreamsSinkWriter, Flink performs retry on Fatal Exception

I had chance to go through some Kinesis Stream classes and found the following method in KinesisStreamsSinkWriter. The first negation condition seems not very correct to my understanding: if the error is fatal, it will use the next condition "failOnError" and then proceed to retry (if failOnError is false). Where as the error is not fatal, Writer will not perform retry.
private boolean isRetryable(Throwable err) {
if (**!KINESIS_FATAL_EXCEPTION_CLASSIFIER.isFatal**(err, getFatalExceptionCons())) {
return false;
}
if (failOnError) {
getFatalExceptionCons()
.accept(new KinesisStreamsException.KinesisStreamsFailFastException(err));
return false;
}
return true;
}
Can you help me confirm if this is a bug or an intended implementation? Thanks in advance.

Insufficient space allocated to copy array contents

I am building up some code to request multiple information from a database in order to have a time table in my front end incl. multiple DB requests.
The problem is that with one particular request where . am using swiftKuery and DispatchGroups i receive ooccasionally but not always an error message in my XCode. This can not be reconstructed by different request but just sometimes happens.
here is a snippet of my code.
var profWorkDaysBreak = [time_workbreaks]()
let groupServiceWorkDayBreaks = DispatchGroup()
...
///WorkdaysBreakENTER AsyncCall
//UnreliableCode ?
profWorkDays.forEach {workDay in
groupServiceWorkDayBreaks.enter()
time_workbreaks.getAll(weekDayId: workDay.id) { results, error in
if let error = error {
print(error)
}
if let results = results {
profWorkDaysBreak.append(contentsOf: results) // The error happens here !
}
groupServiceWorkDayBreaks.leave()
}
}
...
groupServiceWorkDayBreaks.wait()
The results and profWorkDaysBreak variables are the same just sometimes i receive the message:
Fatal error: Insufficient space allocated to copy array contents
This leads to a stop of the execution.
I assume, that maybe the loop might sometimes finish an earlier execution in the DispatchGroup ??? but this is the only think i have as an idea....
Most likely this is caused by some race conditions due to the fact that you modify the array from multiple threads. And if two threads happen to try and alter the array at the same time, you get into problems.
Make sure you serialize the access to the array, this should solve the problem. You can use a semaphore for that:
var profWorkDaysBreak = [time_workbreaks]()
let groupServiceWorkDayBreaks = DispatchGroup()
let semaphore = DispatchSemaphore(value: 0)
...
profWorkDays.forEach { workDay in
groupServiceWorkDayBreaks.enter()
time_workbreaks.getAll(weekDayId: workDay.id) { results, error in
if let error = error {
print(error)
}
if let results = results {
// acquire the lock, perform the non-thread safe operation, release the lock
semaphore.wait()
profWorkDaysBreak.append(contentsOf: results) // The error happens here !
semaphore.signal()
}
groupServiceWorkDayBreaks.leave()
}
}
...
groupServiceWorkDayBreaks.wait()
The semaphore here acts like a mutex, allowing at most one thread to operate on the array. Also I would like to emphasise the fact that the lock should be hold for the least amount of time possible, so that the other threads don't have to wait for too much.
Here is the only way i got my code running reliable so far..
i skipped contents of completely and just went to a forEach Loop
var profWorkDaysBreak = [time_workbreaks]()
let groupServiceWorkDayBreaks = DispatchGroup()
...
///WorkdaysBreakENTER AsyncCall
//UnreliableCode ?
profWorkDays.forEach {workDay in
groupServiceWorkDayBreaks.enter()
time_workbreaks.getAll(weekDayId: workDay.id) { results, error in
if let error = error {
print(error)
}
if let results = results {
results.forEach {profWorkDayBreak in
profWorkDaysBreak.append(profWorkDayBreak)
}
/*
//Alternative causes error !
profWorkDaysBreak.append(contentsOf: results)
*/
}
groupServiceWorkDayBreaks.leave()
}
}
...
groupServiceWorkDayBreaks.wait()

JDBC Connection pooling for SQL Server: DBCP vs C3P0 vs No Pooling

I got this Java webapp which happens to communicate too much with a SQL Server Database. I wanna decide how to manage the connections to this DB in an efficient manner. The first option which pops to mind is using connection pooling third parties. I chose C3P0 and DBCP and prepared some test cases to compare these approaches as follows:
No Pooling:
public static void main(String[] args) {
long startTime=System.currentTimeMillis();
try {
for (int i = 0; i < 100; i++) {
Connection conn = ConnectionManager_SQL.getInstance().getConnection();
String query = "SELECT * FROM MyTable;";
PreparedStatement prest = conn.prepareStatement(query);
ResultSet rs = prest.executeQuery();
if (rs.next()) {
System.out.println(i + ": " + rs.getString("CorpName"));
}
conn.close();
}
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
System.out.println("Finished in: "+(System.currentTimeMillis()-startTime)+" milli secs");
}
DBCP:
public static void main(String[] args) {
long startTime=System.currentTimeMillis();
try {
for (int i = 0; i < 100; i++) {
Connection conn = ConnectionManager_SQL_DBCP.getInstance().getConnection();
String query = "SELECT * FROM MyTable;";
PreparedStatement prest = conn.prepareStatement(query);
ResultSet rs = prest.executeQuery();
if (rs.next()) {
System.out.println(i + ": " + rs.getString("CorpName"));
}
conn.close();
}
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
System.out.println("Finished in: "+(System.currentTimeMillis()-startTime)+" milli secs");
}
C3P0:
public static void main(String[] args) {
long startTime=System.currentTimeMillis();
try {
for (int i = 0; i < 100; i++) {
Connection conn = ConnectionManager_SQL_C3P0.getInstance().getConnection();
String query = "SELECT * FROM MyTable;";
PreparedStatement prest = conn.prepareStatement(query);
ResultSet rs = prest.executeQuery();
if (rs.next()) {
System.out.println(i + ": " + rs.getString("CorpName"));
}
conn.close();
}
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
System.out.println("Finished in: "+(System.currentTimeMillis()-startTime)+" milli secs");
}
And Here is the results:
Max Pool size for c3p0 and dbcp=10
c3p0: 5534 milli secs
dbcp: 4807 milli secs
No Pooling: 2660 milli secs
__
Max Pool size for c3p0 and dbcp=100
c3p0: 4937 milli secs
dbcp: 4798 milli secs
No Pooling: 2660 milli secs
One might say the initialization and startup time of pooling libraries might affect the results of these test cases. I have repeated them with larger numbers in the loop and results are almost the same.
Surprisingly the no pooling approach is much more faster than connection pooling methods. While I assume when we close a connection physically, getting a new one must be more time consuming.
So, what's going on here?
EDIT_01: c3p0 and dbcp configurations
c3p0:
cpds.setMinPoolSize(5);
cpds.setAcquireIncrement(5);
cpds.setMaxPoolSize(100);
cpds.setMaxStatements(1000);
dbcp:
basicDataSource.setMinIdle(5);
basicDataSource.setMaxIdle(30);
basicDataSource.setMaxTotal(100);
basicDataSource.setMaxOpenPreparedStatements(180);
The rest of configurations are left as default. Worth to mention that all connections are established for a DB on localhost.
c3p0 is not deader than a doornail. It's old but (somewhat) actively maintained. Whether newer alternatives better suit your application is for you to decide.
What version of c3p0 are you using? If you think it is deader than a doornail, are you using an old version? You should be using 0.9.5.2.
The outcome of the test as you've defined it will be highly dependent on lots of things difficult to evaluate with the information you've provided. As Mark Rotteveel points out, you've not shown any information about your config. You've not said anything about the location of the SQL Server. You'll notice greater benefit from a Connection pool when the database is remote than when it is local, as some of the performance improvement comes from amortizing the network latency of Connection acquisition over multiple client uses. Your test executes a query and iterates through the result set. The longer the result set, the more you'll see overhead from the Connection pool (which must proxy the ResultSet) overtake the benefits of faster Connection acquisition. (The numbers you are getting look unusually bad, though. c3p0 typically has very fast ResultSet passthrough performance.) With a sufficiently long queries, the cost of Connection acquisition becomes negligible, if iterating through a ResultSet, the overhead of the pooling library increases, making a Connection pool not so useful.
But this is far from the typical use case for web or mobile clients, which usually make short queries, inserts, and updates. For short queries, inserts, and updates, the cost of a de novo Connection acquisition can be very large relative to the execution of the query. This is the use-case for which Connection pools offer a large improvement. That may not be what you are testing; it depends on how big MyTable is.

Behavior of initial.min.cluster.size

Is Hazelcast always blocking in case initial.min.cluster.size is not reached? If not, under which situations is it not?
Details:
I use the following code to initialize hazelcast:
Config cfg = new Config();
cfg.setProperty("hazelcast.initial.min.cluster.size",Integer.
toString(minimumInitialMembersInHazelCluster)); //2 in this case
cfg.getGroupConfig().setName(clusterName);
NetworkConfig network = cfg.getNetworkConfig();
JoinConfig join = network.getJoin();
join.getMulticastConfig().setEnabled(false);
join.getTcpIpConfig().addMember("192.168.0.1").addMember("192.168.0.2").
addMember("192.168.0.3").addMember("192.168.0.4").
addMember("192.168.0.5").addMember("192.168.0.6").
addMember("192.168.0.7").setRequiredMember(null).setEnabled(true);
network.getInterfaces().setEnabled(true).addInterface("192.168.0.*");
join.getMulticastConfig().setMulticastTimeoutSeconds(MCSOCK_TIMEOUT/100);
hazelInst = Hazelcast.newHazelcastInstance(cfg);
distrDischargedTTGs = hazelInst.getList(clusterName);
and get log messages like
debug: starting Hazel pullExternal from Hazelcluster with 1 members.
Does that definitely mean there was another member that has joined and left already? It does not look like that would be the case from the log files of the other instance. Hence I wonder whether there are situtations where hazelInst = Hazelcast.newHazelcastInstance(cfg); does not block even though it is the only instance in the hazelcast cluster.
The newHazelcastInstance blocks till the clusters has the required number of members.
See the code below for how it is implemented:
private static void awaitMinimalClusterSize(HazelcastInstanceImpl hazelcastInstance, Node node, boolean firstMember)
throws InterruptedException {
final int initialMinClusterSize = node.groupProperties.INITIAL_MIN_CLUSTER_SIZE.getInteger();
while (node.getClusterService().getSize() < initialMinClusterSize) {
try {
hazelcastInstance.logger.info("HazelcastInstance waiting for cluster size of " + initialMinClusterSize);
//noinspection BusyWait
Thread.sleep(TimeUnit.SECONDS.toMillis(1));
} catch (InterruptedException ignored) {
}
}
if (initialMinClusterSize > 1) {
if (firstMember) {
node.partitionService.firstArrangement();
} else {
Thread.sleep(TimeUnit.SECONDS.toMillis(3));
}
hazelcastInstance.logger.info("HazelcastInstance starting after waiting for cluster size of "
+ initialMinClusterSize);
}
}
If you set the logging on debug then perhaps you can see better what is happening. Member joining and leaving should already be visible under info.

Nested transactions on google app engine datastore 3

Question is: does ds.put(employee) happen in transaction? Or does the outer transaction get erased/overriden by the transaction in saveRecord(..)?
Once error is thrown at line datastore.put(..) at some point in the for-loop (let's say i==5), will previous puts originating on the same line get rollbacked?
What about puts happening in the saveRecord(..). I suppose those will not get rollbacked.
DatastoreService datastore = DatastoreServiceFactory.getDatastoreService()
Transaction txn = datastore.beginTransaction();
try {
for (int i=0; 1<10; i++) {
Key employeeKey = KeyFactory.createKey("Employee", "Joe");
Entity employee = datastore.get(employeeKey);
employee.setProperty("vacationDays", 10);
datastore.put(employee);
Entity employeeRecord = createRecord("record", employeeKey);
saveRecord(employeeRecord);
}
txn.commit();
} finally {
if (txn.isActive()) {
txn.rollback();
}
}
public void saveRecord(Entity entity) {
datastore.beginTransaction();
try {
// do some logic in here, delete activity and commit txn
datastore.put(entity);
} finally {
if (datastore.getCurrentTransaction().isActive()) {
datastore.getCurrentTransaction().rollback();
}
}
}
OK, I'll assume you are using low-level Datastore API. Note that getTransaction() does not exist. I'll assume you meant datastoreService.getCurrentTransaction().
DatastoreService.beginTransaction() will return a Transaction, that is considered a current transaction on the same thread until you call beginTransaction() again. Since you call beginTransaction() in a loop inside "outer" transaction, it breaks your "outer" code: after the loop is finished ds.getCurrentTransaction() does not return the same transaction. Also, put() implicitly uses current transaction.
So first you must fix outer code to save transaction as shown in example:
public void put(EventPlan eventPlan) {
Transaction txn = ds.beginTransaction();
try {
for (final Activity activity : eventPlan.getActivities()) {
save(activity, getPlanKey(eventPlan)); // PUT
// IMPORTANT - also pass transaction and use it
// (I assume this is some internal helper method)
ds.put(txn, activity, getSubPlanKey(eventPlan)); //subplan's parent is eventPlan
}
txn.commit();
} finally {
if (txn.isActive())
txn.rollback();
}
}
Now on to questions:
Yes, because all puts are now part of the same transaction (after you fix the code) and you call txn.rollback() on it in case of errors.
No, of course not. They are part of different transactions.

Resources