We have a spring java application that connects to a MS SQL server cluster of 2 nodes (2016 SP2 standard version).
We are testing failover: if a node fails, the application needs 90 seconds before reconnecting to the other node, that will be too much for production.
After reading and reading again the HickaryCP documentation for java, I tried to test this scenario with datagrip: I run a long query (insert a line in a table every 500 ms during 10 minutes) and I get the same issue: the database was unavailable for 90 seconds after 1 node failure.
Maybe the issue is cluster side and not application side...
Is there any SQL server cluster configuration that prevent us to reconnect before 90 seconds?
How can the connection be back before these 90 seconds? is there any caching or default configuration that we should update?
Thanks a lot for your help
EDIT
The test was wrong, I updated in comments the issue I am getting:
it reconnects as soon as the 1st node is back. The issue is after a second failover: no connection can be established then (I wait for the 2 nodes synchronization before the 2nd failover)
Related
I am a bit new to postgresql db. I have done a setup over Azure Cloud for my PostgreSQL DB.
It's Ubuntu 18.04 LTS (4vCPU, 8GB RAM) machine with PostgreSQL 9.6 version.
The problem that occurs is when the connection to the PostgreSQL DB stays idle for some time let's say 2 to 10 minutes then the connection to the db does not respond such that it doesn't fulfill the request and keep processing the query.
Same goes with my JAVA Spring-boot Application. The connection doesn't respond and the query keep processing.
This happens randomly such that the timing is not traceable sometimes it happens in 2 minutes, sometimes in 10 minutes & sometimes don't.
I have tried with PostgreSQL Configuration file parameters. I have tried:
tcp_keepalive_idle, tcp_keepalive_interval, tcp_keepalive_count.
Also statement_timeout & session_timeout parameters but it doesn't change anyway.
Any suggestion or help would be appreciable.
Thank You
If you are setting up PostgreSQL DB connection on Azure VM you have to be aware that there are Unbound and Outbound connections timeouts . According to
https://learn.microsoft.com/en-us/azure/load-balancer/load-balancer-outbound-connections#idletimeout ,Outbound connections have a 4-minute idle timeout. This timeout is not adjustable. For inbound timeou there is an option to change in on Azure Portal.
We run into similar issue and were able to resolve it on client side. We changed Spring-boot default Hikari configuration as follow:
hikari:
connection-timeout: 20000
validation-timeout: 20000
idle-timeout: 30000
max-lifetime: 40000
minimum-idle: 1
maximum-pool-size: 3
connection-test-query: SELECT 1
connection-init-sql: SELECT 1
Have two multiple geographically distributed SQL Server Databases with transactional replication.SQL Server agent sync two servers every 1 minute.After working for 10 minutes on server B (subscriber),it has an error, which lasts 10-15 minutes and corrected herself. Then again it working 10 minute and has an error. On server A (publication) I have log backup schedule it runs every 10 minutes. Maybe there is conflict between two jobs ?
SQLServerAgent Error: Request to run job XXX (from User sa)
refused because the job is already running from a request by Schedule 14 (Replication agent schedule.).
Changed database context to 'XXX'. (.Net SqlClient Data Provider)
How to fix it ?
We are facing an issue with long running threads in our web logic server 11g.
What actually happening is when we made a request in our application, if the thread associated with the request is taking more than 5 mins. Our weblogic server 11g is creating a new thread for the same request. that means we have 2 long running threads for the same request after 5 mins ( after 10 mins we will have 3 and so on ). this thing goes on repeat for every 5 mins and all the threads in the weblogic server get struck ed and finally server goes into "warning" status and application hangs out.
I suspected it as session replication issues but we are not using any clustered environment so I believe Session-replication will not be the reason for this strange behaviour of server.
Any suggestion on how to resolve this issue is highly appreciated.
Could be due to Idempotent setting of weblogic server plug-in is set to on and WLIOTimeoutSecs is left at its default value of 300 seconds.
I have a SQL Server 2008 database, and I need a mergereplication because i want to sync with mobile devices afterwards.
So I created a replication but when it comes to start the snapshotagent, the agent tries to start for about 20 minutes and then it shows the message
The replication agent has not logged a progress message in 10 minutes.
This might indicate an unresponsive agent or high system activity.
Verify that records are being replicated to the destination and that
connections to the Subscriber, Publisher, and Distributor are still
active.
There aren't any other errormessages, neither in the snapshot-agent-status-window nor in the agent-log-window.
I don't have the administrator of the domain, but the local administrator and a domainuser with admin-privilegs. Both have all rights to database, are in the access-list of the replication.
The server agent runs on the local administrator-account and there are 3 MergeReplications on the server, working
The job runs also under the local administrator.
Thank you for your help, Karl
So it works again...
Maybe someone else has got the same issue one day, so i post the solution here:
I researched on the server and found out, the sql server service is running under a local user. The reason for this is, that there were problems with the backupsystem, used by our customers and so they changed it years ago.
Because of the local user account a 15404-Error occures.
Knowing, that i mustn't use domain-accounts, I also solved the initial problem with my snapshot-agent. I searched for hours (nearly days ;) ) and it was just this little change:
When the Replication is created, the job is created too. The job has three steps. The Job-owner is the local-admin, also for the server-agent-service. But the second step of my job (replictionsnapshot) has one setting: run as. And by default this isn't the job-owner but the user running the creation, in my case my domain-account.
Now, that I set it to the local-administrator as well everything works fine again.
Thanks, Karl
I had the same issue, And the below fixed the issue. The replication agent was timing out after 10 minutes and changing the heartbeat from 10 to 30 minutes solved the issue,
Run the below command
exec sp_changedistributor_property #property = 'heartbeat_interval', #value = 30;
and then restart the sql agent on the subscriber to continue syncing.
We have a java server connecting to a MySQL 5 database usingHibernate as our persistence layer which is using c3p0 for DB connection pooling.
I've tried following the c3p0 and hibernate documentation:
Hibernate - HowTo Configure c3p0 connection pool
C3P0 Hibernate properties
C3P0.properties configuration
We're getting an error on our production servers stating that:
... Caused by:
com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException:
No operations allowed after connection
closed.Connection was implicitly
closed due to underlying
exception/error:
BEGIN NESTED EXCEPTION
com.mysql.jdbc.exceptions.jdbc4.CommunicationsException
MESSAGE: The last packet successfully
received from the server was45000
seconds ago.The last packet sent
successfully to the server was 45000
seconds ago, which is longer than the
server configured value of
'wait_timeout'. You should consider
either expiring and/or testing
connection validity before use in your
application, increasing the server
configured values for client timeouts,
or using the Connector/J connection
property 'autoReconnect=true' to avoid
this problem.
STACKTRACE:
com.mysql.jdbc.exceptions.jdbc4.CommunicationsException:
The last packet successfully received
from the server was45000 seconds
ago.The last packet sent successfully
to the server was 45000 seconds ago,
which is longer than the server
configured value of 'wait_timeout'.
You should consider either expiring
and/or testing connection validity
before use in your application,
increasing the server configured
values for client timeouts, or using
the Connector/J connection property
'autoReconnect=true' to avoid this
problem.
We have our c3p0 connection pool properties setup as follows:
hibernate.c3p0.max_size=10
hibernate.c3p0.min_size=1
hibernate.c3p0.timeout=5000
hibernate.c3p0.idle_test_period=300
hibernate.c3p0.max_statements=100
hibernate.c3p0.acquire_increment=2
The default MySQL wait_timetout is set to 28800 seconds (8 hours), the reported error is saying that it's been over 45000 seconds (about 12.5 hours). Although the c3p0 configuration states that it will "timeout" idle connections that haven't been used after 5000 seconds and it will check every 300 seconds, thus an idle connection should never live longer than 5299 seconds right?
I've tested locally by setting my developer MySQL (my.ini on windows, my.cnf on Unix) wait_timeout=60 and lowering the c3p0 idle timeout values below 60 seconds, and it will properly timeout idle connections and create new ones. I also check to ensure that we're not leaking DB connections and holding onto a connection, and it doesn't appear we are.
Here's the c3p0.properties file I'm using to test in my developer environment to ensure c3p0 is properly handling connections.
hibernate.properties (testing with MySQL wait_timeout=60)
hibernate.c3p0.max_size=10
hibernate.c3p0.min_size=1
hibernate.c3p0.timeout=20
hibernate.c3p0.max_statements=100
hibernate.c3p0.idle_test_period=5
hibernate.c3p0.acquire_increment=2
c3p0.properties
com.mchange.v2.log.FallbackMLog.DEFAULT_CUTOFF_LEVEL=ALL
com.mchange.v2.log.MLog=com.mchange.v2.log.FallbackMLog
c3p0.debugUnreturnedConnectionStackTraces=true
c3p0.unreturnedConnectionTimeout=10
Make sure that c3p0 really is starting by examine the log. I, for some reason, had two versions of hibernate (hibernate-core3.3.1.jar and hibernate-3.2.6GA.jar) on my classpath. I also used hibernate annotatations version 3.4.0GA which is not compatible with 3.2.x. (dont know if that had something to do with the original problem).
After removal of one of the hibernate jar's (cant remember which i deleted, probably hibernate-3.2.6GA.jar) c3p0 finally started and i got rid of the annoying com.mysql.jdbc.exceptions.jdbc4.CommunicationsException that happend efter 8h inactivity.