I am using hikari cp with spring boot app which has more that 1000 concurrent users.
I have set the max pool size-
spring.datasource.hikari.maximum-pool-size=300
When i look at the processlist of mysql using
show processlist;
It shows max 300 which is equal to the pool size.It never increases than max pool.Is this intened?
I thought pool size means connections maintained so that the connections can be reused when future requests to the database are required but when need comes more connections can be made.
Also when I am removing the max pool config ,I immediately get-
HikariPool-0 - Connection is not available, request timed out after 30000ms.
How to resolve this problem.Thanks in advance.
Yes, it's intended. Quoting the documentation:
This property controls the maximum size that the pool is allowed to reach, including both idle and in-use connections. Basically this value will determine the maximum number of actual connections to the database backend. A reasonable value for this is best determined by your execution environment. When the pool reaches this size, and no idle connections are available, calls to getConnection() will block for up to connectionTimeout milliseconds before timing out. Please read about pool sizing. Default: 10
So basically, when all 300 connections are in use, and you are trying to make your 301st connection, Hikari won't create a new one (as maximumPoolSize is the absolute maximum), but it will rather wait (by default 30 seconds) until a connection is available again.
This also explains why you get the exception you mentioned, because the default (when not configuring a maximumPoolSize) is 10 connections, which you'll probably immediately reach.
To solve this issue, you have to find out why these connections are blocked for more than 30 seconds. Even in a situation with 1000 concurrent users, there should be no problem if your query takes a few milliseconds or a few seconds at most.
Increasing the pool size
If you are invoking really complex queries that take a long time, there are a few possibilities. The first one is to increase the pool size. This however is not recommended, as the recommended formula for calculating the maximum pool size is:
connections = ((core_count * 2) + effective_spindle_count)
Quoting the About Pool Sizing article:
A formula which has held up pretty well across a lot of benchmarks for years is
that for optimal throughput the number of active connections should be somewhere
near ((core_count * 2) + effective_spindle_count). Core count should not include
HT threads, even if hyperthreading is enabled. Effective spindle count is zero if
the active data set is fully cached, and approaches the actual number of spindles
as the cache hit rate falls. ... There hasn't been any analysis so far regarding
how well the formula works with SSDs.
As described within the same article, that means that a 4 core server with 1 hard disk should only have about 10 connections. Even though you might have more cores, I'm assuming that you don't have enough cores to warrant the 300 connections you're making, let alone increasing it even further.
Increasing connection timeout
Another possibility is to increase the connection timeout. As mentioned before, when all connections are in use, it will wait for 30 seconds by default, which is the connection timeout.
You can increase this value so that the application will wait longer before going in timeout. If your complex query takes 20 seconds, and you have a connection pool of 300 and 1000 concurrent users, you should theoretically configure your connection timeout to be at least 20 * 1000 / 300 = 67 seconds.
Be aware though, that means that your application might take a long time before showing a response to the user. If you have a 67 second connection timeout and an additional 20 seconds before your complex query completes, your user might have to wait up to a minute and a half.
Improve execution time
As mentioned before, your primary goal would be to find out why your queries are taking so long. With a connection pool of 300, a connection timeout of 30 seconds and 1000 concurrent users, it means that your queries are taking at least 9 seconds before completing, which is a lot.
Try to improve the execution time by:
Adding proper indexes.
Writing your queries properly.
Improve database hardware (disks, cores, network, ...)
Limit the amount of records you're dealing with by introducing pagination, ... .
Divide the work. Take a look to see if the query can be split into smaller queries that result in intermediary results that can then be used in another query and so on. As long as you're not working in transactions, the connection will be freed up in between, allowing you to serve multiple users at the cost of some performance.
Use caching
Precalculate the results: If you're doing some resource-heavy calculation, you could try to pre-calculate the results during a moment that the application isn't used as often, eg. at night and store those results in a different table that can be easily queried.
...
Related
how can we keep fixed number of active concurrent users/requests at time for a scenario.
I have an unique testing problem where I am required to do the performance testing of services with fixed number of request at a given moment for a given time periods like 10 minutes or 30 minutes or 1 hour.
I am not looking for per second thing, what I am looking for is that we start with N number of requests and as any of request out of N requests completes we add one more so that at any given moment we have N concurrent requests only.
Things which I tried are rampUsers(100) over 10 seconds but what I see is sometimes there are more than 50 users at a given instance.
constantUsersPerSec(20) during (1 minute) also took the number of requests t0 50+ for sometime.
atOnceUsers(20) seems related but I don't see any way to keep it running for given number of seconds and adding more requests as previous ones completes.
Thankyou community in advance, expecting some direction from your side.
There is a throttling mechanism (https://gatling.io/docs/3.0/general/simulation_setup/#throttling) which allow you to set max number of requests, but you must remember that users are injected to simulation independently of that and you must inject enough users to produce that max number of request, without that you will end up with lower req/s. Also users that will be injected but won't be able to send request because of throttling will wait in queue for they turn. It may result in huge load just after throttle ends or may extend your simulation, so it is always better to have throttle time longer than injection time and add maxDuration() option to simulation setup.
You should also have in mind that throttled simulation is far from natural way how users behave. They never wait for other user to finish before opening page or making any action, so in real life you will always end up with variable number of requests per second.
Use the Closed Work Load Model injection supported by Gatling 3.0. In your case, to simulate and maintain 20 active users/requests for a minute, you can use an injection like,
Script.<Controller>.<Scenario>.inject(constantConcurrentUsers(20) during (60 seconds))
I am parsing thousands of csv files from my application and for each parsed row I am making an insert into Cassandra. It seems that after letting it run it stops at 2048 inserts and throws the BusyConnection error.
Whats the best way for me to make about 1 million inserts?
Should i export the inserts as strings into a file, then run that file directly from CQL to make these massive inserts so I dont actually do it over the network?
We solve such issues using script(s).
The script go through input data and...
At each time it takes a specific amount of data from input.
Wait for specific amount of time.
Continues in reading and inserting of data.
ad 1. For our configuration and data (max 10 columns with mostly numbers and short texts) we found from 500 to 1000 rows are optimal.
ad 2. We define wait time as n * t. Where n is number of rows processed in single run of script. And t is time constant in millisecond. Value of t strongly depends on your configuration; however, for us t = 70 ms is enough to make the process smooth.
1 million requests - it's not so big number really, you can load it from cqlsh using the COPY FROM command. But you can load this data via your Java code as well.
From the error message it looks like that you're using asynchronous API. You can use it for high-performance inserts, but you need to control how many requests are processed at the same time (so-called, in-flight requests).
There are several aspects here:
Starting with version 3 of the protocol, you may have up to 32k in-flight requests per connection instead of 1024 that is used by default. You can configure it when creating Cluster object.
You need to control how many requests are in-flight, by wrapping session.executeAsync with some counter, for example, like in this example (not the best because it limits on the total requests per session, not on the connections to individual hosts - this will require much more logic, especially around token-aware requests).
I'm writing a Gatling load test which simply bombards a given endpoint over HTTP for a given period of time. I have it gradually ramp up connections per second, and then hold it there for the duration of the test. My setup looks like this:
setUp(
scn.inject(
rampUsersPerSec(10 to 70 during(1 minute),
constantUsersPerSec(70) during(9 minutes)
).protocols(httpConf).throttle(jumpToRps(70) holdFor(10 minutes))
)
This works, but the problem is that our requests take a long time, sometimes much longer than a second.
What ends up happening is that the server slows down and requests start taking longer and longer, and instead of maintaining 70 connections to the server at a time, this quickly grows linearly and I'll have something like 1000 open connections at any given time.
Is there a way to "limit the pool" of Gatling users to maintain X open connections at a given time? I've so far been unsuccessful in trying to throttle it.
What you want is a closed injection model.
In order to do that with Gatling, you have to wrap your scenario content with a loop, and possible flush the HTTP caches and cookie jars. Search the doc.
Note that this model is nowhere realistic, except if your system indeed limit the number of users that it lets enter, with an upfront queue. Typical use case is a call center.
Why are there SetMaxOpenConns and SetMaxIdleConns. In the doc
SetMaxIdleConns
SetMaxIdleConns sets the maximum number of connections in the idle
connection pool.
If MaxOpenConns is greater than 0 but less than the new MaxIdleConns
then the new MaxIdleConns will be reduced to match the MaxOpenConns
limit
If n <= 0, no idle connections are retained.
SetMaxOpenConns
SetMaxOpenConns sets the maximum number of open connections to the
database.
If MaxIdleConns is greater than 0 and the new MaxOpenConns is less
than MaxIdleConns, then MaxIdleConns will be reduced to match the new
MaxOpenConns limit
If n <= 0, then there is no limit on the number of open connections.
The default is 0 (unlimited).
Why have both functions but not a single function to adjust both idle and open connections like MaxConns which is MaxIdleConns + MaxOpenConns. Why would a developer have to arrange how many open and idle conns there can be instead of defining the total pool?
The db pool may contain 0 or more idle connections to the database. These were connections that were made, used, and rather than closed, were kept around for future use. The number of these we can keep around is MaxIdleConns.
When you request one of these idle connections, it becomes an Open connection, available for you to use. The number of these you can use is MaxOpenConns.
Now, there is no point in ever having any more idle connections than the maximum allowed open connections, because if you could instantanously grab all the allowed open connections, the remain idle connections would always remain idle. It's like having a bridge with four lanes, but only ever allowing three vehicles to drive across it at once.
Therefore, we would like to ensure that
MaxIdleConns <= MaxOpenConns
The functions are written to preserve this invariant by reducing MaxIdleConns whenever it exceeds MaxOpenConns. Note that the documentation says, only MaxIdleConns is ever reduced to match MaxOpenConns, the latter is never true.
To answer the question of why a developer might want to adjust these separately: consider the case of an application that is usually quiet, but occasionally needs to open a large number of connections. You may wish to specify a large MaxOpenConns, but a very small MaxIdleConns, to ensure that your application can open as many connections in requires whenever it needs to, but releases these resources quickly, freeing up memory both for itself and the database. Keeping an idle connection alive is not free, and it's usually done because you want to turn it into usable connection soon.
So the reason there are two numbers here is that these are two parameters that you might have a good reason to vary individually. Of course, the semantics of the API mean that if you don't care about setting both these values, you can just set the one that you care about, which is probably MaxOpenConns
Say I'm expecting about 100 requests a second, each request should take anywhere between 1 - 3 seconds (In a perfect world).
Would I create a pool of 300 connections? Or something slightly higher to compensate for potential spikes?
That depends on the distribution of arriving events.
Queuing theory can give you a formula (for a given distribution) how many connections you need so that the probability of failure (no free connection in your case) will be no more than certain percentage.
You may want to look at these notes (page 17) which give you some formulas, such as probability that you have n requests being served at the same time or you have a non-empty queue (the state that you want to avoid)