I was playing with the Option "Maximum Number of Connections" within a Microsoft SSAS Multidimensional OLAP Solution. According to following article, SSAS will open more Connection to the Database and will process several Partitions.
http://henkvandervalk.com/how-to-process-a-ssas-molap-cube-as-fast-as-possible-part-2
When I change the value (16, 20 etc.) and monitor the connection in the activity monitor, I always see that SSAS only open 10 Connection at the same time. Is this option in a relation with other options ? When I change it to 4, then only 4 partitions will be process in parallel.
Thanks for any advice and hints.
My SSAS Options:
- ThreadPool \ Process \ MaxThreads: 320
- ThreadPool \ Query \ MaxThreads: 64
My Server:
- 32 Cores
- 512 GB RAM
look at the data source connection to the underlying database. The default number of connections there is 10.
Related
I have an Azure Data Factory pipeline which launches 4 Databricks activities in parallel.
The 4 activities do almost the same thing:
Write different data in 4 different SQL Server tables in the same database
val df= spark.sql("SELECT * FROM TAB1")
df
.write
.format("com.microsoft.sqlserver.jdbc.spark")
.mode("overwrite")
.option("truncate", value = true)
.option("reliabilityLevel", "BEST_EFFORT")
.option("tableLock", "false")
.option("url", url)
.option("dbtable", "dbo.TAB1")
.option("user", u)
.option("password", p)
.option("schemaCheckEnabled", "false")
.option("batchsize", "1048576")
.save()
We noticed that although the job executes successfully, sometimes it fails with an:
SQLServerException: The connection is closed error.
The data we try to write in SQL takes between 10 and 20 minutes to finish entirely.
I am thinking maybe the fact that we execute the 4 jobs in parallel is the source of the problem, but I am not sure.
Any help is appreciated.
As you have mention that you are running 4 Databricks activities in parallel, in those activities you are writing data to 4 different tables in the same database and the error occurs only sometimes - so there are high chances that you are facing the capacity issue on SQL Server side.
If you are using Azure SQL Database, you can upgrade the tier and this should work fine.
Also, make sure the Azure Integration Runtime has sufficient Cores and Memory to run your 4 pipelines in parallel. If data is huge it might be possible that IR is incapable to process it. Create a new IR and use it to run your pipelines. Refer below image.
Alternatively, you can run your pipeline in serial instead of parallel, this will leave the database and IR with enough memory to deal with the data.
I am using DBeaver to connect to my MS SQL database hosted in local. I try to export my tables as CSV files. In the case where the query is rather big (40k rows which takes a couple of minutes) the export gets stopped with the message
"SQL Error: The connection is closed".
I kept the default parameters for dbeaver database connection, and my SQL server timeout is the default one (10 minutes, which is more than it takes to trigger the error)
Any idea where it might come from?
You know, the value of binary is extremely large and weight. So that takes much time to transfer via the network. That's the reason why you're getting error. In my opinion,
You should split your query into multiple time to fetch data (How about 1k records in each time).
Just get the exactly items that you need (where condition or the columns that you need instead of all)
Every database driver allows to configure the connectTimeout, a parameter that declares for how long the client (dbeaver) will wait before deciding something went wrong.
You can change this parameter right-clicking on the name of the server, choosing Edit Connection, then clicking on the Driver properties tab and searching for the connectTimeoutparameter (or something equivalent). Increase the value you found there.
I had this problem with PostgreSQL 13, found a connectTimeout = 20ms and increased it to 200ms to overcome the issue.
An old MySQL driver showed a connectTimeout = 20000, most probably in nanoseconds.
Sqlserver.exe showing memory greater than max memory limit lock pages also enabled its confusing
AS stated here
SQL Servers max memory setting defines the limits for buffer pool usage only. There will be variable but significant allocations required over and above that limit.
Jonathan Kehayias's, Christian Bolton and John Samson have level 300/400 posts on the topic. Brent Ozar has an easier to read article that might be a better place to start.
Also related: SQL Server 2008 R2 “Ghost Memory”
Min & Max Server Memory
Microsoft SQL Server Management Studio → Right Click the Server → Properties →Memory → Server Memory Options → Minimum server memory (In MB): = 0 and Maximum server memory (In MB): = 2147483647
Configure this memory allocation Based on the RAM installed in the DB Server.
For Ex:
IF DB server is installed with 6 GB of RAM, then maintain the 20% breadth space for the OS installed in the server.
For 6 GB of RAM, Maximum server memory (In MB) will be = 4915 for the SQL server.
Right Click the Server → Properties →Security → Login Auditing → Enable the Failed logins only. This option will avoid the log write and memory log space will be saved.
I am a bit new to postgresql db. I have done a setup over Azure Cloud for my PostgreSQL DB.
It's Ubuntu 18.04 LTS (4vCPU, 8GB RAM) machine with PostgreSQL 9.6 version.
The problem that occurs is when the connection to the PostgreSQL DB stays idle for some time let's say 2 to 10 minutes then the connection to the db does not respond such that it doesn't fulfill the request and keep processing the query.
Same goes with my JAVA Spring-boot Application. The connection doesn't respond and the query keep processing.
This happens randomly such that the timing is not traceable sometimes it happens in 2 minutes, sometimes in 10 minutes & sometimes don't.
I have tried with PostgreSQL Configuration file parameters. I have tried:
tcp_keepalive_idle, tcp_keepalive_interval, tcp_keepalive_count.
Also statement_timeout & session_timeout parameters but it doesn't change anyway.
Any suggestion or help would be appreciable.
Thank You
If you are setting up PostgreSQL DB connection on Azure VM you have to be aware that there are Unbound and Outbound connections timeouts . According to
https://learn.microsoft.com/en-us/azure/load-balancer/load-balancer-outbound-connections#idletimeout ,Outbound connections have a 4-minute idle timeout. This timeout is not adjustable. For inbound timeou there is an option to change in on Azure Portal.
We run into similar issue and were able to resolve it on client side. We changed Spring-boot default Hikari configuration as follow:
hikari:
connection-timeout: 20000
validation-timeout: 20000
idle-timeout: 30000
max-lifetime: 40000
minimum-idle: 1
maximum-pool-size: 3
connection-test-query: SELECT 1
connection-init-sql: SELECT 1
I have a project in Spring Boot (1.5.1.RELEASE) which is using Postgres DB (9.1-901-1).
While I am running this application in production it will create upto 100 number of Idle connection in DB.
So I override the default configuration to control creating 'N' number of idle connection. Please check below configuration:
datasource:
driverClassName: org.postgresql.Driver
url: jdbc:postgresql://localhost:5432/db_name
username: root
password: root
tomcat:
# default value is 100 but postgres' default is 100 as well. To prevent "PSQLException: FATAL: sorry, too many
# clients already", we decrease the max-active value here. Which should be sufficient, by the way
max-active: 10
max-idle: 10
min-idle: 5
max-wait: 30000
time-between-eviction-runs-millis: 5000
min-evictable-idle-time-millis: 60000
jmx-enabled: true
Now Its creating 5 Idle connection to DB.
I am verifying that by executing below query.
select * from pg_stat_activity;
Now My question is, Do I really need 5 Idle connection for Production environment.
What will happen if I change my configuration like below? Will this work without any problem?
max-active: 1
max-idle: 1
min-idle: 0
And also would like to know how PgBouncer will help for this case? Is it necessary to have PgBouncer for Postgres?
The configuration you have proposed is definitely not recommended. A complete DB connection cycle will go through
Establish TCP connection
Validate the credentials
Connection is ready
Execute commands
Disconnect
By maintaining idle connections (connection pool) with the DB you are saving times spent steps 1-3 thus achieving better performance.
You should tune the settings on the DB based on the max instances of the microservices that will connect. for e.g. if max number of microservice instances is 5 and the service is configured to maintain 50 idle connetions then ensure that your DB is configured to cater to atleast 250 connections.
To arrive at min connection settings for the microservices you will need to do some tests based on your non functional requirements and load tests on the services.