I am upgrading our Airflow instance from 1.9 to 1.10.3 and whenever the scheduler runs now I get a warning that the database connection has been invalidated and it's trying to reconnect. A bunch of these errors show up in a row. The console also indicates that tasks are being scheduled but if I check the database nothing is ever being written.
The following warning shows up where it didn't before
[2019-05-21 17:29:26,017] {sqlalchemy.py:81} WARNING - DB connection invalidated. Reconnecting...
Eventually, I'll also get this error
FATAL: remaining connection slots are reserved for non-replication superuser connections
I've tried to increase the SQL Alchemy pool size setting in airflow.cfg but that had no effect
# The SqlAlchemy pool size is the maximum number of database connections in the pool.
sql_alchemy_pool_size = 10
I'm using CeleryExecutor and I'm thinking that maybe the number of workers is overloading the database connections.
I run three commands, airflow webserver, airflow scheduler, and airflow worker, so there should only be one worker and I don't see why that would overload the database.
How do I resolve the database connection errors? Is there a setting to increase the number of database connections, if so where is it? Do I need to handle the workers differently?
Update:
Even with no workers running, starting the webserver and scheduler fresh, when the scheduler fills up the airflow pools the DB connection warning starts to appear.
Update 2:
I found the following issue in the Airflow Jira: https://issues.apache.org/jira/browse/AIRFLOW-4567
There is some activity with others saying they see the same issue. It is unclear whether this directly causes the crashes that some people are seeing or whether this is just an annoying cosmetic log. As of yet there is no resolution to this problem.
This has been resolved in the latest version of Airflow, 1.10.4
I believe it was fixed by AIRFLOW-4332, updating SQLAlchemy to a newer version.
Pull request
1) I have installed Apache Flink in my local machine (Ubuntu 16.04). Developed a java programs, created Jar files and trying to run them as a jobs in Flink web front end. I am able to run the job individually, but couldn't able to run multiple jobs parallel.
Please let me know if any configuration has to be modified, so that i can run them simultaneously.
2) Unable to run the flink job which is having multiple tasks (>500 task in a single job) Getting following exceptions:
(a) network.partition.PartitionNotFoundException (org.apache.flink.runtime.io.network.partition.PartitionNotFoundException)
(b) rest.handler.taskmanager.TaskManagerDetailsHandler (org.apache.flink.runtime.rest.handler.taskmanager.TaskManagerDetailsHandler - Implementation error: Unhandled exception)
(c) heap memory exception
Please let me know how to overcome these and what are the necessary configuration needed to run the job.
Tried to increase memory to 2048.
This is my first experience with SSIS so bear with me...
I am using SSIS to migrate tables from Oracle to SSMS, there are some very large tables I am trying to transfer (50 million rows +). SSIS is now completely freezing up and rebooting VS when I am just trying to save the package (not even running it). It keeps returning errors of insufficient memory, however, I am working on a remote server that has well over the RAM it takes to run this package.
Error Message when trying to save
The only thing I can think of is when this package is attempting to run, my Ethernet Kbps are through the roof right as the package starts. Maybe need to update my pipeline?
Ethernet Graph
Also, my largest table will fail when importing due to BYTE sizes (again, not nearly using all the memory on the server). We are using ODBC Source as this was the only way we were able to get other large tables to upload more than 1 million rows.
I have tried creating a temporary buffer file to help with memory pressure, but that had no changes. I have changed the AutoAdjustBufferSize to True, no change in results. also changed DefaultBufferMaxRows and DefaultBufferSize.. no change.
ERRORS WHEN RUNNING LARGE TABLE:
Information: 0x4004300C at SRC_STG_TABLENAME, SSIS.Pipeline: Execute
phase is beginning.
Information: 0x4004800D at SRC_STG_TABLENAME: The buffer manager
failed a memory allocation call for 810400000 bytes, but was unable
to swap out any buffers to relieve memory pressure. 2 buffers were
considered and 2 were locked.
Either not enough memory is available to the pipeline because not
enough are installed, other processes were using it, or too many
buffers are locked.
Information: 0x4004800F at SRC_STG_TABLENAME: Buffer manager
allocated 1548 megabyte(s) in 2 physical buffer(s).
Information: 0x40048010 at SRC_STG_TABLENAME: Component "ODBC
Source" (60) owns 775 megabyte(s) physical buffer.
Information: 0x4004800D at SRC_STG_TABLENAME: The buffer manager
failed a memory allocation call for 810400000 bytes, but was unable
to swap out any buffers to relieve memory pressure. 2 buffers were
considered and 2 were locked.
Either not enough memory is available to the pipeline because not
enough are installed, other processes were using it, or too many
buffers are locked.
Information: 0x4004800F at SRC_STG_TABLENAME: Buffer manager
allocated 1548 megabyte(s) in 2 physical buffer(s).
Information: 0x40048010 at SRC_STG_TABLENAME: Component "ODBC
Source" (60) owns 775 megabyte(s) physical buffer.
Information: 0x4004800D at SRC_STG_TABLENAME: The buffer manager
failed a memory allocation call for 810400000 bytes, but was unable
to swap out any buffers to relieve memory pressure. 2 buffers were
considered and 2 were locked.
Either not enough memory is available to the pipeline because not
enough are installed, other processes were using it, or too many
buffers are locked.
Information: 0x4004800F at SRC_STG_TABLENAME: Buffer manager
allocated 1548 megabyte(s) in 2 physical buffer(s).
Information: 0x40048010 at SRC_STG_TABLENAME: Component "ODBC
Source" (60) owns 775 megabyte(s) physical buffer.
Error: 0xC0047012 at SRC_STG_TABLENAME: A buffer failed while
allocating 810400000 bytes.
Error: 0xC0047011 at SRC_STG_TABLENAME: The system reports 26
percent memory load. There are 68718940160 bytes of physical memory
with 50752466944 bytes free. There are 4294836224 bytes of virtual
memory with 914223104 bytes free. The paging file has 84825067520
bytes with 61915041792 bytes free.
Information: 0x4004800F at SRC_STG_TABLENAME: Buffer manager
allocated 1548 megabyte(s) in 2 physical buffer(s).
Information: 0x40048010 at SRC_STG_TABLENAME: Component "ODBC
Source" (60) owns 775 megabyte(s) physical buffer.
Error: 0x279 at SRC_STG_TABLENAME, ODBC Source [60]: Failed to add
row to output buffer.
Error: 0x384 at SRC_STG_TABLENAME, ODBC Source [60]: Open Database
Connectivity (ODBC) error occurred.
Error: 0xC0047038 at SRC_STG_TABLENAME, SSIS.Pipeline: SSIS Error
Code DTS_E_PRIMEOUTPUTFAILED. The PrimeOutput method on ODBC Source
returned error code 0x80004005. The component returned a failure code
when the pipeline engine called PrimeOutput(). The meaning of the
failure code is defined by the component, but the error is fatal and
the pipeline stopped executing. There may be error messages posted
before this with more information about the failure.
This is really holding up my work. HELP!
I suggest reading data in chunks:
Instead of loading the whole table, try to split the data into chunks and import them to SQL Server. From a while, I answered a similar answer related to SQLite, i will try to reproduce it to fit the Oracle syntax:
Step by Step guide
In this example each chunk contains 10000 rows.
Declare 2 Variables of type Int32 (#[User::RowCount] and #[User::IncrementValue])
Add an Execute SQL Task that execute a select Count(*) command and store the Result Set into the variable #[User::RowCount]
Add a For Loop with the following preferences:
Inside the for loop container add a Data flow task
Inside the dataflow task add an ODBC Source and OLEDB Destination
In the ODBC Source select SQL Command option and write a SELECT * FROM TABLE query *(to retrieve metadata only`
Map the columns between source and destination
Go back to the Control flow and click on the Data flow task and hit F4 to view the properties window
In the properties window go to expression and Assign the following expression to [ODBC Source].[SQLCommand] property: (for more info refer to How to pass SSIS variables in ODBC SQLCommand expression?)
"SELECT * FROM MYTABLE ORDER BY ID_COLUMN
OFFSET " + (DT_WSTR,50)#[User::IncrementValue] + "FETCH NEXT 10000 ROWS ONLY;"
Where MYTABLE is the source table name, and IDCOLUMN is your primary key or identity column.
Control Flow Screenshot
References
ODBC Source - SQL Server
How to pass SSIS variables in ODBC SQLCommand expression?
HOW TO USE SSIS ODBC SOURCE AND DIFFERENCE BETWEEN OLE DB AND ODBC?
How do I limit the number of rows returned by an Oracle query after ordering?
Getting top n to n rows from db2
Update 1 - Other possible workarounds
While searching for similar issues i found some additional workarounds that you can try:
(1) Change the SQL Server max memory
SSIS: The Buffer Manager Failed a Memory Allocation Call
sp_configure 'show advanced options', 1;
GO
RECONFIGURE;
GO
sp_configure 'max server memory', 4096;
GO
RECONFIGURE;
GO
(2) Enable Named pipes
[Fixed] The buffer manager detected that the system was low on virtual memory, but was unable to swap out any buffers
Go to Control Panel – > Administrative Tools -> Computer Management
On Protocol for SQL Instance -> Set Named Pipes = Enabled
Restart the SQL instance Service
After that try to import the data and it will fetch the data in chunks now instead of fetch all at once. Hope that will work for you guys and save your time.
(3) If using SQL Server 2008 install hotfixes
The SSIS 2008 runtime process crashes when you run the SSIS 2008 package under a low-memory condition
Update 2 - Understanding the error
In the following MSDN link, the error cause was described as following:
Virtual memory is a superset of physical memory. Processes in Windows typically do not specify which they are to use, as that would (greatly) inhibit how Windows can multitask. SSIS allocates virtual memory. If Windows is able to, all of these allocations are held in physical memory, where access is faster. However, if SSIS requests more memory than is physically available, then that virtual memory spills to disk, making the package operate orders of magnitude slower. And in worst cases, if there is not enough virtual memory in the system, then the package will fail.
Are you running your packages in parallel ? If yes, change to serie.
You can also try to divide this big table into subsets using an operation like modulo. See that example :
http://henkvandervalk.com/reading-as-fast-as-possible-from-a-table-with-ssis-part-ii
(in the example, he is running in parallel, but you can put this in serie)
Also, if you are running the SSIS package on a computer that is running an instance of SQL Server, when you run the package, set the Maximum server memory option for the SQL Server instance to a smaller value.
That will increases available memory.
I am using version 2.3.7 for Java 6. I have set maximumPoolSize to 200 and connectionTimeout to 30 s. I run into SQLTimeoutExceptions from BaseHikariPool.getConnection in one of our load test cases involving 70 simultaneous users uploading 10 files each. I turned on debug logging and obtained pool stats. So it would seem that the pool isn't being exhausted. Rather, HikariCP takes longer than connectionTimeout to create new connections. How can I debug this part of the process? The underlying data source is SQLServerDataSource version 4.1.
connectionTimeout is the maximum time to wait for obtaining connection from pool.
it is NOT time out for creating connection from data source. there is none.
You may want to consider reducing pool size. begin load testing with minimum and gradually increasing till SqlServer begins to take much longer to create connection.
check about pool size
HTH
It might be because in HikariCP opening a connection is a blocking call (
https://github.com/brettwooldridge/HikariCP/issues/1287)
You might find this option useful com.zaxxer.hikari.blockUntilFilled. With this option connection pool will open minimumIdle connections in parallel during initialization instead of lazy initializing connections.
We have a SQL Server with 132GB of memory in it, My SQL Server is allocated with max memory of 110GB. Today morning, I saw an alert saying:
MSSQL 2014: SQL Server has failed to allocate sufficient memory to run the query
Source: MSSQLSERVER
Description: There is insufficient system memory in resource pool 'default' to run this query.
Now, I can see the Memory utilization through task manager and it is showing 88% utilized (which I see everyday when there are no issues). I do not see any error in SQL Log or Event Log.
There are no any complex queries running now.
Is there any way to find out what caused the insufficient memory issue last night? How can this be prevented from re-occurring?
If you use some kind of batch upload (ie. series of insert into ...), this time the batch size (combined with the data) stepped over the limit.
Or you have stored procedure(s) with parameter type of sql_variant, and the parameter value is exceeded the limit.
Try to do some "social engineering" which client done some unusual (regarding of data size) at the time of exception.