SSIS Buffer allocation failed - sql-server

I have a parent package in SSIS that has for each loop container that executes 2 child packages in a sequential manner. The connection values are passed to the child packages as parameters. The child packages dynamically connect to a flat-file source and a derived column is being used to replace the "\N" values with NULL and it is finally loaded in the SQL server destination. The total size of the flat file source is 3GB. The for-each loop executes in the first iteration, but it is failing in the second iteration. I am getting an error message of buffer allocation failed.
Error message
The system reports an 85 percent memory load. There are 32767590400 bytes of physical memory with 4636983296 bytes free. There are 4294836224 bytes of virtual memory with 362987520 bytes free. The paging file has 43437899776 bytes with 13638811648 bytes free.
The Data Flow task failed to create a buffer to call PrimeOutput for output "Flat File Source" (23) on component "Flat File Source Output" (27). This error usually occurs due to an out-of-memory condition.
Memory pressure was alleviated, buffer manager is not throttling allocations anymore
(in the buffer tuning log)
I have changed the default buffer size from 10 MB to 50 MB and default buffer max rows from 10000 to 50000. also changed the rows per batch to 10000 for files larger than 100MB.
still, I am facing the same issue. can anyone help me resolve this issue?
thanks,
Abhishek

I would suggest setting the AutoAdjustBufferSize to True.
Then you need to calculate the max size of the row of the table.
You can use this SQL where you replace your table and a schema in the quotes:
SELECT
SUM (max_length) [row_length]
,104857600.0/SUM (max_length) [Max_100MB_Buffer_Rows]
,2147483647.0/SUM (max_length) [Max_2GB_Buffer_Rows]
FROM sys.tables t
JOIN sys.columns c ON t.object_id=c.object_id
JOIN sys.schemas s ON t.schema_id=s.schema_id
WHERE t.name = '{my_table_name}' AND s.name = '{my_schema_name}'
Then depending on which version your packages are (2015 or 2017) you either use 100MB buffer (2015) or 2GB buffer (2017).
To be on the safe side use 100MB buffer.
Basically, you need to set a good Max num rows (not too high) depending on how wide your data is.

Related

Need to clean users01.dbf in Oracle 12c

I have Oracle Database 12c in a docker container. At one point, the space in the tablespace expanded (the users01.dbf file became 32GB in size) and I made a new file for this tablespace - users02.dbf. I analyzed the tables and indexes that occupy the most space and made them truncate. I see that the size of the largest tables and indexes has decreased, but the users01.dbf and users02.dbf files remain the same size:
users01.dbf 32G
users02.dbf 8.8G
Here is a screenshot:
Left size before, right after truncate command. How can I clean or reduce the size of users01.dbf and users02.dbf files and not break the database.
I am using the one query to find out the actual size that can be shrunk from the data file.
I am using it for almost 4 years and I don't know from where I got it but Yes, I got that script from one of the good blogs of the oracle.
Following script will generate the series of command which can be executed on DB to shrunk the data file or we can say reclaim the free space from the data file (only if more than 1 MB can be reclaimed) to perfect size without any error.
set linesize 1000 pagesize 0 feedback off trimspool on
with
hwm as (
-- get highest block id from each datafiles ( from x$ktfbue as we don't need all joins from dba_extents )
select /*+ materialize */ ktfbuesegtsn ts#,ktfbuefno relative_fno,max(ktfbuebno+ktfbueblks-1) hwm_blocks
from sys.x$ktfbue group by ktfbuefno,ktfbuesegtsn
),
hwmts as (
-- join ts# with tablespace_name
select name tablespace_name,relative_fno,hwm_blocks
from hwm join v$tablespace using(ts#)
),
hwmdf as (
-- join with datafiles, put 5M minimum for datafiles with no extents
select file_name,nvl(hwm_blocks*(bytes/blocks),5*1024*1024) hwm_bytes,bytes,autoextensible,maxbytes
from hwmts right join dba_data_files using(tablespace_name,relative_fno)
)
select
case when autoextensible='YES' and maxbytes>=bytes
then -- we generate resize statements only if autoextensible can grow back to current size
'/* reclaim '||to_char(ceil((bytes-hwm_bytes)/1024/1024),999999)
||'M from '||to_char(ceil(bytes/1024/1024),999999)||'M */ '
||'alter database datafile '''||file_name||''' resize '||ceil(hwm_bytes/1024/1024)||'M;'
else -- generate only a comment when autoextensible is off
'/* reclaim '||to_char(ceil((bytes-hwm_bytes)/1024/1024),999999)
||'M from '||to_char(ceil(bytes/1024/1024),999999)
||'M after setting autoextensible maxsize higher than current size for file '
|| file_name||' */'
end SQL
from hwmdf
where
bytes-hwm_bytes>1024*1024 -- resize only if at least 1MB can be reclaimed
order by bytes-hwm_bytes desc
/
It will generate the commands something like the following:
/* reclaim 1934M from 2048M */ alter database datafile 'C:\APP\TEJASH\VIRTUAL\ORADATA\ORCL\DATAFILE\TEJASH_DATAFILE_01.DBF' resize 115M;
/* reclaim 158M from 200M */ alter database datafile 'C:\APP\TEJASH\VIRTUAL\ORADATA\ORCL\DATAFILE\UNDO_DF_02.DBF' resize 43M;
/* reclaim 59M from 1060M */ alter database datafile 'C:\APP\TEJASH\VIRTUAL\ORADATA\ORCL\DATAFILE\O1_MF_SYSAUX_G9K5LYTT_.DBF' resize 1002M;
/* reclaim 3M from 840M */ alter database datafile 'C:\APP\TEJASH\VIRTUAL\ORADATA\ORCL\DATAFILE\O1_MF_SYSTEM_G9K5KK2J_.DBF' resize 838M;
You can directly execute all of them and you do not have to worry about calculating anything by yourself.
Please note that this script will work on the data files for which autoextensible is ON
I hope this will help you out.
Cheers!!
alter database datafile 'path_to_datafile/users01.dbf' resize 150M;
Repeat same for 02. Make sure path and file names are correct.
Since you truncated tables it should have dropped the segments altogether, so file should be able to shrink. If you get ORA-03297: file contains used data beyond requested RESIZE value then it means you have data at 150M mark so you should try increasing resize limit more until error goes away.
As always you shouldn't be doing stuff directly on Production but test it out.

Sqlite3 Disk I/O error encountered after a while, but worked after using copy of db

I'm using an sqlite3 database to record data every second. The interface to it is provided by Flask-SQLAlchemy.
This can work fine for a couple of months, but eventually (as the .db file approaches 8 GB), an error prevents any more data from being written to the database:
Failed to commit: (sqlite3.OperationalError) disk I/O error
The journal file does not seem to be the issue here - if I restart the application and use the pragma journal_mode=TRUNCATE, the journal file is created but the disk I/O error persists.
Here's the .dbinfo (obtained from sqlite3.exe):
database page size: 1024
write format: 1
read format: 1
reserved bytes: 0
file change counter: 5200490
database page count: 7927331
freelist page count: 0
schema cookie: 12
schema format: 4
default cache size: 0
autovacuum top root: 0
incremental vacuum: 0
text encoding: 1 (utf8)
user version: 0
application id: 0
software version: 3008011
number of tables: 6
number of indexes: 7
number of triggers: 0
number of views: 0
schema size: 5630
data version 2
However this worked:
I made a copy of the .db file (call app.db and copy.db).
I renamed app.db to orig.db
I renamed copy.db to app.db (so effectively, I swapped it so that the copy becomes the app).
When I started my application again, it was able to write to the app.db file once more! So I could write to a copy I made of the database.
The drive is an SSD (Samsung 850 EVO mSATA)> I wonder if that's got something to do with it? Does anyone have any ideas on how I can prevent it from happening again?
EDIT: I've used the sqlite3.exe CLI to execute an INSERT INTO command manually, and this actually completed successfully (and wrote to the disk). However, when I re-ran my Flask-SQLAlchemy interface to write to it, it still came up with the disk I/O error.
UPDATE:
A colleague pointed out that this might be related to another question: https://stackoverflow.com/a/49506243/3274353
I strongly suspect now that this is a filesystem issue - in my system, the database file is being updated constantly alongside some other files which are also being written to.
So to reduce the amount of fragmentation, I'm getting the database to pre-allocate some disk space now using the answer provided in aforementioned question: https://stackoverflow.com/a/49506243/3274353
Something like this:
CREATE TABLE t(x);
INSERT INTO t VALUES(zeroblob(500*1024*1024)); -- 500 MB
DROP TABLE t;
To know whether this needs to be done, I use a call to the freelist_count pragma:
PRAGMA schema.freelist_count;
Return the number of unused pages in the database file.

SQL Server 2016 R Services: sp_execute_external_script returns 0x80004005 error

I run some R code after querying 100M records and get the following error after the process runs for over 6 hours:
Msg 39004, Level 16, State 19, Line 300
A 'R' script error occurred during execution of 'sp_execute_external_script'
with HRESULT 0x80004005.
HRESULT 0x80004005 appears to be associated in Windows with Connectivity, Permissions or an "Unspecified" error.
I know from logging in my R code that the process never reaches the R script at all. I also know that the entire procedure completes after 4 minutes on a smaller number of records, for example, 1M. This leads me to believe that this is a scaling problem or some issue with the data, rather than a bug in my R code. I have not included the R code or the full query for proprietary reasons.
However, I would expect a disk or memory error to display a 0x80004004 Out of memory error if that were the case.
One clue I noticed in the SQL ERRORLOG is the following:
SQL Server received abort message and abort execution for major error : 18
and minor error : 42
However the time of this log line does not coincide with the interruption of the process, although it does occur after it started. Unfortunately, there is precious little on the web about "major error 18".
A SQL Trace when running from SSMS shows the client logging in and logging out every 6 minutes or so, but I can only assume this is normal keepalive behaviour.
The sanitized sp_execute_external_script call:
EXEC sp_execute_external_script
#language = N'R'
, #script = N'#We never get here
#returns name of output data file'
, #input_data_1 = N'SELECT TOP 100000000 FROM DATA'
, #input_data_1_name = N'x'
, #output_data_1_name = N'output_file_df'
WITH RESULT SETS ((output_file varchar(100) not null))
Server Specs:
8 cores
256 GB RAM
SQL Server 2016 CTP 3
Any ideas, suggestions or debugging hints would be greatly appreciated!
UPDATE:
Set TRACE_LEVEL=3 in rlauncher.config to turn on a higher level of logging and re-ran the process. The log reveals a cleanup process that ran, removing session files, at the time the entire process failed after 6.5 hours.
[2016-05-30 01:35:34.419][00002070][00001EC4][Info] SQLSatellite_LaunchSatellite(1, A187BC64-C349-410B-861E-BFDC714C8017, 1, 49232, nullptr) completed: 00000000
[2016-05-30 01:35:34.420][00002070][00001EC4][Info] < SQLSatellite_LaunchSatellite, dllmain.cpp, 223
[2016-05-30 08:04:02.443][00002070][00001EC4][Info] > SQLSatellite_LauncherCleanUp, dllmain.cpp, 309
[2016-05-30 08:04:07.443][00002070][00001EC4][Warning] Session A187BC64-C349-410B-861E-BFDC714C8017 cleanup wait failed with 258 and error 0
[2016-05-30 08:04:07.444][00002070][00001EC4][Info] Session(A187BC64-C349-410B-861E-BFDC714C8017) logged 2 output files
[2016-05-30 08:04:07.444][00002070][00001EC4][Warning] TryDeleteSingleFile(C:\PROGRA~1\MICROS~1\MSSQL1~1.MSS\MSSQL\EXTENS~1\MSSQLSERVER06\A187BC64-C349-410B-861E-BFDC714C8017\Rscript1878455a2528) failed with 32
[2016-05-30 08:04:07.445][00002070][00001EC4][Warning] TryDeleteSingleDirectory(C:\PROGRA~1\MICROS~1\MSSQL1~1.MSS\MSSQL\EXTENS~1\MSSQLSERVER06\A187BC64-C349-410B-861E-BFDC714C8017) failed with 32
[2016-05-30 08:04:08.446][00002070][00001EC4][Info] Session A187BC64-C349-410B-861E-BFDC714C8017 removed from MSSQLSERVER06 user
[2016-05-30 08:04:08.447][00002070][00001EC4][Info] SQLSatellite_LauncherCleanUp(A187BC64-C349-410B-861E-BFDC714C8017) completed: 00000000
It appears the only way to allow my long-running process to continue is to:
a) Extend the Job Cleanup wait time to allow the job to finish
b) Disable the Job Cleanup process
I have thus far been unable to find the value that sets the Job Cleanup wait time in the MSSQLLaunchpad service.
While a JOB_CLEANUP_ON_EXIT flag exists in rlauncher.config, setting it to 0 has no effect. The service seems to reset it to 1 when it is restarted.
Again, any suggestions or assistance would be much appreciated!
By default, SQL Server reads all data into R memory as a Data Frame before starting execution of R script. Based on the fact that the script works with 1M rows and fails to start with 100M rows, this could potentially be an Out of Memory error. To resolve memory issues, (other than increasing memory on machine/reducing data size) you can try one of these solutions
Increase memory allocation for R process execution using sys.resource_governor_external_resource_pools max_memory_percent setting. By default, SQL Server limits R process execution to 20% of memory.
Streaming execution for R script instead of loading all data into memory. Note that this parameter can only be used in cases where the output of the R script doesn’t depend on reading or looking at the entire set of rows.
The Warnings in RLauncher.log about data cleanup happened after the R script execution can be safely ignored and probably not the root cause for the failures you are seeing.
Unable to resolve this issue in SQL, I simply avoided the SQL Server Launchpad service which was interrupting the processing and pulled the data from SQL using the R RODBC library. The pull took just over 3 hours (instead of 6+ using sp_execute_external_procedure).
This might implicate the SQL Launchpad service, and suggests that memory was not the issue.
Please try your scenario in SQL Server 2016 RTM. There have been many functional and performance fixes made since CTP3.
For more information on how to get the SQL Server 2016 RTM checkout SQL Server 2016 is generally available today blogpost.
I had almost the same issue with SQL Server 2016 RTM-CU1. My Query failed with error 0x80004004 instead of 0x80004005. And it failed beginning with 10,000,000 records, but that could be related to only having 16 GB memory and/or different data.
I got around it by using a field list instead of "*". Even if the field list contains all the fields from the data source (a rather complicated view in my case), a query featuring a field list is always successful, while "SELECT TOP x * FROM ..." always fails for some large x.
I've had the a similar error (0x80004004), and the problem was that one of the rows in one of the columns contained a "very" special character (I'm saying "very" because other special characters did not cause this error).
So that when I replaced 'Folkelånet Telefinans' with 'Folkelanet Telefinans', the problem went away.
In your case, maybe at least one of the values in the last 99M rows contains something like that character, and you just have to replace it. I hope that Microsoft will resolve this issue at some point.

Pulling my hair out at Generic Error Message

I have a dataflow which is a SharePoint List Source to ADO.NET Database Destination . In SSIS 2008, when I run this I get the Error below. I have been through the 40 columns I am bringing through and checked the input size and the size of the database columns and they look fine. Point to note is that I am using memo fields to ntext data type. The result set sought is 600 rows imported from SharePoint List . I run this data flow and I get the Error below, only 200 get written to the database out of 600 that need to get imported.
3 Interesting tests.
Test 1, increasing the buffer size to 30 million , I now get 390 rows
imported, then I get the Error below.
Test 2, Upping the value to 50
million causes the error to happen straight away. I have not touched
the batch size which is 2000.
Test 3, I unmap a few random columns
in the middle of the SharePoint list source and all the rows now get
imported.
I do not understand what is going on. It seems I am hitting some kind of internal limit, is this a SharePoint adaptor problem ?
Error
Microsoft.SqlServer.Dts.Pipeline.DoesNotFitBufferException: The value is too large to fit in the column data area of the buffer.
at Microsoft.SqlServer.Dts.Pipeline.PipelineBuffer.SetString(Int32 columnIndex, String value)
at Microsoft.Samples.SqlServer.SSIS.SharePointListAdapters.SharePointListSource.PrimeOutput(Int32 outputs, Int32[] outputIDs, PipelineBuffer[] buffers)
at Microsoft.SqlServer.Dts.Pipeline.ManagedComponentHost.HostPrimeOutput(IDTSManagedComponentWrapper100 wrapper, Int32 outputs, Int32[] outputIDs, IDTSBuffer100[] buffers, IntPtr ppBufferWirePacket)
Resolved. The source for sp list source says a field is 100 when it is 150 in real life. So frustrating. Changed the field size on the source.

Out of memory error as inserting a 600MB files into sql server express as filestream data

(please read the update section below, I leave the original question too for clarity)
I am inserting many files into a SQL Server db configured for filestream.
I am inserting in a loop the files from a folder to a database table.
Everything goes fine until I try to insert a 600 MB file.
As it inserts it there is a +600MB memory usage in task manager and I have the error.
The DB size is < 1 GB and the total size of documents is 8 GB, I am using SQL Server Express R2, and according to the documentation I could have problems only if trying to insert a document that is greater than 10 GB (Express limitation) - Current DB Size.
Can anyone tell me why do I have this error? It is very crucial for me.
UPDATE FOR BOUNTY:
I offered 150 because it is very crucial for me!
This seems to be a limitation of Delphi memory Manager, trying to insert a document bigger than 500MB, I didn't check the exact threshold anyway it is between 500 and 600MB). I use SDAC components, in particular a TMSQuery (but I think the same can be done with and TDataset descendant), to insert the document in a table that has a PK (ID_DOC_FILE) and a varbinary(max) field (DOCUMENT) I do:
procedure UploadBigFile;
var
sFilePath: String;
begin
sFilePath := 'D:\Test\VeryBigFile.dat';
sqlInsertDoc.ParamByName('ID_DOC_FILE').AsInteger := 1;
sqlInsertDoc.ParamByName('DOCUMENT').LoadFromFile(sFilePath, ftblob);
sqlInsertDoc.Execute;
sqlInsertDoc.Close;
end;
SDAC team told me this is a limitation of Delphi memory manager. Now since SDAC doesn't support filestream I cannot do what has been suggested in c# in the first answer. Is the only solution reporting to Embarcadero and ask a bug fix?
FINAL UPDATE:
Thanks, really, to all you that answered me. For sure inserting big blobs can be a problem for the Express Edition (because the limitations of 1 GB of ram), anyway I had the error on the Enterprise edition, and it was a "delphi" error, not a sql server one. So I think that the answer that I accepted really hits the problem, even if I have no time to verify it now.
SDAC team told me this is a limitation of Delphi memory manager
To me that looked like an simplistic answer, and I investigated. I don't have the SDAC components and I also don't use SQL Server, my favorites are Firebird SQL and the IBX component set. I tried inserting an 600Mb blob into a table, using IBX, then tried the same using ADO (covering two connection technologies, both TDataSet descendants). I discovered the truth is somewhere in the middle, it's not really the memory manager, it's not SDAC's fault (well... they are in a position to do something about it, if many more people attempt inserting 600 Mb blobs into databases, but that's irrelevant to this discussion). The "problem" is with the DB code in Delphi. As it turns out Delphi insists on using an single Variant to hold whatever type of data one might load into an parameter. And it makes sense, after all we can load lots of different things into an parameter for an INSERT. The second problem is, Delphi wants to treat that Variant like an VALUE type: It copies it around at list twice and maybe three times! The first copy is made right when the parameter is loaded from the file. The second copy is made when the parameter is prepared to be sent to the database engine.
Writing this is easy:
var V1, V2:Variant;
V1 := V2;
and works just fine for Integer and Date and small Strings, but when V2 is an 600 Mb Variant array that assignment apparently makes a full copy! Now think about the memory space available for a 32 bit application that's not running in "3G" mode. Only 2 Gb of addressing space are available. Some of that space is reserved, some of that space is used for the executable itself, then there are the libraries, then there's some space reserved for the memory manager. After making the first 600 Mb allocation there just might not be enough available addressing space to allocate an other 600 Mb buffer! Because of this it's safe to blame it on the memory manager, but then again, why exactly does the DB stuff need an other copy of the 600 Mb monster?
One possible fix
Try splitting up the file into smaller, more manageable chunks. Set up the database table to have 3 fields: ID_DOCUMENT, SEQUENCE, DOCUMENT. Also make the primary key on the table to be (ID_DOCUMENT, SEQUENCE). Next try this:
procedure UploadBigFile(id_doc:Integer; sFilePath: String);
var FS:TFileStream;
MS:TMemoryStream;
AvailableSize, ReadNow:Int64;
Sequence:Integer;
const MaxPerSequence = 10 * 1024 * 1024; // 10 Mb
begin
FS := TFileStream.Create(sFilePath, fmOpenRead);
try
AvailableSize := FS.Size;
Sequence := 0;
while AvailableSize > 0 do
begin
if AvailableSize > MaxPerSequence then
begin
ReadNow := MaxPerSequence;
Dec(AvailableSize, MaxPerSequence);
end
else
begin
ReadNow := AvailableSize;
AvailableSize := 0;
end;
Inc(Sequence); // Prep sequence; First sequence into DB will be "1"
MS := TMemoryStream.Create;
try
MS.CopyFrom(FS, ReadNow);
sqlInsertDoc.ParamByName('ID_DOC_FILE').AsInteger := id_doc;
sqlInsertDoc.ParamByName('SEQUENCE').AsInteger := sequence;
sqlInsertDoc.ParamByName('DOCUMENT').LoadFromStream(MS, ftblob);
sqlInsertDoc.Execute;
finally MS.Free;
end;
end;
finally FS.Free;
end;
sqlInsertDoc.Close;
end;
You could loop through the byte stream of the object you are trying to insert and essentially buffer a piece of it at a time into your database until you have your entire object stored.
I would take a look at the Buffer.BlockCopy() method if you're using .NET
Off the top of my head, the method to parse your file could look something like this:
var file = new FileStream(#"c:\file.exe");
byte[] fileStream;
byte[] buffer = new byte[100];
file.Write(fileStream, 0, fileStream.Length);
for (int i = 0; i < fileStream.Length; i += 100)
{
Buffer.BlockCopy(fileStream, i, buffer, 0, 100);
// Do database processing
}
Here is an example that reads a disk file and saves it into a FILESTREAM column. (It assumes that you already have the transaction Context and FilePath in variables "filepath" and "txContext".
'Open the FILESTREAM data file for writing
Dim fs As New SqlFileStream(filePath, txContext, FileAccess.Write)
'Open the source file for reading
Dim localFile As New FileStream("C:\temp\microsoftmouse.jpg",
FileMode.Open,
FileAccess.Read)
'Start transferring data from the source file to FILESTREAM data file
Dim bw As New BinaryWriter(fs)
Const bufferSize As Integer = 4096
Dim buffer As Byte() = New Byte(bufferSize) {}
Dim bytes As Integer = localFile.Read(buffer, 0, bufferSize)
While bytes > 0
bw.Write(buffer, 0, bytes)
bw.Flush()
bytes = localFile.Read(buffer, 0, bufferSize)
End While
'Close the files
bw.Close()
localFile.Close()
fs.Close()
You're probably running into memory fragmentation issues somewhere. Playing around with really large blocks of memory, especially in any situation where they might need to be reallocated tends to cause out of memory errors when in theory you have enough memory to do the job. If it needs a 600mb block and it can't find a hole that's 600mb wide that's it, out of memory.
While I have never tried it my inclination for a workaround would be to create a very minimal program that does ONLY the one operation. Keep it absolutely as simple as possible to keep the memory allocation minimal. When faced with a risky operation like this call the external program to do the job. The program runs, does the one operation and exits. The point is the new program is in it's own address space.
The only true fix is 64 bit and we don't have that option yet.
I recently experienced a similar problem while running DBCC CHECKDB on a very large table. I would get this error:
There is insufficient system memory in
resource pool 'internal' to run this
query.
This was on SQL Server 2008 R2 Express. The interesting thing was that I could control the occurrence of the error by adding or deleting a certain number of rows to the table.
After extensive research and discussions with various SQL Server experts, I came to the conclusion that the problem was a combination of memory pressure and the 1 GB memory limitation of SQL Server Express.
The recommendation given to me was to either
Acquire a machine with more memory and a
licensed edition of SQL Server or...
Partition the table into sizeable
chunks that DBCC CHECKDB could
handle
Due the complicated nature of parsing these files into the FILSTREAM object, I would recommend the filesystem method and and simply use SQL Server to store the locations of the files.
"While there are no limitations on the number of databases or users supported, it is limited to using one processor, 1 GB memory and 4 GB database files (10 GB database files from SQL Server Express 2008 R2)." It is not the size of the database files that is an insue but "1 GB memory". Try spitting the 600MB+ file but putting it in the stream.

Resources