I have an extremely large database and most of the space is the index size. I moved several indexes to a different file group (just to experiment) but no matter what I do I cannot reduce the size of the MDF.
I tried shrink database, shrink files, rebuilding clustered index. What can I do to reclaim that space in the MDF? I've moved 25GB worth of indexes to a different file group. Is it even possible to reduce my mdf by that same 25gb (or close to it)?
SQL Server 2008 Enterprise
Group Total Space Avail Space
PRIMARY 388485.000000 27126.3125000
Index 24778.375000 26.6250000
Options I tried for shrink:
TSQL:
1. DBCC SHRINKFILE (1, 10);
2. DBCC SHRINKFILE (1, TRUNCATE_ONLY);
UI: Shrink database:
1. Without 'reorganize files...'
2. With 'reorganize files...'
2a. set max free space = 10%
2b. set max free space = 90%
2c. set max free space = 75%
UI: Shrink files: (type =data, group = primary, filename = db name)
1. Release unused space
2. reorganize pages... setting shrink file = 2GB (max it allows)
3. reorganize pages... setting shrink file = 361,360 (min allowed)
did not try 'Empty file' option because that doesnt seem like what i want.
I've noticed in the past that shrinking the data file in smaller chunks can be more effective than trying to shrink it all in one go. If you were to attempt to use a similar strategy then you'd want to do something like below:
DECLARE #targetSize AS INT = 388000;
DECLARE #desiredFinalSize AS INT = 362000;
DECLARE #increment AS INT = 300;
DECLARE #sql AS VARCHAR(200);
WHILE #targetSize > #desiredFinalSize
BEGIN
SET #sql = 'DBCC SHRINKFILE(''MyDataFileName'', ' + CAST(#targetSize AS VARCHAR(10)) + ');'
SELECT #sql;
EXEC(#sql);
SET #targetSize = #targetSize - #increment;
END
OK, try through the UI a "Shrink Database" with "0%" max free space.
The 10% setting you used before "worked" but your 27 gigs free space is under 10% (its like 7%) of your total DB.
Bear in mind this may cause you performance issues down the road since your DB will need to grow again, and this will also cause performance issues now since shrinking leads to fragmentation which is bad.
However if space is your primary concern the above method should work.
This step looks wrong
UI: Shrink files: (type =data, group = primary, filename = db name)
1. Release unused space
2. reorganize pages... setting shrink file = 2GB (max it allows)
3. reorganize pages... setting shrink file = 361,360 (min allowed)
Among your question steps, this is the right one.
Shrink File
File type (Data), group (PRIMARY) Name ()
Shrink Action = reorganize pages (=move to front), Shrink File *
Shrink File is entered in MB, so a value of 361,360 would be a third of a Terabyte. The easiest way to set it to minimum is to enter 0, tab out -> it will replace with the min allowed.
Right now we need to get the size down to something manageable for backups
This is completely the wrong way to go about it. BACKUPs do not include free space, so whether your primary mdf file has 100GB of free space or 1TB, as long as the data resides on 200GB, that is the size of the backup files. For SQL Server 2008, you can use "with compression", which gives an average size of about 1/6th.
sometimes i've noticed that if i shrink down a file, it won't shrink below the initial size set on the file. could you go see what the initial file size of the file is?
Related
I have Oracle Database 12c in a docker container. At one point, the space in the tablespace expanded (the users01.dbf file became 32GB in size) and I made a new file for this tablespace - users02.dbf. I analyzed the tables and indexes that occupy the most space and made them truncate. I see that the size of the largest tables and indexes has decreased, but the users01.dbf and users02.dbf files remain the same size:
users01.dbf 32G
users02.dbf 8.8G
Here is a screenshot:
Left size before, right after truncate command. How can I clean or reduce the size of users01.dbf and users02.dbf files and not break the database.
I am using the one query to find out the actual size that can be shrunk from the data file.
I am using it for almost 4 years and I don't know from where I got it but Yes, I got that script from one of the good blogs of the oracle.
Following script will generate the series of command which can be executed on DB to shrunk the data file or we can say reclaim the free space from the data file (only if more than 1 MB can be reclaimed) to perfect size without any error.
set linesize 1000 pagesize 0 feedback off trimspool on
with
hwm as (
-- get highest block id from each datafiles ( from x$ktfbue as we don't need all joins from dba_extents )
select /*+ materialize */ ktfbuesegtsn ts#,ktfbuefno relative_fno,max(ktfbuebno+ktfbueblks-1) hwm_blocks
from sys.x$ktfbue group by ktfbuefno,ktfbuesegtsn
),
hwmts as (
-- join ts# with tablespace_name
select name tablespace_name,relative_fno,hwm_blocks
from hwm join v$tablespace using(ts#)
),
hwmdf as (
-- join with datafiles, put 5M minimum for datafiles with no extents
select file_name,nvl(hwm_blocks*(bytes/blocks),5*1024*1024) hwm_bytes,bytes,autoextensible,maxbytes
from hwmts right join dba_data_files using(tablespace_name,relative_fno)
)
select
case when autoextensible='YES' and maxbytes>=bytes
then -- we generate resize statements only if autoextensible can grow back to current size
'/* reclaim '||to_char(ceil((bytes-hwm_bytes)/1024/1024),999999)
||'M from '||to_char(ceil(bytes/1024/1024),999999)||'M */ '
||'alter database datafile '''||file_name||''' resize '||ceil(hwm_bytes/1024/1024)||'M;'
else -- generate only a comment when autoextensible is off
'/* reclaim '||to_char(ceil((bytes-hwm_bytes)/1024/1024),999999)
||'M from '||to_char(ceil(bytes/1024/1024),999999)
||'M after setting autoextensible maxsize higher than current size for file '
|| file_name||' */'
end SQL
from hwmdf
where
bytes-hwm_bytes>1024*1024 -- resize only if at least 1MB can be reclaimed
order by bytes-hwm_bytes desc
/
It will generate the commands something like the following:
/* reclaim 1934M from 2048M */ alter database datafile 'C:\APP\TEJASH\VIRTUAL\ORADATA\ORCL\DATAFILE\TEJASH_DATAFILE_01.DBF' resize 115M;
/* reclaim 158M from 200M */ alter database datafile 'C:\APP\TEJASH\VIRTUAL\ORADATA\ORCL\DATAFILE\UNDO_DF_02.DBF' resize 43M;
/* reclaim 59M from 1060M */ alter database datafile 'C:\APP\TEJASH\VIRTUAL\ORADATA\ORCL\DATAFILE\O1_MF_SYSAUX_G9K5LYTT_.DBF' resize 1002M;
/* reclaim 3M from 840M */ alter database datafile 'C:\APP\TEJASH\VIRTUAL\ORADATA\ORCL\DATAFILE\O1_MF_SYSTEM_G9K5KK2J_.DBF' resize 838M;
You can directly execute all of them and you do not have to worry about calculating anything by yourself.
Please note that this script will work on the data files for which autoextensible is ON
I hope this will help you out.
Cheers!!
alter database datafile 'path_to_datafile/users01.dbf' resize 150M;
Repeat same for 02. Make sure path and file names are correct.
Since you truncated tables it should have dropped the segments altogether, so file should be able to shrink. If you get ORA-03297: file contains used data beyond requested RESIZE value then it means you have data at 150M mark so you should try increasing resize limit more until error goes away.
As always you shouldn't be doing stuff directly on Production but test it out.
I have a PostgreSQL table named census. I have performed the ANALYSE command on the table, and the statistics are recorded in pg_stats.
There are other entries in this pg_stats from other database tables as can be expected.
However, I wanted to know the space consumed for storing the histogram_bounds for the census table alone. Is there a good and fast way for it?
PS: I have tried dumping the pg_stats table onto the disk to measure the memory using
select * into copy_table from census where tablename='census';
However, it failed because of the pseudo-type anyarray.
Any ideas there too?
In the following I use the table pg_type and its column typname for demonstration purposes. Replace these with your table and column name to get the answer for your case (you didn't say which column you are interested in).
You can use the pg_column_size function to get the size of any column:
SELECT pg_column_size(histogram_bounds)
FROM pg_stats
WHERE schemaname = 'pg_catalog'
AND tablename = 'pg_type'
AND attname = 'typname';
pg_column_size
----------------
1269
(1 row)
To convert an anyarray to a regular array, you can first cast it to text and then to the desired array type:
SELECT histogram_bounds::text::name[] FROM pg_stats ...
If you measure the size of that converted array, you'll notice that it is much bigger than the result above.
The reason is that pg_column_size measures the actual size on disk, and histogram_bounds is big enough to be stored out of line in the TOAST table, where it will be compressed. The converted array is not compressed, because it is not stored in a table.
What is the max size of a file that I can insert using varbinary(max) in SQL Server 2008 R2? I tried to change the max value in the column to more than 8,000 bytes but it won't let me, so I'm guessing the max is 8,000 bytes, but from this article on MSDN, it says that the max storage size is 2^31-1 bytes:
varbinary [ ( n | max) ]
Variable-length binary data. n can be a value from 1 through 8,000. max indicates that the maximum storage size is 2^31-1 bytes. The storage size is the actual length of the data entered + 2 bytes. The data that is entered can be 0 bytes in length. The ANSI SQL synonym for varbinary is binary varying.
So how can i store larger files in a varbinary field? I'm not considering using a FILESTREAM since the files I want to save are from 200kb to 1mb max, The code I'm using:
UPDATE [table]
SET file = ( SELECT * FROM OPENROWSET ( BULK 'C:\A directory\A file.ext', SINGLE BLOB) alias)
WHERE idRow = 1
I have been able to execute that code successfully to files less or equal than 8000 bytes. If i try with a file 8001 bytes size it will fail. My file field on the table has a field called "file" type varbinary(8000) which as I said, I can't change to a bigger value.
I cannot reproduce this scenario. I tried the following:
USE tempdb;
GO
CREATE TABLE dbo.blob(col VARBINARY(MAX));
INSERT dbo.blob(col) SELECT NULL;
UPDATE dbo.blob
SET col = (SELECT BulkColumn
FROM OPENROWSET( BULK 'C:\Folder\File.docx', SINGLE_BLOB) alias
);
SELECT DATALENGTH(col) FROM dbo.blob;
Results:
--------
39578
If this is getting capped at 8K then I would guess that either one of the following is true:
The column is actually VARBINARY(8000).
You are selecting the data in Management Studio, and analyzing the length of the data that is displayed there. This is limited to a max of 8192 characters in results to text, if this is the case, so using DATALENGTH() directly against the column is a much better approach.
I would dare to say, use file stream for files bigger than 1 MB based on the following from: MS TechNet | FILESTREAM Overview.
In SQL Server, BLOBs can be standard varbinary(max) data that stores
the data in tables, or FILESTREAM varbinary(max) objects that store
the data in the file system. The size and use of the data determines
whether you should use database storage or file system storage. If the
following conditions are true, you should consider using FILESTREAM:
Objects that are being stored are, on average, larger than 1 MB.
Fast read access is important.
You are developing applications that use a middle tier for application logic.
For smaller objects, storing varbinary(max) BLOBs in the database
often provides better streaming performance.
"SET TEXTSIZE" Specifies the size of varchar(max), nvarchar(max), varbinary(max), text, ntext, and image data returned by a SELECT statement.
select ##TEXTSIZE
The SQL Server Native Client ODBC driver and SQL Server Native Client OLE DB Provider for SQL Server automatically set TEXTSIZE to 2147483647 when connecting. The maximum setting for SET TEXTSIZE is 2 gigabytes (GB), specified in bytes. A setting of 0 resets the size to the default (4 KB).
As mentioned, for big files you should prefer file stream.
(please read the update section below, I leave the original question too for clarity)
I am inserting many files into a SQL Server db configured for filestream.
I am inserting in a loop the files from a folder to a database table.
Everything goes fine until I try to insert a 600 MB file.
As it inserts it there is a +600MB memory usage in task manager and I have the error.
The DB size is < 1 GB and the total size of documents is 8 GB, I am using SQL Server Express R2, and according to the documentation I could have problems only if trying to insert a document that is greater than 10 GB (Express limitation) - Current DB Size.
Can anyone tell me why do I have this error? It is very crucial for me.
UPDATE FOR BOUNTY:
I offered 150 because it is very crucial for me!
This seems to be a limitation of Delphi memory Manager, trying to insert a document bigger than 500MB, I didn't check the exact threshold anyway it is between 500 and 600MB). I use SDAC components, in particular a TMSQuery (but I think the same can be done with and TDataset descendant), to insert the document in a table that has a PK (ID_DOC_FILE) and a varbinary(max) field (DOCUMENT) I do:
procedure UploadBigFile;
var
sFilePath: String;
begin
sFilePath := 'D:\Test\VeryBigFile.dat';
sqlInsertDoc.ParamByName('ID_DOC_FILE').AsInteger := 1;
sqlInsertDoc.ParamByName('DOCUMENT').LoadFromFile(sFilePath, ftblob);
sqlInsertDoc.Execute;
sqlInsertDoc.Close;
end;
SDAC team told me this is a limitation of Delphi memory manager. Now since SDAC doesn't support filestream I cannot do what has been suggested in c# in the first answer. Is the only solution reporting to Embarcadero and ask a bug fix?
FINAL UPDATE:
Thanks, really, to all you that answered me. For sure inserting big blobs can be a problem for the Express Edition (because the limitations of 1 GB of ram), anyway I had the error on the Enterprise edition, and it was a "delphi" error, not a sql server one. So I think that the answer that I accepted really hits the problem, even if I have no time to verify it now.
SDAC team told me this is a limitation of Delphi memory manager
To me that looked like an simplistic answer, and I investigated. I don't have the SDAC components and I also don't use SQL Server, my favorites are Firebird SQL and the IBX component set. I tried inserting an 600Mb blob into a table, using IBX, then tried the same using ADO (covering two connection technologies, both TDataSet descendants). I discovered the truth is somewhere in the middle, it's not really the memory manager, it's not SDAC's fault (well... they are in a position to do something about it, if many more people attempt inserting 600 Mb blobs into databases, but that's irrelevant to this discussion). The "problem" is with the DB code in Delphi. As it turns out Delphi insists on using an single Variant to hold whatever type of data one might load into an parameter. And it makes sense, after all we can load lots of different things into an parameter for an INSERT. The second problem is, Delphi wants to treat that Variant like an VALUE type: It copies it around at list twice and maybe three times! The first copy is made right when the parameter is loaded from the file. The second copy is made when the parameter is prepared to be sent to the database engine.
Writing this is easy:
var V1, V2:Variant;
V1 := V2;
and works just fine for Integer and Date and small Strings, but when V2 is an 600 Mb Variant array that assignment apparently makes a full copy! Now think about the memory space available for a 32 bit application that's not running in "3G" mode. Only 2 Gb of addressing space are available. Some of that space is reserved, some of that space is used for the executable itself, then there are the libraries, then there's some space reserved for the memory manager. After making the first 600 Mb allocation there just might not be enough available addressing space to allocate an other 600 Mb buffer! Because of this it's safe to blame it on the memory manager, but then again, why exactly does the DB stuff need an other copy of the 600 Mb monster?
One possible fix
Try splitting up the file into smaller, more manageable chunks. Set up the database table to have 3 fields: ID_DOCUMENT, SEQUENCE, DOCUMENT. Also make the primary key on the table to be (ID_DOCUMENT, SEQUENCE). Next try this:
procedure UploadBigFile(id_doc:Integer; sFilePath: String);
var FS:TFileStream;
MS:TMemoryStream;
AvailableSize, ReadNow:Int64;
Sequence:Integer;
const MaxPerSequence = 10 * 1024 * 1024; // 10 Mb
begin
FS := TFileStream.Create(sFilePath, fmOpenRead);
try
AvailableSize := FS.Size;
Sequence := 0;
while AvailableSize > 0 do
begin
if AvailableSize > MaxPerSequence then
begin
ReadNow := MaxPerSequence;
Dec(AvailableSize, MaxPerSequence);
end
else
begin
ReadNow := AvailableSize;
AvailableSize := 0;
end;
Inc(Sequence); // Prep sequence; First sequence into DB will be "1"
MS := TMemoryStream.Create;
try
MS.CopyFrom(FS, ReadNow);
sqlInsertDoc.ParamByName('ID_DOC_FILE').AsInteger := id_doc;
sqlInsertDoc.ParamByName('SEQUENCE').AsInteger := sequence;
sqlInsertDoc.ParamByName('DOCUMENT').LoadFromStream(MS, ftblob);
sqlInsertDoc.Execute;
finally MS.Free;
end;
end;
finally FS.Free;
end;
sqlInsertDoc.Close;
end;
You could loop through the byte stream of the object you are trying to insert and essentially buffer a piece of it at a time into your database until you have your entire object stored.
I would take a look at the Buffer.BlockCopy() method if you're using .NET
Off the top of my head, the method to parse your file could look something like this:
var file = new FileStream(#"c:\file.exe");
byte[] fileStream;
byte[] buffer = new byte[100];
file.Write(fileStream, 0, fileStream.Length);
for (int i = 0; i < fileStream.Length; i += 100)
{
Buffer.BlockCopy(fileStream, i, buffer, 0, 100);
// Do database processing
}
Here is an example that reads a disk file and saves it into a FILESTREAM column. (It assumes that you already have the transaction Context and FilePath in variables "filepath" and "txContext".
'Open the FILESTREAM data file for writing
Dim fs As New SqlFileStream(filePath, txContext, FileAccess.Write)
'Open the source file for reading
Dim localFile As New FileStream("C:\temp\microsoftmouse.jpg",
FileMode.Open,
FileAccess.Read)
'Start transferring data from the source file to FILESTREAM data file
Dim bw As New BinaryWriter(fs)
Const bufferSize As Integer = 4096
Dim buffer As Byte() = New Byte(bufferSize) {}
Dim bytes As Integer = localFile.Read(buffer, 0, bufferSize)
While bytes > 0
bw.Write(buffer, 0, bytes)
bw.Flush()
bytes = localFile.Read(buffer, 0, bufferSize)
End While
'Close the files
bw.Close()
localFile.Close()
fs.Close()
You're probably running into memory fragmentation issues somewhere. Playing around with really large blocks of memory, especially in any situation where they might need to be reallocated tends to cause out of memory errors when in theory you have enough memory to do the job. If it needs a 600mb block and it can't find a hole that's 600mb wide that's it, out of memory.
While I have never tried it my inclination for a workaround would be to create a very minimal program that does ONLY the one operation. Keep it absolutely as simple as possible to keep the memory allocation minimal. When faced with a risky operation like this call the external program to do the job. The program runs, does the one operation and exits. The point is the new program is in it's own address space.
The only true fix is 64 bit and we don't have that option yet.
I recently experienced a similar problem while running DBCC CHECKDB on a very large table. I would get this error:
There is insufficient system memory in
resource pool 'internal' to run this
query.
This was on SQL Server 2008 R2 Express. The interesting thing was that I could control the occurrence of the error by adding or deleting a certain number of rows to the table.
After extensive research and discussions with various SQL Server experts, I came to the conclusion that the problem was a combination of memory pressure and the 1 GB memory limitation of SQL Server Express.
The recommendation given to me was to either
Acquire a machine with more memory and a
licensed edition of SQL Server or...
Partition the table into sizeable
chunks that DBCC CHECKDB could
handle
Due the complicated nature of parsing these files into the FILSTREAM object, I would recommend the filesystem method and and simply use SQL Server to store the locations of the files.
"While there are no limitations on the number of databases or users supported, it is limited to using one processor, 1 GB memory and 4 GB database files (10 GB database files from SQL Server Express 2008 R2)." It is not the size of the database files that is an insue but "1 GB memory". Try spitting the 600MB+ file but putting it in the stream.
On my app i am creating a real time trace (not sure how yet but i am!) and on the sp_trace_create function in SQlServer, i know that the #maxfilesize defaults to 5, but on my app its going to be stopped when the user wants to stop it...any ideas how this can be done?
Because i dont want to have to save the files...im not sure how the rollover works?
Right now im putting it on a timer loop that queries the database with all the specified events on it with a maximum file size of 1(usually doesnt take more then about 2 seconds), merges with the old lot of data in my dgview and deletes the original file. this goes round until the user tells it to stop which will stop the timer from querying the database. Not a solid method but i guess its a start! All i need now is the find out the datatypes of the columns as when im setting my values in the filters they need to go in as the matching datatype to the column... anyone have any clue where i can get a list of the datatypes? msdn have the list but no types...
To start a trace with file rollover, instead of stopping at a maximum size, start the trace like so:
exec #rc = sp_trace_create #TraceID output, 2, N'InsertFileNameHere', #maxfilesize, NULL
where #maxfilesize will define the size reached before a new rollover file will be created.
WARNING: be very careful about performing unlimited tracing. If you fill up a production disc, it's your head not mine!
You can stop a running trace like so:
EXEC sp_trace_setstatus #ID, 0
EXEC sp_trace_setstatus #ID, 2
where #ID is the ID of the trace you want to stop.
See this post.
According to the documentation, what you want to do is not possible:
[ #maxfilesize = ] max_file_size
Specifies the maximum size in megabytes (MB) a trace file can grow. max_file_size is bigint, with a default value of 5.
If this parameter is specified without the TRACE_FILE_ROLLOVER option, the trace stops recording to the file when the disk space used exceeds the amount specified by max_file_size.
I don't see why you can't just cycle through the files and load them into a table or your app. Shouldn't be that hard.