Is Exponential Growth for SQL Server TempDB Normal?

Is Exponential Growth for SQL Server TempDB Normal? - sql-server

I am in the position of "acting" DBA since our previous DBA left. I haven't had any formal training or the like in SQL Server. I've searched on TempDB autogrowth terms and can find a lot of information about how to turn it on, how to shrink dbs etc, but not specifically about the issue I have, if it's normal, or if there's a bunk setting somewhere. I'm hoping some of you Stack Overflow Legends can shed some light on my issue.
Starting from a fresh server reboot with a zero-size TempDB:
An action happens where the TempDB needs 100mb, so grows by 100mb. OK
Another action happens where it needs more, and it grows by 110mb. Eh?? 10% growth should be 10mb, not 110mb
This continues on every time it needs to grow, which is small at first but eventually it's growing by gigabytes every time. We quickly chewed up all 150gb of space on the drive and our ERP system collapsed in a heap and caused an outage while the server was restarted, because it grew by like 4+gb
Surely this can't be normal?
Some background information on the server:
Autogrowth Settings
As far as I can tell, these are the only settings I can change? So I can understand the 100mb -> 110mb is a 10% jump, but the growth is cumulative. In other words, the files size is now 210mb. When it grows again, it'll be 331mb. So you can see how I can quickly burn through disk space.
DB Growth Log
Am I interperting this incorrectly? Shouldn't 10% growth to a 100mb file result in a 110mb file, not 210?
For some extra background, this is a single SQL server instance with around 70 databases, 20 of which are "active" as they are used by our ERP system. The others or historical. Each business division gets it's own database, so we run the single ERP system but it connects to 20 different databas, so essentially has 20 full database installations of itself on the server.
Edit (for Craig's response) - To clarify the growth and the use of MB and GB, this is the growth history from the logs:
GrowthInMB
125
137
151
166
183
201
221
243
267
294
324
356
392
431
474
522
574
631
695
764
841
925
1017
1119
1231
1354
1489
1638
1802
1982
2181
2399
2639
2903
3193
3512
3864
4250
4675
5143
So at the end of this, my tempdb is 60GB, not 4.4GB. If the growth of 10% each time, it should be 10% larger than the previous, so 100MB grows to 110MB, but 10% growth is resulting in 100MB + 110MB to 210MB.

Related

Is the AWS DynamoDB's free tier with 25 RCUs sufficient to read nearly 100Mb of data from 8 tables totally at any given instance?

I have text data on the DynamoDB's tables and there are 8 tables totally with max size of 256Kb - 300 Kb each. That makes 2Mb - 2.5Mb the total size of the DB.
I am reading the table from an app and I am making nearly 50 reads for all the tables from the app at any given instance of time. So that means at any given instance the total reads will be of size 100Mb
So will the 25 RCU's provided by the AWS DynamoDB's free tier be sufficient to carry out the above mentioned task, or will I be billed at the end of the month.

I need 50 for each table and eventually consistent read
Then you need 400 (50*8) eventually consistent (ES) reads per seconds. 1 RCU is 2 ES reads per second. Which means that performing 400 ES will require 200 RCU, resulting in being way over your 25 RCUs.
Subsequently, you will have to pay for the excess RCUs that you use.

How to calculate redo log size?

I am trying to figure out if the current redo log size I have right now is optimal. Here is what I have done:
I used the Oracle documentation to find most of this information:
http://www.oracle.com/technetwork/database/availability/async-2587521.pdf
Using SQL I used the below query
select thread#,sequence#,blocks*block_size/1024/1024 "MB",(next_time-first_time)*86400 "sec",
(blocks*block_size/1024/1024)/((next_time-first_time)*86400) "MB/s"
from V$ARCHIVED_LOG
where ((next_time-first_time)*86400<>0) and
first_time between to_date('2020/03/28 08:00:00','YYYY/MM/DD HH24:MI:SS')
and to_date('2020/05/28 11:00:00','YYYY/MM/DD HH24:MI:SS')
and dest_id=3
order by first_time
From the results, I calculate the average MB/S which is 7.67 and the maximum MB/S which is 245 MB/S
According to the Oracle documentation
See table on recommended redo log group size
Using this query
select * from V$LOGFILE a, V$LOG b where a.GROUP# = b.GROUP#
I discovered that I have 15 groups of 2 GB, so the redo log group size is 30 GB.
Oracle says that "In general we
recommend adding an additional 30% on top of the peak rate", so that would mean I am expected to have 245mb/s*1.3 = 318.5 MB/S. Then here is where I get a little lost. Do I use the table in the picture I attached? If so, I would be expected to have a redo log group size of 64GB? Or am I making a connection where there should not be one?
Finally, I did also
select optimal_logfile_size from v$instance_recovery
and that returns 14 GB.
I am having trouble making all the connections and trying to confirm, my redo log size is adequate.

If you have 15 groups of 2GB each, then your group size is 2GB, not 30GB.
The idea is not to switch logs too often - no more than every 20 minutes. So look at how often your log switches are happening. If you are still more than 20 minutes between switches then you are probably fine. If you are ever having more frequent switches than that, then you might need bigger logs.
Based on the calculations you performed, a max rate of ~319MB/s would indicate that individual redo log files should be 64GB, and you want a minimum (per best practice) of three redo log groups. That said - how much of your time is spent at peak load? If only a small amount of time per day (your average transaction rate is much lower) then this may be overkill. You don't want log switches to happen too far apart, either, or your ability to do point-in-time recovery after a redo log failure could be compromised.
It may make more sense for you to have log files that are 16GB and maintain a steady switch rate on average and accept a higher switch rate during peak load. You might need more individual log files that way to handle the same total transactions per minute without waiting for incomplete log switches: say three groups of 64GB each vs. 12 groups of 16GB each. The same total log capacity but in smaller chunks for switches and archive logging. That's probably why you have 15 groups of 2GB each configured now...

Ideally there should not be frequent log switches per hour. Check the redo log switch frequency and increase the size of redo log accordingly.
Found useful link that can be used to get all the redo logs related details below.
Find Redo Log Size / Switch Frequency / Location in Oracle

SQL Server is creating too many LOB pages

I have a strange situation with a SQL Server database where to actual data in the table is roughly 320 MiB. This is determined by summing up the DATALENGTH of all the columns. This will ignore fragmentation, index space and other SQL Server internal overhead. The problem though is that the table is roughly 40 GiB in size and it's growing at an alarming rate, very disproportionate to the amount of data in bytes or rows that was inserted.
I used the sys.dm_db_index_physical_stats function to look at the physical data and the roughly 40 GiB of data is tied up in LOB_DATA.
The most of the 320 MiB that makes up the table contents is of type ntext. Now, my question is how come SQL Server has allocated 40 GiB of LOB_DATA when there's only roughly 310 MiB ntext data.
Will the problem go away if we convert the column to nvarchar(max)? Are there any storage engine specifics regarding ntext and LOB_DATA that is causing the LOB_DATA pages to not be reclaimed? Why is it groing at a so disproportionate rate with regards to the amount of changes that are being made?

How to efficiently store this big amount of data? Database or what?

I have to do an application that will check the changes of 35 items each second. Each item have 3 values that will fit into 5 bytes each one, so 15 bytes for item. Values will not change each second but there isn't a pattern, maybe they change continuously or they stall for a while ...
So I did a small calculation and I got that storing all the fields each second on a relational database (SQL) I will have:
35 * 15 bytes * 60 seconds * 60 minutes * 24 hours * 365 = 16.5 Gb a year.
This is too much for an SQL database. What would you do to reduce the size of the data? I was thinking on storing the data only when there is a change but then you need to store when the change was done and if the data changes too often this can require more space than the other approach.
I don't know if there are other repositories other than SQL databases that fit better with my requirements.
What do you think?
EDIT: More information.
There is no relation between data other than the one I could create to save space. I just need to store this data and query it. The data can look like (putting it all in one table and saving the data each second):
Timestamp Item1A Item1B Item1C Item2A Item2B ....
whatever 1.33 2.33 1.04 12.22 1.22
whatever 1.73 2.33 1.04 12.23 1.32
whatever 1.23 2.33 1.34 12.22 1.22
whatever 1.33 2.31 1.04 12.22 1.21
I can feel that must be better solutions rather than this aproach ...
EDIT 2:
I usually will query the data about the values of an Item over the time, usually I won't query data from more than one Item ...

This is too much for an SQL database
Since when is it too much?
That's really peanuts for almost any RDBMS out there (~17GB of data every year).
MySQL can do it, so can PostgreSQL, Firebird and plenty others but not the likes of Sqlite. I'd pick PostgreSQL myself.
Having SQL databases with hundreds of TB of data is not that uncommon these days, so 17GB is nothing to think about, really. Let alone 170GB in 10 years (with the machines of the time).
Even if it gets to 30GB a year to account for other data and indexes, that's still OK for an SQL database.
Edit
Considering your structure, that looks to me solid, the minimal things that you need are already there and there are no extras that you don't need.
You can't get any better than that, without using tricks that have more disadvantages than advantages.

I'm currently considering using compressed files instead of SQL databases. I will keep the post upgraded with the info I get.

Unaccounted for database size

I currently have a database that is 20GB in size.
I've run a few scripts which show on each tables size (and other incredibly useful information such as index stuff) and the biggest table is 1.1 million records which takes up 150MB of data. We have less than 50 tables most of which take up less than 1MB of data.
After looking at the size of each table I don't understand why the database shouldn't be 1GB in size after a shrink. The amount of available free space that SqlServer (2005) reports is 0%. The log mode is set to simple. At this point my main concern is I feel like I have 19GB of unaccounted for used space. Is there something else I should look at?
Normally I wouldn't care and would make this a passive research project except this particular situation calls for us to do a backup and restore on a weekly basis to put a copy on a satellite (which has no internet, so it must be done manually). I'd much rather copy 1GB (or even if it were down to 5GB!) than 20GB of data each week.
sp_spaceused reports the following:
Navigator-Production 19184.56 MB 3.02 MB
And the second part of it:
19640872 KB 19512112 KB 108184 KB 20576 KB
while I've found a few other scripts (such as the one from two of the server database size questions here, they all report the same information either found above or below).
The script I am using is from SqlTeam. Here is the header info:
* BigTables.sql
* Bill Graziano (SQLTeam.com)
* graz#<email removed>
* v1.11
The top few tables show this (table, rows, reserved space, data, index, unused, etc):
Activity 1143639 131 MB 89 MB 41768 KB 1648 KB 46% 1%
EventAttendance 883261 90 MB 58 MB 32264 KB 328 KB 54% 0%
Person 113437 31 MB 15 MB 15752 KB 912 KB 103% 3%
HouseholdMember 113443 12 MB 6 MB 5224 KB 432 KB 82% 4%
PostalAddress 48870 8 MB 6 MB 2200 KB 280 KB 36% 3%
The rest of the tables are either the same in size or smaller. No more than 50 tables.
Update 1:
- All tables use unique identifiers. Usually an int incremented by 1 per row.
I've also re-indexed everything.
I ran the dbcc shrink command as well as updating the usage before and after. And over and over. An interesting thing I found is that when I restarted the server and confirmed no one was using it (and no maintenance procs are running, this is a very new application -- under a week old) and when I went to run the shrink, every now and then it would say something about data changed. Googling yielded too few useful answers with the obvious not applying (it was 1am and I disconnected everyone, so it seems impossible that was really the case). The data was migrated via C# code which basically looked at another server and brought things over. The quantity of deletes, at this point in time, are probably under 50k in rows. Even if those rows were the biggest rows, that wouldn't be more than 100M I would imagine.
When I go to shrink via the GUI it reports 0% available to shrink, indicating that I've already gotten it as small as it thinks it can go.
Update 2:
sp_spaceused 'Activity' yields this (which seems right on the money):
Activity 1143639 134488 KB 91072 KB 41768 KB 1648 KB
Fill factor was 90.
All primary keys are ints.
Here is the command I used to 'updateusage':
DBCC UPDATEUSAGE(0);
Update 3:
Per Edosoft's request:
Image 111975 2407773 19262184
It appears as though the image table believes it's the 19GB portion.
I don't understand what this means though.
Is it really 19GB or is it misrepresented?
Update 4:
Talking to a co-worker and I found out that it's because of the pages, as someone else here has also state the potential for that. The only index on the image table is a clustered PK. Is this something I can fix or do I just have to deal with it?
The regular script shows the Image table to be 6MB in size.
Update 5:
I think I'm just going to have to deal with it after further research. The images have been resized to be roughly 2-5KB each and on a normal file system doesn't consume much space but on SqlServer it seems to consume considerably more. The real answer, in the long run, will likely be separating that table in to another partition or something similar.

Try this query:
SELECT object_name(object_id) AS name, rows, total_pages,
total_pages * 8192 / 1024 as [Size(Kb)]
FROM sys.partitions p
INNER JOIN sys.allocation_units a
ON p.partition_id = a.container_id

You may also want to update the usage in the systables before you run the query to make sure that they are accurate.
DECLARE #DbName NVARCHAR(128)
SET #DbName = DB_NAME(DB_ID())
DBCC UPDATEUSAGE(#DbName)

what is the fill factor you're using in your reindexing? it has to be high. from 90-100% depending on the PK datatype.
if your fill factor is low then you'll have a lot of half empty pages which can't be shrunk down.

Did you try the dbcc command to shrink the catalog? If you transfer all data to an empty catalog, is it also 20GB?
A database uses a page-based file system, so you might be running into a lot of slack (empty space between pages) due to heavy row removal: if the dbms expects rows to be inserted at that spot, it might be better to leave the spots open. Do you use unique_identifier based PK's which have a clustered index?

you could try doing a database vacuum, this can often yield large space improvements if you have never done it before.
hope this helps.

Have you checked the stats under the "Shrink Database" dialog? In SQL Server Management Studio (2005 / 2008), right-click the database, click Tasks -> Shrink -> Database. That'll show you how much space is allocated to the DB, and how much of that allocated space is currently unused.

Have you ensured that the space isn't being consumed by your transaction log? If you're in full recovery mode, the t-log won't be shrinkable until you perform a transaction log backup.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight