I am trying to figure out if the current redo log size I have right now is optimal. Here is what I have done:
I used the Oracle documentation to find most of this information:
http://www.oracle.com/technetwork/database/availability/async-2587521.pdf
Using SQL I used the below query
select thread#,sequence#,blocks*block_size/1024/1024 "MB",(next_time-first_time)*86400 "sec",
(blocks*block_size/1024/1024)/((next_time-first_time)*86400) "MB/s"
from V$ARCHIVED_LOG
where ((next_time-first_time)*86400<>0) and
first_time between to_date('2020/03/28 08:00:00','YYYY/MM/DD HH24:MI:SS')
and to_date('2020/05/28 11:00:00','YYYY/MM/DD HH24:MI:SS')
and dest_id=3
order by first_time
From the results, I calculate the average MB/S which is 7.67 and the maximum MB/S which is 245 MB/S
According to the Oracle documentation
See table on recommended redo log group size
Using this query
select * from V$LOGFILE a, V$LOG b where a.GROUP# = b.GROUP#
I discovered that I have 15 groups of 2 GB, so the redo log group size is 30 GB.
Oracle says that "In general we
recommend adding an additional 30% on top of the peak rate", so that would mean I am expected to have 245mb/s*1.3 = 318.5 MB/S. Then here is where I get a little lost. Do I use the table in the picture I attached? If so, I would be expected to have a redo log group size of 64GB? Or am I making a connection where there should not be one?
Finally, I did also
select optimal_logfile_size from v$instance_recovery
and that returns 14 GB.
I am having trouble making all the connections and trying to confirm, my redo log size is adequate.
If you have 15 groups of 2GB each, then your group size is 2GB, not 30GB.
The idea is not to switch logs too often - no more than every 20 minutes. So look at how often your log switches are happening. If you are still more than 20 minutes between switches then you are probably fine. If you are ever having more frequent switches than that, then you might need bigger logs.
Based on the calculations you performed, a max rate of ~319MB/s would indicate that individual redo log files should be 64GB, and you want a minimum (per best practice) of three redo log groups. That said - how much of your time is spent at peak load? If only a small amount of time per day (your average transaction rate is much lower) then this may be overkill. You don't want log switches to happen too far apart, either, or your ability to do point-in-time recovery after a redo log failure could be compromised.
It may make more sense for you to have log files that are 16GB and maintain a steady switch rate on average and accept a higher switch rate during peak load. You might need more individual log files that way to handle the same total transactions per minute without waiting for incomplete log switches: say three groups of 64GB each vs. 12 groups of 16GB each. The same total log capacity but in smaller chunks for switches and archive logging. That's probably why you have 15 groups of 2GB each configured now...
Ideally there should not be frequent log switches per hour. Check the redo log switch frequency and increase the size of redo log accordingly.
Found useful link that can be used to get all the redo logs related details below.
Find Redo Log Size / Switch Frequency / Location in Oracle
Related
my SSISDB is writing a large number of entries, especially in [internal].[event_messages] and [internal].[operation_messages].
I have already set the the number of versions to keep and the log retention period to 5 each. After running the maintenance job, selecting the distinct dates in both those tables shows that there are only 6 dates left (including today), as one would expect. Still, the tables I mentioned above have about 6.5 million entries each and a total database size of 35 GB (again, for a retention period of 5 days).
In this particular package, I am using a lot of loops and I suspect that they are causing this rapid expansion.
Does anyone have an idea of how to reduce the number of operational and event messages written by the individual components? Or do you have an idea of what else might be causing this rate of growth? I have packages running on other machines for over a year with a retention period of 150 days and the size of the SSISDB is just about 1 GB.
Thank You!
When my cluster runs for a certain time(maybe a day, maybe two days), some queries may become very slow, about 2~10min to finish, when this happens, I need to restart the whole cluster, and the query become normal, but after some time, very slow queries happen again
The query response time depends on multiple factor including
1. Table size, if table size grows with time then response time will also increase
2. If it is open source version then time spent in GC pauses, which in turn will depend on number of objects/grabage present in the JVM Heap
3. Number of Concurrent queries being run
4. Amount of data overflown to disk
You will need to describe in detail your usage pattern of snappydata. Only then It would be possible to characterise the issue.
Some of the questions that should be answered are like
1. What is cluster size?
2. What are the table sizes?
3. Are writes happening continuously on the tables or only queries are getting executed?
You can engage us at slack channel to provide us informations related to your clusters.
We run SQL Server 2008 R2. The database sizes vary from 20-500 gb, and are all in full recovery mode.
I find it hard to determine a good initial size for the transaction log files, to prevent autogrowth's. It depends on multipele factors, but can anyone give some rule of thumb on a good value for the initial size of the log? Can I see how fast a log is growing in the last xx time or something?
Any advice on this subject is very welcome.. thanks!
You will need to collect metadata over a period of time to understand the growth pattern of your logfiles. The query below can be set up as a job, to fetch and populate a table with the metadata and then you will be able to understand what your initial size will need to be
SELECT GETDATE() AS DateRun
,(size * 8.0)/1024.0 AS size_in_mb
, CASE
WHEN max_size = -1
THEN 9999999 -- Unlimited growth, so handle this how you want
ELSE (max_size * 8.0)/1024.0 END AS max_size_in_mb
FROM YOURDBNAMEHERE.sys.database_files
WHERE data_space_id = 0
You want to avoid auto-grow because it can introduce latency and load spikes. Small auto-grow increments will also cause VLF fragmentation. Still, auto-grow should be enabled as a safety net.
The goal is to have it on, but never kick in in practice. Second-level goal is to not waste space and have quick recovery times. Logs cannot be instant-initialized.
When I size a log I size it so that the chance of an auto-grow event during normal operations is low. If I was forced to quantify that, I'd say that 10% chance per year is a pretty safe standard. I set the grow increment to a low value to avoid latency spiked (even timeouts) in case an auto-grow is required.
Determine, how much log space is being generated at most between two log backups. Probably, you want to measure the growth rate per hour at peak load and add a safety margin. You can very crudely measure the growth rate by taking note of the allocated space in the log file(s) one hour apart.
Some maintenance operations can also very quickly generate tons of log. You need to measure this as well.
The key is to measure peak log generation between two log backups according to your backup schedule.
If you want a quicker, less scientific solution: Start with a small log, let it auto-grow by a rather small increment (256MB?) and look how big it got after a week. Then "reformat" the log to remove VLF fragmentation (shrink, then set the final size plus safety margin).
I have a large table which is both heavily read, and heavily written (append only actually).
I'd like to get an understanding of how the indexes are affecting write speed, ideally the duration spent updating them (vs the duration spent inserting), but otherwise some sort of feel for the resources used solely for the index maintenance.
Is this something that exists in sqlserver/profiler somewhere?
Thanks.
Look at the various ...wait... columns under sys.dm_db_index_operational_stats. This will account for waits for locks and latches, however it will not account for log write times. For log writes you can do a simple math based on row size (ie. a new index that is 10 bytes wide on a table that is 100 bytes wide will add 10% log write) since log write time is driven just by the number of bytes written. The Log Flush... counters under Database Object will measure the current overall DB wide log wait times.
Ultimately, the best measurement is base line comparison of well controlled test load.
I don't believe there is a way to find out the duration of the update, but you can check the last user update on the index by querying sys.dm_db_index_usage_stats. This will give you some key information on how often the index is queried and updated, and the datetime stamp of this information.
I have a database containing records collected every 0.1 seconds, and I need to time-average the data from a given day to once every 20 minutes. So I need to return a day's worth of data averaged to every 20 minutes which is 24*3 values.
Currently I do a separate AVG call to the database for each 20-minute period within the day, which is 24*3 calls. My connection to the database seems a little slow (it is remote) and it takes ~5 minutes to do all the averages. Would it be faster to do a single query in which I access the entire day's worth of data then average it to every 20 minutes? If it helps to answer the question, I have to do some arithmetic to the data before averaging, namely multiplying several table columns.
You can calculate the number of minutes since midnight like:
datepart(hh,datecolumn)*60 + datepart(mi,datecolumn)
If you divide that by 20, you get the number of the 20 minute interval. For example, 00:10 would fall in interval 0, 00:30 in interval 1, and 15:30 in interval 46, and so on. With this formula, you can group on 20 minute intervals like:
select
(datepart(hh,datecolumn)*60 + datepart(mi,datecolumn)) / 20 as IntervalNr
, avg(value)
from YourTable
group by (datepart(hh,datecolumn)*60 + datepart(mi,datecolumn)) / 20
You can do math inside the avg call, like:
avg(col1 * col2 - col3 / col4)
In general reducing the number of queries is a good idea. Aggregate and do whatever arithmetic/filtering/grouping you can in the query (i.e. in the database), and then do 'iterative' computations on the server side (e.g. in PHP).
To be sure whether it would be faster or not, it should be measured.
However it should be faster, as you have a slow connection to the database, and this way the number of roundtrips has a bigger impact on the total time of execution.
How about a stored procedure on your database? If your database engine doesn't support one, how about having a script or something doing the math and populating a separate 'averages' table on your database server. Then you only have to read the averages from the remote client once a day only.
Computation in one single query would be slightly faster. Think of the overhead on multiple requests like setting up the connection, parsing the query or loading the stored procedure, etc.
But also make sure that you've accurate indicies which may result in a hugh performance increase. Some operations on hugh databases may last from minutes to hours.
If you are sending a lot of data, and the connection is the bottleneck, how and when you group and send the data doesn't matter. There is no good way to send 100MB every 10 minutes over a 56k modem. Figure out the size of your data and bandwidth and be sure you can even send it.
That said:
First be certain the network is the bottleneck. If so, try to work with a smaller data set if possible, and test different scenarios. In general, 1 large record set will use less bandwidth than 2 recordsets that are half the size.
If possible add columns to your table and compute and store the column product and interval index (see Andomar's post) every time you post data to the database.