Relpages and reltuples under MVCC - database

If I understand correctly, under MVCC(multi version concurrency control), dead tuples are left in the page, until the Vacuum comes in and mark them "unused", and until "vacuum full" comes in and reorgnize them to defragment the space -- so we use less space for the same data.
I have a table, which in one environment that hasn't done vacuum full has around:
SELECT relpages, reltuples from pg_class where relname='pg_toast_16450';
relpages | reltuples
-----------+--------------
544447814 | 6.394397e+06
In another environment that has undergone vacuum full has:
SELECT relpages, reltuples from pg_class where relname='pg_toast_16450';
relpages | reltuples
----------+--------------
2476625 | 4.439228e+06
Looks like relpages does lower dramatically, which matches my understanding. However, reltuples does not. (relpages has a 250X change, yet reltuples has just 1.33X) Does that mean reltuples does not include dead tuples? If that's the case, does the query planner utilizing reltuples to design query plan has a way around dead tuples?

reltuples is an estimate for the number of live rows in a table. As the documentation says,
It is updated by VACUUM, ANALYZE, and a few DDL commands such as CREATE INDEX.
So the number an always be slightly off if the last such command (perhaps triggered by autovacuum) has run on the table, and VACUUM (FULL) would fix that.
However, there is a second thing to consider, since this is a TOAST table: it may contain some entries that belong to dead rows in the table. The TOAST entries of a dead row need not be dead themselves, but VACUUM (FULL) would not copy them, so the number can shrink additionally. I would suspect that this has happened, because the number is reduced by more than the 10% that I would expect otherwise.

Related

Postgres row_number() doubling table size roughly every 24 hours

I have an Assets table with ~165,000 rows in it. However, the Assets make up "Collections" and each Collection may have ~10,000 items, which I want to save a "rank" for so users can see where a given asset ranks within the collection.
The rank can change (based on an internal score), so it needs to be updated periodically (a few times an hour).
That's currently being done on a per-collection basis with this:
UPDATE assets a
SET rank = a2.seqnum
FROM
(SELECT a2.*,
row_number() OVER (
ORDER BY elo_rating DESC) AS seqnum
FROM assets a2
WHERE a2.collection_id = #{collection_id} ) a2
WHERE a2.id = a.id;
However, that's causing the size of the table to double (i.e. 1GB to 2GB) roughly every 24 hours.
A VACUUM FULL clears this up, but that doesn't feel like a real solution.
Can the query be adjusted to not create so much (what I assume is) temporary storage?
Running PostgreSQL 13.
Every update writes a new row version in Postgres. So (aside from TOASTed columns) updating every row in the table roughly doubles its size. That's what you observe. Dead tuples can later be cleaned up to shrink the physical size of the table - that's what VACUUM FULL does, expensively. See:
Are TOAST rows written for UPDATEs not changing the TOASTable column?
Alternatively, you might just not run VACUUM FULL and keep the table at ~ twice it's minimal physical size. If you run plain VACUUM (without FULL!) enough - and if you don't have long running transactions blocking that - Postgres will have marked dead tuples in the free-space map by the time the next UPDATE kicks in and can reuse the disk space, thus staying at ~ twice its minimal size. That's probably cheaper than shrinking and re-growing the table all the time, as the most expensive part is typically to physically grow the table. Be sure to have aggressive autovacuum settings for the table. See:
Aggressive Autovacuum on PostgreSQL
VACUUM returning disk space to operating system
Probably better yet, break out the ranking into a minimal separate 1:1 table (a.k.a. "vertical partitioning") , so that only minimal rows have to be written "a few times an hour". Probably including elo_rating you mention in the query, which seems to change at least as frequently (?).
(LEFT) JOIN to the main table in queries. While that adds considerable overhead, it may still be (substantially) cheaper. Depends on the complete picture, most importantly the average row size in table assets and the typical load apart from your costly updates.
See:
Many columns vs few tables - performance wise
UPDATE or INSERT & DELETE? Which is better for storage / performance with large text columns?

Interpreting SQL Server table statistics

I write queries and procedures, I have no experience as a DB Admin and I am not in a position to be such. I work with hundreds of tables and certain older tables are difficult to work with. I suspect that statistics are a problem but the DBA states that isn't the case.
I don't know how to interpret statistics or even which ones I should look at. As an example, I am currently JOINing 2 tables, it is a simple JOIN that uses an index.
It returns just under 500 rows in 4 columns. It runs very quickly but not when in production with thousands of runs a day. My estimated and actual rows on this JOIN are off by 462%.
I have distilled this stored procedure down to a lot of very basic temp tables to locate the problem areas and it appears to be 2 tables, this example is one of them.
What I want is to know is which commands to run and what statistics to look at to take to the DBA to discuss the specific problem at hand. I do not want to be confrontational but informational. I have a very good professional relationship with this DBA but he is very black and white with his policies so I may not get anywhere with it in the end, but then I can also take that to my lead if I get stonewalled.
I ran a DBCC SHOW_STATISTICS on the table's index. I am not sure if this is the data I need or what I am really looking at. I would really like to know where to start with this. I have googled but all the pages I read are very geared towards DBAs and assume prior knowledge in areas I don't have.
Below is a sample of my JOIN obfuscated - my JOIN is on a temp table the first 2 conditions are needed for the Index, the date conditions when removed make the JOIN actually much worse with 10x the reads:
SELECT
x.UniqueID,
x.ChargeCode,
x.dtDate,
x.uniqueForeignID
INTO
#AnotherTempTable
FROM
Billing.dbo.Charges x
JOIN
#temptable y ON x.uniqueForeignID = y.uniqueID
AND x.ChargeCode = y.ChargeCode
AND #PostMonthStart <= x.dtDate
AND x.dtDate < #PostMonthEnd
The JOIN above is part of a new plan where I have been dissecting all the data needed into temp tables to determine the root cause of the problem in high CPU and reads in production. Below is a list of all the statements that are executing, sorted by the number of reads. The second row is this example query but there are others with similar issues.
Below is the execution plan operations for the plan prior to my updates.
While the new plan has better run time and closer estimates, I worry that I am still going to run into issues if the statistics are off. If I am completely off-base, please tell me and point me in the right direction, I will gladly bark up a different tree if I am making incorrect assumptions.
The first table returned shows some general information. You can see the statistics on this index were last updated 12/25/2019 at 10:19 PM. As of the writing of this answer, that is yesterday evening, so stats were updated recently. That is likely to be some kind of evening maintenance, but it could also be a threshold of data modifications that triggered an automatic statistics update.
There were 222,596,063 rows in the table at the time the statistics were sampled. The statistics update sampled 626,452 of these rows, so the sample rate is 0.2%. This sample size was likely the default sample rate used by a simple update statistics MyTable command.
A sample rate of 0.2% is fast to calculate but can lead to very bad estimates-- especially if an index is being used to support a foreign key. For example, a parent/child relationship may have a ParentKey column on the child table. A low statistics sample rate will result in very high estimates per parent row which can lead to strange decisions in query plans.
Look at the third table (the histogram). The RANGE_HI_KEY corresponds to a specific key value of the first column in this index. The EQ_ROWS column is the histogram's estimate of the number of rows that correspond to this key. If you get a count of the rows in this table by one of these keys in the RANGE_HI_KEY column, does the number in the EQ_ROWS column look like an accurate estimate? If not, a higher sample rate may yield better query plans.
For example, take the value 1475616. Is the count of rows for this key close to the EQ_ROWS value of 3893?
select count(*) from MyTable where FirstIndexColumn = 1475616
If the estimate is very bad, the DBA may need to increase the sample size on this table:
update statistics MyTable with sample 5 percent
If the DBA uses Ola Hallengren's plan (an excellent choice, in my opinion), this can be done by passing the #StatisticsSample parameter to the IndexOptimize procedure.

order by slows query down massively

using sql server 2014; ((SP1-CU3) (KB3094221) Oct 10 2015 x64
I have the following query
SELECT * FROM dbo.table1 t1
LEFT JOIN dbo.table2 t2 ON t2.trade_id = t1.tradeNo
LEFT JOIN dbo.table3 t3 ON t3.TradeReportID = t1.tradeNo
order by t1.tradeNo
there are ~70k, 35k and 73k rows in t1,t2 and t3 respectively.
When I omit the order by this query executes in 3 seconds with 73k rows.
As written the query took 8.5 minutes to return ~50k rows (I since stopped it)
Switching the order of the LEFT JOINs makes a difference:
SELECT * FROM dbo.table1 t1
LEFT JOIN dbo.table3 t3 ON t3.TradeReportID = t1.tradeNo
LEFT JOIN dbo.table2 t2 ON t2.trade_id = t1.tradeNo
order by t1.tradeNo
This now runs in 3 seconds.
I dont have any indexes on the tables. Adding indexes t1.tradeNo and t2.trade_id and t3.TradeReportID has no effect.
Running the query with only one left join (both scenarios) in combination with the order by is fast.
Its fine for me to swap the order of the LEFT JOINs but this doesnt go far to explaining why this happens and under what scenarios it may happen again
The estimated exectuion plan is: (slow)
(exclamation mark details)
VS
Switching the order of the left joins (fast):
which I note are markedly different but I cannot interpret these to explain the performance difference
UPDATE
It appears the addition of the order by clause results in the execution plan using the Table Spool (lazy spool) vs NOT using this in the fast query.
If I turn off the table spool via DBCC RULEOFF ('BuildSpool'); this 'fixes' the speed but according to this post this isnt recommended and cannot do it per query anyway
UPDATE 2
One of the columns returned (table3.Text] has type varchar(max)) - If this is changed to nvarchar(512) then the original (slow) query is now fast - ie the execution plan now decides to not use the Table Spool - also note that even tho the type is varchar(max) the field values are NULL for every one of the rows. This is now fixable but I am none the wiser
UPDATE 3
Warnings in the execution plan stated
Type conversion in expression (CONVERT_IMPLICIT(nvarchar(50),[t2].[trade_id],0)) may affect "CardinalityEstimate" in query plan choice, ...
t1.tradeNo is nvarchar(21) - the other two are varchar(50) - after altering the latter two to the same as the first the problem disappears! (leaving varchar(max) as stated in UPDATE 2 unchanged)
Given this problem goes away when either UPDATE 2 or UPDATE 3 are rectified I would guess that its a combination of the query optimizer using a temp table (table spool) for a column that has an unbounded size - very interesting despite the nvarchar(max) column having no data.
Your likely best fix is to make sure both sides of your joins have the same datatype. There's no need for one to be varchar and the other nvarchar.
This is a class of problems that comes up quite frequently in DBs. The database is assuming the wrong thing about the composition of the data it's about to deal with. The costs shown in your estimated execution plan are likely a long way from what you'd get in your actual plan. We all make mistakes and really it would be good if SQL Server learned from its own but currently it doesn't. It will estimate a 2 second return time despite being immediately proven wrong again and again. To be fair, I don't know of any DBMS which has machine-learning to do better.
Where your query is fast
The hardest part is done up front by sorting table3. That means it can do an efficient merge join which in turn means it has no reason to be lazy about spooling.
Where it's slow
Having an ID that refers to the same thing stored as two different types and data lengths shouldn't ever be necessary and will never be a good idea for an ID. In this case nvarchar in one place varchar in another. When that makes it fail to get a cardinality estimate that's the key flaw and here's why:
The optimizer is hoping to require only a few unique rows from table3. Just a handful of options like "Male", "Female", "Other". That would be what is known as "low cardinality". So imagine tradeNo actually contained IDs for genders for some weird reason. Remember, it's you with your human skills of contextualisation, who knows that's very unlikely. The DB is blind to that. So here is what it expects to happen: As it executes the query the first time it encounters the ID for "Male" it will lazily fetch the data associated (like the word "Male") and put it in the spool. Next, because it's sorted it expects just a lot more males and to simply re-use what it has already put in the spool.
Basically, it plans to fetch the data from tables 1 and 2 in a few big chunks stopping once or twice to fetch new details from table 3. In practice the stopping isn't occasional. In fact, it may even be stopping for every single row because there are lots of different IDs here. The lazy spool is like going upstairs to get one small thing at a time. Good if you think you just need your wallet. Not so good if you're moving house, in which case you'll want a big box (the eager spool).
The likely reason that shrinking the size of the field in table3 helped is that it meant it estimated less of a comparative benefit in doing the lazy spool over a full sort up front. With varchar it doesn't know how much data there is, just how much there could potentially be. The bigger the potential chunks of data that need shuffling, the more physical work needs doing.
What you can do to avoid in future
Make your table schema and indexes reflect the real shape of the data.
If an ID can be varchar in one table then it's very unlikely to need the extra characters available in nvarchar for another table. Avoid the need for conversions on IDs and also use integers instead of characters where possible.
Ask yourself if any of these tables need tradeNo to be filled in for
all rows. If so, make it not nullable on that table. Next, ask if the
ID should be unique for any of these tables and set it up as such in
the appropriate index. Unique is the definition of maximum cardinality
so it won't make that mistake again.
Nudge in the right direction with join order.
The order you have your joins in the SQL is a signal to the database about how powerful/difficult you expect each join to be. (Sometimes as a human you know more. e.g. if querying for 50 year old astronauts you know that filtering for astronauts would be the first filter to apply but maybe begin with the age when searching for 50 year office workers.) The heavy stuff should come first. It will ignore you if it thinks it has the information to know better but in this case it's relying on your knowledge.
If all else fails
A possible fix would be to INCLUDE all the fields you'll need from table3 in the index on TradeReportId. The reason the indexes couldn't help so much already is that they make it easy to identify how to re-sort but it still hasn't been physically done. That is work it was hoping to optimize with a lazy spool but if the data were included it would be already available so no work to optimize.
Having indexes on a table are key to speeding up retrieval of data. Start with this and then retry your query to see if the speed is improved using 'ORDER BY'

How to optimize a SQL Server full text search

I want to use fulltextsearch for an autocomplete service, which means I need it to work fast! Up to two seconds max.
The search results are drawn from different tables and so I created a view that joins them together.
The SQL function that I'm using is FREETEXTTABLE().
The query runs very slowly, sometimes up to 40 seconds.
To optimize the query execution time, I made sure the base table has a clustered index column that's an integer data type (and not a GUID)
I have two questions:
First, any additional ideas about how to make the full text search faster? (not including upgrading the hardware...)
Second, How come each time after I rebuild the full text catalog, the search query works very fast (less then one second), but only for the first run. The second time I run the query it takes a few more seconds and it's all down hill from there.... any idea why this happens?
The reason why your query is very fast the first time after rebuilding the catalog might be very simple:
When you delete the catalog and rebuild it, the indexes have to be rebuilt, which takes some time. If you make a query before the rebuilding is finished, they query is faster, simply because there is less data. You should also notice, that your query-result contains less rows.
So testing the query speed only makes sense after rebuilding of the indexes is finished.
The following select might come handy to check the size (and also fragmentation) of the indexes. When the size stops growing, rebuilding of the indexes is finished ;)
-- Compute fragmentation information for all full-text indexes on the database
SELECT c.fulltext_catalog_id, c.name AS fulltext_catalog_name, i.change_tracking_state,
i.object_id, OBJECT_SCHEMA_NAME(i.object_id) + '.' + OBJECT_NAME(i.object_id) AS object_name,
f.num_fragments, f.fulltext_mb, f.largest_fragment_mb,
100.0 * (f.fulltext_mb - f.largest_fragment_mb) / NULLIF(f.fulltext_mb, 0) AS fulltext_fragmentation_in_percent
FROM sys.fulltext_catalogs c
JOIN sys.fulltext_indexes i
ON i.fulltext_catalog_id = c.fulltext_catalog_id
JOIN (
-- Compute fragment data for each table with a full-text index
SELECT table_id,
COUNT(*) AS num_fragments,
CONVERT(DECIMAL(9,2), SUM(data_size/(1024.*1024.))) AS fulltext_mb,
CONVERT(DECIMAL(9,2), MAX(data_size/(1024.*1024.))) AS largest_fragment_mb
FROM sys.fulltext_index_fragments
GROUP BY table_id
) f
ON f.table_id = i.object_id
Here's a good resource to check out. However if you really want to improve performance you'll have to think about upgrading your hardware. (I saw a significant performance increase by moving my data and full text index files to separate read-optimized disks and by moving logs and tempdb to separate write-optimized disks -- a total of 4 extra disks plus 1 more for the OS and SQL Server binaries.)
Some other non-hardware solutions I recommend:
Customize the built-in stop word list to define more stop words, thereby reducing the size of your full text index.
Change the file structure of tempdb. See here and here.
If your view performs more than 1 call to FREETEXTTABLE then consider changing your data structure so that the view only has to make 1 call.
However none of these by themselves are likely to be the silver bullet solution you're looking for to speed things up. I suspect there may be other factors here (maybe a poor performing server, network latency, resource contention on the server..) especially since you said the full text searches get slower with each execution which is the opposite of what I've seen in my experience.

The best way to design a Reservation based table

One of my Clients has a reservation based system. Similar to air lines. Running on MS SQL 2005.
The way the previous company has designed it is to create an allocation as a set of rows.
Simple Example Being:
AllocationId | SeatNumber | IsSold
1234 | A01 | 0
1234 | A02 | 0
In the process of selling a seat the system will establish an update lock on the table.
We have a problem at the moment where the locking process is running slow and we are looking at ways to speed it up.
The table is already efficiently index, so we are looking at a hardware solution to speed up the process. The table is about 5 mil active rows and sits on a RAID 50 SAS array.
I am assuming hard disk seek time is going to be the limiting factor in speeding up update locks when you have 5mil rows and are updating 2-5 rows at a time (I could be wrong).
I've herd about people using index partition over several disk arrays, has anyone had similar experiences with trying to speed up locking? can anyone give me some advise onto a possible solution on what hardware might be able to be upgraded or what technology we can take advantage of in order to speed up the update locks (without moving to a cluster)?
One last try…
It is clear that there are too many locks hold for too long.
Once the system starts slowing down
due to too many locks there is no
point in starting more transactions.
Therefore you should benchmark the system to find out the optimal number of currant transaction, then use some queue system (or otherwise) to limit the number of currant transaction. Sql Server may have some setting (number of active connections etc) to help, otherwise you will have to write this in your application code.
Oracle is good at allowing reads to bypass writes, however SqlServer is not as standared...
Therefore I would split the stored proc to use two transactions, the first transaction should just:
be a SNAPSHOT (or READ UNCOMMITTED) transaction
find the “Id” of the rows for the seats you wish to sell.
You should then commit (or abort) this transaction,
and use a 2nd (hopefully very short) transaction that
Most likcly is READ COMMITTED, (or maybe SERIALIZABLE)
Selects each row for update (use a locking hint)
Check it has not been sold in the mean time (abort and start again if it has)
Set the “IsSold” flag on the row
(You may be able to the above in a single update statement using “in”, and then check that the expected number of rows were updated)
Sorry sometimes you do need to understant what each time of transaction does and how locking works in detail.
If the table is smaller, then the
update is shorter and the locks are
hold for less time.
Therefore consider splitting the table:
so you have a table that JUST contains “AllocationId” and “IsSold”.
This table could be stored as a single btree (index organized table on AllocationId)
As all the other indexes will be on the table that contrains the details of the seat, no indexes should be locked by the update.
I don't think you'd getting anything out of table partitioning -- the only improvement you'd get would be in fewer disk reads from a smaller (shorter) index trees (each read will hit each level of the index at least once, so the fewer levels the quicker the read.) However, I've got a table with a 4M+ row partition, indexed on 4 columns, net 10 byte key length. It fits in three index levels, with the topmost level 42.6% full. Assuming you had something similar, it seems reasonable that partitioning might only remove one level from the tree, and I doubt that's much of an improvement.
Some off the-cuff hardward ideas:
Raid 5 (and 50) can be slower on writes, because of the parity calculation. Not an issue (or so I'm told) if the disk I/O cache is large enough to handle the workload, but if that's flooded you might want to look at raid 10.
Partition the table across multiple drive arrays. Take two (or more) Raid arrays, distribute the table across the volumes[files/file groups, with or without table partitioning or partitioned views], and you've got twice the disk I/O speed, depending on where the data lies relative to the queries retrieving it. (If everythings on array #1 and array #2 is idle, you've gained nothing.)
Worst case, there's probably leading edge or bleeding edge technology out there that will blow your socks off. If it's critical to your business and you've got the budget, might be worth some serious research.
How long is the update lock hold for?
Why is the lock on the “table” not just the “rows” being sold?
If the lock is hold for more then a
faction of a second that is likely to
be your problem. SqlServer does not
like you holding locks while users
fill in web forms etc.
With SqlServer, you have to implement a “shopping cart” yourself, by temporary reserving the seat until the user pays for it. E.g add a “IsReserved” and “ReservedAt” colunn, then any seats that has been reserved for more then n minutes should be automatically unreserved.
This is a hard problem, as a shopper does not expect a seat that is in stock to be sold to someone else where he is checking out. However you don’t know if the shopper will ever complete the checkout. So how do you show it on a UI. Think about having a look at what other booking websites do then copy one that your users already know how to use.
(Oracle can sometimes cope with lock being kept for a long time, but even Oracle is a lot faster and happier if you keep your locking short.)
I would first try to figure out why the you are locking the table rather than just a row.
One thing to check out is the Execution plan of the Update statement to see what Indexes it causes to be updated and then make sure that row_level_lock and page_level_lock are enabled on those indexes.
You can do so with the following statement.
Select allow_row_locks, allow_page_locks from sys.indexes where name = 'IndexNameHere'
Here are a few ideas:
Make sure your data and logs are on separate spindles, to maximize write performance.
Configure your drives to only use the first 30% or so for data, and have the remainder be for backups (minimize seek / random access times).
Use RAID 10 for the log volume; add more spindles as needed for performance (write performance is driven by the speed of the log)
Make sure your server has enough RAM. Ideally, everything needed for a transaction should be in memory before the transaction starts, to minimize lock times (consider pre-caching). There are a bunch of performance counters you can check for this.
Partitioning may help, but it depends a lot on the details of your app and data...
I'm assuming that the T-SQL, indexes, transaction size, etc, have already been optimized.
In case it helps, I talk about this subject in detail in my book (including SSDs, disk array optimization, etc) -- Ultra-Fast ASP.NET.

Resources