Decreasing column size has never changed the table size. PostgreSQL - database

Some time ago I read about resizing a column in PostgreSQL updating the atttypmod manually.
https://web.archive.org/web/20111007112138/http://sniptools.com/databases/resize-a-column-in-a-postgresql-table-without-changing-data
It works properly, however, although I re-size a 2500 character column to 250, table size has never changed. Should I run a vacumm? What should I do?
Thanks!

The link you quote should be taken down, as this advice is dangerous and misleading. This will not change the size of existing columns. It is not supported to modify catalog tables directly, for the reason that it is too easy to get data corruption, which is what I would call a varchar(250) column with 1000 characters in it.
You should change the data type by properly rewriting the table:
ALTER TABLE tab ALTER col
TYPE varchar(250) USING substr(col, 1, 250);

Related

Maximum row size error - works with lazy SELECT INTO, does not with INSERT INTO

I should stress the below is a curiosity on a lazy bit of manipulation work so please don't waste any time on it - just if you happen to have a suggestion off the top of your head.
Messing around with a convenience piece creating a large flat table for future use in a mail file.
I'm getting this error:
Cannot create a row of size 11559 which is greater than the allowable maximum row size of 8060.
Now I took the lazy approach originally and used SELECT * INTO FROM to create a template table (~500 columns, all varchar(max)) with no issues. If I then truncate the created table and insert into it (stressing from the SAME source table), I return the error above.
(For clarity, I'm also created the table "manually" (i.e. CREATE TABLE) and get the same issue with inserting into)
While I'm familiar with the row size limit (hence the use of varchar(max)), what's especially strange is that on the source table, showcontig returns a MaximumRecordSize is 3676 on the source table, so I'm unable to determine where the problem lies.
Any suggestions would be gratefully received!
SQL Server has the limit of 8K per table row.
You should review and resize all nvarchar(nnn) columns and/or split (and normalize, I suspect) the table into several ones.
Another workaround is to use nvarchar(max) but be careful.

sql azure setting value to null increases table size

I had an uniqueidentifier field in SQL Server (SQL Azure to be precise) that I wanted to get rid of. Initially, when I ran the code as mentioned in SQL Azure table size to determine the size of the table it was about 0.19 MB.
As a first step I set all values in the uniqueidentifier field to null. There are no constraints/indexes that use the column. Now, when I ran the code to determine the table sizes the table had increased in size to about 0.23 MB. There are no records being added to a table (its in a staging environment).
I proceeded to delete the column and it still hovered at the same range.
Why does the database size show an increase when I delete a column. Any suggestions?
Setting an uniqueidentifier column to NULL value does not change the record size in any way, since is a fixed size type (16 bytes). Dropping a fixed size column column does not change the record size, unless is the last column in the physical layout and the space can be reused later. ALTER TABLE ... DROP COLUMN is only a logical operation, it simply marks the columns as dropped, see SQL Server Columns Under the Hood.
In order to reclaim the space you need to drop the column and then rebuild the clustered index of the table, see ALTER INDEX ... REBUILD.
For the record (since SHRINK is not allowed in SQL Azure anyway) on the standalone SQL Server SHRINK would had solved nothing, this is not about page reservation but about physical record size.
It's counting the number of reserved pages to calculate the size. Deleting a column may reduce the number of pages that are actually utilized to store data, but the newly-freed pages are probably still reserved for future inserts.
I think you'd need to shrink the database to see the size decrease, as per: http://social.msdn.microsoft.com/Forums/en-US/ssdsgetstarted/thread/ae698613-79d5-4f23-88b4-f42ee4b3ad46/
As an aside, I am fairly sure that setting the value of a non-variable-length column (like a GUID) to null will not save you any space at all- only deleting the column will do so. This per Space used by nulls in database

Oracle starts to do full table scans when a column is changed from varchar to nclob

I have a table with about 100.000 rows that used to look more or less like this:
id varchar(20),
omg varchar(10),
ponies varchar(3000)
When adding support for international characters, we had to redefine the ponies column to an nclob, as 3000 (multibyte) characters is too big for an nvarchar
id varchar(20),
omg varchar(10),
ponies nclob
We read from the table using a prepared statement in java:
select omg, ponies from tbl where id = ?
After the 'ponies' column was changed to an NCLOB and some other tables where changed to use nchar columns, Oracle 11g decided to do a full table scan instead of using the index for the id column, which causes our application to grind to a halt.
When adding a hint to the query, the index is used and everything is "fine", or rather just a little bit more slow than it was when the column was a varchar.
We have defined the following connection properties:
oracle.jdbc.convertNcharLiterals="true"
defaultNChar=true
Btw, The database statistics are updated.
I have not had time to look at all queries, so I don't know if other indexes are ignored, but do I have to worry that the defaultNChar setting somehow is confusing the optimizer since the id is not a nchar? It would be rather awkward to either sprinkle hints on virtually all queries or redefine all keys.
Alternatively, is the full table scan regarded as insignificant as a "large" nclob is going to be loaded - that assumption seems to be off by 3 orders of magnitude, and I would like to believe that Oracle is smarter than that.
Or is it just bad luck? Or, something else? Is it possible to fix without hints?
The problem turns out to be the jdbc-flag defaultNChar=true.
Oracles optimizer will not use indexes created on char/varchar2 columns if the parameter is sent as a nchar/nvarchar. This is nearly making sense, as I suppose you could get phantom results.
We are mostly using stored procedures, with the parameters defined as char/varchar2 - forcing a conversion before the query is executed - so we didn't notice this effect except in a few places where dynamic sql is used.
The solution is to convert the database to AL32UTF8 and get rid of the nchar columns.
When you redid the statistics did you estimate or use dbms_stats.gather_table_stats with an estimate_percentage > 50%? If you didn't then use dbms_stats with a 100% estimate_percentage.
If your table is only 3 columns and these are the ones you're returning then the best index is all 3 columns no matter what you hint and even if the id index is unique. As it stands your explain plan should by a unique index scan followed by a table access by rowid. If you index all 3 columns this becomes a unique scan as all the information you're returning will be in the index already and there's no need to re-access the table to get it. The order would be id, omg, ponies to make use of it in the where clause. This would effectively make your table an index organized table, which would be easier than having a separate index. Obviously, gather stats after.
Saying all that I'm not actually certain you can index a nclob and no matter what you do the size of the column will have an impact as the longer it is the more disk reads you will have to do.
Sorry, but I don't understand why had you change your column ponies from varchar to clob. If your maximum lenght is 3000 char in this column, why don't you use a NVARCHAR2 column instead? As far as I know, nvarchar2 can hold up to 4000 characters.
But you're right, the maximum column size allowed is 2000 characters when the national character set is AL16UTF16 and 4000 when it is UTF8.

Any hidden pitfalls changing a column from varchar(8000) to varchar(max)?

I have a lot (over a thousand places) of legacy T-SQL code that only makes INSERTs into a varchar(8000) column in a utility table. Our needs have changed and now that column needs to be able to handle larger values. As a result I need to make that column varchar(max). This is just a plain data column where there are no searches preformed on it, no index on it, only one procedure reads it, it is INSERT and forget for the application (almost like a log entry).
I plan on making changes in only a few places that will actually generate the larger data, and in the single stored procedure that processes this column.
Are there any hidden pitfalls changing a column from varchar(8000) to a varchar(max)?
Will all the T-SQL string functions work the same, LEN(), RTRIM(), SUBSTRING(), etc.
Can anyone imagine any reason why I'd have to make any changes to the code that thinks the column is still varchar(8000)?
All MAX types have a small performance penalty, see Performance comparison of varchar(max) vs. varchar(N).
If your maintenance include online operations (online index rebuild), you will lose the possibility to do them. Online operations are not supported for tables with BLOB columns:
Clustered indexes must be created, rebuilt, or dropped offline when the underlying table contains large object (LOB) data types: image, ntext, text, varchar(max), nvarchar(max), varbinary(max), and xml.
Nonunique nonclustered indexes can be created online when the table contains LOB data types but none of these columns are used in the index definition as either key or nonkey (included) columns. Nonclustered indexes defined with LOB data type columns must be created or rebuilt offline.
The performance penalty is really small, so I wouldn't worry about it. The loss of ability to do online rebuilds may be problematic for really hot must-be-online operations tables. Unless online operations are a must, I'd vote to go for it and change it to MAX.
Crystal Reports 12 (and other versions, as far as I know) doesn't handle varchar(max) properly and interprets it as varchar(255) which leads to truncated data in reports.
So if you're using Crystal Reports, thats a disadvantage to varchar(max). Or a disadvantage to using Crystal, to be precise.
See:
http://www.crystalreportsbook.com/Forum/forum_posts.asp?TID=5843&PID=17503
http://michaeltbeeitprof.blogspot.com/2010/05/crystal-xi-and-varcharmax-aka-memo.html
If you genuinely don't need indexes and it is a large column you should be fine. Varchar (max) appears to be exactly what you need and you will have less problems with existing code than you would if you used text.
Make sure to test any updates where text is added to the existing text. It should work using regular concatenation, but I'd want to be able to prove it.

How to reduce size of SQL Server table that grew from a datatype change

I have a table on SQL Server 2005 that was about 4gb in size.
(about 17 million records)
I changed one of the fields from datatype char(30) to char(60) (there are in total 25 fields most of which are char(10) so the amount of char space adds up to about 300)
This caused the table to double in size (over 9gb)
I then changed the char(60) to varchar(60) and then ran a function to cut extra whitespace out of the data (so as to reduce the average length of the data in the field to about 15)
This did not reduce the table size. Shrinking the database did not help either.
Short of actually recreating the table structure and copying the data over (that's 17 million records!) is there a less drastic way of getting the size back down again?
You have not cleaned or compacted any data, even with a "shrink database".
DBCC CLEANTABLE
Reclaims space from dropped variable-length columns in tables or indexed views.
However, a simple index rebuild if there is a clustered index should also do it
ALTER INDEX ALL ON dbo.Mytable REBUILD
A worked example from Tony Rogerson
Well it's clear you're not getting any space back ! :-)
When you changed your text fields to CHAR(60), they are all filled up to capacity with spaces. So ALL your fields are now really 60 characters long.
Changing that back to VARCHAR(60) won't help - the fields are still all 60 chars long....
What you really need to do is run a TRIM function over all your fields to reduce them back to their trimmed length, and then do a database shrinking.
After you've done that, you need to REBUILD your clustered index in order to reclaim some of that wasted space. The clustered index is really where your data lives - you can rebuild it like this:
ALTER INDEX IndexName ON YourTable REBUILD
By default, your primary key is your clustered index (unless you've specified otherwise).
Marc
I know I'm not answering your question as you are asking, but have you considered archiving some of the data to a history table, and work with fewer rows?
Most of the times you might think at first glance that you need all that data all the time but when actually sitting down and examining it, there are cases where that's not true. Or at least I've experienced that situation before.
I had a similar problem here SQL Server, Converting NTEXT to NVARCHAR(MAX) that was related to changing ntext to nvarchar(max).
I had to do an UPDATE MyTable SET MyValue = MyValue in order to get it to resize everything nicely.
This obviously takes quite a long time with a lot of records. There were a number of suggestions as how better to do it. They key one was a temporary flag indicated if it had been done or not and then updating a few thousand at a time in a loop until it was all done. This meant I had "some" control over how much it was doing.
On another note though, if you really want to shrink the database as much as possible, it can help if you turn the recovery model down to simple, shrink the transaction logs, reorganise all the data in the pages, then set it back to full recovery model. Be careful though, shrinking of databases is generally not advisable, and if you reduce the recovery model of a live database you are asking for something to go wrong.
Alternatively, you could do a full table rebuild to ensure there's no extra data hanging around anywhere:
CREATE TABLE tmp_table(<column definitions>);
GO
INSERT INTO tmp_table(<columns>) SELECT <columns> FROM <table>;
GO
DROP TABLE <table>;
GO
EXEC sp_rename N'tmp_table', N'<table>';
GO
Of course, things get more complicated with identity, indexes, etc etc...

Resources