I have a set of indexes in my databases, and I would like to set the Fill Factor to 0. Some of these are currently set to 100. I know that as far as SQL Server is concerned, this is the same, but we have a piece of software which is comparing two databases with the same schema and it is detecting a difference between them if the Fill Factor of an index is 0 in one database and 100 in the other. This is a problem.
Whilst we go about updating our software to be a bit cleverer about this, I would like to be able to set the Fill Factor to 0 on various indexes. SQL Server won't let you specify a Fill Factor of 0 (it must be 1..100), so the only way I can think of doing this is to DROP the index and recreate it (with the server default set to 0).
But is there another (and preferably quicker) way?
If the default instance fill factor (%) configuration value is set to zero, you can omit FILLBACTOR and recreate indexes with CREATE INDEX...WITH (DROP_EXISTING=ON). This will be a bit more efficient than DROP/CREATE because no sort will be needed to recreate the index and, the case of the clustered index, non-clustered indexes won't need to be rebuilt twice.
Related
I'm teaching a Microsoft SQL Server course and a student asked if there was a practical reason why we would ever set the auto increment as something other than 1,1 (seed starting at 1, increment by 1). Is there a practical reason why we would ever set the seed to something higher or increment by a value other than 1? Shouldn't it not matter as long as that value is unique?
If there is no practical reason, why do we have the option to set those values for Identity in Microsoft SQL Server?
There are a lot of practical reasons for having a configurable start value :
You may want to insert a few predefined records with well-known IDs, eg Missing, Unknown and Not Applicable records in a dimension or lookup table should probably have predefined IDs. New rows should get IDs outside the range of the predefined numbers.
After loading or replicating data with existing ID values, new records should get IDs that don't conflict with the imported data. The easiest way to do this is by setting the starting ID somewhere above the maximum imported ID.
TRUNCATE TABLE resets the IDENTITY value. To avoid generating duplicate IDs you need to reseed the table with DBCC CHECKIDENT and set the current value to something other than 1
There are certainly dozens of other reasons.
If you are using a signed integer and start at 1, you're only using half the available range. Default should really be to start at minimum for the data type, for example -2 billion for 32 bit int.
In some cases, you may want to combine data from multiple tables. In that case each table should keep a separate range of ids. You could have each table start at a different number to prevent overlaps. For example start one at 1 and another one at 1 billion. Or you could use odd numbers for one (1,2) and even numbers for the other (2,2).
I have a table with around 115k rows. Something like this:
Table: People
Column: ID PRIMARY KEY INT IDENTITY NOT NULL
Column: SpecialCode NVARCHAR(255) NULL
Column: IsActive BIT NOT NULL
Initially, I had an index defined like so:
PK_IDX (clustered) -- clustered index on primary key
IDX_SpecialCode (non clustered, non-unique) -- index on the SpecialCode column
And I'm doing an update like so:
Update People set IsActive = 0
Where SpecialCode not in ('...enormous list of special codes....')
This enormous list is essentially 99% of the users in the table.
This update takes forever on my server. As a test I trimmed the list of special codes in the "not in" clause to something like 1% of the users in the table, and my execution plan ends up using an INDEX SCAN on the PK_IDX index instead of the IDX_SpecialCode index that I thought it'd use.
So, I thought that maybe I needed to modify the IDX_SpecialCode so that it included the column "IsActive" in it. I did so and I still see the execution plan defaulting to the PK_IDX index scan and my query still takes a very long time to run.
So - what is the more correct way to do an update of this nature? I have the list of user's I want to exclude from the update, but was trying to avoid loading all employees special codes from the database, filtering out those not in my list on my application side, and then running my query with an in clause, which will be a much much smaller list in my actual usage.
Thanks
If you have the employees you want to exclude, why not just populate an indexed table with those PK_IDs and do a:
Update People
set IsActive = 0
Where NOT EXISTS (SELECT NULL
FROM lookuptable l
WHERE l.PK = People.PK)
You are getting index scans because SQL Server is not stupid, and realizes that it makes more sense to just look at the whole table instead of checking for 100 different criteria one at a time. If your stats are up to date the optimizer knows about how much of the table is covered by your IN statement and will do a table or clustered index scan if it thinks it will be faster.
With SQL-Server indexes are ignored when you use the NOT clause. That is why you are seeing the execution plan ignoring your index. <- Ref: page 6. MCTS Exam 70-433 Database Development SQL 2008 (I'm reading it at the moment)
It might be worth taking a look at Full text indexes although I don't know whether the same will happen with that (I haven't got access to a box with it set up to test at the moment)
hth
Is there any way you could use the IDs of the users you wish to exclude instead of their code - even on indexed values comparing ids may be faster than strings.
I think that the problem is your SpecialCode NVARCHAR(255). Strings comparison in Sql Server are very slow. Consider change your query to work with the IDs. And also, try to avoid the NVarchar. if dont care about Unicode, use Varchar instead.
Also, check your database collation to see if it matches the instance collation. Make sure you are not having hard disk performance issues.
I am working on a new system with a SQL Server 2005 database which will soon be going into production. A colleague recently mentioned to me that I should always be specifying the fill factor on my tables. Currently I don't specify fill factor on any of my tables.
My application is OLTP with a mix of reads and writes. A couple of my tables are "reference" tables i.e. read-only but most are read-write. The read-only tables are low volume ( < 50000 rows ).
From what I've read in the SQL Server documentation I should be sticking with the default fill-factor unless the table is read only.
Can anyone comment on this, both for read-only and read-write tables?
No, you shouldn't specify a fill factor on your tables. The fill factor is always ignored by the engine, except for one and only one operation: index build (which includes initial build of an index on a populated table and/or a rebuild of an index). So the fill factor makes sense to be specified only in the ALTER TABLE ... REBUILD and ALTER INDEX ... REBUILD operations.
See also A SQL Server DBA myth a day: (25/30) fill factor.
I have a table with about 45 columns and as more data goes in, the longer it takes for the inserts to happen. I have increased the size of the data and log files, reduced the fill factor on all the indexes on that table, and still slower and slower insert times. Any ideas would be GREATLY appreciated.
For inserts, you want to DECREASE the fillfactor on the indexes on the table in order to reduce page splitting.
It is somewhat expected that it will take longer to insert as more data goes in, because your indexes just plain get bigger.
Try putting in data in batches instead of row-by-row. SQL Server is more efficient that way.
Make sure you don't have too many indexes on your tables.
Consider using SQL Server 2005's INCLUDE statement on your indexes if you are just including columns in your indexes because you want them covered in your queries.
How big is the table?
What is the context? Is this a batch of many new records?
Can you post the schema including index definition?
Can you SET STATISTICS IO ON, SET STATISTICS TIME ON, and post the display for one iteration?
Is there anything pathological about the data, or the context? Is this on a server or a laptop (testing)?
Why dont you drop index before inserting and recreate index back on to table so you no need to do update statistics
You could also ensure that the indexes on that table are defragmented
So for this one project, we have a bunch of queries that are executed on a regular basis (every minute or so. I used the "Analyze Query in Database Engine " to check on them.
They are pretty simple:
select * from tablex where processed='0'
There is an index on processed, and each query should return <1000 rows on a table with 1MM records.
The Analyzer recommended creating some STATISTICS on this.... So my question is: What are those statistics ? do they really help performance ? how costly are they for a table like above ?
Please bear in mind that by no means I would call myself a SQL Server experienced user ... And this is the first time using this Analyzer.
Statistics are what SQL Server uses to determine the viability of how to get data.
Let's say, for instance, that you have a table that only has a clustered index on the primary key. When you execute SELECT * FROM tablename WHERE col1=value, SQL Server only has one option, to scan every row in the table to find the matching rows.
Now we add an index on col1 so you assume that SQL Server will use the index to find the matching rows, but that's not always true. Let's say that the table has 200,000 rows and col1 only has 2 values: 1 and 0. When SQL Server uses an index to find data, the index contains pointers back to the clustered index position. Given there's only two values in the indexed column, SQL Server decides it makes more sense to just scan the table because using the index would be more work.
Now we'll add another 800,000 rows of data to the table, but this time the values in col1 are widely varied. Now it's a useful index because SQL Server can viably use the index to limit what it needs to pull out of the table. Will SQL Server use the index?
It depends. And what it depends on are the Statistics. At some point in time, with AUTO UPDATE STATISTICS set on, the server will update the statistics for the index and know it's a very good and valid index to use. Until that point, however, it will ignore the index as being irrelevant.
That's one use of statistics. But there is another use and that isn't related to indices. SQL Server keeps basic statistics about all of the columns in a table. If there's enough different data to make it worthwhile, SQL Server will actually create a temporary index on a column and use that to filter. While this takes more time than using an existing index, it takes less time than a full table scan.
Sometimes you will get recommendations to create specific statistics on columns that would be useful for that. These aren't indices, but the do keep track of the statistical sampling of data in the column so SQL Server can determine whether it makes sense to create a temporary index to return data.
HTH
In Sql Server 2005, set auto create statistics and auto update statistics. You won't have to worry about creating them or maintaining them yourself, since the database handles this very well itself.