Sql Server 2008 R2 DC Inserts Performance Change

Sql Server 2008 R2 DC Inserts Performance Change - sql-server

I have noticed an interesting performance change that happens around 1,5 million entered values. Can someone give me a good explanation why this is happening?
Table is very simple. It is consisted of (bigint, bigint, bigint, bool, varbinary(max))
I have a pk clusered index on first three bigints. I insert only boolean "true" as data varbinary(max).
From that point on, performance seems pretty constant.
Legend: Y (Time in ms) | X (Inserts 10K)
I am also curios about constant relatively small (sometimes very large) spikes I have on the graph.
Actual Execution Plan from before spikes.
Legend:
Table I am inserting into: TSMDataTable
1. BigInt DataNodeID - fk
2. BigInt TS - main timestapm
3. BigInt CTS - modification timestamp
4. Bit: ICT - keeps record of last inserted value (increases read performance)
5. Data: Data
Bool value Current time stampl keeps
Enviorment
It is local.
It is not sharing any resources.
It is fixed size database (enough so it does not expand).
(Computer, 4 core, 8GB, 7200rps, Win 7).
(Sql Server 2008 R2 DC, Processor Affinity (core 1,2), 3GB, )

Have you checked the execution plan once the time goes up? The plan may change depending on statistics. Since your data grow fast, stats will change and that may trigger a different execution plan.
Nested loops are good for small amounts of data, but as you can see, the time grows with volume. The SQL query optimizer then probably switches to a hash or merge plan which is consistent for large volumes of data.
To confirm this theory quickly, try to disable statistics auto update and run your test again. You should not see the "bump" then.
EDIT: Since Falcon confirmed that performance changed due to statistics we can work out the next steps.
I guess you do a one by one insert, correct? In that case (if you cannot insert bulk) you'll be much better off inserting into a heap work table, then in regular intervals, move the rows in bulk into the target table. This is because for each inserted row, SQL has to check for key duplicates, foreign keys and other checks and sort and split pages all the time. If you can afford postponing these checks for a little later, you'll get a superb insert performance I think.
I used this method for metrics logging. Logging would go into a plain heap table with no indexes, no foreign keys, no checks. Every ten minutes, I create a new table of this kind, then with two "sp_rename"s within a transaction (swift swap) I make the full table available for processing and the new table takes the logging. Then you have the comfort of doing all the checking, sorting, splitting only once, in bulk.
Apart from this, I'm not sure how to improve your situation. You certainly need to update statistics regularly as that is a key to a good performance in general.
Might try using a single column identity clustered key and an additional unique index on those three columns, but I'm doubtful it would help much.
Might try padding the indexes - if your inserted data are not sequential. This would eliminate excessive page splitting and shuffling and fragmentation. You'll need to maintain the padding regularly which may require an off-time.
Might try to give it a HW upgrade. You'll need to figure out which component is the bottleneck. It may be the CPU or the disk - my favourite in this case. Memory not likely imho if you have one by one inserts. It should be easy then, if it's not the CPU (the line hanging on top of the graph) then it's most likely your IO holding you back. Try some better controller, better cached and faster disk...

Related

How can a very large table with a single integer primary key index be tuned for massive volume of inserts? [migrated]

This question was migrated from Stack Overflow because it can be answered on Database Administrators Stack Exchange.
Migrated 3 days ago.
Environment:
SQL Server 2019 on Windows Server 2019, on KVM backed by TrueNAS, 16 cores, 32 GB RAM.
Application runs 50 parallel threads all inserting into the same massive table.
This combination appears to work against the SQL Server architecture
Additional details
the problem table is both deep and wide - 20,000,000 rows with over 300 columns and 40-50 indexes
The application uses JDBC Batch API's. This particular table, due to row size, is inserting in batches of 1,000 rows.
Tables with more reasonable row sizes are inserting in batches of 10,000 rows
I can't share the actual DDL, but it's pretty mundane apart from the row simply being massive (a surrogate key BIGINT ID column, two natural key VARCHAR columns, 300 or so cargo columns, 0 BLOB/CLOB columns, then 40-50 indexes)
The primary key index DDL is "create unique index mytable_pk on dbo.mytable (keycolumn);"
The only other unique index DDL is "create unique index mytable_ndx1 on dbo.mytable (division, itemnum)";
The product that owns the database is used by hundreds of fortune 2000 customers, so changing hte data model is not an option for me or the product vendor.
Restrictions
Since the database is ultimately a third party's, any changes I make
to it must be in-place. Once the data is inserted into it, I no
longer have any access to it.
The database is owned by a third party
off-the-shelf application.
the primary key is a sequential integer
Observations and metrics
Early in the process, we were bottlenecked on CPU resources.
Once we hit about 1,000,000 rows, we were single threading on latches, sometimes spending over two seconds in a latch, and rarely spending less than 500ms in a latch. Latching and IO buffer waits were both excessive. CPU dropped to about 12% usage.
In a second test, I dropped all of the indexes and re-ran the job. The job completed 8 times as quickly, showing zero load on the SQL server and bottlenecking on CPU on the application which is very good from the SQL Server perspective.
After reading Microsoft's literature, I came to the conclusion that the data model is working against SQL Server's indexing architecture for tuning for massive inserts.
I will not always have the option of dropping and recreating the indexes. Is there a way to tune the table to distribute the I/O
** Now to the real question **
Is there a way to tune SQL Server, under the covers, to distribute the IO so sequential numbers in an index not in the same buffer when doing massive inserts of sequential data?

There are several well-known approaches to addressing last page insert contention in SQL Server.
Many of these are covered in the documentation at Resolve last-page insert PAGELATCH_EX contention in SQL Server. Summarising the options from that link:
Use OPTIMIZE_FOR_SEQUENTIAL_KEY (details)
Move primary key off identity column
Make the leading key a non-sequential column
Add a non-sequential value as a leading key
Use a GUID as a leading key
Use table partitioning and a computed column with a hash value
Switch to In-Memory OLTP
Method 7 can also be implemented as an in-memory OLTP table to handle a high rate of ingestion with regular batch moves to the final destination table. For the very highest concurrency, use natively compiled code with the in-memory table as much as possible (including for the inserts). The frequency and size of moves is dictated by your requirements.
As mentioned in another answer, delayed durability can also improve insert performance in many cases.
Related Q & A: Solving periodic high PAGELATCH_EX Waits. Last page contention?
All that said, you haven't shown evidence of a last-page contention issue at all. More likely, you're encountering problems related to updating all those secondary indexes and a lack of memory on the instance meaning index maintenance often has to wait for pages to be brought in from storage for modification. You don't mention the type of latch you see waits on, but I imagine they'd be PAGEIOLATCH_*.
The primary solution would be to dramatically increase the memory available to SQL Server for its buffer pool so fewer IOs are necessary. Failing that, a faster storage subsystem would be required.

Have you tried using Delayed Durability?
When to use delayed transaction durability
Some of the cases in which you could benefit from using delayed transaction durability are:
You can tolerate some data loss.
If you can tolerate some data loss, for example, where individual records are not critical as long as you have most of the data, then delayed durability may be worth considering. If you cannot tolerate any data loss, do not use delayed transaction durability.
You are experiencing a bottleneck on transaction log writes.
If your performance issues are due to latency in transaction log writes, your application will likely benefit from using delayed transaction durability.
Your workloads have a high contention rate.
If your system has workloads with a high contention level much time is lost waiting for locks to be released. Delayed transaction durability reduces commit time and thus releases locks faster, which results in higher throughput.

The short answer to your "real question" is no because contiguous keys of a disk-based b-tree index must be stored in the same page.

I've never used SQL server, but your problem isn't specific to one database, so maybe this can still help.
When inserting a large number of rows per second the bottlenecks are either going to be parsing overhead (which can be parallelized), index updates (which may be parallelizable or not), primary key sequence generation, or other stuff like postgres' large object support, but that depends on your column types and database quirks. Then at some point any transactional database must generate sequential transaction log entries which is also a concurrency bottleneck.
First thing you should do is check if the inserts are grouped into transactions (not one insert per transaction). Then make sure the IO is fast, look for bottlenecks there, iowait, etc.
In a second test, I dropped all of the indexes and re-ran the job. The job completed 8 times as quickly, showing zero load on the SQL server
So that eliminates some of the candidates and hints that the problem is indices.
For example if 50 threads each insert a row at the same time, and...
You have a high cardinality index with each row hitting a different page in the index, then these can be parallelized
You have a low cardinality index, most of the inserted rows have the same value in the same column, and all these threads are fighting for control of the same index page.
This can compound with index/table page splits if your fillfactor is too high, in this case all the threads will want to insert in the same index page, and it's already full, so one thread is splitting the page while all others are waiting.
Unfortunately you didn't post the table info in the question, which you should really do. But you probably know if your indices are low cardinality or high. The first thing you could do is run the same tests again, adding the indices one by one, try to see which one causes trouble.
You can also lower fillfactor so there is less chance the inserts end up in a page that is already full.
If you find a problematic low cardinality index then you should first wonder if it's actually useful for queries, maybe you can drop it. If you want to keep it, you can hack it into a high cardinality index by adding a dummy column at the end. For example if you have an index on (category) which has few different values and causes problems for inserts, you can turn it into (category,other_column) which will work just as well for selecting based on category and might provide some extra features like sorting on other_column while selecting on category. However other_column should not be the PK or date or any other column that will have have values that end up in the same page in all your concurrent inserts, because that would be back to square one.
Next, you can try single-threading, or a low number of threads. Back to this:
In a second test, I dropped all of the indexes and re-ran the job. The job completed 8 times as quickly, showing zero load on the SQL server and bottlenecking on CPU on the application which is very good from the SQL Server perspective.
This may look nice at first glance but there's a problem here. Basically your application is doing the easy things (processing rows) and delegating the hard things (ie, concurrency) to the database. That's fine until it exceeds the database's capabilities, then it breaks down. Databases are excellent at handling concurrency correctly, but doing it fast is a very hard problem: coordinating several cores on a lock has a hard performance limit, caused by latency of communication between the cores, which is the speed of information propagation, in other words the speed of light, which cannot be negotiated with.
Locks are just memory held as cache lines in CPU caches. So a side effect of the way multicore systems work is, it's much faster for the same core to reacquire a lock it just released, because the line is still in its cache, so there is no slow inter-core communication involved. Likewise, several cores attempting to modify different parts of the same index page will result in cache line exchanges between them and lots of communication to determine what core owns what byte in that page. And that is surprisingly slow, it can take microseconds instead of nanoseconds.
In addition you have 50 client threads, so 50 server threads, and only 16 cores, so on the database server the OS will multitask the 50 threads between the 16 cores. This means the OS will end up putting one thread to sleep while it's holding a lock, and when that happens, performance is destroyed.
So the next test you can do is to compare insertion time with all your indices between these two scenarii:
Your current one with 50 threads
Then stop it, copy the inserted data from your main table into a temp table, truncate the main table, and insert the exact same data again with:
INSERT INTO yourtable SELECT * FROM temptable
In the second case you're inserting the same data. For the test to be valid it should be in the same order, so you might want to add an ORDER BY primary key while copying the rows into the temp table, so they come out in the proper order. I don't know if the tables are clustered, but you'll find a way to get the order correct.
You can also try various orders, one of the indices may be faster if data is inserted in an order that it likes.
If the second insert is much faster than the mutli-threaded one, then that will give you a clue of what you need to do. In this case that's probably a funnel, ie a process that gathers rows generated by the many threads and inserts them using a low number of threads, maybe just one.
This can simply be all the threads inserting into a non-indexed table, and a separate task flushing this table into the main one every X milliseconds.

SQL Server : time-series data performance

I have a table of a little over 1 billion rows of time-series data with fantastic insert performance but (sometimes) awful select performance.
Table tblTrendDetails (PK is ordered as shown):
PK TrendTime datetime
PK CavityId int
PK TrendValueId int
TrendValue real
The table is continuously pulling in new data and purging old data, so insert and delete performance needs to remain snappy.
When executing a query such as the following, performance is poor (30 sec):
SELECT *
FROM tblTrendDetails
WHERE TrendTime BETWEEN #inMinTime AND #inMaxTime
AND CavityId = #inCavityId
AND TrendValueId = #inTrendId
If I execute the same query again (with similar times, but any #inCavityId or #inTrendId), performance is very good (1 sec). Performance counters show that disk access is the culprit the first time the query is run.
Any recommendations regarding how to improve performance without (significantly) adversely affecting the insert or delete performance? Any suggestions (including completely changing the underlying database) are welcome.

The fact that subsequent queries of the same or similar data run much faster is probably due to SQL Server caching your data. That said, is it possible to speed this initial query up?
Verify the query plan:
My guess is that your query should result in an Index Seek rather than an Index Scan (or worse, a Table Scan). Please verify this using SET SHOWPLAN_TEXT ON; or a similar feature. Using between and = as your query does should really take advantage of the clustered index, though that's debatable.
Index Fragmentation:
It is possible that your clustered index (the primary key in this case) is quite fragmented after all of those inserts and deletes. I would probably check this with DBCC SHOWCONTIG (tblTrendDetails).
You can defrag the table's indexes with DBCC INDEXDEFRAG (MyDatabase, tblTrendDetails).
This may take some time, but will allow the table to remain accessible, and you can stop the operation without any nasty side-effects.
You might have to go further and use DBCC DBREINDEX (tblTrendDetails). This is an offline operation, though, so you should only do this when the table does not need to be accessed.
There are some differences described here: Microsoft SQL Server 2000 Index Defragmentation Best Practices.
Be aware that your transaction log can grow quite a bit from defragging a large table, and it can take a long time.
Partitioned Views:
If these do not remedy the situation (or fragmentation is not a problem), you may even wish to look to partitioned views, in which you create a bunch of underlying base tables for various ranges of records, then union them all up in a view (replacing your original table).
Better Stuff:
If performance of these selects is a real business need, you may be able to make the case for better hardware: faster drives, more memory, etc. If your drives are twice as fast, then this query will run in half the time, yeah? Also, this may not be workable for you, but I've simply found newer versions of SQL Server to truly be faster with more options and better to maintain. I'm glad to have moved most of my company's data to 2008R2. But I digress...

Database that can handle >500 millions rows

I am looking for a database that could handle (create an index on a column in a reasonable time and provide results for select queries in less than 3 sec) more than 500 millions rows. Would Postgresql or Msql on low end machine (Core 2 CPU 6600, 4GB, 64 bit system, Windows VISTA) handle such a large number of rows?
Update: Asking this question, I am looking for information which database I should use on a low end machine in order to provide results to select questions with one or two fields specified in where clause. No joins. I need to create indices -- it can not take ages like on mysql -- to achieve sufficient performance for my select queries. This machine is a test PC to perform an experiment.
The table schema:
create table mapper {
key VARCHAR(1000),
attr1 VARCHAR (100),
attr1 INT,
attr2 INT,
value VARCHAR (2000),
PRIMARY KEY (key),
INDEX (attr1),
INDEX (attr2)
}

MSSQL can handle that many rows just fine. The query time is completely dependent on a lot more factors than just simple row count.
For example, it's going to depend on:
how many joins those queries do
how well your indexes are set up
how much ram is in the machine
speed and number of processors
type and spindle speed of hard drives
size of the row/amount of data returned in the query
Network interface speed / latency
It's very easy to have a small (less than 10,000 rows) table which would take a couple minutes to execute a query against. For example, using lots of joins, functions in the where clause, and zero indexes on a Atom processor with 512MB of total ram. ;)
It takes a bit more work to make sure all of your indexes and foreign key relationships are good, that your queries are optimized to eliminate needless function calls and only return the data you actually need. Also, you'll need fast hardware.
It all boils down to how much money you want to spend, the quality of the dev team, and the size of the data rows you are dealing with.
UPDATE
Updating due to changes in the question.
The amount of information here is still not enough to give a real world answer. You are going to just have to test it and adjust your database design and hardware as necessary.
For example, I could very easily have 1 billion rows in a table on a machine with those specs and run a "select top(1) id from tableA (nolock)" query and get an answer in milliseconds. By the same token, you can execute a "select * from tablea" query and it take a while because although the query executed quickly, transferring all of that data across the wire takes awhile.
Point is, you have to test. Which means, setting up the server, creating some of your tables, and populating them. Then you have to go through performance tuning to get your queries and indexes right. As part of the performance tuning you're going to uncover not only how the queries need to be restructured but also exactly what parts of the machine might need to be replaced (ie: disk, more ram, cpu, etc) based on the lock and wait types.
I'd highly recommend you hire (or contract) one or two DBAs to do this for you.

Most databases can handle this, it's about what you are going to do with this data and how you do it. Lots of RAM will help.
I would start with PostgreSQL, it's for free and has no limits on RAM (unlike SQL Server Express) and no potential problems with licences (too many processors, etc.). But it's also my work :)

Pretty much every non-stupid database can handle a billion rows today easily. 500 million is doable even on 32 bit systems (albeit 64 bit really helps).
The main problem is:
You need to have enough RAM. How much is enough depends on your queries.
You need to have a good enough disc subsystem. This pretty much means if you want to do large selects, then a single platter for everything is totally out of the question. Many spindles (or a SSD) are needed to handle the IO load.
Both Postgres as well as Mysql can easily handle 500 million rows. On proper hardware.

What you want to look at is the table size limit the database software imposes. For example, as of this writing, MySQL InnoDB has a limit of 64 TB per table, while PostgreSQL has a limit of 32 TB per table; neither limits the number of rows per table. If correctly configured, these database systems should not have trouble handling tens or hundreds of billions of rows (if each row is small enough), let alone 500 million rows.
For best performance handling extremely large amounts of data, you should have sufficient disk space and good disk performance—which can be achieved with disks in an appropriate RAID—and large amounts of memory coupled with a fast processor(s) (ideally server-grade Intel Xeon or AMD Opteron processors). Needless to say, you'll also need to make sure your database system is configured for optimal performance and that your tables are indexed properly.

The following article discusses the import and use of a 16 billion row table in Microsoft SQL.
https://www.itprotoday.com/big-data/adventures-big-data-how-import-16-billion-rows-single-table.
From the article:
Here are some distilled tips from my experience:
The more data you have in a table with a defined clustered index, the
slower it becomes to import unsorted records into it. At some point,
it becomes too slow to be practical. If you want to export your table
to the smallest possible file, make it native format. This works best
with tables containing mostly numeric columns because they’re more
compactly represented in binary fields than character data. If all
your data is alphanumeric, you won’t gain much by exporting it in
native format. Not allowing nulls in the numeric fields can further
compact the data. If you allow a field to be nullable, the field’s
binary representation will contain a 1-byte prefix indicating how many
bytes of data will follow. You can’t use BCP for more than
2,147,483,647 records because the BCP counter variable is a 4-byte
integer. I wasn’t able to find any reference to this on MSDN or the
Internet. If your table consists of more than 2,147,483,647 records,
you’ll have to export it in chunks or write your own export routine.
Defining a clustered index on a prepopulated table takes a lot of disk
space. In my test, my log exploded to 10 times the original table size
before completion. When importing a large number of records using the
BULK INSERT statement, include the BATCHSIZE parameter and specify how
many records to commit at a time. If you don’t include this parameter,
your entire file is imported as a single transaction, which requires a
lot of log space. The fastest way of getting data into a table with a
clustered index is to presort the data first. You can then import it
using the BULK INSERT statement with the ORDER parameter.
Even that is small compared to the multi-petabyte Nasdaq OMX database, which houses tens of petabytes (thousands of terabytes) and trillions of rows on SQL Server.

Have you checked out Cassandra? http://cassandra.apache.org/

As mentioned pretty much all DB's today can handle this situation - what you want to concentrate on is your disk i/o subsystem. You need to configure a RAID 0 or RAID 0+1 situation throwing as many spindles to the problem as you can. Also, divide up your Log/Temp/Data logical drives for performance.
For example, let say you have 12 drives - in your RAID controller I'd create 3 RAID 0 partitions of 4 drives each. In Windows (let's say) format each group as a logical drive (G,H,I) - now when configuring SQLServer (let's say) assign the tempdb to G, the Log files to H and the data files to I.

I don't have much input on which is the best system to use, but perhaps this tip could help you get some of the speed you're looking for.
If you're going to be doing exact matches of long varchar strings, especially ones that are longer than allowed for an index, you can do a sort of pre-calculated hash:
CREATE TABLE BigStrings (
BigStringID int identity(1,1) NOT NULL PRIMARY KEY CLUSTERED,
Value varchar(6000) NOT NULL,
Chk AS (CHECKSUM(Value))
);
CREATE NONCLUSTERED INDEX IX_BigStrings_Chk ON BigStrings(Chk);
--Load 500 million rows in BigStrings
DECLARE #S varchar(6000);
SET #S = '6000-character-long string here';
-- nasty, slow table scan:
SELECT * FROM BigStrings WHERE Value = #S
-- super fast nonclustered seek followed by very fast clustered index range seek:
SELECT * FROM BigStrings WHERE Value = #S AND Chk = CHECKSUM(#S)
This won't help you if you aren't doing exact matches, but in that case you might look into full-text indexing. This will really change the speed of lookups on a 500-million-row table.

I need to create indices (that does not take ages like on mysql) to achieve sufficient performance for my select queries
I'm not sure what you mean by "creating" indexes. That's normally a one-time thing. Now, it's typical when loading a huge amount of data as you might do, to drop the indexes, load your data, and then add the indexes back, so the data load is very fast. Then as you make changes to the database, the indexes would be updated, but they don't necessarily need to be created each time your query runs.
That said, databases do have query optimization engines where they will analyze your query and determine the best plan to retrieve the data, and see how to join the tables (not relevant in your scenario), and what indexes are available, obviously you'd want to avoid a full table scan, so performance tuning, and reviewing the query plan is important, as others have already pointed out.
The point above about a checksum looks interesting, and that could even be an index on attr1 in the same table.

sql server delete slowed drastically by indexes

I am running an archive script which deletes rows from a large (~50m record DB) based on the date they were entered. The date field is the clustered index on the table, and thus what I'm applying my conditional statement to.
I am running this delete in a while loop, trying anything from 1000 to 100,000 records in a batch. Regardless of batch size, it is surprisingly slow; something like 10,000 records getting deleted a minute. Looking at the execution plan, there is a lot of time spent on "Index Delete"s. There are about 15 fields in the table, and roughly 10 of them have some sort of index on them. Is there any way to get around this issue? I'm not even sure why it takes so long to do each index delete, can someone shed some light on exactly whats happening here? This is a sample of my execution plan:
alt text http://img94.imageshack.us/img94/1006/indexdelete.png
(The Sequence points to the Delete command)
This database is live and is getting inserted into often, which is why I'm hesitant to use the copy and truncate method of trimming the size. Is there any other options I'm missing here?

Deleting 10k records from a clustered index + 5 non clustered ones should definetely not take 1 minute. Sounds like you have a really really slow IO subsytem. What are the values for:
Avg. Disk sec/Write
Avg. Disk sec/Read
Avg. Disk Write Queue Length
Avg. Disk Read Queue Length
On each drive involved in the operation (including the Log ones!). If you placed indexes in separate filegroups and allocated each filegroup to its own LUN or own disk, then you can identify which indexes are more problematic. Also, the log flush may be a major bottleneck. SQL Server doesn't have much control here, is all in your own hands how to speed things up. that time is not spent in CPU cycles, is spent waiting for IO to complete and you need an IO subsystem calibrated for the load you demand.
To reduce the IO load you should look into making indexes narrower. Primarily, make sure the clustered index is the narrowest possible that works. Then, make sure the nonclustered indexes don't include sporious unused large columns (I've seen that...). A major gain may be had by enabling page compression. And ultimately, inspect index usage stats in sys.dm_db_index_usage_stats and see if any index is good for the axe.
If you can't reduce the IO load much, you should try to split it. Add filegroups to the database, move large indexes on separate filegroups, place the filegroups on separate IO paths (distinct spindles).
For future regular delete operations, the best alternative is to use partition switching, have all indexes aligned with the clustered index partitioning and when the time is due, just drop the last partition for a lightning fast deletion.

Assume for each record in the table there are 5 index records.
Now each delete is in essence 5 operations.
Add to that, you have a clustered index. Notice the clustered index delete time is huge? (10x) longer than the other indexes? This is because your data is being reorganized with every record deleted.
I would suggest dropping at least that index, doing a mass delete, than reapplying. Index operations on delete and insert are inherently costly. A single rebuild is likely a lot faster.

I second the suggestion that #NickLarsen made in a comment. Find out if you have unused indexes and drop them. This could reduce the overhead of those index-deletes, which might be enough of an improvement to make the operation more timely.
Another more radical strategy is to drop all the indexes, perform your deletes, and then quickly recreate the indexes for the now smaller data set. This doesn't necessarily interrupt service, but it could probably make queries a lot slower in the meantime. Though I am not a Microsoft SQL Server expert, so you should take my advice on this strategy with a grain of salt.

More of a workaround, but can you add an IsDeleted flag to the table and update that to 1 rather than deleting the rows? You will need to modify your SELECTs and UPDATEs to use this flag.
Then you can schedule deletion or archiving of these records for off-hours.

It would take some work to implement it given this is in production, but if you are on SQL Server 2005 / 2008 you should investigate and convert the table to being partitioned, then the removal of old data can be achieved extremely quickly. It is designed for a 'rolling window' type effect and prevents large scale deletes tieing up a table / process.
Unfortunately with the table in production, migrating it across to this technique will take some T-SQL coding, knowledge and a weekend to upgrade / migrate it. Once in place though any existing selects and inserts will work against it seamlessly, the partition maintenance and addition / removal is where you need the t-sql to control the process.

SQL Server 2008 Performance: No Indexes vs Bad Indexes?

i'm running into a strange problem in Microsoft SQL Server 2008.
I have a large database (20 GB) with about 10 tables and i'm attempting to make a point regarding how to correctly create indexes.
Here's my problem: on some nested queries i'm getting faster results without using indexes! It's close (one or two seconds), but in some cases using no indexes at all seems to make these queries run faster... I'm running a Checkpoiunt and a DBCC dropcleanbuffers to reset the caches before running the scripts, so I'm kinda lost.
What could be causing this?
I know for a fact that the indexes are poorly constructed (think one index per relevant field), the whole point is to prove the importance of constructing them correctly, but it should never be slower than having no indexes at all, right?
EDIT: here's one of the guilty queries:
SET STATISTICS TIME ON
SET STATISTICS IO ON
USE DBX;
GO
CHECKPOINT;
GO
DBCC DROPCLEANBUFFERS;
GO
DBCC FREEPROCCACHE;
GO
SELECT * FROM Identifier where CarId in (SELECT CarID from Car where ManufactId = 14) and DataTypeId = 1
Identifier table:
- IdentifierId int not null
- CarId int not null
- DataTypeId int not null
- Alias nvarchar(300)
Car table:
- CarId int not null
- ManufactId int not null
- (several fields followed, all nvarchar(100)
Each of these bullet points has an index, along with some indexes that simultaneously store two of them at a time (e.g. CarId and DataTypeId).
Finally, The identifier table has over million entries, while the Car table has two or three million

My guess would be that SQL Server is incorrectly deciding to use an index, which is then forcing a bookmark lookup*. Usually when this happens (the incorrect use of an index) it's because the statistics on the table are incorrect.
This can especially happen if you've just loaded large amounts of data into one or more of the tables. Or, it could be that SQL Server is just screwing up. It's pretty rare that this happens (I can count on one hand the times I've had to force index use over a 15 year career with SQL Server), but the optimizer is not perfect.
* A bookmark lookup is when SQL Server finds a row that it needs on an index, but then has to go to the actual data pages to retrieve additional columns that are not in the index. If your result set returns a lot of rows this can be costly and clustered index scans can result in better performance.
One way to get rid of bookmark lookups is to use covering indexes - an index which has the filtering columns first, but then also includes any other columns which you would need in the "covered" query. For example:
SELECT
my_string1,
my_string2
FROM
My_Table
WHERE
my_date > '2000-01-01'
covering index would be (my_date, my_string1, my_string2)

Indexes don't really have any benefit until you have many records. I say many because I don't really know what that tipping over point is...It depends on the specific application and circumstances.
It does take time for the SQL Server to work with an index. If that time exceeds the benefit...This would especially be true in subqueries, where a small difference would be multiplied.
If it works better without the index, leave out the index.

Try DBCC FREEPROCCACHE to clear the execution plan cache as well.

This is an empty guess. Maybe if you have a lot of indexes, SQL Server is spending time on analyzing and picking one, and then rejecting all of them. If you had no indexes, the engine wouldn't have to waste it's time with this vetting process.
How long this vetting process actually takes, I have no idea.

For some queries, it is faster to read directly from the table (clustered index scan), than it is to read the index and fetch records from the table (index scan + bookmark lookup).
Consider that a record lives along with other records in a datapage. Datapage is the basic unit of IO. If the table is read directly, you could get 10 records for the cost of 1 IO. If the index is read directly, and then records are fetched from the table, you must pay 1 IO per record.
Generally SQL server is very good at picking the best way to access a table (direct vs index). There may be something in your query that is blinding the optimizer. Query hints can instruct the optimizer to use an index when it is wrong to do so. Join hints can alter the order or method of access of a table. Table Variables are considered to have 0 records by the optimizer, so if you have a large Table Variable - the optimizer may choose a bad plan.
One more thing to look out for - varchar vs nvarchar. Make sure all parameters are of the same type as the target columns. There's a case where SQL Server will convert the whole index to the parameter's type in the event of a type mismatch.

Normally SQL Server does a good job at deciding what index to use if any to retrieve the data in the fastest way. Quite often it will decide not to use any indexes as it can retrieve small amounts of data from small tables quicker without going away to the index (in some situations).
It sounds like in your case SQL may not be taking the most optimum route. Having lots of badly created indexes may be causing it to pick the wrong routes to get to the data.
I would suggest viewing the query plan in management studio to check what indexes its using, and where the time is being taken. This should give you a good idea where to start.
Another note is it maybe that these indexes have gotten fragmented over time and are now not performing to their best, it maybe worth checking this and rebuilding some of them if needed.

Check the execution plan to see if it is using one of these indexes that you "know" to be bad?
Generally, indexing slows down writing data and can help to speed up reading data.
So yes, I agree with you. It should never be slower than having no indexes at all.

SQL server actually makes some indexes for you (e.g. on primary key).
Indexes can become fragmented.
Too many indexes will always reduce performance (there are FAQs on why not to index every col in the db)
also there are some situations where indexes will always be slower.

run:
SET SHOWPLAN_ALL ON
and then run your query with and without the index usage, this will let you see what index if any are being used, where the "work" is going on etc.

No Sql Server analyzes both the indexes and the statistics before deciding to use an index to speed up a query. It is entirely possible that running a non-indexed version is faster than an indexed version.
A few things to try
ensure the indexes are created and rebuilt, and re-organized (defragmented).
ensure that the auto create statistics is turned on.
Try using Sql Profiler to capture a tuning profile and then using the Database Engine Tuning Advisor to create your indexes.
Surprisingly the MS Press Examination book for Sql administration explains indexes and statistics pretty well.
See Chapter 4 table of contents in this amazon reader preview of the book
Amazon Reader of Sql 2008 MCTS Exam Book

To me it sounds like your sql is written very poorly and thus not utilizing the indexes that you are creating.
you can add indexes till you're blue in the face but if your queries aren't optimized to use those indexes then you won't get any performance gain.
give us a sample of the queries you're using.
alright...
try this and see if you get any performance gains (with the pk indexes)
SELECT i.*
FROM Identifier i
inner join Car c
on i.CarID=c.CarID
where c.ManufactId = 14 and i.DataTypeId = 1

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight