I have a database which contains data for events. For every event database size increases by 1 GB.
I have planned to create a separate database for archive. After completing each event, I plan to move that data to the archive database using a stored procedure.
I have already added many indexes on database to improve speed.
So is this good technique or is there any better idea to improve the speed on database ?
Thanks in advance.
Related
According to this article https://msdn.microsoft.com/en-us/library/bb402876.aspx
having the log file and data file on separate drives will improve performance as I/O activity occurs at the same time for both the data and log files. Based on this information I have moved the non system databases.
Accordingly to this article http://www.sqlservercentral.com/Forums/Topic1263294-1550-1.aspx the only system log file that should be moved to a separate drive is the tempDB, the rest won't make a difference in performance as master and model as they are hardly updated.
I want to know from the Stackoverflow community, will it be beneficial to move the System Database log files on another storage drive to improve performance on SQL Server 2012 (master, model, msdb)?
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I'm currently trying to develop a solution for data logging related to a SCADA application, using SQL Server 2012 Express. The SCADA application is configured to execute a stored procedure on SQL Server to push data in the db.
The data flow imho is quite heavy ( 1.4 - 1.9 m rows per day, averaging 43 bytes in length, after some tweaks). The table which stores data has one clustered index on three columns. For now our focus is to store this data as compactly as possible and without generating too much fragmentation (SELECTS are not of major interest right now).
Currently the DB occupies ~250 MB (I have pre-allocated 5120 MB for the DB), and holds only this data table one other table which can be ignored, and the transaction logs.
My questions are:
How can I setup index maintenance on this DB? Being Express edition I can't use SQL Server agent. I'll use task scheduler but should I use rebuild or reorganize? Is it advisable to use a fill factor under 100? Should I configure the task scheduler to call at intervals such that the task will only reorganize (fragmentation under 30%)? Is rebuilding an increasingly expensive operation (day x index is rebuilt, will day x+1 take less time to rebuild as opposed to rebuilding only once in 2 days), after it reaches max storage space?
Again having SQL Server Express edition limits the data capacity to 10 GB. I'm trying to squeeze as much as I can in that amount. I'm planning to build a ring buffer - can I setup the DB such that after I get in the event logs the message that the alter database expand etc. failed the stored procedure will use update on oldest values as a means of inserting data (my fear is that even updates will take some new space, and at that point I'll have to somehow aggressively shrink the DB)?
I have also considered using a compressed win partition to store the files of the DB, and using a free unlimited DB such as MySQL for storage purposes, and SQL Server only as a frontend - the SCADA app must be configured with SQL Server. Is that worth considering?
To optimize inserts I'm using a global temp db which holds up to 1k rows (counting with a sequence) as a form of buffer and then push the data into the main table and truncate the temp table. Is that efficient? Should I consider transactions for efficiency instead - I've tried to begin a named transaction in the stored procedure if it doesn't exist and if the sequence is reaching 1k commit the tran? Does increasing the threshold to 10k rows lead to less fragmentation?
If you're thinking I'm unfamiliar with Databases then you are right. Atm there is only one scada application using SQL Server, but the actual application is setup redundantly so at the end everything will take twice the resources (and each instance of the SCADA application will get its own storage). Also I need to mention that I can't just upgrade to a superior edition of SQL Server, but I have the freedom to use any piece of free software.
Most of the answers cross over the 4 numbers, so I just put responses in bullets to help:
Indexes should probably be maintained, but in your case, they can be prohibitive. Besides the clustered index on the table, indexes generally (the nonclustered type) are for querying the data.
To let a Scheduled Task do the work since no Agent can be use, the sqlcmd utility (https://technet.microsoft.com/en-us/library/ms165702%28v=sql.105%29.aspx). The command line tools may have to be installed, but that let's you write a batch script to run the SQL commands.
With an app doing as much inserting as you describe, I would design a 2-step process. First, a basic table with no nonclustered indexes to accept the inserts. Second, a table you'd be querying the data. Then, use a scheduled task to call a stored proc to transfer transactional data from table 1 to table 2 perhaps hourly or daily based on your query needs (and also remove the original data from table 1 after transfer to table 2 - this should definitely be done in a transaction).
Otherwise, every insert has to not only insert the raw data for the table, but also insert the records for the indexes.
Due to the quantity of your inserts, high fill factors should probably be avoided (probably set to less than 50%). A high (100%) fill factor means the nonclustered indexes don't leave any space in the pages of the tables to actually insert records. Every record you insert means the pages of the table have to be re-organized. Having a high fill factor will leave space in each page of the table so new records can be inserted in indexes without having to reorganize them.
To optimize your inserts, I would use the 2-step process above to insert records straight into your first table. If you can have your app use SQL Bulk Copy, I would explore that as well.
To optimize space, you can explore a few things:
Do you need all the records accessible in real time? Perhaps you can work with the business to create a data retention policy in which you keep every record in the database for 24 hours, then a summary by minute or something for 1 week, hourly for 2 weeks, daily for 6 months, etc. You could enhance this with a daily backup so that you could restore any particular day in its entirety if needed.
Consider changing the database level from full recovery to simple or bulk-logged. This can control your transaction log with the bulk inserts you may be doing.
More Info: https://technet.microsoft.com/en-us/library/ms190692%28v=sql.105%29.aspx
You'll have to work hard to manage your transaction log. Take frequent checkpoints and transaction log backups.
I have an application that is in production with its own database for more than 10 years.
I'm currently developing a new application (kind of a reporting application) that only needs read access to the database.
In order not to be too much linked to the database and to be able to use newer DAL (Entity Framework 6 Code First) I decided to start from a new empty database, and I only added the tables and columns I need (different names than the production one).
Now I need some way to update the new database with the production database regularly (would be best if it is -almost- immediate).
I hesitated to ask this question on http://dba.stackexchange.com but I'm not necessarily limited to only using SQL Server for the job (I can develop and run some custom application if needed).
I already made some searches and had those (part-of) solutions :
Using Transactional Replication to create a smaller database (with only the tables/columns I need). But as far as I can see, the fact that I have different table names / columns names will be problematic. So I can use it to create a smaller database that is automatically replicated by SQL Server, but I would still need to replicate this database to my new one (it may avoid my production database to be too much stressed?)
Using triggers to insert/update/delete the rows
Creating some custom job (either a SQL Job or some Windows Service that runs every X minutes) that updates the necessary tables (I have a LastEditDate that is updated by a trigger on my tables, so I can know that a row has been updated since my last replication)
Do you some advice or maybe some other solutions that I didn't foresee?
Thanks
I think that the Transactional replication is the better than using triggers.
Too much resources would be used in source server/database due to the trigger fires by each DML transaction.
Transactional rep could be scheduled as a SQL job and run it few times a day/night or as a part of nightly scheduled job. IT really depends on how busy the source db is...
There is one more thing that you could try - DB mirroring. it depends on your sql server version.
If it were me, I'd use transactional replication, but keep the table/column names the same. If you have some real reason why you need them to change (I honestly can't think of any good ones and a lot of bad ones), wrap each table in a view. At least that way, the view is the documentation of where the data is coming from.
I'm gonna throw this out there and say that I'd use Transaction Log shipping. You can even set the secondary DBs to read-only. There would be some setting up for full recovery mode and transaction log backups but that way you can just automatically restore the transaction logs to the secondary database and be hands-off with it and the secondary database would be as current as your last transaction log backup.
Depending on how current the data needs to be, if you only need it done daily you can set up something that will take your daily backups and then just restore them to the secondary.
In the end, we went for the Trigger solution. We don't have that much changes a day (maybe 500, 1000 top), and it didn't put too much pressure on the current database. Thanks for your advices.
I am trying to come up with an archiving solution and would like to implement the following architecture:
Main Table - kept small.
Copy Job - takes a 3 month worth of data and copies the data into an archive table.
When the archive tables reaches certain number of record the job creates a new table thus the database keeps rolling accumulating approx a calendar year worth of records.
My questions are:
Are there any ready solutions I can refer to?
Common design practices to execute on?
For SQL Server 2005+, take a look at Partitioned Tables and Indexes in SQL Server 2005, especially the Sliding-Window Scenario portion of the article.
How can I record all the Inserts and Updates being performed on a database (MS SQL Server 2005 and above)?
Basically I want a table in which I can record all the inserts andupdates issues on my database.
Triggers will be tough to manage because there are 100s of tables and growing.
Thanks
Bullish
We have hundreds of tables and growing and use triggers. In newer versions of SQL server you can use change Data Capture or Change Tracking but we have not found them adequate for auditing.
What we have is are two separate audit tables for each table (one for recording the details of the instance (1 row even if you updated a million records) and one for recording the actual old and new values), but each has the same structure and is created by running a dynamic SQL proc that looks for unauditied tables and creates the audit triggers. This proc is run every time we deploy.
Then you should also take the time to write a proc to pull the data back out of the audit tables if you want to restore the old values. This can be tricky to write on the fly with this structure, so it is best to have it handy before you have the CEO peering down your neck while you restore the 50,000 users accidentally deleted.
As of SQL Server 2008 and above you have change data capture.
Triggers, although unwieldy and a maintenance nightmare, will do the job on versions prior to 2008.