Processing data faster using ssis and an ssd drive - sql-server

We have a new local server that processes data using sql server 2008 and ssis. in this server i have dedicated drive to do different things. the c drive is for the os and software. the d drive is for db storage , and the ssis . The e drive is a ssd drive that we restore each database that is being used by the ssis.
our idea was that we process allot of data and since the ssd drive is only 500gb( because of the cost) we would have everything on a regular drive and transfer the databases in use to the ssd drive to have the process run faster.
when i run the ssis without the ssd drive it takes about 8 hrs and when i run the process restoring the databases on the ssd drive the process takes about the same amount of time to process( in fact if i include the restoring of the data bases the process takes longer)
As of right now i cannot move the os and software to the ssd drive to test to see if that would help the process.
is there a way to utilze the ssd drive to process the data and to speed up the process.

If you want to speed up a given process, you need to find the bottleneck.
Generally speaking (since you give no details of the SSIS-Process) at each point in the operation, one of the systems components (CPU, RAM, I/O, Network) is operating at maximum speed. You need to find the component that contributes the most to your run time and then either speed that component up by replacing it with a faster component or reduce the load on it by redesigning the process.
Since you ruled out already the I/O to the user database(s), you need to look elsewhere. For a general ad hoc look, use the systems resource monitor (available through the task manager). For a deeper look, there are lots of performance counters available via perfmon.exe, both for OS (CPU, I/O), SSIS and SQL Server.
If you have reason to believe that DB-I/O is your bottleneck, try moving the tempdb to the SSD (if you generally have lots of load on the tempdb that is a good idea anyway). Instructions here.
Unless you give us some more details about the SSIS process in question, that's all I can say for now.

Related

How to detach data disk in Azure SQL VM

For temporary testing purpose. I created a SQL VM in Azure, and the Azure wizard assign me a OS disk with 127G and a Data disk with 1T. But the cost of the Data disk is a little bit expensive for me. so I change the server default data and log path to OS(C) disk. and backup DB to OS(C) disk. then detach Data(F) disk.
The problem is sql server start fail without data disk. what should I do if I want run sql server without data(F) disk?
C drive is dedicated for OS. While putting data and log files on the same drive as OS may work for test/dev workloads, it is not recommended to put database files and OS on the same drive for your production workloads. Depending on your VM type, OS drive may use Standard Disks (HDD) or Premium Disks. There are IOPS limit for each of these Disks types. For instance, Standard Disks can support up to 500 IOPS and it is important to keep these IOPS for OS so that OS operations do not have to compete IOPS with other applications. Starving OS operations can result in VM restart.
Please send an email to AzureDisksPM#microsoft.com if you have additional questions.
Thanks,
Aung
as far as I recall, you dont really need data disk to run sql on azure vm. by default it will use it to host database files, you can move those to disk C and repoint SQL to them. there are many ways to do that, you can consult official docs.
https://learn.microsoft.com/en-us/sql/relational-databases/databases/move-system-databases?view=sql-server-2017

Shared volume for data (multiple MDF) and another shared volume for logs (multiple LDF) on SAN

I have 3 instances of SQL Server 2008, each on different machines with multiple databases on each instance. I have 2 separate LUNS on my SAN for MDF and LDF files. The NDX and TempDB files run on the local drive on each machine. Is it O.K. for the 3 instances to share a same volume for the data files and another volume for the log files?
I don't have thin provisioning on the SAN so I would like to not constaint disk space creating multiple volumes because I was adviced that I should create a volume (drive letter) for each instance, if not for each database. I am aware that I should split my logs and data files at least. No instance would share the actual database files, just the space on drive.
Any help is appretiated.
Of course the answer is: "It depends". I can try to give you some hints on what it depends however.
A SQL Server Instance "assumes" that it has exclusive access to its resources. So it will fill all available RAM per default, it will use all CPUs and it will try to saturate the I/O channels to get maximum performance. That's the reason for the general advice to keep your instances from concurrently accessing the same disks.
Another thing is that SQL Server "knows" that sequential I/O access gives you much higher trhoughput than random I/O, so there are a lot of mechanisms at work (like logfile organization, read-ahead, lazy writer and others) to avoid random I/O as much as possible.
Now, if three instances of SQL Server do sequential I/O requests on a single volume at the same time, then from the perspective of the volume you are getting random I/O requests again, which hurts your performance.
That being said, it is only a problem if your I/O subsystem is a significant bottleneck. If your logfile volume is fast enough that the intermingled sequential writes from the instances don't create a problem, then go ahead. If you have enough RAM on the instances that data reads can be satisfied from the buffer cache most of the time, you don't need much read performance on your I/O subsystem.
What you should avoid in each case is multiple growth steps on either log or data files. If several files on one filesystem are growing, you will get fragmentation and fragmentation can transform a sequential read or write request even from a single source to random I/O again.
The whole picture changes again if you use SSDs as disks. These have totally different requirements and behaviour, but since you didn't say anything about SSD I will assume that you use a "conventional" disk-based array or RAID configuration.
Short summary: You might get away with it, if the circumstances are right, but it is hard to assess without knowing a lot more about your systems, from both the SAN and SQL perspective.

SQL Server - ETL approach

We get daily files that need to be loaded into our database. The files will get delivered on a separate server than the database. Which one of the 2 approaches are better for the ETL from a performance perspective?
Transfer files over from the delivery server to the database server. Do bulk load.
Open DB connection from delivery server and load
Edited to add: The servers are all on the same network.
Depends whether source servers are SQL servers or other technology, the driver used (if it's oracle the Microsoft driver will nerf your perf badly, oracle is better), the amount of database overhead You want to impose (while one server is feeding the other they are probably both IO bound), the disk layout You have (ie reading from one raid and writing to the other, conpressing and transferring through 1gig or 100mb might be more efficient. Usually the dumps compress nicely but as Beth have noticed, test it.
With dumps You can abuse parallel transformations (like multiple disk shares, and multiple processors use for compression - use 7zip period.) With ethernet YOu probably wont abuse as much parallelism. Same thing affects the target server.
All in all, as usual with performance, test, quantify, test, quantify, repeat:)
The universal response of 'It Depends'. It depends particularly on what ETL technology you are using. If your ETL is tied to the database server for its processing power (SSIS, BODI (to a lesser degree) then you need to get your files onto the database server asap. If you have a more file based ETL package (Abinitio, Informatica) then you are free to do your transformation on your delivery server and then move your 'ready-to-load' data onto the database server for bulk loading.
in all cases.
Espacially if the files are very large, you can compress data files before transporting over network.

What's the best storage for SQL Server 2000?

I want to access my sql server database files in a INTEL SS4000-E storage. It´s a NAS Storage. Could it be possible to work with it as a storage for sql server 2000? If not, what is the best solution?
I strongly recommend against it.
Put your data files locally on the server itself, with RAID mirrored drives. The reasons are twofold:
SQL Server will run much faster for all but the smallest workloads
SQL Server will be much less prone to corruption in case the link to the NAS gets broken.
Use the NAS to store backups of your SQL Server, not to host your datafiles. I don't know what your database size will be, or what your usage pattern will be, so I can't tell you what you MUST have. At a minimum for a database that's going to take any significant load in a production environment, I would recommend two logical drives (one for data, one for your transaction log), each consisting of a RAID 1 array of the fastest drives you can stomach to buy. If that's overkill, put your database on just two physical drives, (one for the transaction log, and one for data). If even THAT is over budget, put your data on a single drive, back up often. But if you choose the single-drive or NAS solution, IMO you are putting your faith in the Power of Prayer (which may not be a bad thing, it just isn't that effective when designing databases).
Note that a NAS is not the same thing as a SAN (on which people typically DO put database files). A NAS typically is much slower and has much less bandwidth than a SAN connection, which is designed for very high reliability, high speed, advanced management, and low latency. A NAS is geared more toward reducing your cost of network storage.
My gut reaction - I think you're mad risking your data on a NAS. SQL's expectation is continuous low-latency uninterrupted access to your storage subsystem. The NAS is almost certainly none of those things - you local or SAN storage (in order of performance, simplicity and therefore preference) - leave the NAS for offline file storage/backups.
The following KB lists some of the constraints and issues you'd encounter trying to use a NAS with SQL - while the KB covers SQL 7 through 2005, a lot of the information still applies to SQL 2008 too.
http://support.microsoft.com/kb/304261
local is almost always faster than networked storage.
Your performance for sql will depend on how your objects, files, and filegroups are defined, and how consumers use the data.
Well "best" means different things to different people, but I think "best" performance would be a TMS RAMSAN or a RAID of SSDs... etc
Best capacity would be achieved with a RAID of large HDDs...
Best reliability/data saftey would be achieved with Mirroring across many drives, and regular backups (off site preferably)...
Best availability... I don't know... maybe a clone the system and have a hot backup ready to go at all times.
Best security would require encryption, but mainly limiting physical access to the machine (and it's backups) is enough unless it's internet connected.
As the other answers point out, there will be a performance penalty here.
It is also worth mentioning that these things sometimes implement a RAM cache to improve I/O performance, if that is the case and you do trial this config, the NAS should be on the same power protection / UPS as the server hardware, otherwise in case of power outtage the NAS may 'loose' the part of the file in cache. ouch!
It can work but a dedicated fiber attached SAN will be better.
Local will usually be faster but it has limited size and won't scale easily.
I'm not familiar with the hardware but we initially deployed a warehouse on a shared NAS. Here's what we found.
We were regularly competing for resources on the head unit -- there was only so much bandwidth that it could handle. Massive warehouse queries and data loads were severely impacted.
We needed 1.5 TB for our warehouse (data/indexes/logs) we put each of these resources onto a separate set of LUNS (like you might do with attached storage). Data was spanning just 10 disks. We ran into all sorts of IO bottlenecks with this. the better solution was to create one big partition across lots of small disks and store data, index and logs all in the same place. This sped things up considerably.
If you're dealing with a moderately used OLTP system, you might be fine but a NAS can be troublesome.

Which type of external drives are good for SQL backup files?

As a part of database maintenance we are thinking of taking daily backups onto an external/firewire drives. Are there any specific recommended drives for the frequent read/write operations from sql server 2000 to take backups?
Whatever you do, just don't use USB 1.1.
The simple fact is that harddrives over a period of time will fail. The best two solutions
I can recommend unfortunately do not avail of using harddrives.
Using a tape backup, granted is slower but you get the flexibility of having the option of offsite backups. It is easy to put a tape in the boot of a car. Rotating the tapes means that you can have pretty recent protection against any unforseen situations.
Another option is an online backup solution where the backups are encrypted and copied offsite. My reccommendation is definitly at least having some sort of offsite backup external to the building that you keep the SQL servers. After all it is "disaster" recovery.
Pretty much any external drive can be used here, provided it has the space to hold your backups and enough performance to get the backups there. The specifics depend on your exact requirements.
In my experience, FireWire tends to outperform USB for disk activity, regardless of their theoretical maximum transfer rates. And FireWire 800 will perform even better yet. I have found poor performance from FireWire and USB drives when you have multiple concurrent reads/writes going on, but with backups, it's generally more large sequential reads and writes.
Another option that is a little bit more complex to setup and manage, but can provide you with greater flexibility and performance is external SATA (eSATA). You can even get Hot Swappable external SATA enclosures for even greater convenience, and ease of taking your backups offsite.
However, another related option that I've had excellent success with is to setup a separate server to act as your backup server. You can use whatever disk options you choose (FireWire, SATA, eSATA, SCSI, FiberChannel, iSCSI, etc), and share out that disk storage as a network share (I use NFS and Samba on a Linux box, but for a Windows oriented network, a Windows share will work fine). You can then access the shares across the network and backup multiple machines to it. Also, the separation of backup server from your production machines will give you greater flexibility if you need to take it offline for maintenance, adding/removing storage, etc.
Drobo!
A USB hard drive RAID array that uses normal - off the shelf hard drives. 4 bays, when you need more space, buy another hard drive. Out of bays? Buy bigger hard drives and replace your smallest in the array.
http://www.drobo.com/
Depending on the size of the databases speed of the drive can be a real factor. I would look into something like Drobo but with an eSata or SAS interface. There is nothing more entertaining than watching a terabyte go through USB 2.0. Also, you might consider something like hyperbac or RedGate SQL Backup to compress the backup and make it easier to fit on the drive as well.
For the most part, external drives aren't a good option - unless your database is really small.
Other than some of the options others have listed, you can also use UNC/Network shares as a great 'off-box' option.
Check out the following video for some other options:
SQL Server Backup Options (Free Video)
And the videos on configuring backups on the site will show you how to specify a network path for backup purposes.

Resources