what exactly is i/o affinity in SQL server?
An SQ L SERVER uses 64 cores.
It has been discovered that a performance issues ,when large amount of data are written to tables under heavy system load.
Why to limit the number of cores that handle I/O by using I/O Affinity?
When to create an affinity mask?
It is used to control how many cores of CPU are used for disk operations and how many are used for the remaining SQL related services.
--> affinity I/O mask Option
Related
Having an issue with DB2OLEDB performance, using sql server 2017 performing a data load from IBM i 7.3 .
The client is a VMware VM, network settings seem ok and have been tweaked up to the best of my ability (vmxnet3 driver 1.8). Load from other VMs or from www flies at over 100mbits.
Troubleshooting so far:
DB2OLEDB (Microsoft) performs substantially faster (3-5x) than IBMDASQL.
Setting I/O Affinity mask to one core doubles performance, but additional cores have no impact.
RSS is on.
DB2OLEDB inprocess on/off has no effect on throughput but off introduces substantial spool up time at the beginning of each query.
Performance currently around 15 mbit. Same table from another SQL server (cached) loads about 3x faster at 50mbit+ (different provider obviously).
Interestingly, enabling rowcache spikes network throughput at the beginning to 100-150mbits. I.e. I'm inferring that there is plenty of network bandwidth available.
Finally, we are using in-memory table as destination in order to eliminate disk i/o as a culprit.
Cpu is burning up one core and the remaining are at ~20% ish.
Any thoughts?
I suspect that DB2OLEDB driver or some part of COM is the bottleneck at this point.
edit: #MandyShaw (too long for comment) Windows Side. IBM i never breaks 1% for my particular workload and it generaly runs 25%-50% load depending on TOD. SQL statements are varied. Everything from straight four part query to 7 table snowflake as a passthrough. One interesting thing: throughput (network) varies based on row length. Wider tables appear to pump at roughly the same row rate as thinner tables. This is true for both the IBM and Microsoft driver. Reducing network latency had a great impact on performance (see RSC issues with Vmxnet3 driver 1.6.6.0). If I understand correctly the OLEDB drivers fetch one row at a time (except possibly when loading the rowset cache).
In other words, for every row we're issuing a request from SQL server to COM/OLEDB driver to Supervisor Network Driver, to Hypervisor Network driver, to physical NIC, through fiber and landing at the IBM i. Then back again. We have successfully been able to multiplex large table loads using the service broker (but this is impractical for most applications). This, as well as other metrics, suggests that the IBM i has plenty of cpu and bandwidth to spare. The fiber fabric is mostly idle, we've tuned the bejeezus out of the hypervisor (VMware) and the supervisor (tcp/ip stack) as well as the SQL server itself.
This is why I'm looking at the COM/OLEDB provider for answers. Something in this model seems to stink. It's either not configured properly or simply doesn't support multiple threads of execution.
I'm also willing to accept that it's something in SQL server, but for the life of me I can't find a way to make a linked server query run multi-threaded using any combination of configuration, options or hints. Again, it may just not be possible by design.
At this point in time, the few known leads that I have involve (1) tuning network interrupt request coalescing and frequency to minimize interrupts to the OLEDB driver thread and (2) throwing the HIS gateway on a consumer x86 box with a high single core frequency (5ghz) in order to maximize single threaded performance.
These are both shitty options.
If you've got something particular in mind with the EBCIDIC/ASCII conversion performance, I'd be happy to try it and report back. Shoot me a link/info.
I need to access a huge amount of data in a short time, so I came up with a solution. I intend to run the same select query using the between clause by changing its parameters, in a parallel manner on the cores available in the cpu.
For this to work I wanted to know whether SQL Server inherently uses all the cores efficiently for a select statement or it just uses a single core for processing the query. I could not find much help on the internet regarding this.
If I could implement this approach on a quad-core, I would increase my access times 4 times than the normal query and this would be affected with change in number of cores.
Is it possible to run multiple threads of a C# application on multiple cores to get parallel execution?
SQL Server does take parallel queries into account. It is usually avoided as there is some overhead (and it is seldomly possible to do it better manually). The maximum degree can be configured with sp_configure or in a WORKLOAD GROUP and you can specify the MAXDOP hint in selects to change it up/down-wards.
Typically it depends on the IO throughput (especially if data is distributed over multiple path) as well as computational complexity of queries and calculations.
I have 2 physical CPU (each one is quad core) and hyperthreading is enabled. Task manager shows 16 logical processors. Current maxdop setting is default to zero.
When parallelism is used, if available all 16 logical processors will be used and thereof 16 schedulers can be used to span the query. To clarify when query optimizer uses parallelism (especially when hyperthreading is enabled); does it considers available logical cores and not physical cores?
When SQL Server generates a parallel plan, it does not include the DOP at which it will be executed every time. The plan simply includes parallel operators, and the DOP is selected dynamically when the query is executed, based on factors such as server load, and limitations like MAXDOP settings/hints.
While SQL Server is NUMA-aware, AFAIK it's not able to distinguish between a regular core and a Hyperthreaded core. They're all just logical processors.
If your workload benefits from parallel queries running on as much CPU horsepower as you can get, experiment with turning Hyperthreading off to see if it makes a difference. Hyperthreading is great for OLTP where there are lots of concurrent connections with short requests, but maybe not so much in your situation. You'll have to test.
I preparing now export procedure from SQL Server to third-party ERP system. It scans almost all tables from database scheme and creates XML for every product. Export is long enough, client wish make it faster. I realised that export procedure utilize only 1 CPU .
So i think if I run the same procedure several times with different parameters (different range of products), i can utilise all processors and it can be faster.
Questions
How i can do that using only SQL Server tools.
One of possible solutions is using SSIS. Any other?
Number of processors can be vary. I can get processors count using sys.dm_os_sys_info. How dynamically start procedure several times dependable from processors quantity?
A couple of things to bear in mind:
1 - Are you SURE you are CPU bound and not IO bound? In my experience, it's very rare to have the CPU be the bottleneck in a process and much more likely to have disk access speed restrictions. You should do additional testing to avoid restructuring your entire process only to get a 2% speed increase because your hard drives can't keep up.
2 - It could be as simple as checking your server-side parallelism setting. Sometimes DBAs set this to 1 because it can mitigate issues with bad query plans, but normally there is limited benefit.
3 - You can set the max degree of parallelism on a query using OPTION (MAXDOP #) where # is the number of parallel processes you want to allow.
We have an 8 CPU 2.5GHz machine with 8 GB of RAM than executes SQL queries in slower fashion than a dual core 2.19 GHz with 4GB of RAM.
Why is this the case, given that Microsoft SQL Server 2000 is installed on both machines?
Just check these links to indicate where the bottleneck is situated
http://www.brentozar.com/sql/
I think the disk layout and the location where which SQL server database files are causing the trouble.
Our solution for multicore servers (our app executes many very complex queries, which tend to create many threads and these start to interlock and even deadlock sometimes):
sp_configure 'show advanced options', 1
reconfigure
go
sp_configure 'max degree of parallelism', 1
reconfigure
This is not ideal solution, but we haven't noticed any performance loss for other actions.
Of course you should optimize disk layout too and sometimes limit SQL server memory for 64bit server.
Also, you may have different settings of SQL Server (memory assignments and AWE memory, threads, maximum query memory, processor affinity, priority boost).
Check the execution plans for the same query on both machines and, if possible, post it here.
Most probably it will be the case.
Keep in mind that just be cause one machine has more CPUs running at a higher clock speed and memory than another, it's not necessarily going to solve a given problem faster than another.
Though you don't provide details, it's possible that the 8-CPU machine has 8 sockets, each with a single-core CPU (say, a P4-era Xeon) and 1 GB of local (say RDRAM) RAM. The second machine is a modern Core 2 Duo with 4GB of DDR2 RAM.
While each CPU in machine #1 has a higher individual frequency, the netburst architecture is much slower clock-for-clock than the Core 2 architecture. Additionally, if you have a light CPU load, but memory-intensive load that doesn't fit in the 1GB local to the CPU on the first machine, your memory accesses may be much more expensive on the first machine (as they have to happen via the other CPUs). Additionally, the DDR2 on the Core 2 machine is much quicker than the RDRAM in the Xeon.
CPU frequency and total memory aren't everything -- the CPU architecture, Memory types, and CPU and memory hierarchy also matter.
Of course, it may be a much simpler answer as the other answers suggest -- SQL Server tripping over itself trying to parallelize the query.