SQL Server / SSISDB Jobs suddenly taking longer to complete - sql-server

Beginning on 12/29/2022, several of our SSISDB packages started taking twice as long to finish their etl. This delays the beginning of our daily reporting for our company.
There are no errors that give much of a clue, and there is nothing out of the ordinary in the logs. The company has been on a code deployment freeze for 3 weeks now, so I'm pretty sure it is not that.
The Server CPU fluctuates between 30 and 60%, so I don't think it's a server resource issue. This phenomenon is occurring to various ETLs. I have looked into the reasons while these jobs will "Hang" or go "Runaway", but there is no discernable explanation I can find.
Can you recommend steps for debugging? SQL Server Management Studio 2018.

Related

Avoiding "schema drift detected" errors in SSDT comparisons

I'm trying to update a SQL Server project in Visual Studio 2019 by using the SSDT schema comparison. My source is a running database server, the destination is the VS SQL Server project.
When the comparison is done and I click "Update", I get the message
Source schema drift detected. Press Compare to refresh the comparison
No matter how many times I refresh the comparison, I always get the same result.
I have tried various connection tweaks (read-only intent, asynchronous processing, multiple active result sets) in the hopes that I can make the comparison run faster and update the project before the drift happens, but to no avail. I have also tried reducing the types of objects included in the comparison, but have not been able to reduce it enough to prevent drift from being detected.
I think the biggest issue I have is that aside from the "schema drift detected" message, I feel like I'm shooting in the dark. By that I mean that I have no idea what is causing SSDT to detect drift, and therefore I can't work around it.
I tried running the SQL Profiler to capture what SSDT is doing so I could find where SSDT is detecting drift. However, I haven't been able to find any query that gives different results when run multiple times within a short period.
So in conclusion, my questions are:
What does SSDT look at to determine when the database schema has drifted?
How can I update my SQL Server project when it always detects schema drift?
I also struggled for months to find the cause of the same error. I was already thinking about flashing Windows 10 on my laptop. I won't list the dead ends anymore. In my final desperation, I copied the SQL Server database and VS project to another machine, and there the comparison worked without a bone. The suspicion arose that maybe the error is not in VS, but rather that my SQL server is confusing VS.
I have a SQL Server 2012. I put the latest update on it (SP4) and wonder of wonders, compare and sync started working perfectly right away. Of course, now before every update I pray a little so that I don't encounter the "Source schema drift detected" message.
I have been unsuccessfully fighting this annoying error for MANY SSDT versions.
Searching for it you will see multiple places where it is claimed to be fixed, WHICH IS FALSE, as it is happening right now with VS 2022 SSDT.
In my case, it ONLY happens when comparing against ONE out of the 5 database servers I regularly use the tool with.
The only workaround I have found that usually works is to REBOOT the destination database server (NOT just cycle the SQL Server Service) and then run the SSDT compare QUICKLY!
As the server that this happens on is an integration server running on a VM in my local network, I can bounce the server, but in other scenarios this would be a show-stopper.
IMO the most onerous things about this issue is that you cannot even generate the script to copy / paste into SSMS, which is how I often use the tool.
This issue has not been fixed for YEARS and is very intermittent, so I have no hope of seeing it actually fixed - I hope this workaround is helpful to someone.

SQL Server job hangs when calling an SSIS package until agent is restarted

I have googled and read many questions/answers, but only one question has ever sounded exactly the same and it did not have an answer.
The situation:
My group has several SQL Servers that are running SQL Server 2017. They are configured virtually identically.
These servers are build boxes, meaning they pull data from a data ware house, or an extract file, run some ETL processing and then push to a prod box. SSIS packages are deployed on the box where the DB resides.
Just over a month ago (with no updates having occurred), one of these servers started having an issue where all the jobs that ran an SSIS package would "hang" on the step that ran the package. Any other step runs fine. But a job step that runs a package (all jobs do this), will not even start the package. The package shows no indication in the executions that anything has even tried to start it.
If the user executes the deployed package it will run successfully.
The only thing that will "fix" the issue is restarting the agent service.
I created a simple job to run a simple package every 5 mins. It had been running for about a week, the last time it ran was 4/11/2021 at 2:40am, the 2:45 run hung. I could find nothing in the event logs that occurred at that time. The server was rebooted as a normal scheduled process at 3:15 and was online by 3:25 because that is the next time it tried to run and it again just hung. So even a server reboot did not fix the issue.
I am at my wits end, since there is no error (the job hangs and the package does not even start) there is no logging that I can find that is showing any issues, I am at a loss as to what might cause this.
Thanks in advance.
Take a look at the SSISDB catalog database on each/all the servers involved. Has it grown exponentially and needs the history etc. cleared down or settings changed? How big are the transaction logs for those databases etc.?

SQL Server 2008 R2 Job Launched Step 1 hundreds of times

I have an ETL SSIS package that is scheduled via job to run nightly at 7pm. It is the only step in the job, and the failure action is "quit the job reporting failure". The server is Windows Server 2008 R2, and the SQL Server version is 2008 R2. There is also an instance of SQL Server 2012 installed on this server, but the services are not started for that instance.
I've made no changes to the job, package, or server, and tonight it behaved strangely. When I look at the history of the job and expand tonight, it shows starting step 1 over 400 times, all at exactly 7 PM. It looks like it just kept launching it until the transaction log filled the entire drive and had no more space to grow, then exited the job reporting failure. I shrunk the transaction log by setting recovery mode to simple and running DBCC SHRINKFILE. I then restarted all of the SQL services for that instance and re-ran the job. So far, it seems to be running as expected, although I suppose time will tell.
I did a search of stack overflow and have seen nothing like this mentioned. We're actually starting a project to virtualize the box, then upgrade to 2012, so this may end up being one of those oddball things that never happens again, but I thought I'd ask in case anyone has any idea why this might have happened.
open the job step and go to the advanced tab. Look at the retry attempts. could it be that it has a big number? this would make the step run many times if it fails.
:

SSIS Package Takes Longer Time To Complete

I've done a migration from SQL 2008 to SQL 2014. Unfortunately, one of the SSIS package which takes only 6 hours to run on 2008 is now taking 8 hours on 2014.
Can somebody told me why this is happening and how can I solve this problem? Is it something to do with setting?
I appreciate any idea/help from you guys. Thanks in advance.
Could be some problems:
Check the operating system is the same data SQL 2008.
Check the memory SQL SERVER:   
Right-click: Server properties -> Memory -> Maximum Server Memory
Sometimes the virtual team, lowers the CPU consumption for the benefit of another machine
(If this is a virtual machine).
What about logging?
In 2012, the concept of project deployments was born. In addition to that concept, a centralized SSIS database was created by default when as Integration Services server was installed. Are you deploying the packages to a server to be run? If so, then logging might slow you down. http://msdn.microsoft.com/en-us/library/hh231191.aspx especially if the default it set to verbose and/or you're doing your own custom logging ( for each event, two executions happen).
Your SSIS server may be drowning from the default logging in addition to the standard workload of the data movements in the package. Try turning logging down or off. Basic works well for us. While the package is executing, monitor any resources that are running too high. That could give you some hints about potential bottle necks and where else to look.

Understanding SQL Profiler trace

I'm currently experiencing some problems on my DotNetNuke SQL Server 2005 Express site on Win2k8 Server. It runs smoothly for most of the time. However, occasionally (order once or twice an hour) it runs very slowly indeed - from a user perspective it's almost like there's a deadlock of some description when this occurs.
To try to work out what the problem is I've run SQL Profiler against the SQL Express database.
Looking at the results, some specific questions I have are:
The SQL trace shows an Audit Logon and Audit Logoff for every RPC:Completed - does this mean Connection Pooling isn't working?
When I look in Performance Monitor at ".NET CLR Data", then none of the "SQL client" counters have any instances - is this just a SQL Express lack-of-functionality problem or does it suggest I have something misconfigured?
The queries running when the slowness occur don't yet seem unusual - they run fast at other times. What other perfmon counters or other trace/log files can you suggest as useful tools for my further investigation.
Jumping straight to Profiler is probably the wrong first step. First, try checking the Perfmon stats on the server. I've got a tutorial online here:
http://www.brentozar.com/perfmon
Start capturing those metrics, and then after it's experienced one of those slowdowns, stop the collection. Look at the performance metrics around that time, and the bottleneck will show up. If you want to send me the csv output from Perfmon at brento#brentozar.com I can give you some insight as to what's going on.
You might still need to run Profiler afterwards, but I'd rule out the OS and hardware first. Also, just a thought - have you checked the server's System and Application event logs to make sure nothing's happening during those times? I've seen instances where, say, the antivirus client downloads new patches too often, and does a light scan after each update.
My spidey sense tells me that you may have SQL Server blocking issues. Read this article to help you monitor blocking on your server to check if its the cause.
If you think the issues may be performance related and want to see what your hardware bottleneck is, then you should gather some cpu, disk and memory stats using perfmon and then co-relate them with your profiler trace to see if the slow response is related.
no
nothing wrong with that...it shows that you're not using the .NET functionality embed in SQL Server.
You can check http://www.xsqlsoftware.com/Product/xSQL_Profiler.aspx for more detailed analysis of profiler trace. It has reports that show top queries by time or CPU (Not one single query, but sum of all execution of a single query).
Some other things to check:
Make sure your datafiles or log files
are not auto-extending.
Make sure your anti-virus is set to
ignore your sql data and log
files.
When looking at the profiler output, be sure the check the queries that finished just prior to your targets,
they could've been blocking.
Make sure you've turned off Auto-close on the database; re-opening after closing takes some
time.

Resources