SSIS stuck in validation when running package (SQL Server 2016) - sql-server

I've got a very basic SSIS package with a task in it that has a simple Dataflow in it. Source is a table in a database in one SQL Server instance and destination is an identical table in a different instances database. I've done this kind of thing tons of times, but this time, I'm getting a weird occurance.
When I run the package, it freezes on the validation of the destination after 50% has been reached. I've turned off referential integrity for the table, and i'm doing a fast load keeping identity fields and nulls, with table lock and check contraints ticked. The table only contains about 200,000 records and should be fast to copy.
Anyone know what could be causing this?
As an extra bit of information, If I execute the task directly it runs fine..

Well, that was very odd. It turns out that the previous Task (an Execute SQL) was messing things up. I was turning off referential integrity for the table, which should not have adversely affected the dataflow. Removing this sorted out the problem. To be honest I don't need it in this case, it was just copied from a previous package that did need it.
Weird.

Related

SQL Server Complex Add/Update merge million rows daily

I have 7 reports which are downloaded daily at late night.
These reports can be downloaded in csv/xml. I am downloading them csv format as they are memory efficient.
This process runs in background and is managed by hangfire.
After they are downloaded, I am using dapper to run a stored procedure which insert/update/update data using merge statements. This stored procedure has seven table value parameters.
Instead of delete, I am updating that record's IsActive column to false.
Note that 2 reports have more than 1 million records.
I am getting timeout exceptions only in Azure SQL. In SQL Server, it works fine. As a workaround, I have increased timeouts to 1000 for this query.
This app is running in Azure s2.
I have pondered over the option of sending xml but I have found SQL Server is slow at processing xml which counter productive.
I can not also use SqlBulkCopy as I have to update based on some conditions.
Also note that more reports will be added in future.
Also when a new report is added then there are large amount of inserts. If previously added report is ran again then majority updates are run.
These tables currently do not have any indexes, only clustered integer primary key.
Each row has a unique code. This code is used to identify whether to insert/update/delete
Can you recommend a way to increase performance?
Is your source inputting the whole data? Whether they are updated/new. I assume by saying the unique code(insert/update/delete) you are only considering changes (Delta). If not that's one area. Another is to consider parallelism. I think then you need to have different stored procedures for each table. Non dependent tables could be processed together

SISS doesn't finish insert when there is 34.000 rows, my DB may have some kind of limitation?

I created a package in my SISS that makes a insert in differents databases, the table is the same.
But my package doesn't finish when I have a lot of rows, it keeps running forever.
I have noticed that for a database located in another instance this problem doesn't occurs, I don't know, maybe my database doesn't allow a huge insert?
I did some test trying insert less rows and my package work well. I thing that my problem may be some role in my db ...
Flow of integration
Please let me know what is the source of the data.
Thanks

How do I handle rows that were deleted from the source using SSIS Slowly Changing Dimension

I am trying to implement an ETL process for our Type 1 slowly changing dimension tables in a SQL 2014 database. The load needs to happen across servers, and I would prefer not to use linked servers.
I have been looking for ways to do this in SSIS and found the slowly changing dimension wizard which works fine except that this seems to only allow either inserting new rows or updating rows where there is a match on the business key, however I haven't found a place where it allows me to handle when a record exists in the dimension table but was deleted from the source. I would like to make sure these are deleted. Am I missing something? Has anyone found a better way to handle this in SSIS?
I know that I could just dump everything into another table on the destination server and write a TSQL merge, but there just seems like should be a simple way to do this in SSIS.
First, I would avoid the SCD functionality in SSIS, as its performance tends to be terrible - I've actually been told to avoid it by MS certified trainers, as well as plenty of people with a lot of experience. It's OK-ish on very small dimensions, but quickly tends to become unmanageable. There's a blog post here from someone who thinks it's usable in some situations, but even they suggest using a staging table for updates.
If you want to do this in SSIS you could use a Lookup to find the rows that need to be deleted (find the rows in your destination which aren't in the source using the no match output), then an OLE DB Command to delete them. But I'd give some serious thought to simply moving the data over to a staging area and doing this in TSQL, because SSIS will do it row by agonising row. Similarly to the SCD tool - it might be OK on small amounts of data, but if you're dealing with larger amounts (or might be in future), it may well become unmanageable.
If you don't want to move all of the data over to a staging area, you could use SSIS to build up a table only holding the unique IDs of the rows that need deleting, then fire off an Execute SQL Task from the Control Flow to delete them all at once.

KISS way to copy a list of tables between SQL instances?

My main application has nightly SSIS jobs that move some (not all) of the data from the production servers to various test and development servers.
These jobs do nothing fancy. They delete the destination table and repopulate it from the source. There are many dozens of tables and 4 or 5 servers in the mix, with plenty of foreign keys, but everything is SQL to SQL and there is no merging or lookups.
Using SSIS to do this has proven painfully brittle. When a new application release changes the schema, more than half the time, jobs begin failing. Why? The updates are done by the developers, and the packages have been tweaked and changed dozens of times by different developers and often the SSIS changes happen during crunch time of the development cycle.
It has occurred to me that SSIS may or may not be the right tool for this. SSIS is pretty bloated for simple table copies. (Column Mapping, etc.) Is there a better way? Ideally, it would simply take as input (preferably from a central source):
an unordered list of tables (Nothing but exact match column naming supported)
a source server
a destination server
It would then simply:
- Begin execution on a schedule at the designated server.
- sort the list referentially (to not violate FK constraints)
- delete/truncate all the destination tables in referential order
- Copy tables from source to destination in referential order
- Report back success or failure.
The only challenging things on the list (I think) would be a good low-maintenance way to schedule the jobs and for the jobs to report back success or failure.
[Note: I'm not looking for technical details. I am simply looking for a lower maintenance, less brittle way to make these data moves happen. I'll post my initial idea below as a possible solution, but I'll be quite happy if there is a simpler solution out there.]
Instead of truncating tables and copying data, you could DROP the tables, and use the Transfer SQL Server Objects Task to copy the table, which would include DDL changes.
Caveat being that you would have to handle foreign keys accordingly.

SpeedUp Database Updates

There is a SqlServer2000 Database we have to update during weekend.
It's size is almost 10G.
The updates range from Schema changes, primary keys updates to some Million Records updated, corrected or Inserted.
The weekend is hardly enough for the job.
We set up a dedicated server for the job,
turned the Database SINGLE_USER
made any optimizations we could think of: drop/recreate indexes, relations etc.
Can you propose anything to speedup the process?
SQL SERVER 2000 is not negatiable (not my decision). Updates are run through custom made program and not BULK INSERT.
EDIT:
Schema updates are done by Query analyzer TSQL scripts (one script per Version update)
Data updates are done by C# .net 3.5 app.
Data come from a bunch of Text files (with many problems) and written to local DB.
The computer is not connected to any Network.
Although dropping excess indexes may help, you need to make sure that you keep those indexes that will enable your upgrade script to easily find those rows that it needs to update.
Otherwise, make sure you have plenty of memory in the server (although SQL Server 2000 Standard is limited to 2 GB), and if need be pre-grow your MDF and LDF files to cope with any growth.
If possible, your custom program should be processing updates as sets instead of row by row.
EDIT:
Ideally, try and identify which operation is causing the poor performance. If it's the schema changes, it could be because you're making a column larger and causing a lot of page splits to occur. However, page splits can also happen when inserting and updating for the same reason - the row won't fit on the page anymore.
If your C# application is the bottleneck, could you run the changes first into a staging table (before your maintenance window), and then perform a single update onto the actual tables? A single update of 1 million rows will be more efficient than an application making 1 million update calls. Admittedly, if you need to do this this weekend, you might not have a lot of time to set this up.
What exactly does this "custom made program" look like? i.e. how is it talking to the data? Minimising the amount of network IO (from a db server to an app) would be a good start... typically this might mean doing a lot of work in TSQL, but even just running the app on the db server might help a bit...
If the app is re-writing large chunks of data, it might still be able to use bulk insert to submit the new table data. Either via command-line (bcp etc), or through code (SqlBulkCopy in .NET). This will typically be quicker than individual inserts etc.
But it really depends on this "custom made program".

Resources