I have a VB.NET windows application that pulls information from an MS Access database. The primary role of the application is to extract information from Excel files in various formats, standarize the file layout and write that out to csv files. The application uses MS Access as the source for the keys and cross reference files.
The windows app uses typed datasets for much of the user interaction between the database. The standardization is done on the on each clients machine. The application is not... how can I say this...FAST :-).
Question: What is the best way to migrate the DB and application to SQL Server 2005. I am thinking it might be a good idea to write the code for the standarization in and SSIS packages.
What is the appropriate way to go about this migration?
The application pulls data from 250 excel files each week and approximatley 800 files each month with an average of about 5000 rows per file. There are 13 different file formats that are standarized and out put into 3 different standard formats. The application takes between 25 min. and 40 min to run depending on which data run we are taling about. 95% of the appliction is the standarization process. All the user does is pick a few parameters then start the run.
Microsoft provide a free tool to migrate an Access Database to SQL Server. Once you've upgraded you should be able to change your connection string to point at SQL Server.
You might want to run your app through a profiler to ensure that the Access DB is really what's slowing down your app, and not something else. It would be a shame to go through all the work to convert it over to SQL server, and have nothing to show for it.
The Access upsizing wizard can be used as a starting point.
You may be able to change the backend to be SQL Server with linked tables in Access without changing your front end. Then, you can modify the front end to go directly to SQL Server at will.
Unless you are hitting Access very heavily, I doubt that it is your bottleneck.
As far as reading the Excel files, SSIS can do it, but it might not be as reliable as the mechanism you are using in VB.NET right now, if your VB.NET code has a lot of smart logic to deal with a degree of variation in the input files
As far as writing data out to CSV, SSIS is fine, and I've found SSIS to be a pretty good performer.
If you could give more details about the workflow and how much the user interacts with the database versus the program pulling configuration, it might be easier to help with your architecture.
SSIS is very configurable on the fly (package configuring itself somewhat while it is running), and in many cases it could be programmed to read a variety of Excel files and convert them to CSV, but it's not as configurable on the fly as a hand-coded system. It is also possible to use the SSIS object model to generate packages programmatically and then execute them - this does not have some of the limitations of a package configuring itself, but the object model is pretty complex.
Making sure the scope is clear:
Use a .NET program to
drive an Access database front-end which enables you to
Extract data from a number of Excel spreadsheets,
Massaging the data appropriately, and
Save the result in a CSV file.
What sorts of volumes are we talking about? How many clients, how many spreadsheets per client, how many rows per spreadsheet (I think it would be 32767 max for a single spreadsheet, right? And how much time are we talking about?
Seems like a lot of moving parts. And Access usually is a pretty good tool (with VBA) to do this sort of thing by itself.
It doesn't seem like enough volume to provide a major time sink for a well-designed Access database front-ending Excel to accomplish the whole process using VBA. If your alternative involves installing and operating SQL Server (in place of Access) on each client, I would be surprised if the admin and operational overhead doesn't increase.
So Weekly, per client:
250 files at 25 minutes
= 10 files / minute
or 6 seconds per file.
Monthly, per client:
800 files at 40 minutes
= 20 files/minute
or 3 seconds per file.
My expectation would be less than 1 sec. per file (5000 rows) round trip including:
a. Import or attach xls to mdb,
b. Transform via Access SQL
c. Export to csv
The only explanation that comes to mind is that perhaps the .NET app is reading, transforming, and saving a row at a time. Is that possibly the case?
If you convert to SSIS, then that probably obsoletes the .NET app, because SSIS will want to handle the ETL (and save) itself. So you will basically be rewriting the software. But you may have better resources for SSIS than for Access. But it seems to me like overkill. BUt then .NET rather than VBA also is maybe overkill; and rewriting in VBA is work, too. The least effort would I think be to see if you can do the entire ETL (and save) using Access SQL for most of it, and using VBA just for scripting, to iterate through input files in a directory or some such.
I think you could at least prototype the basic use cases and find out if you can find out pretty quickly where the time is being spent now (as suggested by earlier responses.) But that would be worth finding out before committing redevelopment resources aimed at the wrong part of the problem. If you can expand a bit in those areas, I could probably direct you further. But Access is pretty well suited for this sort of thing, at (IMHO) a lower TCO than SQL Server + SSIS + .NET.
Not to mention that I'd be surprised if the csv files are the true end point, which may play a role in the decision. Isn't the Excel data really ending up further down the path?
Finally, how objectionable is a 25-40 minute process that presumably is unattended, can run over lunch break, and maybe basically works ok?
Notes:
Per week
Excel Files 250
Minutes 25
Minutes/File 0.1
Sec/File 6
Per month
Excel files 800
Minutes 40
Minutes/File 0.05
Sec/File 3
Related
[This is a different approach to a similar question I did couple days ago, I recognize the approach of "cloning" was wrong as this would involve too many changes and offline the actual DB]
EXPLANATION:
I have a set of systems-installations (WebApp + SQL Server database), where the databases have two groups of objects, one small "core" and second and large "instance_specific".
Those systems need to be rebuilt base on another system from time to time (like an Excel financial scenario case), a rebuild of a system will be best if "core" can stay, and everything else delete and importing from a source database (the source database can be different from time to time)
SO FAR:
CREATE DATABASE Database1_copy AS COPY OF Database1;
is not an option as will need to rebuild all core and neither Azure manage well DB created from SSMS, neither establish the needed connections, neither maintain the users and roles... (not good idea to change the db)
Same from recovery from backup / Import Data-Tier Application ...
Script seems really not a good option, doesn't include SEQUENCES (probably other objects as well) and it weight too much to be practical (see notes at the end). How should I load a 360mb script on SSMS to execute it on the destination system?
SQLAzureMW v5.15.6 Release Binary for SQL Server 2014, not able really to try as I run out of memory and trow and exception (with 16gb ram). Databases too big, or buggy ?
Solutions based on manual usage of backup sql functions, as:
https://www.mssqltips.com/sqlservertip/1243/auto-generate-sql-server-database-restore-scripts/
are not supported in Azure ...
To get an instance "clear" and ready to receive the backup I really like the idea of the next post, filtering objects out that belong to core:
Drop all the tables, stored procedures, triggers, constraints and all the dependencies in one sql statement
And due the mount of data, it would be great if "data-flows" goes from db to db directly, not back to local, and even better if it could be executed from the destination DB
SO:
In my best dreams I imagine a SP (similar to the one form bellow on stackoverflow) that goes object by object, reconstruct-it, and loads the data within, and even initialize sequences to the last used values.
I try to copy TABLES (with constrains and index, AND DATA), SEQUENCES (with acual value), VIEWS, SPs, FUNCTIONS, and I think that's all (do not need to pass USERS and ROLES)
I come also to Apexsql and RedBeltTools, haven't try yet, but maybe the solution, even I would prefer not to rely in 3rd party software that runs locally.
Am I out of options?
Should I start digging how to build my own SP migration tool?
(I am not really sure how/where to start ...)
Just some numbers of an actual source database:
CHECK_CONSTRAINT 12
DEFAULT_CONSTRAINT 259
FOREIGN_KEY_CONSTRAINT 145
PRIMARY_KEY_CONSTRAINT 162
SEQUENCE_OBJECT 7
SERVICE_QUEUE 3
SQL_INLINE_TABLE_VALUED_FUNCTION 1
SQL_SCALAR_FUNCTION 27
SQL_STORED_PROCEDURE 765
SQL_TABLE_VALUED_FUNCTION 6
UNIQUE_CONSTRAINT 54
USER_TABLE 268
VIEW 42
Other than Users and Roles, I think nothing else is missed here, isn't it?
Script DB delivers a 360 mb (over 500k lines)
and it seems to not include SEQUENCES at least
(SMSS scripting let me just chose: Tables/Views/SPs/UserDefFunctions/Users/DatabaseRole)
Since you have mention you want to replicate 90% of anAzureSQL database to an existent one. I could see this scenario as comparing two databases in schema and data and trying to send selected changes from an origin db to a destiny db.
That said I would recommend using SQL Server Data Tools, it has a complete suite of tools that could aid you in that scenario, and multiple ways to update the destiny db: inline, via script, via dacpac, via bacpac.
Also since your question is so open, I don't know if it will solve your scenario, but try looking at it and its compare functionality.
I've been using Access to rapid-prototype a DB. Now I'd like to do a small group online test so I split the DB and placed the back-end on Azure SQL Server, then re-linked. It's incredibly slow and I've been researching solutions for days without positive results. My local environment is Win10, Office2016 64bit and internet connection is fast and stable.
I have tried different ODBC drivers, including the SQL Native Client v11.
I've disabled auto-tuning level on the NIC.
I've recreated all queries from access on the server.
I've made sure that Tracing in ODBC is off.
But I enabled tracing temporarily to see what was happening. If I opened the front-end, logged in (Small user table), and did something on the first form (Add 1 record with 3 sub-records...and really...nothing fancy or heavy at all and this only takes 1 minute) then closed the DB, I see that the Tracing log file is 1.5MB.
So I created an empty Access file and an ODBC link to only the User table (12 columns, 20 records), and then monitored the tracing log file again. Opening access doesn't add anything to the log file, but opening this one, linked table made the log file grow to 255kb. Opening this table in access took 5 seconds.
Access sent about 800 requests to the server for opening just this one small table. If I paste all the User table data into a text file, it's only 2kb. SO why is it so slow?
Any ideas on this, and specifically other suggestions to get this working faster?
Kind regards,
Well, the reason why using Azure is slower than running Access connected to a local instance of SQL server is because, well slow is slow!
I mean, if you going to travel 30 miles, you have a choice to walk, or to take a car.
So here is the question you need to know:
Why is walking slower than driving a car?
Answer: Because you are travelling at a slower speed!
So why is using Azure slower the using an instance of SQL server running on your local computer or local network?
Answer:
Because the connection speed to Azure is about 100 times slower!
The idea here that you not going to take into account the DIFFERENCE in connection speed is the issue here. It is a disservice to the reading public that may conclude that such a setup (Access front end on a pc to Azure instance of SQL server) is not a viable setup.
So the first issue here is to make a note of your connection speed to the back end database.
A typical office local area network has a speed of 100mbits, or today most are 1gig – even the el-cheapo routers you purchase at Best Buy are now rated at 1gig (1000 mbits).
However, your typical high speed internet is rated at about 5, or 10 mbits. So that is 100 times slower. (Actually 1000/5 = 200 times slower!!!).
That means if something NOW takes 3 seconds on your office network with Access and SQL server? Well then a WAN (over the internet), then you need to multiple the time by the change in your connection speed (this is so simple – yet it seems to escape all!). So, if you lucky, you might have a 5 mbits speed rating for your internet. That means you go
1000 / 5 = 200
You now take the 200 and multiple the existing delay you have of say 3 seconds and you get 600 seconds (that is 10 minutes if you are wondering!). So you going from 3 seconds to 10 minutes!
This kind of comparison in speed would be like walking into a sports shop to purchase a rubber boat to cross the Atlantic. So not taking into account the change in internet speed and wondering why things are slow is the issue here.
You can most certainly use Access to Azure, but you have to realize two simple concepts.
a connection and test with a connection that is 50-200 times slower than your LAN is a test that going to run 50 to 200 times slower! The failure to mention and take into considering the MASSIVE DIFFERENCE in your speed connection of your LAN compared to a WAN is the simple issue here.
opening a form bound to a large table of data is going to case performance issues.
I was sitting at the bus stop talking to a 90 year old granny lady. I asked her the following:
Have you ever used an instant teller?
She said, why yes, I use them all the time.
I then asked here don’t you think it would be bad to have the teller machine download all the peoples accounts while you wait and THEN ask you for your account number?
The old lady stated, of course, that would be silly. I type in my account pin and the machine ONLY downloads my account information – this is practical and obvious.
In other words that old lady realised that downloading a bunch of data BEFORE you the user even types in or does anything is a waste of bandwidth.
So you never want to launch a form bound to a table and THEN ask the user what record to work on. Why have Access download large numbers of records into a form and THEN ask the user or allow the user to navigate to the required record?
Even when using Google, it does not download the whole internet into your web browser page and you then go ctrl+f to search the contents of that web page.
The same concepts should be applied to Access applications. A design that asks for what to work on and then launches a form bound to a table with a "where" clause will thus fix this issue.
So if you have a form (and even a sub form) that displays a customer invoice, you would FIRST ASK FOR the invoice number, and then simply launch that form using a where clause that restricts the form to the ONE invoice!
Keep in mind that you can STILL use that invoice form bound to a table of 1 million rows and ONLY THE ONE record will be pulled down the network connection *if one used the where clause.
So a typical internet connection has adequate speed to run a browser, and also has MORE than adequate bandwidth speed to pull down a few records. Access often gets a bad rap for poor performance, but that is ONLY DUE to Access developers IGNORING the obvious advice that downloading tons of things that you don’t yet need into a form will run slow.
So web based applications, or even desktop applications written in vb.net perform well with SQL Azure running in the cloud over that MUCH slower internet connection because those applications don’t launch forms bound to large datasets WITHOUT FIRST simply allowing the user to request what they need to see and view.
As for Access and using SharePoint? That setup can be VERY fast, and in fact MUCH faster than SQL Azure, MySQL or any traditional database system because when you using SharePoint tables and Access, then Access automatic syncs a copy of the data local. This setup means your application will continue to run WITHOUT ANY internet connection. The instant the connection is restored, then the data sync can resume.
This means that if you have a table with 15,000 rows and run a report on that data the report can run and launch in an instant with SharePoint back end since a local copy of the data exists in the front end at ALL TIMES! So this setup is VERY well suited an off line mode or in cases that you have a poor and slow internet connections since you as noted always have local copy of the data – only when a record is changed does a sync occur, and that sync can occur independent of Access. So you change one record – and it starts syncing with SharePoint.
However for larger data sets that have to be updated, then SQL server is far better since you can execute a sql update on 10,000 rows and ZERO network traffic and transfer of data need occur to update those 10,000 rows when using SQL server (a pass through query) and when using SharePoint, the 10,000 rows WILL transfer over the network since the local copy requires the rows to be updated. So that massive advantage of using SharePoint for the database backend does not exist for applications that have to update lots of rows or do lots of row update types of data processing.
So the key concepts and take away here:
The high speed internet connection you have is often 10-200 times slower than your typical cheap office (local) network. So that means a 2 second operation will now take 10-200 times longer.
The Access application needs to be optimized to avoid things like loading too many records into a form. So building search forms etc. That FIRST ASK the user what they need to work on is a basic and simple requirement for all good developers and that includes Access developers.
Access and SharePoint can be the BEST option, and such a setup allows the application to run EVEN WHEN there is no internet connection at all. If table sizes are below say 10,000 rows, then this setup can often be ideal. However for applications that have to update lots of rows and for data processing heavy applications this setup is poor since updates to any rows will case data syncing to occur over the network. This setup is also the cheapest, since a single office 365 account with SharePoint support for Access can be had for $6 per month, and that $6 account allows up to 500 free users and those 500 users can even use their Gmail or non-Microsoft account for this setup. And such access applications that do fit within the bounds of SharePoint tables tend to need far less changes and optimizing then using SQL server over the internet.
With SQL server, use of views, pass-tough query and in some cases writing store procedures allows updates and code to run WITHOUT using ANY bandwidth. So one can send a single update query to the server that updates 10,000 rows of data – the only network cost will be the “tiny” amount of bandwidth to send that sql statement.
So while bound forms can be used with SQL Azure running in the cloud, one needs to build software like those do for the web, or vb.net in which they FIRST ask the user what account or customer to work on and THEN launch the UI to display that given data.
So in access, you build a search form say like this:
So at the end of the day, it is important to ignore posts here that suggests Access to SQL in the cloud is not viable. Access with proper designs will work rather well over typical internet connections to SQL server running on Azure.
In fact I seen people use Access to SQL over a 56k modem!
One has to adopt sensible designs in which the data pulled for a given task is restricted – this is a hall mark of all developers – the only issue is Access does NOT enforce this approach while most other developer tools don’t let you hang yourself with things like bound forms to large tables! It not that Access is slow, but Access is slow when you make poor design decisions.
Access to SharePoint can be a real winner – especially for poor bandwidth, spotty bandwidth and even when the connection is lost, the application will continue to run and run faster than 99% of the cases if you were running the same application with a SQL back end. There is a BIG caveat here since only certain types of applications will work well with SharePoint tables. For me to explain the why, how, and when such applications are better is beyond a simple post here, but one simply needs to be aware that SharePoint can be incredible solution, but not for all applications and SQL server can and will be better choice. This SharePoint “better” choice can only be determined on a case by case evaluation of the given type of application in question.
The problem is simply that Azure SQL Database is not very fast running with small DTUs (Database Transaction Units) compared to, say, an in-house instance of SQL Server hosted on even a moderate modern server.
I've checked it out too, and it requires extremely careful design of queries and filtering - far from what you normally can get away with - to obtain acceptable overall speed. On the other hand, this is a giving experience that will bring focus to potential bottlenecks you otherwise wouldn't encounter before it might be too late.
OK, so after almost a week of trying to get this to work (Access front-end to SQL Server back-end on Azure), I've come to the conclusion that it's not a viable solution.
I've tried SQL Server, and setup a Sharepoint 2016 server on Azure, which also failed.
What has worked is using a product from Bullzipp called MS Access to MySQL to convert the access tables, then adding a MySQL DB on the server and importing the file generated by Bullzip. The only thing to note here is that Bullzip doesn't like the newer access formats (it wants an MDB file) so go to Access, create a new, empty file, but make sure you set its file type to MDB. then import your tables across and run Bullzip.
It's now working a hell of a lot faster than the SQL Server, but I am getting some write conflicts in Access, so I just need to go through the code and do whatever I need to so I can avoid those messages.
Using Access as a front to Azure SQL tables is the worst solution. But, sometimes you have to do it. I have a client who is adamant that she wants to keep her Access database. When she hired her very first employee, it became clear she needed to SQL tables behind the screens.
This was a bit of a nightmare. However, after redesigning some terrible table structures, creating views and many procs, I've been able to do it. I use local tables in some cases, and refill by pulling from a stored proc and inserting into the local table. I use linked tables for basic data edits, and do explicit save records almost constantly.
I also have a first-load module that opens all forms, goes to the last record, back to the first record, and then hides the form until needed. The load limps along for about 3
My only remaining issue is now that Azure will close connections after idle time of (I think) 30 or more minutes -- or maybe it's when the laptop sleeps? That kills the app and it has to be closed and re-opened.
I'm looking for a little advice.
I have some SQL Server tables I need to move to local Access databases for some local production tasks - once per "job" setup, w/400 jobs this qtr, across a dozen users...
A little background:
I am currently using a DSN-less approach to avoid distribution issues
I can create temporary LINKS to the remote tables and run "make table" queries to populate the local tables, then drop the remote tables. Works as expected.
Performance here in US is decent - 10-15 seconds for ~40K records. Our India teams are seeing >5-10 minutes for the same datasets. Their internet connection is decent, not great and a variable I cannot control.
I am wondering if MS Access is adding some overhead here than can be avoided by a more direct approach: i.e., letting the server do all/most of the heavy lifting vs Access?
I've tinkered with various combinations, with no clear improvement or success:
Parameterized stored procedures from Access
SQL Passthru queries from Access
ADO vs DAO
Any suggestions, or an overall approach to suggest? How about moving data as XML?
Note: I have Access 7, 10, 13 users.
Thanks!
It's not entirely clear but if the MSAccess database performing the dump is local and the SQL Server database is remote, across the internet, you are bound to bump into the physical limitations of the connection.
ODBC drivers are not meant to be used for data access beyond a LAN, there is too much latency.
When Access queries data, is doesn't open a stream, it fetches blocks of it, wait for the data wot be downloaded, then request another batch. This is OK on a LAN but quickly degrades over long distances, especially when you consider that communication between the US and India has probably around 200ms latency and you can't do much about it as it adds up very quickly if the communication protocol is chatty, all this on top of the connection's bandwidth that is very likely way below what you would get on a LAN.
The better solution would be to perform the dump locally and then transmit the resulting Access file after it has been compacted and maybe zipped (using 7z for instance for better compression). This would most likely result in very small files that would be easy to move around in a few seconds.
The process could easily be automated. The easiest is maybe to automatically perform this dump every day and making it available on an FTP server or an internal website ready for download.
You can also make it available on demand, maybe trough an app running on a server and made available through RemoteApp using RDP services on a Windows 2008 server or simply though a website, or a shell.
You could also have a simple windows service on your SQL Server that listens to requests for a remote client installed on the local machines everywhere, that would process the dump and sent it to the client which would then unpack it and replace the previously downloaded database.
Plenty of solutions for this, even though they would probably require some amount of work to automate reliably.
One final note: if you automate the data dump from SQL Server to Access, avoid using Access in an automated way. It's hard to debug and quite easy to break. Use an export tool instead that doesn't rely on having Access installed.
Renaud and all, thanks for taking time to provide your responses. As you note, performance across the internet is the bottleneck. The fetching of blocks (vs a continguous DL) of data is exactly what I was hoping to avoid via an alternate approach.
Or workflow is evolving to better leverage both sides of the clock where User1 in US completes their day's efforts in the local DB and then sends JUST their updates back to the server (based on timestamps). User2 in India, also has a local copy of the same DB, grabs just the updated records off the server at the start of his day. So, pretty efficient for day-to-day stuff.
The primary issue is the initial DL of the local DB tables from the server (huge multi-year DB) for the current "job" - should happen just once at the start of the effort (~1 wk long process) This is the piece that takes 5-10 minutes for India to accomplish.
We currently do move the DB back and forth via FTP - DAILY. It is used as a SINGLE shared DB and is a bit LARGE due to temp tables. I was hoping my new timestamped-based push-pull of just the changes daily would have been an overall plus. Seems to be, but the initial DL hurdle remains.
What is the fastest method to fill a database table with 10 Million rows? I'm asking about the technique but also about any specific database engine that would allow for a way to do this as fast as possible. I"m not requiring this data to be indexed during this initial data-table population.
Using SQL to load a lot of data into a database will usually result in poor performance. In order to do things quickly, you need to go around the SQL engine. Most databases (including Firebird I think) have the ability to backup all the data into a text (or maybe XML) file and to restore the entire database from such a dump file. Since the restoration process doesn't need to be transaction aware and the data isn't represented as SQL, it is usually very quick.
I would write a script that generates a dump file by hand, and then use the database's restore utility to load the data.
After a bit of searching I found FBExport, that seems to be able to do exactly that - you'll just need to generate a CSV file and then use the FBExport tool to import that data into your database.
The fastest method is probably running an INSERT sql statement with a SELECT FROM. I've generated test data to populate tables from other databases and even the same database a number of times. But it all depends on the nature and availability of your own data. In my case i had enough rows of collected data where a few select/insert routines with random row selection applied half-cleverly against real data yielded decent test data quickly. In some cases where table data was uniquely identifying i used intermediate tables and frequency distribution sorting to eliminate things like uncommon names (eliminated instances where a count with group by was less than or equal to 2)
Also, Red Gate actually provides a utility to do just what you're asking. It's not free and i think it's Sql Server-specific but their tools are top notch. Well worth the cost. There's also a free trial period.
If you don't want to pay or their utility you could conceivably build your own pretty quickly. What they do is not magic by any means. A decent developer should be able to knock out a similarly-featured though alpha/hardcoded version of the app in a day or two...
You might be interested in the answers to this question. It looks at uploading a massive CSV file to a SQL server (2005) database. For SQL Server, it appears that a SSIS DTS package is the fastest way to bulk import data into a database.
It entirely depends on your DB. For instance, Oracle has something called direct path load (http://download.oracle.com/docs/cd/B10501_01/server.920/a96652/ch09.htm), which effectively disables indexing, and if I understand correctly, builds the binary structures that will be written to disk on the -client- side rather than sending SQL over.
Combined with partitioning and rebuilding indexes per partition, we were able to load a 1 billion row (I kid you not) database in a relatively short order. 10 million rows is nothing.
Use MySQL or MS SQL and embedded functions to generate records inside the database engine. Or generate a text file (in cvs like format) and then use Bulk copy functionality.
I need solution to pump data from Lotus Notes to SqlServer. Data will be transfered in 2 modes
Archive data transfer
Current data transfer
Availability of data in Sql is not critical, data is used for reports. Reports could be created daily, weekly or monthly.
I am considering to choose from one of those solutions: DESC and SSIS. Could You please give me some tips about prons and cons of both technologies. If You suggest something else it could be also taken into consideration.
DECS - Domino Enterprise Connection Services
SSIS - Sql Sever Integration Services
I've personally used XML frequently to get data out of Lotus Notes in a way that can be read easily by other systems. I'd suggest you take a look and see if that fits your needs. You can create views that emit XML or use NotesAgents or Java Servlets, all of which can be accessed using HTTP.
SSIS is a terrific tool for complex ETL tasks. You can even write C# code if you need to. There are lots of pre-written available data cleaning components already out there for you to download if you want. It can pretty much do anything you need to do. It does however have a fairly steep learning curve. SSIS comes free with SQL Server so that is a plus. A couple of things I really like about SSIS are the ability to log errors and the way it handles configuration so that moving the package from the dev environment to QA and Prod is easy once you have set it up.
We have also set up a meta data database to record a lot of information about our imports such as the start and stop time, when the file was recieved, the number of records processed, types of errors etc. This has really helped us in researching data issues and has helped us write some processes that are automatically stopped when the file exceeds the normal parameters by a set amount. This is handy if you normally recive a file with 2 million records and the file comes in one day with 1000 records. Much better than delting 2,000,000 potential customer records because you got a bad file. We also now have the ability to do reporting on files that were received but not processed or files that were expected but not received. This has tremendously improved our importing porcesses (we have hundreds of imports and exports in our system). If you are designing from sratch, you might want to take some time and think about what meta data you want to have and how it will help you over time.
Now depending on your situation at work, if there is a possibility that data will also be sent to the SQL Server database from sources other than Lotus Notes as well as the imports from Notes that you are developing for, I would suggest it might be worth your time to go ahead and start using SSIS as that is how the other imports are likely to be done. As a database person, I would prefer to have all the imports I support using the same technology.
I can't say anything about DECS as I have never used it.
Just a thought - but as Lotus Notes tends to behave a bit "different" than relational databases (or anything else), you might be safer going with a tool which comes out of the Notes world, versus a tool from the sql world.
(I have used DECS in the past (prior to Domino 8) and it has worked fine for pumping data out into a SQL Server database. I have not used SSIS).