Copy data from one database to another using VB.NET - sql-server

I need to copy data from one database to another using a VB.NET program.
The target database is SQL Server the source database is some proprietary ODBC compliant database.
I need to loop through a list of table to copy. Read the data from the source database table for a given modified date. Delete the corresponding date from the target database table and insert the records from the source table. The databases are of the same structure i.e. table names and field names, but the data types may differ (however they are compliant e.g. double in source, float in target). No primary keys exist.
Heres how I may do it :
Firstly execute a Delete command to the target.
I could then use a DataReader to obtain data from the source, loop through the Items and create an Insert Command for each row. Add Parameters to the Command with the appropriate values and execute. And wrap the whole thing in a Transaction.
I was just wondering if I am missing a trick here. Any Suggestions

I think you should use the right for the job and I'm guessing that that is SSIS in this case, but I could be wrong and perhaps you have already explored that path.
In that case yes a datareader would do depnding how much data you have. A datatable might even be eassier and faster to program (no need to worry about datatypes since the adapter should take care of that.

The trick would be to use set based operations and not the 'row at a time' concept which we programmers were first taught :)
Here's some pseudocode
INSERT INTO DestTable (columns, columns...)
(Select ModifiedRow from SourceTable where date = Modified)
Perhaps your requirements are more complicated and may need the row by row approach, but this is normally not the case.
I'd opt to put this code in a job step and schedule on SQL. It could also be a stored procedure run from .net.
Also, using SSIS for a db to db transfer is most likely overkill unless you are going to be using some of the special transformations in there.

Take a look at the SqlBulkCopy class. If you can get the source into a DataTable or read it with an IDataReader then it's eligible. It will also attempt to convert between compatible types. See Single Bulk Copy Operations for more details.
This would be more desirable than using INSERT statements for each row.

Dim reader As System.IO.DirectoryInfo
reader = My.Computer.FileSystem.GetDirectoryInfo("c:\program Files\Microsoft SQL Server\MSSQL.1\mssql\data")
If (reader.Attributes And System.IO.FileAttributes.ReadOnly) > 0 Then
MsgBox("File is readonly!")
Else
MsgBox("Database is not read-only protected")
End If
Check all the tables first

Related

SSIS - importing identical data from multiple databases

I want to copy and merge data from tables with identical structure (in a number of different source databases) to a single table of similar structure in a destination database. From time to time I need to add or remove a source database.
This is currently achieved using a Data Flow Task containing an OLEDB source with a SQL query within which there is a UNION for each of the databases I am extracting from. There is quite a lot of SQL within each UNION so, if I need to add fields, I need to add the same additional SQL to each UNION. Similarly, when I add or remove a source database I need to add or remove a UNION.
I was hoping that, rather than use such a UNION with a lot of duplicated code, I could, instead, use a Foreach Loop Container that executes SQL contained in a variable using parameters to substitute the name of the database and other database dependent items within the SQL on each iteration but I hit problems with that as I assume the Data Flow Task within the loop could not interpret the incoming fields because of the use what is effectively dynamic SQL.
Any suggestions as to how I might best achieve this without duplicating a lot of SQL?
It sounds like you have your loop figured out for moving from database to database. As long as the table schemas are identical (other than names as noted) from database to database, this should work for you.
Inside the For Each Loop container, create either a Script Task or an Execute SQL Task, whichever you're more comfortable working with.
Use that task to dynamically generate the SQL of your OLE DB Source query, changing the Customer Code prefix for each iteration. Assign the SQL text to a variable, either directly in a Script Task, or by assigning the Result Set of your Execute SQL Task (the result set being the query text) to a variable.
Inside your Data Flow Task, in the OLE DB Source, under Data Access Mode select "SQL Command from variable". Select the variable that you populated with your query in the last task.
You'll also need to handle changing the connection string between iterations, but, again, it sounds like you have a handle on that part already.

SSIS Insert Row and Get Resulting ID Using OLE DB Command

Having seen other questions with answers that don't totally address what I am after, I am wondering how in SSIS to use an OLE DB Command transformation to do an Insert and immediately get the resulting primary key for each row inserted as a new column, all within the same Data Flow Task. That sounds like it should be a common, built-in, fairly simple thing to ask for in SSIS, right?
So the obvious first choice for me would be to use an OLE DB Command where I do a SELECT and include an OUTPUT clause in my command:
INSERT INTO dbo.MyReleaseTable(releaseDate)
OUTPUT ?=Inserted.id
VALUES (?)
Only I can't figure out how to do this in an OLE DB Command (with an output) and it not complain. I've read about using stored procedures to do this, so am I required to use a stored procedure if I want to do this?
Let's say this won't work. I could use a Script Transformation and execute direct SQL in that, right? Well if that's what I must do, then the line between using custom code and SSIS block-components gets blurred and I am tempted to throw SSIS away and just do the whole ETL in code.
Then I hear talk about using an Execute SQL task. So now I can't even do 1 data flow within 1 data flow task? Am I getting that right? I'd like to keep 1 single data flow contained within 1 data flow task and not have to break my 1 flow out between separate tasks.
If it turns out that this seemingly simple data flow objective is not built into SSIS then I will consider dumping SSIS altogether. Talend has a free ETL offering, don't they?
Well, this can be done with SSIS inside DataFlow, but with some tricks. You need to create a stored procedure with input and output parameters and reuse it in DataFlow, as described here, fetching result value.
Drawbacks of this approach:
You need to create a Stored Procedure
Each row is processed with SP, which causes implicit transactions, instead of batch processing. This can slow down your package.
Solution without performance penalty - do it in two DataFlows, first doing value insert into some temp table, and the second DF - doing SQL MERGE command at OLE DB source and handling output data as you wish. All this inside transaction, handled either by MSDTC or by your own.

Export large amounts of binary data from one SQL database and import it into another database of the same schema

I have one database with an image table that contains just over 37,000 records. Each record contains an image in the form of binary data. I need to get all of those 37,000 records into another database containing the same table and schema that has about 12,500 records. I need to insert these images into the database with an IF NOT EXISTS approach to make sure that there are no duplicates when I am done.
I tried exporting the data into excel and format it into a script. (I have doe this before with other tables.) The thing is, excel does not support binary data.
I also tried the "generate scripts" wizard in SSMS which did not work because the .sql file was well over 18GB and my PC could not handle it.
Is there some other SQL tool to be able to do this? I have Googled for hours but to no avail. Thanks for your help!
I have used SQL Workbench/J for this.
You can either use WbExport and WbImport through text files (the binary data will be written as separate files and the text file contains the filename).
Or you can use WbCopy to copy the data directly without intermediate files.
To achieve your "if not exists" approache you could use the update/insert mode, although that would change existing row.
I don't think there is a "insert only if it does not exist mode", but you should be able to achieve this by defining a unique index and ignore errors (although that wouldn't be really fast, but should be OK for that small number of rows).
If the "exists" check is more complicated, you could copy the data into a staging table in the target database, and then use SQL to merge that into the real table.
Why don't you try the 'Export data' feature? This should work.
Right click on the source database, select 'Tasks' and then 'Export data'. Then follow the instructions. You can also save the settings and execute the task on a regular basis.
Also, the bcp.exe utility could work to read data from one database and insert into another.
However, I would recommend using the first method.
Update: In order to avoid duplicates you have to be able to compare images. Unfortunately, you cannot compare images directly. But you could cast them to varbinary(max) for comparison.
So here's my advice:
1. Copy the table to the new database under the name tmp_images
2. use the merge command to insert new images only.
INSERT INTO DB1.dbo.table_name
SELECT * FROM DB2.dbo.table_name
WHERE column_name NOT IN
(
SELECT column_name FROM DB1.dbo.table_name
)

ADO - Can I edit results of a complex query with multiple join statements?

I'm working on a data conversion utility which can push data from one master database out to a number of different databases. The utility its self will have no knowledge of how data is kept in the destination (table structure), but I would like to provide writing a SQL statement to return data from the destination using a complex SQL query with multiple join statements. As long as the data is in a standardized format that the utility can recognize (field names) in an ADO query.
What I would like to do is then modify the live data in this ADO Query. However, since there are multiple join statements, I'm not sure if it's possible to do this. I know at least with BDE (I've never used BDE), it was very strict and you had to return all fields (*) and such. ADO I know is more flexible, but I don't know quite how flexible in this case.
Is it supposed to be possible to modify data in a TADOQuery in this manner, when the results include fields from different tables? And even if so, suppose I want to append a new record to the end (TADOQuery.Append). Would it append to two different tables?
The actual primary table I'm selecting from has a complimentary table which is joined by the same primary key field, one is a "Small" table (brief info) and the other is a "Detail" table (more info for each record in Small table). So, a typical statement would include something like this:
select ts.record_uid, ts.SomeField, td.SomeOtherField from table_small ts
join table_detail td on td.record_uid = ts.record_uid
There are also a number of other joins to records in other tables, but I'm not worried about appending to those ones. I'm only worried about appending to the "Small" and "Detail" tables - at the same time.
Is such a thing possible in an ADO Query? I'm willing to tweak and modify the SQL statement in any way necessary to make this possible. I have a bad feeling though that it's not possible.
Compatibility:
SQL Server 2000 through 2008 R2
Delphi XE2
Editing these Fields which have no influence on the joins is usually no problem.
Appending is ... you can limit the Append to one of the Tables by
procedure TForm.ADSBeforePost(DataSet: TDataSet);
begin
inherited;
TCustomADODataSet(DataSet).Properties['Unique Table'].Value := 'table_small';
end;
but without an Requery you won't get much further.
The better way will be setting Values by Procedure e.g. in BeforePost, Requery and Abort.
If your View would be persistent you would be able to use INSTEAD OF Triggers
Jerry,
I encountered the same problem on FireBird, and from experience I can tell you that it can be made(up to a small complexity) by using CachedUpdates . A very good resource is this one - http://podgoretsky.com/ftp/Docs/Delphi/D5/dg/11_cache.html. This article has the answers to all your questions.
I have abandoned the original idea of live ADO query updates, as it has become more complex than I can wrap my head around. The scope of the data push project has changed, and therefore this is no longer an issue for me, however still an interesting subject to know.
The new structure of the application consists of attaching multiple "Field Links" on various fields from the original set of data. Each of these links references the original field name and a SQL Statement which is to be executed when that field is being imported. Multiple field links can be on one single field, therefore can execute multiple statements, placing the value in various tables, etc. The end goal was an app which I can easily and repeatedly export a common dataset from an original source to any outside source with different data structures, without having to recompile the app.
However the concept of cached updates was not appealing to me, simply for the fact pointed out in the link in RBA's answer that data can be changed in the database in the mean-time. So I will instead integrate my own method of customizable data pushes.

dynamic insertion with xml or stored procedure

I have 20 record group that i need to batch insert them all in one connection so there is two solution (XML or stored procedure). this operation frequently executed so i need fast performance and least overhead
1) I think XML is performs slower but we can freely specify how many record we need to insert as a batch by producing the appropriate XML, I don't know the values of each field in a record, there maybe characters that malformed our XML like using " or filed tags in values so how should i prevent this behavior ?
2) using stored procedure is faster but i need to define all input parameters which is boring task and if i need to increase or decrease the number of records inserted in a batch then i need to change the SP
so which solution is better in my environment with respect to my constrains
XML is likely the better choice, however there are other options
If you're using SQL Server 2008 you can use Table Valued parameters instead.
Starting with .NET 2.0 you had the option to use the SQLBulkCopy
If you're using oracle you can pass a user defined type but I'm not sure what versions of ODP and Oracle that works with.
Note these are all .NET samples. I don't know that this will work for you. It would probably help if you include the database and version and client technology that you're using.

Resources