SQL Server - Copy Data Between Tables - sql-server

I need to copy a large amount (~200,000) of records between two tables inside the same SQL Server 2000 database.
I can't change the original table to include the columns I would need, so the copy is the only solution.
I made a script with insert select statement. It works, but sometimes the .net form that triggers the stored procedure catches an exception with a timeout expired error.
Is there a more effective way to copy this many records around?
Any tips about how to check where the timeout occurred in the database?

INSERT (id,name) SELECT id,name FROM
your_table WHERE your_condition
And i'd suggest you to put your form in a different thread so It won't freeze, you can also increase the timeout, it's in your connection string.
If you can't avoid the multiple insert, you can try to split them in smaller stack, for instance send only 50 query at a time.

Are you wanting to create an application to copy data between tables or is this just a one-off solution? If you only need to do this once, you should create a script to execute on the database server itself to copy the data you need to transfer between tables.

Are you using a SqlCommand to execute the stored procedure?
If so, set the CommandTimeout:
myCmd.CommandTimeout = 360; //value is in seconds.

1>Compare two database with redgate data compare since other table is empty the script which will generate after comparing will be all inserts.Select insert for that table only.
2>Use multiscript from redgate just add those script to multiscript and execute on that database table it will keep on executing till complete and then you can compare if u have all data correctly.
3> If you don't want to use multiscript create a command line application to just insert the data .

Related

Can we use ADF Lookup activity perform INSERT operation on SNOWFLAKE table

I have created new dataset using snowflake connector and used the same as source dataset in lookup activity.
Then I am trying to INSERT the record into snowflake using following query.
'INSERT INTO SAMPLE_TABLE VALUES('TEST',1,1,CURRENT_TIMESTAMP,'TEST'-- (all values are passed)
Result: The row getting inserted into snowflake but my pipeline got failed stating the below error.
Failure happened on 'Source' side. ErrorCode=UserErrorOdbcInvalidQueryString,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=The following ODBC Query is not valid: 'INSERT INTO SAMPLE_TABLE VALUES('TEST',1,1,CURRENT_TIMESTAMP,'TEST');'
Could you please share you advise or anylead to solve this problem.
Thanks.
Rajesh
Lookup, as the name suggests, is for searching and retrieving data, not for inserting. However, you can enclose your INSERT code in a procedure and execute it using the Lookup activity.
However, I strongly do not recommend such an action, remember that when inserting data into Snowflake, you create at least one micro-partition with a size of 16MB, if you insert one line at a time, the performance will be terrible and the data will take up a disproportionate amount of space. Remember Snowlfake is not a transaction database (! OLTP).
Instead, it's better to save all the records in an intermediate file and then import the entire file in one move.
You can use the lookup activity to perform operations other than selects, it just HAS to have an output. I've gotten around it with a postgres database doing create tables, truncates, one off inserts by just concatenating a
select current_date;
after the main query.
Note, the sql script activity will definitely be better for this, we are waiting on postgres support in that though.

How to update a table while using it at the same time?

I have a Local DB (I'm using SQL Server Express) named PCNAME\SQLEXPRESS. I need to load data from the main database at MAINDB\HYEAH, so I linked the mainDB and I was able to insert data from the main DB into local DB using a stored procedure.
The problem I have is that I can't figure out the correct way to do the following:
I'm constantly using the data imported from the mainDB, that data is in the table Credits, I'm always consulting, inserting or updating a record from that table.
But every 10 minutes I have to reload the data from the mainDB into Credits again, but I can't stop using the data. I need to find a way to be able to use this data and manipulate it, while is being reloaded from the mainDB.
I'm not an expert in DB or SQL transactions so I thought about this solution:
The first time I load the data from mainDB I'll do it directly on table Credits. The other times I'll load the data in a temporary table and when the stored procedure finishes, I'll replace Credits with data from the temporary table. But I think this is dumb cause if I delete all the data from Credits to replace it with temporary table I will not be able to continue using the data so I'm stuck.
Is there a way to properly achieve this?
Thank you!
One option would be to use synonyms.
BEGIN TRY
DROP SYNONYM working_table
END TRY
BEGIN CATCH
END CATCH
CREATE SYNONYM working_table FOR import_table_a
you can now do your selects and updates to working_table and they will go into import_table_a. When you need to reload the data (into import_table_b) you just drop the synonym and point it at the new version of the table.
But do take on board the other comments that imply that you might be fixing the wrong problem :)

Getting a call to a job and a result of a complex select in a single procedure in Microsoft SQL

I have an SSIS job, and a relatively complex select, that use the same data. I have to make it so that my client doesn't have to call them separately, but use one thing to get the result of the select and call the job.
My original plan was to create a procedure, which will take necessary input, and then output a table variable with the select result.
However, after reading the Microsoft documentation, I found out that table variables might not be able to hold a result with more than 100 rows, while I might want to select ~10 000 rows. And now I'm stumped. What is the best way to call a job and select data, from one component?
I have permissions to create views, procedures, and I can edit the SSIS job. The user will provide me with 2 parameters.
This is how I would suggest that you do in this scenario, to take the complexity away from the SSIS.
Create the SP that you wanted to; but instead of Table Variable; push your output into a table. This table can be addded on the fly(dynamically using CREATE TABLE script) or can exist on the DB always available as a buffer.
Call this SP in your control flow.
In the Data flow task, select from this buffer table.
After completing the SSIS work, flush the buffer table, i.e. truncate the table.
Caveat: You may face problem in concurrency scenarios; To eliminate that, you should have a column BatchID or BatchStartTimeStamp which can store a unique value for each run.
You can pass data for BatchID or BatchStartTimeStamp from SSIS package.

What is the fastest way of copying a data from a DataWindow/DataStore to SQL Server table using Powerbuilder

We have a datastore (powerbuilder datawindow's twin sister) that contains over 40.000 rows, which takes more than 30 minutes to insert into a Microsoft SQL Server table.
Currently, I am using a script generator that generates the sql table definition and an insert command for each row. At the end, the full script to sql server for execution.
I have already found that script generation process consumes more than 97% of the whole task.
Could you please help me finding a more efficient way of copying my client's data to sql server table?
Edit1 (after NoazDad's comments):
Before answer, please bear in mind that:
Tabel structure is dynamic;
I am trying to avoid using datastore.Update() method;
Not sure it would be faster but you could save the data from the datastore in a tab delimited file then do a BULK INSERT via Sql. Something like
BULK
INSERT CSVTest
FROM 'c:\csvtest.txt'
WITH
(
FIELDTERMINATOR = '\t',
ROWTERMINATOR = '\n'
)
GO
You can try saving the datastore contents into a string variable via ds.object.datawindow.data syntax then save that to a file then execute the SQL.
The way I read this, you're saying that the table that the data is being inserted into doesn't even exist in the schema until the user presses "GO" and initiates the script? And then you create embedded SQL statements that create the table, and insert rows 1 by 1 in a loop?
That's... Well, let's just say I wouldn't do it this way.
Do you not have any idea what the schema will look like ahead of time? If you do, then paint the datastore against that table, and use ds_1.Update() to generate the INSERT statements. Use the datawindow for what it's good for.
If that's not possible, and you must use embedded SQL, then at least perform a COMMIT every 1000 rows or so. Otherwise, SQLServer is building up UNDO logs against the table, in case something goes wrong and they have to be rolled back.
Other ideas...
Disable triggers on the updated table while it is being updated (if possible)
Use the PB Pipeline object, it has settings for commit- might be faster but not much.
Best idea. Do something on the server side. I'd try to create SQL statements for your 40K inserts, and call a stored procedure sending all 40k insert/update statements and let the stored procedure handle the inserts/updates.
Create a dummy table with a few columns, one being a long text, update it with a block of SQL statements like mentioned in last idea and have a process that delimits and executes the sql statements.
Some variant of above but using bulk insert as mentioned by Matt. Bulk insert is the fastest way to insert many rows.
Maybe try something with autocommit so that you commit only at the end, or every 10k rows as mentioned by someone already.
PB has an async option in the transaction object (connection) maybe you could let the update go in the background and let the user continue. This doesn't work with all databases and may not work in your situation. I haven't had much luck using async option.
The reason your process is so slow is that PB does each update separately, so you are hitting the network and database constantly. There may be triggers on the update table and those are getting hammered too. Slamming them in on the server eliminates network lag and is much faster. Using bulk load is ever faster yet because it doesn't run triggers and eliminates a lot of the database management overhead.
Expanding on the idea of sending SQL statements to a procedure, you can create the sql very easily by doing a dw_1.saveas( SQL! ) (syntax is not right) and send it to the server all at once. Let the server parse it and run the SQL.
Send something like this to the server via procedure, it should update pretty fast as it is only one statement:
Update TABLE set (col1, col2) values ('a', 'b')|Update TABLE set (col1, col2) values ('a', 'b')|Update TABLE set (col1, col2) values ('a', 'b')
In procedure:
Parse the sql statements, and run them. Easy peasy.
While Matt's answer is probably best, I have another option. (Options are good, right?)
I'm not sure why you're avoiding the datastore.Update() method. I'm assuming it's because the schema doesn't exist at the time of the update. If that's the only reason, it can still be used, thus eliminating 40,000 instances of string manipulation to generate valid SQL.
To do it, you would first create the table. Then, you would use datastore.SyntaxFromSQL() to create a datastore that's bound to the table. It might take a couple of Modify() statements to make the datastore update-able. Then you'd move the data from your original datastore to the update-able, bound datastore. (Look at RowsMove() or dot notation.) After that, an Update() statement generates all of your SQL without the overhead of string parsing and looping.

After insert trigger - SQL Server 2008

I have data coming in from datastage that is being put in our SQL Server 2008 database in a table: stg_table_outside_data. The ourside source is putting the data into that table every morning. I want to move the data from stg_table_outside_data to table_outside_data where I keep multiple days worth of data.
I created a stored procedure that inserts the data from stg_table_outside_Data into table_outside_data and then truncates stg_table_outside_Data. The outside datastage process is outside of my control, so I have to do this all within SQL Server 2008. I had originally planned on using a simple after insert statement, but datastage is doing a commit after every 100,000 rows. The trigger would run after the first commit and cause a deadlock error to come up for the datastage process.
Is there a way to set up an after insert to wait 30 minutes then make sure there wasn't a new commit within that time frame? Is there a better solution to my problem? The goal is to get the data out of the staging table and into the working table without duplications and then truncate the staging table for the next morning's load.
I appreciate your time and help.
One way you could do this is take advantage of the new MERGE statement in SQL Server 2008 (see the MSDN docs and this blog post) and just schedule that as a SQL job every 30 minutes or so.
The MERGE statement allows you to easily just define operations (INSERT, UPDATE, DELETE, or nothing at all) depending on whether the source data (your staging table) and the target data (your "real" table) match on some criteria, or not.
So in your case, it would be something like:
MERGE table_outside_data AS target
USING stg_table_outside_data AS source
ON (target.ProductID = source.ProductID) -- whatever join makes sense for you
WHEN NOT MATCHED THEN
INSERT VALUES(.......)
WHEN MATCHED THEN
-- do nothing
You shouldn't be using a trigger to do this, you should use a scheduled job.
maybe building a procedure that moves all data from stg_table_outside_Data to table_outside_data once a day, or by using job scheduler.
Do a row count on the trigger, if the count is less than 100,000 do nothing. Otherwise, run your process.

Resources