I want to insert multiple records (~1000) using C# and SQL Server 2000 as a datatabase but before inserting how can I check if the record i'm inserting already exists and if so the next record should be inserted. The records are coming from a structured excel file then I load them in a generic collection and iterate through each item and perform insert like this
// Insert records into database
private void insertRecords() {
try {
// iterate through all records
// and perform insert on each iteration
for (int i = 0; i < names.Count; i++) {
sCommand.Parameters.AddWithValue("#name", Names[i]);
sCommand.Parameters.AddWithValue("#person", ContactPeople[i]);
sCommand.Parameters.AddWithValue("#number", Phones[i]);
sCommand.Parameters.AddWithValue("#address", Addresses[i]);
// Open the connection
sConnection.Open();
sCommand.ExecuteNonQuery();
sConnection.Close();
}
} catch (SqlException ex) {
throw ex;
}
}
This code uses a stored procedure to insert the records but I can check the record before inserting?
Inside your stored procedure, you can have a check something like this (guessing table and column names, since you didn't specify):
IF EXISTS(SELECT * FROM dbo.YourTable WHERE Name = #Name)
RETURN
-- here, after the check, do the INSERT
You might also want to create a UNIQUE INDEX on your Name column to make sure no two rows with the same value exist:
CREATE UNIQUE NONCLUSTERED INDEX UIX_Name
ON dbo.YourTable(Name)
The easiest way would probably be to have an inner try block inside your loop. Catch any DB errors and re-throw them if they are not a duplicate record error. If it is a duplicate record error, then don't do anything (eat the exception).
Within the stored procedure, for the row to be added to the database, first check if the row is present in the table. If it is present, UPDATE it, otherwise INSERT it. SQL 2008 also has the MERGE command, which essentially moshes update and insert together.
Performance-wise, RBAR (row-by-agonizing-row) is pretty inefficient. If speed is an issue, you'd want to look into the various "insert a lot of rows all at once" procsses: BULK INSERT, the bcp utility, and SSIS packages. You still have the either/or issue, but at least it'd perform better.
Edit:
Bulk inserting data into an empty table is easy. Bulk inserting new data in a non-empty table is easy. Bulk inserting data into a table where some of the data (as, presumably, defined by the primary key) is already present is tricky. Alas, the specific steps get detailed quickly and are very dependent upon your system, code, data structures, etc. etc.
The general steps to follow are:
- Create a temporary table
- Load the data into the temporary table
- Compare the contents of the temporary table with those of the target table
- Where they match (old data), UPDATE
- Where they don't match (new data), INSERT
I did a quick search on SO for other posts that covered this, and stumbled across something I'd never thought of. Try this; not only would it work, its elegant.
Does your table have a primary key? If so you should be able to check that the key value to be inserted is not already in the table.
Related
We have a SQL Server 2008 database with a table containing more than 1.4 billion records. Due to adjustments of the coordinate system, we have to expand the datatype of the coordinate column from decimal(18, 2) to decimal(18, 3).
We've tried multiple things but everything resulted in an exception (transactionlog is full) after about 14 hours of execution.
These are the things we tried:
Alter Table
ALTER TABLE Adress
ALTER COLUMN Coordinate decimal(18, 3) NULL
Designer
Uncheck Tools > Options > Designer > Prevent saving changes that require table re-creation
Open Designer
Change datatype of column to decimal(18, 3)
Right-click > Generate Change Script...
What this script does, is creating a new table with the new datatype, copying the old data to the new table, drop the old table and rename the new table.
Unfortunately both attempts result in a transaction log full exception after 14 hours of execution.
I thought, that changing the datatype via ALTER TABLE... ALTER COLUMN... is only changing the metadata and should be finished in the matter of (milli)seconds?
Do you know of any other method I could try?
Why are my attempts (especially #1) needing that much time?
Thanks in advance
Well the main issue seems large amount of data saved into the table. Your both attempts also seem fine. They both will definitely take time I must say as the data is large.
Each time you alter a column data type the SQL SERVER tries to convert existing data into targeted data type. Processing the conversion on large amount of data may cause delay in execution.
Moreover I wonder if you have any trigger on the table.?
Well! Finally I would suggest you following steps. Give it a try at least
Remove any primary keys/indexes/constraints pointing to the old column, and disable any trigger (if there is any).
Introduce a new nullable column with the new data type (even if
it is meant to be NOT NULL) to the table.
Now make an update query on the table which will set the new column value to the old column value. You can do updating in chunks while updating 1000/100000 batches of the records. And also you can apply conditions to the query for better results.
Once you update all the table by setting new column values to old column then remove the NULL character to NOT NULL from designer (if it is meant to be NOT NULL).
Drop/Delete the old column. Perform Select Query and Verify Your Changes.
Last Point I should add is your database transaction log is also full which can be shrunk but with some precautions. Here is very good example how to reset your transaction log. Should take a look at this too.
Hope This Helps. :)
The solution is to do the updates in batches, easing the pressure on the log file.
Method 1:
a) Create a new table with the new definition.
b) Copy the data to the new table in batches.
c) Drop the old table.
d) Rename the new table.
Method 2:
a) Create a new column with the correct definition.
b) Update the new column with data from the old column in batches.
c) Drop the old column.
d) Rename the new column.
Method 3:
a) BCP the data into a file.
b) Truncate the table.
c) Alter the column.
d) Set the recovery model to bulk logged or simple.
e) BCP the data from the file into the table.
f) Set the recovery model back to full.
Add new column as the last column
If you try to insert before the last column it could take a long time
NewCoordinate decimal(18, 3) NULL
select 1
while(##rowcount > 0)
BEGIN
UPDATE TOP(10000) Adress
SET NewCoordinate = Coordinate
WHERE NewCoordinate <> Coordinate
END
That is my suggestion:
ADD a field to your table and name it like below:
NewCoordinate DECIMAL(18, 3) NULL
WHILE(1 = 1)
BEGIN
UPDATE TOP(1000) Adress SET NewCoordinate = Coordinate
WHERE NewCoordinate IS NULL
IF (##ROWCOUNT < 1000)
BREAK
END
Try to keep your transaction like small.
And Finaly drop your Coordinate field.
I am trying to execute below query and during that one constraint violation exception is generated and due to that insert statement is terminated.
I want suppose from 10 records 9 records are clean then insertion will done for 9.right now statement is terminated and no insertion is performed.
I am using SQL Server 2012 and i do not want to rollback transaction and Insert ignore command is not there in SQL server and i do not want to insert data which contains error.i just want to insert clean data.
Query :
INSERT INTO rcmschargepostingmastertable
(clinicid,
clinicsiteid,
appointmentid,
patientid
)
SELECT clinicid,
clinicsiteid,
appointmentid,
patientid,
FROM #tempautopostbulkchargepostingmastertable
It is not possible to do what you stated in your comment:
i want to ignore any sql error and want to continue insertion for
clean records
SQL Server doesn't have any pure SQL mechanism for doing this. Your only choice is to use one of the proposed work-arounds (SSIS, WHERE clause).
One work-around that hasn't been mentioned because it's the worst performance-wise, but at least it's one that you haven't shot down, is to replace your set-based insert with a cursor that does the inserts one row at a time.
Then you could put the single-row insert in a TRY block, and if it errors, the cursor will skip it and move on to the next one.
I do not want to insert data which contains error.i just want to insert clean data.
Then you need to identify and filter out the bad data/constraint violating records before inserting into target table which will make your life easier.
........
modifiedbyid
FROM #tempautopostbulkchargepostingmastertable
Where some_column <> 'bad data'
Since you are using SQL Server 2012 you can use TRY_CONVERT to identify and filter out the bad data
I have a DB2 stored procedure and trigger which are doing set of insertions.Some of these insert statements might already be present in the table. I am trying to avoid checking for the row before every insert as I believe it might add on to the processing overhead.
I am trying to find out the DB2 equivalent for 'ignore_dup_row' index attribute which is provided by Sybase. If there is no DB2 equivalent for this what else are the viable options to ignore transaction rollbacks when trying to perform a duplicate insert.
Use a merge statement:
merge into t as x
using (
values (...) -- new row
) y (c1, c2, ..., cn)
on x.[key] = y.[key]
when not matched then
insert (c1,c2,...cn) values (y.c1,y.c2,...y.cn);
If you are inserting rows one by one you can also include a continue handler for '23505' in your stored procedure.
I have an Excel source going into an OLE DB destination. I'm inserting data into a view that has an INSTEAD OF trigger that handles all inserts. When I try to execute the package I receive this error:
"Failure inserting into the read-only column ColumnName"
What can I do to let SSIS know that this view is safe to insert into because there is an INSTEAD OF trigger that will handle the insert?
EDIT (Additional info):
Some more additional info. I have a flat file that is being inserted into a normalized database. My initial problem was how do I take a flat file and insert that data into multiple tables while keeping track of all the primary/foreign key relationships. My solution was to create a VIEW that mimicked the structure of the flat file and then create an INSTEAD OF trigger on that view. In my INSTEAD OF trigger I would handle the logic of maintaining all the relationships between tables
My view looks something like this.
CREATE VIEW ImportView
AS
SELECT
CONVERT(varchar(100, NULL) AS CustomerName,
CONVERT(varchar(100), NULL) AS Address1,
CONVERT(varchar(100), NULL) AS Address2,
CONVERT(varchar(100), NULL) AS City,
CONVERT(char(2), NULL) AS State,
CONVERT(varchar(250), NULL) AS ItemOrdered,
CONVERT(int, NULL) AS QuantityOrdered
...
I will never need to select from this view, I only use it to insert data into it from this flat file I receive. I need someway to tell SQL Server that the fields aren't really read only because there is an INSTEAD OF trigger on this view.
Additionally you could just select Keep Identity checkbox in OLEDB Destination Editor, if your column is IDENTITY
It's not an ideal solution but I found a workaround to my problem. Since SSIS was complaining about inserting into my view I created a table with the exact same structure as my view. Then, in an INSTEAD OF trigger on that table, I merely insert the information destined for the table into the view. This adds one more step into the import process but is not a big deal.
Why is the column "read only"? Could you post schema for the view and the underlying table(s)? Is the column IDENTITY? Is there a WITH CHECK OPTION on the view? Is it a derived (calculated) column?
UPDATE:
I see now, a bit unusual application of a view, maybe a stored procedure would have been more appropriate choice -- a stored procedure in the DB and an OLEDB Command in SSIS.
Your final solution with a table as a destination is actually faster, providing that you do not use trigger, but use bulk-insert from the staging table to "final" tables.
I'm still fairly new to T-SQL and SQL 2005. I need to import a column of integers from a table in database1 to a identical table (only missing the column I need) in database2. Both are sql 2005 databases. I've tried the built in import command in Server Management Studio but it's forcing me to copy the entire table. This causes errors due to constraints and 'read-only' columns (whatever 'read-only' means in sql2005). I just want to grab a single column and copy it to a table.
There must be a simple way of doing this. Something like:
INSERT INTO database1.myTable columnINeed
SELECT columnINeed from database2.myTable
Inserting won't do it since it'll attempt to insert new rows at the end of the table. What it sounds like your trying to do is add a column to the end of existing rows.
I'm not sure if the syntax is exactly right but, if I understood you then this will do what you're after.
Create the column allowing nulls in database2.
Perform an update:
UPDATE database2.dbo.tablename
SET database2.dbo.tablename.colname = database1.dbo.tablename.colname
FROM database2.dbo.tablename INNER JOIN database1.dbo.tablename ON database2.dbo.tablename.keycol = database1.dbo.tablename.keycol
There is a simple way very much like this as long as both databases are on the same server. The fully qualified name is dbname.owner.table - normally the owner is dbo and there is a shortcut for ".dbo." which is "..", so...
INSERT INTO Datbase1..MyTable
(ColumnList)
SELECT FieldsIWant
FROM Database2..MyTable
first create the column if it doesn't exist:
ALTER TABLE database2..targetTable
ADD targetColumn int null -- or whatever column definition is needed
and since you're using Sql Server 2005 you can use the new MERGE statement.
The MERGE statement has the advantage of being able to treat all situations in one statement like missing rows from source (can do inserts), missing rows from destination (can do deletes), matching rows (can do updates), and everything is done atomically in a single transaction. Example:
MERGE database2..targetTable AS t
USING (SELECT sourceColumn FROM sourceDatabase1..sourceTable) as s
ON t.PrimaryKeyCol = s.PrimaryKeyCol -- or whatever the match should be bassed on
WHEN MATCHED THEN
UPDATE SET t.targetColumn = s.sourceColumn
WHEN NOT MATCHED THEN
INSERT (targetColumn, [other columns ...]) VALUES (s.sourceColumn, [other values ..])
The MERGE statement was introduced to solve cases like yours and I recommend using it, it's much more powerful than solutions using multiple sql batch statements that basically accomplish the same thing MERGE does in one statement without the added complexity.
You could also use a cursor. Assuming you want to iterate all the records in the first table and populate the second table with new rows then something like this would be the way to go:
DECLARE #FirstField nvarchar(100)
DECLARE ACursor CURSOR FOR
SELECT FirstField FROM FirstTable
OPEN ACursor
FETCH NEXT FROM ACursor INTO #FirstField
WHILE ##FETCH_STATUS = 0
BEGIN
INSERT INTO SecondTable ( SecondField ) VALUES ( #FirstField )
FETCH NEXT FROM ACursor INTO #FirstField
END
CLOSE ACursor
DEALLOCATE ACursor
MERGE is only available in SQL 2008 NOT SQL 2005
insert into Test2.dbo.MyTable (MyValue) select MyValue from Test1.dbo.MyTable
This is assuming a great deal. First that the destination database is empty. Second that the other columns are nullable. You may need an update instead. To do that you will need to have a common key.