Database Integrity on 3 Tables - sql-server

I have 3 tables
Project : ProjectID (primary key)
Bugs: ProjectID, BugID (primary key)
BugLogs: BugID, BugLogID (primary key)
There are:
Multiple bugs on a project
Multiple bug logs on the bugs
How would I insert a project that has a bug (s) and then bug Logs on bugs efficiently into these tables?
Thanks

Well here you need to write individual insert statements. Joins will only be used when querying the data..
You can do the following if bugid is an identity column:
DECLARE #bugid bigint
INSERT INTO Bugs (Projectid,other COLUMNS...)
VALUES (values1,VALUES....)
SELECT #bugid= SCOPE_IDENTITY()
INSERT INTO BugLogs(Bugid,other COLUMNS...)
VALUES(#bugid,....)
Alternatively you can use Output clause to get the bugid. This will work in all scenarios:
DECLARE #bugid bigint
INSERT INTO Bugs (Projectid,other COLUMNS...)
OUTPUT INSERTED.BugId INTO #bugid
VALUES (values1,VALUES....)
INSERT INTO BugLogs(Bugid,other COLUMNS...)
VALUES(#bugid,....)

Joins have nothing to do with Inserts. Joins only come into play when you want to query the data.
If you have your declarative referential integrity in place, then you are going to have to insert records in the following order: Project, Bugs, BugLogs.
If you are working in Microsoft SQL Server, and are using identity columns, after you insert a row, you can use the scope_identity() function to retrieve the primary key assigned and use that to set you foreign keys.

Related

Insert records from temp table where record is not already preset fails

I am trying to insert from temporary table into regular one but since there is data in temp table sharing the same values for a primary key of the table I am inserting to, it fails with primary key constraint being violated. That is expected so I am working around it by inserting only the rows that have the primary key not already present in table I am inserting to.
I tried both EXISTS and NOT IN approach, I checked examples showcasing both, confirmed both works in SQL server 2014 in general, yet I am still getting the following error:
Violation of PRIMARY KEY constraint 'PK_dbo.InsuranceObjects'. Cannot
insert duplicate key in object 'dbo.InsuranceObjects'. The duplicate
key value is (3835fd7c-53b7-4127-b013-59323ea35375).
Here is the SQL in NOT IN variance I tried:
print 'insert into InsuranceObjects'
INSERT INTO $(destinDB).InsuranceObjects
(
Id, Value, DefInsuranceObjectId
)
SELECT Id, InsuranceObjectsValue, DefInsuranceObjectId
FROM #vehicle v
WHERE v.Id NOT IN (SELECT Id FROM $(destinDB).InsuranceObjects) -- prevent error when running scrypt multiple times over
GO
If not apparent:
Id is the primary key in question.
$(destinDB) is command line variable. Different from TSQL variable.
It allows me to define the target database and instance in convenient
script based level or even multiple scripts based level. Its used in
multiple variations throughout the code and has so far performed
perfectly. The only downside is you have to run in CMD mode.
when creating all temp tables USE $(some database) is also used so
it's not an issue
I must be missing something completely obvious but it's driving me nuts that such a simple query fails. What is worse, when I try running select without insert part, it returns ALL the records from temp table despite me having confirmed there are duplicates that should fail the NOT IN part in where clause.
I suspect the issue is that you have duplicate ID values in your temp table. Please check the values there as it would cause the issue you are seeing.

Postgres INSERT INTO... SELECT violates foreign key constraint

I'm having a really, really strange issue with postgres. I'm trying to generate GUIDs for business objects in my database, and I'm using a new schema for this. I've done this with several business objects already; the code I'm using here has been tested and has worked in other scenarios.
Here's the definition for the new table:
CREATE TABLE guid.public_obj
(
guid uuid NOT NULL DEFAULT uuid_generate_v4(),
id integer NOT NULL,
CONSTRAINT obj_guid_pkey PRIMARY KEY (guid),
CONSTRAINT obj_id_fkey FOREIGN KEY (id)
REFERENCES obj (obj_id)
ON UPDATE CASCADE ON DELETE CASCADE
)
However when I try to backfill this using the following code, I get a SQL state 23503 claiming that I'm violating the foreign key constraint.
INSERT INTO guid.public_obj (guid, id)
SELECT uuid_generate_v4(), o.obj_id
FROM obj o;
ERROR: insert or update on table "public_obj" violates foreign key constraint "obj_id_fkey"
SQL state: 23503
Detail: Key (id)=(-2) is not present in table "obj".
However, if I do a SELECT on the source table, the value is definitely present:
SELECT uuid_generate_v4(), o.obj_id
FROM obj o
WHERE obj_id = -2;
"0f218286-5b55-4836-8d70-54cfb117d836";-2
I'm baffled as to why postgres might think I'm violating the fkey constraint when I'm pulling the value directly out of the corresponding table. The only constraint on obj_id in the source table definition is that it's the primary key. It's defined as a serial; the select returns it as an integer. Please help!
Okay, apparently the reason this is failing is because unbeknownst to me the table (which, I stress, does not contain many elements) is partitioned. If I do a SELECT COUNT(*) FROM obj; it returns 348, but if I do a SELECT COUNT(*) FROM ONLY obj; it returns 44. Thus, there are two problems: first, some of the data in the table has not been partitioned correctly (there exists unpartitioned data in the parent table), and second, the data I'm interested in is split out across multiple child tables and the fkey constraint on the parent table fails because the data isn't actually in the parent table. (As a note, this is not my architecture; I'm having to work with something that's been around for quite some time.)
The partitioning is by implicit type (there are three partitions, each of which contains rows relating to a specific subtype of obj) and I think the eventual solution is going to be creating GUID tables for each of the subtypes. I'm going to have to handle the stuff that's actually in the obj table probably by selecting it into a temp table, dropping the rows from the obj table, then reinserting them so that they can be partitioned properly.

Is there a way to update primary key Identity specification Increment 1 without dropping Foreign Keys?

I am trying to change a primary key Id to identity to increment 1 on each entry. But the column has been referenced already by other tables. Is there any way to set primary key to auto increment without dropping the foreign keys from other tables?
If the table isn't that large generate script to create an identical table but change the schema it created to:
CREATE TABLE MYTABLE_NEW (
PK INT PRIMARY KEY IDENTITY(1,1),
COL1 TYPEx,
COL2 TYPEx,
COLn
...)
Set your database to single-user mode or make sure no one is in the
database or tables you're changing or change the table you need to
change to READ/ONLY.
Import your data into MYTABLE_NEW from MYTABLE using set IDENTITY_INSERT on
Script your foreign key constraints and save them--in case you need
to back out of your change later and/or re-implement them.
Drop all the constraints from MYTABLE
Rename MYTABLE to MYTABLE_SAV
Rename MYTABLE_NEW to MYTABLE
Run constraint scripts to re-implement constraints on MYTABLE
p.s.
you did ask if there was a way to not drop the foreign key constraints. Here's something to try on your test system. on Step 4 run
ALTER TABLE MYTABLE NOCHECK CONSTRAINT ALL
and on Step 7 ALTER TABLE MYTABLE CHECK CONSTRAINT ALL. I've not tried this myself -- interesting to see if this would actually work on renamed tables.
You can script all this ahead of time on a test SQL Server or even a copy of the database staged on a production server--to make implementation day a no-brainer and gauge your SLAs for any change control procedures for your company.
You can do a similar methodology by deleting the primary key and re-adding it back, but you'll need to have the same data inserted in the new column before you delete the old column. So you'll be deleting and inserting schema and inserting primary key data with this approach. I like to avoid touching a production table if at all possible and having MYTABLE_SAV around in case "anything" unexpected occurs is a comfort to me personally--as I can tell management "the production data was not touched". But some tables are simply too large for this approach to be worthwhile and, also, tastes and methodologies differ largely from DBA to DBA.

Setting version column in append only table

We have a table that will store versions of records.
The columns are:
Id (Guid)
VersionNumber (int)
Title (nvarchar)
Description (nvarchar)
etc...
Saving an item will insert a new row into the table with the same Id and an incremented VersionNumber.
I am not sure how is best to generate the sequential VersionNumber values. My initial thought is to:
SELECT #NewVersionNumber = MAX(VersionNumber) + 1
FROM VersionTable
WHERE Id = #ObjectId
And then use the the #NewVersionNumber in my insert statement.
If I use this method do I need set my transaction as serializable to avoid concurrency issues? I don't want to end up with duplicate VersionNumbers for the same Id.
Is there a better way to do this that doesn't make me use serializable transactions?
In order to avoid concurrency issues (or in your specific case duplicate inserts) you could create a Compound Key as the Primary Key for your table, consisting of the ID and VersionNumber columns. This would then enforce a unique constraint on the key column.
Subsequently your insert routine/logic can be devised to handle or rather CATCH an insert error due to a duplicate key and then simply re-issue the insert process.
It may also be worth mentioning that unless you specifically need to use a GUID i.e. because of working with SQL Server Replication or multiple data sources, that you should consider using an alternative data type such as BIGINT.
I had thought that the following single insert statement would avoid concurrency issues, but after Heinzi's excellent answer to my question here it turns out that this is not safe at all:
Insert Into VersionTable
(Id, VersionNumber, Title, Description, ...)
Select #ObjectId, max(VersionNumber) + 1, #Title, #Description
From VersionTable
Where Id = #ObjectId
I'm leaving it just for reference. Of course this would work with either table hints or a transaction isolation level of Serializable, but overall the best solution is to use a constraint.

Detailed error message for violation of Primary Key constraint in sql2008?

I'm inserting a large amount of rows into an empty table with a primary key constraint on one column.
If there is a duplicate key error, is there any way to find out the value of the key (or row) that caused the error?
Validating the data prior to the insert is sadly not something I can do right now.
Using SQL 2008.
Thanks!
Doing the count(*) / group by thing is something I'm trying to avoid, this is an insert of hundreds of millions of rows from hundreds of different DB's (some of which are on remote servers)...I don't have the time or space to do the insert twice.
The data is supposed to be unique from the providers, but unfortunately their validation doesn't seem to work correctly 100% of the time and I'm trying to at least see where it's failing so I can help them troubleshoot.
Thank you!
There's not a way of doing it that won't slow your process down, but here's one way that will make it easier. You can add an instead-of trigger on that table for inserts and updates. The trigger will check each record before inserting it and make sure it won't cause a primary key violation. You can even create a second table to catch violations, and have a different primary key (like an identity field) on that one, and the trigger will insert the rows into your error-catching table.
Here's an example of how the trigger can work:
CREATE TRIGGER mytrigger ON sometable
INSTEAD OF INSERT
AS BEGIN
INSERT INTO sometable SELECT * FROM inserted WHERE ISNUMERIC(somefield) = 1 FROM inserted;
INSERT INTO sometableRejects SELECT * FROM inserted WHERE ISNUMERIC(somefield) = 0 FROM inserted;
END
In that example, I'm checking a field to make sure it's numeric before I insert the data into the table. You'll need to modify that code to check for primary key violations instead - for example, you might join the INSERTED table to your own existing table and only insert rows where you don't find a match.
The solution would depend on how often this happens. If it's <10% of the time then I would do the following:
Insert the data
If error then do Bravax's revised solution (remove constraint, insert, find dup, report and kill dup, enable constraint).
This means it's only costing you on the few times an error occurs.
If this is happening more often then I'd look at sending the boys over to see the providers :-)
Revised:
Since you don't want to insert twice, could you:
Drop the primary key constraint.
Insert all data into the table
Find any duplicates, and remove them
Then re-add the primary key constraint
Previous reply:
Insert the data into a duplicate of the table without the primary key constraint.
Then run a query on it to determine rows which have duplicate values for the rpimary key column.
select count(*), <Primary Key>
from table
group by <Primary Key>
having count(*) > 1
Use SSIS to import the data and have it check for this as part of the data flow. That is the best way to handle. SSIS can send the bad records to a table (that you can later send to the vendor to help them clean up their act) and process the good ones.
I can't believe that SSIS does not easily address this "reality", because, let's face it, oftentimes you need and want to be able to:
See if a record exists with a certain unique or primary key
If it does not, insert it
If it does, either ignore it or update it.
I don't understand how they would let a product out the door without this capability built-in in an easy-to-use manner. Like, say, set an attribute of a component to automatically check this.

Resources