What should happen with a column after the primary key constraint was removed? - sql-server

I am talking about the normalization of a primary key. So let's say my primary key column is of type nvarchar, which violates the rules of normalization. After removing the primary key constraint and the identity specification from the desired column. I need to create a new column which will be the new primary key of that table.
My question is, what should happen with the previous primary key?
I've got an answer that sounds like: "the column should became a semantic key", but i can't understand this answer.

It's not unusual when designing a database schema to use a SURROGATE primary key. The idea is to give each record a unique and permanent identifier so it can be easily referenced by applications and foreign keys. This key has no meaning. Knowing the surrogate key gives you no information about the content of the record. The user of your application would never see this value.
On the other hand, your record may have a SEMANTIC primary key. This is a unique value that identifies this data to that makes sense to the user.
For example, let's say you have a table of Employees. The employer assigns each employee a unique Employee ID Number. Let's say you store this value as a string. To the user that value serves as the unique identifier that refers to that employee. Meanwhile, your table may have a numeric column that serves as the unique identifier for that record.
create table Employee ( EmployeeRecordID int identity(1,1) primary key,
EmployerAssignedID nvarchar(12),
EmployeeName nvarchar(60),
Salary money )
insert into Employee ( EmployerAssignedID, EmployeeName, Salary ) values
( '#ABC100', 'Fred', 25000.12 ),
( '#AZZ314', 'Mary', 37700.00 ),
( '#MAA719', 'Fran', 34444.04 ),
( '#MZA977', 'Mary', 36000.00 )
As each record is added, SQL Server generates a unique EmployeeRecordID for each record, starting with 1. This is the SURROGATE key. Within your database and within your application, you would use this value to reference the record.
But when your application is communicating with the users, you would use the EmployerAssignedID. This is the SEMANTIC primary key. It makes sense to your users to use this value to search for a particular employee.

A primary key is no more than a unique index which can't have NULL value as a key. Like any of indexes it can be clustered or nonclustered.
Deleting a clustered index makes table become a heap with changes in structure and behaviour. Deleting a nonclustered index is just deallocation its space and does not affect that table and other indexes on the table as well.
So after deleting you just have a column(s) with unique values and you are able to consider them as a semantic key until some duplicate values are inserted.

Related

id auto_inc column, but no primary key?

I apologize for the bad title, but I wasn't sure how else to phrase it.
Let's imagine for a moment that I wanted to create a us_states table as follow:
create table us_states
(
id serial,
name varchar(256) not null constraint us_states_pk primary key,
code varchar(256) not null
);
What tangible benefits, if any, are there to having an auto incremental id column in a db_table if I don't plan on leveraging it as the primary key for said db_table?
There is zero value in adding an auto-incremented numerical column if it isn't a primary key or unique constraint.
If you have a unique column like name in your case, there are only two considerations not to use it as primary key:
storing a number instead of the name in tables that reference this one will save space
it is painful and should be avoided to modify primary key columns, so if the names change as part of the normal operation, it makes sense to use a different column as primary key

Foreign key to table A and B, where A already have a foreign key to B

Suppose there is a table called Accounts:
CREATE TABLE Accounts
(
[Id] int not null primary key identity(1,1)
[Username] varchar(20) not null unique,
[Password] varchar(20) not null
)
Then, there is another tabled called Characters. Each account can have N characters. So I can use a foreign key to link these characters.
CREATE TABLE Characters
(
[AccountId] int not null foreign key references Accounts([Id]),
[Id] int not null primary key identity(1,1),
[Nickname] varchar(20) not null unique,
[Level] int not null default 0,
)
Each character can have multiple equipments (inventory), so there is a Equipments table.
Since each equipment is linked to a character, I should use foreign key again, and there comes the problem.
Me and my coworker were arguing about which foreign key to use.
Since each character has a unique Id, I told him that we could use foreign key to that Id and that would be enough. As follows:
CREATE TABLE Equipments
(
[CharId] int not null foreign key references Characters([Id]),
[ItemId] int not null
)
He told me that we must use a foreign key to the character id AND the account id, as follows:
CREATE TABLE Equipments
(
[AccountId] int not null foreign key references Accounts([Id]), /*is this necessary?*/
[CharId] int not null foreign key references Characters([Id]),
[ItemId] int not null
)
I'm not expert in Sql Server and in my opinion, the foreign key to the account id is completely unecessary but he keeps telling me that we must use it and it will help performance because the more foreign key you use, it will be better.
So, should I use foreign key to account id and character id or character id is good enough?
As you said, there is a one-to-many relationship between Account and Character (and hence, a character cannot belong to more than one account).
Similarly, as you described, each record in Equipments only corresponse to a unique record in Characters. The relation from Account to Equipments hence can be inferred, and so, there is no need to create an extra column in the Equipments table. Also, the data integrity is preserved just by the two foreign keys already created, so that should not be a problem when you go without the AccountId column in the Equipments table.
Regarding the performance argument, this is a case-by-case situation, and it depends on a lot of other things (number of records, business logic,...). Having unnecessary foreign key can even hurt performance since the database/server will need to maintain that foreign key while operate. Also, I found that if you do not have the key and when you find out that you need it, it is easier to add one in than to remove an existing one, especially when you have to create a whole new column for this one (this last piece is a mere personal opinion).
You should use it only if you plan to interrogate equipment directly for an account which is faster than joining with account via char. Otherwise, no, you shouldn't use it.
You are correct, but for a more important reason.
If you include Accountid in the Equipments table, then you have a second relationship to the Accounts table. Perhaps this is allowed, but in all likelihood, you intend to have the Characters.AccountId be the account id for a row in Equipments.
You would then get the appropriate account id by using a join to the Equipments table.

Pledges table normalize

I have this table and I don't know how to normalize.
1st table name, address, landline no., mobile no., e-mail ad, amount, and registration number https://docs.google.com/file/d/0B99TeByt30n2dDVLVHV4dU1yRFE/edit
and in the second table will be their monthly pledges. https://docs.google.com/file/d/0B99TeByt30n2TVh4c1dmLTFYOWs/edit
PersonTable
PersonId int primary key
Name varchar(50)
Addresss varchar(50)
etc.
PledgeTable
PledgeId int primary key,
PersonId int foreign key references PersonTable(PersonId)
PledgeDate datetime
PledgeAmount decimal(10,2)
Ok.
You have to re engineer the whole idea.
I will give you a few concepts to understand what to do.
Keep in mind that we must NOT to have data duplicates.
So an array can be named Customer.
Customer can have a customer id column, named c_id for instance.
c_id must be defined as integer UNIQUE Auto_Increment NOT NULL
That means that it's an attribute that grants each customer its uniqueness.
A customer might have same name and same surname with another customer, but never same c_id.
Auto increment means that the first entry will be automatically numbered with c_id = 1, the next 2 etc etc.
Keep in mind that if you pass a record in the database using PHP, you enter null as value for c_id. Auto Incremet does the job then.
So you got a customer table and its first attribute that will be defined as the Primary key.
For example here is a small table:
CREATE TABLE customer
(c_id integer UNIQUE Auto_Increment NOT NULL,
c_name varchar(100),
c_surname varchar(100),
c_address varchar(100),
PRIMARY KEY (c_id)
);
You have to add all the attributes the customer has in the table.
That was just an example.
PRIMARY KEY (c_id) line in the end,
sets c_id as the primary key that distinguishes each record as unique.
Then you got another table.
A Pledge table.
CREATE TABLE pledge
(pl_id integer UNIQUE Auto_Increment NOT NULL,
pl_date date,
pl_price double(9,2),
pl_c_id integer,
PRIMARY KEY (pl_id),
FOREIGN KEY (pl_c_id) REFERENCES customer (c_id));
What is new here?
Line:
pl_c_id integer,
and line:
FOREIGN KEY (pl_c_id) REFERENCES customer (c_id)
What happens here, is that you create a column, that will contain an existing c_id!
That way, you make a reference to a customer, by using his/her UNIQUE c_id.
This is defined as integer, so it can fit the key.
pl_c_id MUST be the same type as the primary key of the other table. Ok?
Also SQL will know what key we refer to, by reading the line:
FOREIGN KEY (pl_c_id) REFERENCES customer (c_id)
In plain English that means:
pl_c_id can be filled with values already declared in a Primary Key of another Table, named "customer" that uses as primary key the column named "c_id".
Got it?
Now you got the flexibility to use ANY date.
That is more normalized than what you got.
Usually a 3NF (Third Normal Form) you will be ok.
Oh! Not to mention that you can Google "3NF" or "Third Normal Form" and get nice results for your reference.
;)
Cheers.
Edit: Made this reply more simple to understand.

Incorrect value for UNIQUE_CONSTRAINT_NAME in REFERENTIAL_CONSTRAINTS

I am listing all FK constraints for a given table using INFORMATION_SCHEMA set of views with the following query:
SELECT X.UNIQUE_CONSTRAINT_NAME,
"C".*, "X".*
FROM "INFORMATION_SCHEMA"."KEY_COLUMN_USAGE" AS "C"
INNER JOIN "INFORMATION_SCHEMA"."REFERENTIAL_CONSTRAINTS" AS "X"
ON "C"."CONSTRAINT_NAME" = "X"."CONSTRAINT_NAME"
AND "C"."TABLE_NAME" = 'MY_TABLE'
AND "C"."TABLE_SCHEMA" = 'MY_SCHEMA'
Everything works perfectly well, but for one particular constraint the value of UNIQUE_CONSTRAINT_NAME column is wrong, and I need it in order to find additional information from the referenced Column. Basically, for most of the rows the UNIQUE_CONSTRAINT_NAME contains the name of the unique constraint (or PK) in the referenced table, but for one particular FK it is the name of some other unique constraint.
I dropped and re-created the FK - did not help.
My assumption is that the meta-data is somehow screwed. Is there a way to rebuild the meta data so that the INFORMATION_SCHEMA views would actually show the correct data?
edit-1: sample db structure
CREATE TABLE MY_PARENT_TABLE (
ID INTEGER,
NAME VARCHAR,
--//...
CONSTRAINT MY_PARENT_TABLE_PK PRIMARY KEY CLUSTERED (ID)
)
CREATE UNIQUE NONCLUSTERED INDEX MY_PARENT_TABLE_u_nci_ID_LongName ON MY_PARENT_TABLE (ID ASC) INCLUDE (SOME_OTHER_COLUMN)
CREATE TABLE MY_CHILD_TABLE (
ID INTEGER,
PID INTEGER,
NAME VARCHAR,
CONSTRAINT MY_CHILD_TABLE_PK PRIMARY KEY CLUSTERED (ID)
,CONSTRAINT MY_CHILD_TABLE__MY_PARENT_TABLE__FK
FOREIGN KEY (PID)
REFERENCES MY_PARENT_TABLE (ID)
ON UPDATE NO ACTION
ON DELETE NO ACTION
)
I expect the UNIQUE_CONSTRAINT_NAME to be MY_PARENT_TABLE_PK, but what I am
getting is MY_PARENT_TABLE_u_nci_ID_LongName.
Having looked at the structure, I see that in fact there are 2 UNIQUE constaints on that column - PK and the MY_PARENT_TABLE_u_nci_ID_LongName. So the real question should probably be: why does it take some other unique index and not the PK?
Since you have both a PK and a UNIQUE constraint on the same column, SQL Server picks one to use. I don't know if it picks the UNIQUE constraint because it is thinner (i.e. fewer columns involved) and might require fewer reads to confirm matches(?)
I don't see any way within SQL to enforce which one it chooses, other than ordering your scripts - create the table with the PK, create the other table and the FK, then create the UNIQUE constraint if you really need it - but is that really the case?

Are foreign keys indexed automatically in SQL Server?

Would the following SQL statement automatically create an index on Table1.Table1Column, or must one be explicitly created?
Database engine is SQL Server 2000
CREATE TABLE [Table1] (
. . .
CONSTRAINT [FK_Table1_Table2] FOREIGN KEY
(
[Table1Column]
) REFERENCES [Table2] (
[Table2ID]
)
)
SQL Server will not automatically create an index on a foreign key. Also from MSDN:
A FOREIGN KEY constraint does not have
to be linked only to a PRIMARY KEY
constraint in another table; it can
also be defined to reference the
columns of a UNIQUE constraint in
another table. A FOREIGN KEY
constraint can contain null values;
however, if any column of a composite
FOREIGN KEY constraint contains null
values, verification of all values
that make up the FOREIGN KEY
constraint is skipped. To make sure
that all values of a composite FOREIGN
KEY constraint are verified, specify
NOT NULL on all the participating
columns.
As I read Mike's question, He is asking whether the FK Constraint will create an index on the FK column in the Table the FK is in (Table1). The answer is no, and generally. (for the purposes of the constraint), there is no need to do this The column(s) defined as the "TARGET" of the constraint, on the other hand, must be a unique index in the referenced table, either a Primary Key or an alternate key. (unique index) or the Create Constraint statment will fail.
(EDIT: Added to explicitly deal with comment below -)
Specifically, when providing the data consistency that a Foreign Key Constraint is there for. an index can affect performance of a DRI Constraint only for deletes of a Row or rows on the FK side. When using the constraint, during a insert or update the processor knows the FK value, and must check for the existence of a row in the referenced table on the PK Side. There is already an index there. When deleting a row on the PK side, it must verify that there are no rows on the FK side. An index can be marginally helpful in this case. But this is not a common scenario.
Other than that, in certain types of queries, however, where the query processor needs to find the records on the many side of a join which uses that foreign key column. join performance is increased when an index exists on that foreign key. But this condition is peculiar to the use of the FK column in a join query, not to existence of the foreign Key constraint... It doesn't matter whether the other side of the join is a PK or just some other arbitrary column. Also, if you need to filter, or order the results of a query based on that FK column, an index will help... Again, this has nothing to do with the Foreign Key constraint on that column.
No, creating a foreign key on a column does not automatically create an index on that column. Failing to index a foreign key column will cause a table scan in each of the following situations:
Each time a record is deleted from the referenced (parent) table.
Each time the two tables are joined on the foreign key.
Each time the FK column is updated.
In this example schema:
CREATE TABLE MasterOrder (
MasterOrderID INT PRIMARY KEY)
CREATE TABLE OrderDetail(
OrderDetailID INT,
MasterOrderID INT FOREIGN KEY REFERENCES MasterOrder(MasterOrderID)
)
OrderDetail will be scanned each time a record is deleted in the MasterOrder table. The entire OrderDetail table will also be scanned each time you join OrderMaster and OrderDetail.
SELECT ..
FROM
MasterOrder ord
LEFT JOIN OrderDetail det
ON det.MasterOrderID = ord.MasterOrderID
WHERE ord.OrderMasterID = #OrderMasterID
In general not indexing a foreign key is much more the exception than the rule.
A case for not indexing a foreign key is where it would never be utilized. This would make the server's overhead of maintaining it unnecessary. Type tables may fall into this category from time to time, an example might be:
CREATE TABLE CarType (
CarTypeID INT PRIMARY KEY,
CarTypeName VARCHAR(25)
)
INSERT CarType .. VALUES(1,'SEDAN')
INSERT CarType .. VALUES(2,'COUP')
INSERT CarType .. VALUES(3,'CONVERTABLE')
CREATE TABLE CarInventory (
CarInventoryID INT,
CarTypeID INT FOREIGN KEY REFERENCES CarType(CarTypeID)
)
Making the general assumption that the CarType.CarTypeID field is never going to be updated and deleting records would be almost never, the server overhead of maintaing an index on CarInventory.CarTypeID would be unnecessary if CarInventory was never searched by CarTypeID.
According to: https://learn.microsoft.com/en-us/sql/relational-databases/tables/primary-and-foreign-key-constraints?view=sql-server-ver16#indexes-on-foreign-key-constraints
Unlike primary key constraints, creating a foreign key constraint does not automatically create a corresponding index

Resources