I now have a requirement that a column has to be added to a table that holds a unique ID (guid). This ID is used to match records in different tables and databases but there will be NO FK constraints. Would it be better to store the guid as a varchar(32) or as a uniqueidentifier type?
There will be joins done using this column but not on a regular basis. This ID is NOT a PK. I'm asking in terms of storage and performance.
If it's a GUID, store it as a uniqueidentifier
Related
What is the best way to generate Id or Code column(primary key) in a table in SQL Server? My requirement is a big table which stores more than 1000000 rows.
So generally we create auto identity column of a table but is it the best way? For example, I have an employee_master table and there are three columns like
emp_id auto increment
Emp_code varchar(20)
and also default like Rowno of a table.
So in this scenario, it has 3 unique columns of a table. Is it the best way?
Or what is the best way to generate id/code column both master and transaction table.
Also is there any way which we generate financial year wise auto-generate column.(any function)
The following SQL statement defines the "ID" column to be an auto-increment primary key field in the "Persons" table
CREATE TABLE Persons (
ID int NOT NULL AUTO_INCREMENT,
LastName varchar(255) NOT NULL,
FirstName varchar(255),
Age int,
PRIMARY KEY (ID)
);
I think this will work..
I have read few posts and articles about NEWID() in MS SQL. Before I decide should I use this method or not I would like to get some information. My Single Page App has few tables. One of the tables should store unique key for each customer. I'm wondering if I should use NEWID() also how I should store that id in the table? I was looking over dataypes and there is unique-identifier type. Few articles mentioned that I will have potential problems with performance especially if I would be joining over 100k to some other tables. I will have this scenario where I would have to join these records to different tables. If anyone can provide some answers or suggestions that would be great. Thanks in advance!
Here is example of my Table:
Column Name Data Type Allow Nulls
hm_id int Unchecked // auto-increment id
hm_studentID uniqueidentifier Unchecked // primary key
hm_firstName varchar(50) Checked
hm_lastName varchar(50) Checked
hm_dob datetime Checked
I am using SQL Server 2012 & am creating a table that will have 8 columns, types below
datetime
varchar(12)
varchar(6)
varchar(100)
float
float
int
datetime
Once a day (normally) there will be an upload of approx 10,000 rows of data. Going forward its possible it could be 100,000.
The rows will be unique if I group on the first three columns listed above. I have read I can use the unique constraint on multiple columns which will guarantee the rows are unique.
I think I'm correct in saying that the unique constraint by default sets up non-clustered index. Would a clustered index be better & assuming when the table starts to contain millions of rows this won't cause any issues?
My last question. By applying the unique constraint on my table I am right to say querying the data will be quicker than if the unique constraint wasn't applied (because of the non-clustering or clustering) & uploading the data will be slower (which is fine) with the constraint on the table?
Unique index can be non-clustered.
Primary key is unique and can be clustered
Clustered index is not unique by default
Unique clustered index is unique :)
Mor information you can get from this guide.
So, we should separate uniqueness and index keys.
If you need to kepp data unique by some column - create uniqe contraint (unique index). You'll protect your data.
Also, you can create primary key (PK) on your columns - they will be unique also. But, there is a difference: all other indexies will use PK for referencing, so PK must be as short as possible. So, my advice - create Identity column (int or bigint) and create PK on it. And, create unique index on your unique columns.
Querying data may become faster, if you do queries on your unique columns, if you do query on other columns - you need to create other, specific indexies.
So, unique keys - for data consistency, indexies - for queries.
I think I'm correct in saying that the unique constraint by default
sets up non-clustered index
TRUE
Would a clustered index be better & assuming when the table starts to
contain millions of rows this won't cause any issues?
(1)if u need to make (datetime ,varchar(12), varchar(6)) Unique
(2)if you application or you will access rows using datetime or datetime ,varchar(12) or datetime ,varchar(12), varchar(6) in where condition
ALL the time
then have primary key on (datetime ,varchar(12), varchar(6))
by default it will put Uniqness and clustered index on all above three column.
but as you commented above:
the queries will vary to be honest. I imagine most queries will make
use of the first datetime column
and you will deal with huge data and might join this table with other tables
then its better have a surrogate key( ever-increasing unique identifier ) in the table and to satisfy your Selects
have Non-Clustered INDEXES
Surrogate Key vs Business Key
NON-CLUSTERED INDEX
I am developing a system in which I have a table Employees with multiple columns related to employees. I have a column for the JobTitle and another column for Department.
In my current design, the JobTitle & the Department columns are compound foreign keys in the Employees table and they are linked with the Groups table which has 2 columns compound primary key (JobTitle & Department) and an extra column for the job description.
I am not happy about this design because I think that linking 2 tables using 2 compound varchar columns is not good for the performance, and I think it would be better to have an Integer column (autonumber) JobTitleID used as the primary key in the Groups table and as a foreign key in the Employees table instead of the the textual JobTitle & the Department columns.
But I had to do this because when I import the employees list (Excel) into my Employees table it can just be directly mapped (JobTitle --> JobTitle & Department --> Department). Otherwise if I am using an integer index as primary key I would have then to manually rename the textual JobTitle column in the excel sheet to a number based on the generated keys from the Groups table in order to import.
Is it fine to keep my database design like this (textual compound primary key linked with textual compound foreign key)? If not, then if I used an integer column in the Groups table as primary key and the same as a foreign key in the Employees table then how can I import the employees list from excel directly to Employees table?
Is it possible to import the list from Excel to SQL Server in a way that the textual JobTitle from the excel sheet will be automatically translated to the corespondent JobTitleID from the Groups table? This would be the best solution, I can then add JobTitleID column in the Groups table as a primary key and as a foreign key in the Employees table.
Thank you,
It sounds like you are trying to make the database table design fit the import of the excel file which is not such a good idea. Forget the excel file and design your db tables first with correct primary keys and relationships. This means either int, bigint or guids for primary keys. This will keep you out of trouble unless you absolutely know the key is unique such as in a SSN. The when you import, then populate the departments and job titles into their respective tables creating their primary keys. Now that they are populated, add those keys to the excel file that can be imported into the employees table.
This is just an example of how I would solve this problem. It is not wrong to use multiple columns as the key but it will definitely keep you out of harms way if you stick with int, bigint or guids for your primary keys.
Look at the answer in this post: how-to-use-bulk-insert...
I would create a simple Stored Procedure that imports your excel data into a temporary unrestricted STAGING table and then do the INSERT into your real table by doing the corresponding table joins to get the right foreign keys and dump the rows that failed to import into an IMPORT FAIL table. Just some thoughts...
I want to create a DB , where each table's PK will be GUID and which will be unique across the DB,
Example: my DB name is 'LOCATION'. And I have 3 table as 'CITY' , 'STATE' and 'COUNTRY'.
I want that all the 3 tables have same PK field as GUID ,and that value will be unique across DB.
How to do this in SQL Server, any idea? I have never used SQL Server before, so it will be helpful if briefly explained.
create table CITY (
ID uniqueidentifier not null primary key default newid(),
.
.
.
)
Repeat for the other tables.
What do you mean exactly ?
Just create the table, add an Id field to each table, set the data type of the Id field to 'uniqueidentifier', and you're good to go.
Next, add a primary constraint on those columns, and make sure that, when inserting a new record you assign a new guid to that column (for instance, by using the newid() function).
I can't think of any good reason to have a unique number shared by 3 tables, why not just give each table a unique index with a foreign key reference? Indexed fields are queried quicker than random numbers would be.
I would create a 'Location' table with foreign keys CityId, StateId & CountryId to link them logically.
edit:
If you are adding a unique id across the City, State and Country tables then why not just have them as fields in the same table? I would have thought that your reason for splitting them into 3 tables was to reduce duplication in the database.