SQL Server String column as a unique key

SQL Server String column as a unique key - sql-server

I am using a SQL server table to keep related information with a URL . So there is a Column as "URL" and its type is VARCHAR . In the application we have to use this URL as a unique key to query for information (We are using something like SELECT * FROM Table WHERE URL = "www.google.com\ig")
Is there any disadvantages or known drawbacks in using a URL as a unique key?

Usually it is a better idea to have a numeric value rather than a string as a table key. See a discussion about this subject here: Database Primary Key C# mapping - String or int.
As for using a URL, it should not pose you any problem provided that you have some basic rules to avoid inserting the same (equivalent) URL twice. That is, the database will interpret "www.google.com" and "http://www.google.com" as different strings, so you should have a rule like "URLs will never have the protocol identifier" or "URLs will never end with a slash", or whatever makes sense for your design.

As the others have said - I would definitely not use a long string like an URL as the primary/clustering key on a SQL Server table - but of course, you should feel free to put a unique constraint on that column, to make sure you don't get any duplicates!
You can either do a UNIQUE CONSTRAINT or a UNIQUE INDEX - the end result is pretty much the same (the unique constraint will also be using an index behind the scenes). The plus side for a UNIQUE INDEX is that you can reference it as a foreign key in a separate table, so I almost always use that approach:
CREATE UNIQUE NONCLUSTERED INDEX UIX_YourTable_URL ON dbo.YourTable(urlField)
If you should ever try to insert a value that's already in the table, that insert statement will be rejected with a SQL error and nothing bad can happen.

I would still create a clustered column key on the table e.g. An Auto number and then create a Unique Index on the URL column.
However I cant see why a URL is not unique and all should work as is.

You may benefit from numeric primary key if paging is needed. But still you can add numeric indexer in future. So there's no obstacle to make URL a PK.

Related

Azure Logic App delete Row is not working

I have a table that has a composite primary key.
CONSTRAINT [PK_FileContainerFiles] PRIMARY KEY CLUSTERED
(
[FileId] ASC,
[ContainerId] ASC
)
I am trying to delete the row using logic app connector. It works if the primarykey is having one element.
How to input two identifiers in 'RowId' of Logic app. when I tried something like below, Am getting error. Is this a Microsoft logic App Issue? Any Idea. Please help.

Yes, it is possible. The SQL Connector (which, btw is the same connector used in flows as well as LogicApps and PowerApps) treats primary keys just like SQL. That is, you simply use each key in sequence separated by a comma to construct the "full" key.
My example using composite key:
#{join(createArray(items('For_each')?['BUKRS'],items('For_each')?['LIFNR']),',')}
TLDR: values separated by comma.

The Row Id stands for the unique identifier of the row you wish to delete.
So if you would like to delete a row based on those 2 input parameters, you would first need to find a way to return you the Row Id (unique identifier) of the row(s) you'd like to delete and then execute the Delete row for each of the returned rows.
Another way would be to use a stored procedure to handle the deletion of the rows.
For Reference:
https://learn.microsoft.com/en-us/connectors/sql/

Another working solution is to use instead the "Execute Query" action and do a delete with all the conditions you need.
Sorry for answering such an old post but I had the same issue and found it so I think other people may find it useful too.

Adding a primary key for Entity Framework to an existing column in a View based on a table where every column is Allow Null

There are a lot of questions asking this but I can't seem to find one with the specific solution.
I have a 3rd party database where every field is set to allow null. One of the columns ("Code") is a unique string ID and it is distinct.
Using entity framework I'd like to add this table, by telling EF to treat the column "Code" as a primary key.
I've created a view but I am not sure where to go from here.
I've seen some solutions that involve adding an extra row number to use as the primary key but I would prefer to use "Code" if possible.
Any ideas?

After some playing around I found a read-only solution
In the view I modify the column to be:
SELECT ISNULL(Code, -1) AS Code
Specifying ISNULL allows EF to infer a primary key. It is not ideal as I would like it to be writable as well. You get the message:
Error 6002: The table/view 'KittyCat.dbo.View_GetCatDetails' does not
have a primary key defined. The key has been inferred and the
definition was created as a read-only table/view.

When having an identity column is not a good idea?

In tables where you need only 1 column as the key, and values in that column can be integers, when you shouldn't use an identity field?
To the contrary, in the same table and column, when would you generate manually its values and you wouldn't use an autogenerated value for each record?
I guess that it would be the case when there are lots of inserts and deletes to the table. Am I right? What other situations could be?

If you already settled on the surrogate side of the Great Primary Key Debacle then I can't find a single reason not use use identity keys. The usual alternatives are guids (they have many disadvatages, primarily from size and randomness) and application layer generated keys. But creating a surrogate key in the application layer is a little bit harder than it seems and also does not cover non-application related data access (ie. batch loads, imports, other apps etc). The one special case is distributed applications when guids and even sequential guids may offer a better alternative to site id + identity keys..

I suppose if you are creating a many-to-many linking table, where both fields are foreign keys, you don't need an identity field.
Nowadays I imagine that most ORMs expect there to be an identity field in every table. In general, it is a good practice to provide one.

I'm not sure I understand enough about your context, but I interpret your question to be:
"If I need the database to create a unique column (for whatever reason), when shouldn't it be a monotonically increasing integer (identity) column?"
In those cases, there's no reason to use anything other than the facility provided by the DBMS for the purpose; in your case (SQL Server?) that's an identity.
Except:
If you'll ever need to merge the table with data from another source, use a GUID, which will prevent duplicate keys from colliding.

If you need to merge databases it's a lot easier if you don't have to regenerate keys.

One case of not wanting an identity field would be in a one to one relationship. The secondary table would have as its primary key the same value as the primary table. The only reason to have an identity field in that situation would seem to be to satisfy an ORM.

You cannot (normally) specify values when inserting into identity columns, so for example if the column "id" was specified as an identify the following SQL would fail:
INSERT INTO MyTable (id, name) VALUES (1, 'Smith')
In order to perform this sort of insert you need to have IDENTITY_INSERT on for that table - this is not intended to be on normally and can only be on for a maximum of 1 tables in the database at any point in time.

If I need a surrogate, I would either use an IDENTITY column or a GUID column depending on the need for global uniqueness.
If there is a natural primary key, or the primary key is defined as a unique combination of other foreign keys, then I typically do not have an IDENTITY, nor do I use it as the primary key.
There is an exception, which is snapshot configuration tables which I am tracking with an audit trigger. In this case, there is usually a logical "primary key" (usually date of the snapshot and natural key of the row - like a cost center or gl account number for which the row is a configuration record), but instead of using the natural "primary key" as the primary key, I add an IDENTITY and make that the primary key and make a unique index or constraint on the date and natural key. Although theoretically the date and natural key shouldn't change, in these tables, if a user does that instead of adding a new row and deleting the old row, I want the audit (which reflects a change to a row identified by its primary key) to really reflect a change in the row - not the disappearance of a key and the appearance of a new one.

I recently implemented a Suffix Trie in C# that could index novels, and then allow searches to be done extremely fast, linear to the size of the search string. Part of the requirements (this was a homework assignment) was to use offline storage, so I used MS SQL, and needed a structure to represent a Node in a table.
I ended up with the following structure : NodeID Character ParentID, etc, where the NodeID was a primary key.
I didn't want this to be done as an autoincrementing identity for two main reasons.
How do I get the value of a NodeID after I add it to the database/data table?
I wanted more control when it came to generating my own IDs.

What is the difference between Unique Key and Index with IsUnique=Yes?

I have a table with a primary key, but I want two other columns to be constrained so the combination of the two is guaranteed always to be unique.
(a dumb example: In a BOOKS table, the IBAN column is the primary key, but the combination of the Title and Author columns should also always be unique.)
In the SQL Server Management Studio it's possible to either create a new Index and set IsUnique to Yes, or I can create a new Unique Key.
What is the difference between the two approaches, and which one suits best for which purposes?

Creating a UNIQUE constraint is a clearer statement of the rule. The IsUnique attribute of the index is an implementation detail - how the rule is implemented, not what the rule is. The effect is the same though.

There is a clear difference between the 2.
A unique constraint defines what combination of columns has to be unique.
A unique index is just a way of making sure the above is always valid.
But it's possible to have a non-unique index supporting a unique constraint.
(if the constraint is deferable = Only has to be valid at commit time but is allowed to be broken in the middle of a transaction)

Just so that you know, when you create a unique constraint SQL Server will create an index behind the scenes

One thing I just found out the hard way is that in SSMS scripting of unique keys was set to true by default but the scripting of indices was set to False. When I used the Script Table As context menu from SSMS I didn't get my unique indices.
Also if the type is set to Unique Key, you can't change the "Ignore Duplicate Key" setting. First you have change the type from Unique Key to Index then you can set Ignore Duplicate Keys to true.

unique indexes are unique keys.

I do not think there is any difference between them but using unique index , we can have two benefits , as the column is already unique and also had the index on it so i gonna be more faster to search . So using unique index is more benefit.

Preventing Duplicate Inserts Into SQL With PHP

I'm going to running thousands of queries into SQL and I need to prevent the duplication of field 'domain'. Never had to do this before and any help would be appreciated.

You probably want to create a "UNIQUE" constraint on the field "Domain" - this constraint will raise an error if you create two rows that have the same domain in the database. For an explanation, see this tutorial in W3C school -
http://www.w3schools.com/sql/sql_unique.asp
If this doesn't solve your problem, please clarify the database you have chosen to use (MySql?).
NOTE: This constraint is completely separate from your choice of PHP as a programming language, it is a SQL database definition thing. A huge advantage of expressing this constraint in SQL is that you can trust the database to preserve the constraint even when people import / export data from the database, your application is buggy or another application shares the database.

If this is an absolute database integrity requirement (It's not likely to change, nor does existing data have this problem), I would enforce it at the database with a unique constraint.
As far as detecting it before or after the attempt in order to notify the user, there are a number of techniques which could be used.

Where is the data coming from? Is this something you only want to run once, or a couple of times, or often? If the domain-value already exists, do you just want to skip the insert or do something else (ie increment a counter)?
Depending on your answers, there are many possible solutions:
Pre-sort your data, eliminate duplicates, then insert
(assumes relatively static data, empty table to begin with)
Use an associative array in PHP as a local domain-value cache
(if table already contains data, start by reading existing content;
not thread-safe, but works if it only runs once at a time)
Make domain a UNIQUE column and write wrapper code to handle return errors
Make domain a UNIQUE or PRIMARY KEY column and use an ON DUPLICATE KEY clause:
INSERT INTO mydata ( domain, count ) VALUES
( 'firstdomain', 1 ),
( 'seconddomain', 1 ),
( 'thirddomain', 1 )
ON DUPLICATE KEY
UPDATE count = count+1
Insert all data into the table, then remove duplicates
Note that batching inserts (ie using multiple value clauses per statement) can be significantly faster.

I'm not really sure I understood your question, but perhaps you are looking for SQL's "UNIQUE" constraint. If the query tries to insert a pre-existing value to a field, you (PHP) will be notified about this constraint breach.

There are a bunch of ways to approach this. You could set a unique constraint (like a primary key) on that column. This will cause the insert to fail if that domain has also been inserted. You could also insert all of the duplicate domains and just delete them later on. This will work well if not that many of the domains are duplicated. There are a few questions posted already on finding duplicate rows.

This can be doen with sql, rather than with php.
i am assuming that you are using MySQl, but the same principles will work with different databases.
make the Domain column the primary key. (makes sense, as it has to unique.)
Rather than using INSERT, use UPDATE.
if the primary key already exists (that you are trying to put into the table), update will update the existing tuple, rather than creating a new tuple.
so you will overwrite existing data if it is different, and if it is identical the update will be skipped.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

SQL Server String column as a unique key - sql-server

I would still create a clustered column key on the table e.g. An Auto number and then create a Unique Index on the URL column. However I cant see why a URL is not unique and all should work as is.

You may benefit from numeric primary key if paging is needed. But still you can add numeric indexer in future. So there's no obstacle to make URL a PK.

Related

Azure Logic App delete Row is not working

Adding a primary key for Entity Framework to an existing column in a View based on a table where every column is Allow Null

When having an identity column is not a good idea?

What is the difference between Unique Key and Index with IsUnique=Yes?

Preventing Duplicate Inserts Into SQL With PHP

Categories

Resources