SQL Server Insert Random - sql-server

i have this table:
Create Table Person
(
Consecutive Integer Identity(1,1),
Identification Varchar(15) Primary Key,
)
The Identification column can contain letters, numbers, and is optional, i.e., the customer can enter it or not, if not, creates a number automatic.. how can i do to insert a random number that does not exist before?, preferably a lower number.
A example could be:
Select Random From Person Where Random Not Exists In Identification
This is my code:
Select Min(Convert(Integer,Identification)) - 1
From Person
Where IsNumeric(Identification) = 1
Or
Select Max(Convert(Integer,Identification)) + 1
From Person
Where IsNumeric(Identification) = 1
Works well, but if the customer enter a number high, for example 1000, or higher, then the number will begin from there could have an overflow error
But if there is not a number below Identification and greater than 0 then well be -1, -2, -3.. etc.
Thanks in advance..

I agree with what M.Ali said. But you can just make use of the below code, but still I don't recommend beyond what M.Ali said.
The loop with continue until a random number is generated which is not in your table. You can change the precision to 5 digits by changing 1000 to 10000 and so on.
DECLARE #I INT = 0
DECLARE #RANDOM INT;
WHILE(#I=0)
BEGIN
SELECT #RANDOM = 1000 + (CONVERT(INT, CRYPT_GEN_RANDOM(3)) % 1000);
IF NOT EXISTS(SELECT Identification FROM YOURTABLE WHERE Identification = CAST(#RANDOM AS VARCHAR(4)))
BEGIN
-- Do your stuff here
BREAK;
END
ELSE
BEGIN
-- The ELSE part
END
END

Maintaining a Random number of VARCHAR(15), which depends on end user's input can be a very expensive approach when you also want it to be unique.
Imagine a scenario when you have some decent amount of rows say 10,000 rows in this table and a user comes in trying to insert a Random number, chances are the user maybe try 5, 10 or even maybe 15 times to get a unique random value.
On each failed attempt a call will be made to server, a search will be done on table (more rows more expensive this query will become), and the more (failed) attempts a user makes more disappointment/poor application experience user will have.
Would you ever go back to an application(web/windwos) where just for registeration you had struggle this many time? obviously not.
The moral of the story is if you are asking a user to enter some random value, do not expect users to maintain your database integrity and keep that column unique, take control and pair that value with another column which will definately be random. In your case it can be the Identity column. Or alternately you can generate that value for user yourself, using guid.

select count(*) +1 from Person
This generates a logical ID for Identification that sets the ID to what it 'would have been' with a simple incrementor.
However, you then cannot delete records; instead you must deactivate them, or clear the row.
Alternately, have a separate (hidden) column that only auto-increments, and if Identification is left empty, use the value from the hidden column. Same result, but less risk if deletion is relevant.

Related

SQL create new unique random ID

I'm creating a Inventory application which uses an SQL database to keep track of products.
The ProductNumber is in the format yyyy-xxxx (i.e. 8024-1234), where the first 4 digits describe a category and the last 4 digits describe the an increasing integer, together creating the productnumber.
When creating a new product, the category should first be approved by an administrator, and therefor all new products will be added as 9999-xxxx. Then later, when the product is approved in the category, it's product number will change to the correct ProductNumber.
What I need for this is when creating a new product, to generate a random number for the last 4 digits, and then check if they don't exist already in the database (together with the first 4 digits). So, when creating a new product, some SQL query should create for example 9999-0123 and then double check if this one doesn't exist already.
How could one achieve this?
Thanks in advance!
you didn't precise the SGBD you are using, but here is a potential solution using Oracle PL/SQL:
declare
temp varchar2(10);
any_rows_found number;
row_exist boolean := true;
begin
WHILE row_exist = true
LOOP
temp := '9999-' || ceil(DBMS_RANDOM.value(low => 999, high => 9999));
select count(*) into any_rows_found from my_table where my_column = temp;
if any_rows_found = 1 then
else
row_exist := false;
insert into my_table values (..................., temp);
end if;
end loop;
end;
we use DBMS_RANDOM to generate the random value , concatenate it to 9999- and then check if it exists we loop to generate another value, if it doesn't exist we insert the value.
regards
You can generate your product number with a sequence, if you'd like an incremental number:
CREATE SEQUENCE product_number
START WITH 1000
INCREMENT BY 1
NOCACHE
NOCYCLE;
Whenever you insert or update a new product and need a valid number just call (sequence.nextVal). Then in your product table set (year, product_number) as a primary key (or the product number itself). If you can't set the primary key as said and want to check if the item already exist with the serial number you can generate the sequence number using:
SELECT sequence.nextVal FROM DUAL;
Then check if the product with the generated number exists.
Didn't know what dialect of SQL you are using, this is Oracle SQL but it can appliead in other dialects too.
Also not sure about the target DB - but worked it out for MS-SQL.
In the first step I would not reccommend the approach of generating a random number first and then check if this one exist and potentially doing this over and over again.
Instead you could go by and get the current max productnumber and work from there on. Even with a varchar you will retrieve the max int - since your syntax is always - (c = category / p = product). In addition to that you will get your desired value straight away since the target category is "9999".
You could work with something like this:
DECLARE #newID int;
-- REPLACE to remove the hyphen so we are facing an actual integer
-- Cast to be able to calculate with the value i.E. adding 1 on top of it
-- MAX for retrieving the max value
SELECT #newID = MAX(CAST(replace(ProductNumber,'-','') as int)) + 1 from Test
-- Set the ID by default to 99990000 in case there are no values with the 9999-prefix
IF #newID < 99990000
BEGIN
SET #newID = 99990000
END
-- Push back the hyphen into the new ID given you the final new productNumber
-- 5 is the starting index
-- 0 since no chars from the original ID shall be removed
Select STUFF(#newID, 5,0,'-')
So in case you currently have a product with 9999-1423 as your product with the highest number this would return "9999-1424".
If there are no products with the prefix of "9999" you would simply get "9999-0000".
The ProductNumber is in the format yyyy-xxxx (i.e. 8024-1234), where the first 4 digits describe a category and the last 4 digits describe the an increasing integer, together creating the productnumber.
We will implement this with a calculated column with puts together the category and the product number which will be in their own individual fields.
When creating a new product, the category should first be approved by an administrator, and therefor all new products will be added as 9999-xxxx. Then later, when the product is approved in the category, it's product number will change to the correct ProductNumber.
Put simply, by default every new product is automatically assigned product category 9999
What I need for this is when creating a new product, to generate a random number for the last 4 digits, and then check if they don't exist already in the database (together with the first 4 digits). So, when creating a new product, some SQL query should create for example 9999-0123 and then double check if this one doesn't exist already.
This can be implemented as an identity. This is not random, but I assume that is not really a requirement right?
Keep in mind there are many holes in these requirements.
If your product number changes from 9999-1234 to 8024-1234 but, has already appeared on reports / documents as 9999-1234, that's a problem
This format only supports at most 1,000 products. Then your system breaks
Again, does the number really need to be random?
I won't go into the actual mechanism for approval and assignment, you'll need to ask that in another question once this one is solved.
ProductNumber is in fact not a number, it's a code, so I don't agree with that column name
On to the code.
Create a table by running this:
CREATE TABLE dbo.Products
(
ProductID INT NOT NULL IDENTITY(1,1) PRIMARY KEY,
ProductName VARCHAR(100),
ProductCategoryID INT NOT NULL DEFAULT (9999),
ProductNumber AS (FORMAT(ProductCategoryID,'0000') + '-' + FORMAT(ProductID,'0000'))
)
Some explanation of the columns:
ProductID will autogenerate an incrementing number, starting at 1, incrementing by 1 each time. It's guaranteed to be unique. It's also defined as the primary key
ProductCategoryID will default to 9999 if you don't specify anything for it
ProductNumber is the special value you were after calculated from two individual columns
Now create a new product and see what happens
INSERT INTO dbo.Products(ProductName)
VALUES ('Brown Shoes')
SELECT * FROM dbo.Products
You can see Product Number 9999-0001
Add some more and note that the product code increments. It is not random. Carefully consider if you actually really need this to be random.
Now set the actual product category:
UPDATE dbo.Products
SET ProductCategoryID = 7 WHERE ProductID = 1
SELECT * FROM dbo.Products
and note that the product number updates.
Important to note that the real product id is actually just ProductID. The ProductCode column is just something to satisfy your requirements.

selecting random rows based on weight on another row

I need to select random rows from a table based on weight in another row. Like if the user enters random value 50 I need to select 50 random rows from the table being that the rows with higher weight gets returned more number of times. I saw using NEWID() to select n number of random rows and this link
Random Weighted Choice in T-SQL
where we can select one row based on the weight from another row but I need to select several rows based on user random input number ,so will the best way be using the suggested answer in the above link and looping over it n number of times(but I think it would return the same row) is there any other easy solution.
MY table is like this
ID Name Freq
1 aaa 50
2 bbb 30
3 ccc 10
so when the user enters 50 I need to return 50 random names so it should be like more aaa ,bbb than ccc.Might be like 25 aaa 15 bbb and 10 ccc. Anything close to this will work to.I saw this answer but when I execute against my DB it seems to be running for 5mins and no results yet.
SQL : select one row randomly, but taking into account a weight
I think the difficult part here is getting any individual row to potentially appear more than once. I'd look into doing something like the following:
1) Build a temp table, duplicating records according to their frequency (I'm sure there's a better way of doing this, but the first answer that came to my mind was a simple while loop... This particular one really only works if the frequency values are integers)
create table #dup
(
id int,
nm varchar(10)
)
declare #curr int, #maxFreq int
select #curr=0, #maxFreq=max(freq)
from tbl
while #curr < #maxFreq
begin
insert into #dup
select id, nm
from tbl
where freq > #curr
set #curr = #curr+1
end
2) Select your top records, ordered by a random value
select top 10 *
from #dup
order by newID()
3) Cleanup
drop table #dup
Maybe could you try something like the following:
ORDER BY Freq * rand()
in your sql? So columns with a higher Freq value should in theory get returned more often than those with a lower Freq value. It seems a bit hackish but it might work!

Creating a unique id (PIN) for each record of a table

I want to create a PIN that is unique within a table but not incremental to make it harder for people to guess.
Ideally I'd like to be able to create this within SQL Server but I can do it via ASP.Net if needed.
EDIT
Sorry if I wasn't clear: I'm not looking for a GUID as all I need is a unique id for that table; I just don't want it to be incremental.
Add a uniqueidentifier column to your table, with a default value of NEWID(). This will ensure that each column gets a new unique identifier, which is not incremental.
CREATE TABLE MyTable (
...
PIN uniqueidentifier NOT NULL DEFAULT newid()
...
)
The uniqueidentifier is guaranteed to be unique, not just for this table, but for all tables.
If it's too large for your application, you can derive a smaller PIN from this number, you can do this like:
SELECT RIGHT(REPLACE((SELECT PIN from MyTable WHERE UserID=...), '-', ''), 4/*PinLength*/)
Note that the returned smaller PIN is not guaranteed to be unique for all users, but may be more manageable, depending upon your application.
EDIT: If you want a small PIN, with guaranteed uniqueness, the tricky part is that you need to know at least the maximum number of users, in order to choose the appropriate size of the pin. As the number of users increases, the chances of a PIN collision increases. This is similar to the Coupon Collector's problem, and approaches n log n complexity, which will cause very slow inserts (insert time proportional to the number of existing elements, so inserting M items then becomes O(N^2)). The simplest way to avoid this is to use a large unique ID, and select only a portion of that for your PIN, assuming that you can forgo uniqueness of PIN values.
EDIT2:
If you have a table definition like this
CREATE TABLE YourTable (
[id] [int] IDENTITY(1,1) NOT NULL,
[pin] AS (CONVERT(varchar(9),id,0)+RIGHT(pinseed,3)) PERSISTED,
[pinseed] [uniqueidentifier] NOT NULL
)
This will create the pin from the pinseed a unique ID and the row id. (RAND does not work - since SQL server will use the same value to initialize multiple rows, this is not the case with NEWID())
Just so that it is said, I advise that you do not consider this in any way secure. You should consider it always possible that another user could guess someone else's PIN, unless you somehow limit the number of allowed guesses (e.g. stop accepting requests after 3 attempts, similar to a bank witholding your card after 3 incorrect PIN entries.)
What you want is a GUID
http://en.wikipedia.org/wiki/Globally_unique_identifier
Most languages have some sort of API for generating this... a google search will help ;)
How about a UNIQUEIDENTIFIER type column with a default value of NEWID()?
That will generate a new GUID for each row.
Please have in mind that by requiring an unique PIN (which is uncommon) you will be limiting the max number of allowed users to the the PIN specification. Are you sure you want this ?
A not very elegant solution but which works is to use an UNIQUE field, and then loop attempting to insert a random generated PIN until the insert is successful.
You can use the following to generate a BIGINT, or other datatype.
SELECT CAST(ABS(CHECKSUM(NEWID()))%2000000000+1 as BIGINT) as [PIN]
This creates a number between 1 and 2 billion. You will simulate some level of randomness since it's derived from the NEWID function. You can also format the result as you wish.
This doesn't guarantee uniqueness. I suggest that you use a unique constraint on the PIN column. And, your code that creates the new PIN should check that the new value is unique before it assigns the value.
Use a random number.
SET #uid = ROUND(RAND() * 100000)
The more sparse your values are in the table, the better this works. If the number of assigned values gets large is relationship to the number of available values, it does not work as well.
Once the number is generated you have a couple of options.
1) INSERT the value inside of a retry loop. If you get a dupe error, regenerate the value (or try the value +/-1) and try again.
2) Generate the value and look for the MAX and MIN existing unique identifiers.
DECLARE
#uid INTEGER
SET #uid = ROUND(RAND() * 10000, 1)
SELECT #uid
SELECT MAX(uid) FROM table1 WHERE uid < #uid
SELECT MIN(uid) FROM table1 WHERE uid > #uid
The MIN and MAX value give you a range of available values to work from if the random value is already assigned.

Sql Server Column with Auto-Generated Data

I have a customer table, and my requirement is to add a new varchar column that automatically obtains a random unique value each time a new customer is created.
I thought of writing an SP that randomizes a string, then check and re-generate if the string already exists. But to integrate the SP into the customer record creation process would require transactional SQL stuff at code level, which I'd like to avoid.
Help please?
edit:
I should've emphasized, the varchar has to be 5 characters long with numeric values between 1000 and 99999, and if the number is less than 10000, pad 0 on the left.
if it has to be varchar, you can cast a uniqueidentifier to varchar.
to get a random uniqueidentifier do NewId()
here's how you cast it:
CAST(NewId() as varchar(36))
EDIT
as per your comment to #Brannon:
are you saying you'll NEVER have over 99k records in the table? if so, just make your PK an identity column, seed it with 1000, and take care of "0" left padding in your business logic.
This question gives me the same feeling I get when users won't tell me what they want done, or why, they only want to tell me how to do it.
"Random" and "Unique" are conflicting requirements unless you create a serial list and then choose randomly from it, deleting the chosen value.
But what's the problem this is intended to solve?
With your edit/update, sounds like what you need is an auto-increment and some padding.
Below is an approach that uses a bogus table, then adds an IDENTITY column (assuming that you don't have one) which starts at 1000, and then which uses a Computed Column to give you some padding to make everything work out as you requested.
CREATE TABLE Customers (
CustomerName varchar(20) NOT NULL
)
GO
INSERT INTO Customers
SELECT 'Bob Thomas' UNION
SELECT 'Dave Winchel' UNION
SELECT 'Nancy Davolio' UNION
SELECT 'Saded Khan'
GO
ALTER TABLE Customers
ADD CustomerId int IDENTITY(1000,1) NOT NULL
GO
ALTER TABLE Customers
ADD SuperId AS right(replicate('0',5)+ CAST(CustomerId as varchar(5)),5)
GO
SELECT * FROM Customers
GO
DROP TABLE Customers
GO
I think Michael's answer with the auto-increment should work well - your customer will get "01000" and then "01001" and then "01002" and so forth.
If you want to or have to make it more random, in this case, I'd suggest you create a table that contains all possible values, from "01000" through "99999". When you insert a new customer, use a technique (e.g. randomization) to pick one of the existing rows from that table (your pool of still available customer ID's), and use it, and remove it from the table.
Anything else will become really bad over time. Imagine you've used up 90% or 95% of your available customer ID's - trying to randomly find one of the few remaining possibility could lead to an almost endless retry of "is this one taken? Yes -> try a next one".
Marc
Does the random string data need to be a certain format? If not, why not use a uniqueidentifier?
insert into Customer ([Name], [UniqueValue]) values (#Name, NEWID())
Or use NEWID() as the default value of the column.
EDIT:
I agree with #rm, use a numeric value in your database, and handle the conversion to string (with padding, etc) in code.
Try this:
ALTER TABLE Customer ADD AVarcharColumn varchar(50)
CONSTRAINT DF_Customer_AVarcharColumn DEFAULT CONVERT(varchar(50), GETDATE(), 109)
It returns a date and time up to milliseconds, wich would be enough in most cases.
Do you really need an unique value?

Random record from a database table (T-SQL)

Is there a succinct way to retrieve a random record from a sql server table?
I would like to randomize my unit test data, so am looking for a simple way to select a random id from a table. In English, the select would be "Select one id from the table where the id is a random number between the lowest id in the table and the highest id in the table."
I can't figure out a way to do it without have to run the query, test for a null value, then re-run if null.
Ideas?
Is there a succinct way to retrieve a random record from a sql server table?
Yes
SELECT TOP 1 * FROM table ORDER BY NEWID()
Explanation
A NEWID() is generated for each row and the table is then sorted by it. The first record is returned (i.e. the record with the "lowest" GUID).
Notes
GUIDs are generated as pseudo-random numbers since version four:
The version 4 UUID is meant for generating UUIDs from truly-random or
pseudo-random numbers.
The algorithm is as follows:
Set the two most significant bits (bits 6 and 7) of the
clock_seq_hi_and_reserved to zero and one, respectively.
Set the four most significant bits (bits 12 through 15) of the
time_hi_and_version field to the 4-bit version number from
Section 4.1.3.
Set all the other bits to randomly (or pseudo-randomly) chosen
values.
—A Universally Unique IDentifier (UUID) URN Namespace - RFC 4122
The alternative SELECT TOP 1 * FROM table ORDER BY RAND() will not work as one would think. RAND() returns one single value per query, thus all rows will share the same value.
While GUID values are pseudo-random, you will need a better PRNG for the more demanding applications.
Typical performance is less than 10 seconds for around 1,000,000 rows — of course depending on the system. Note that it's impossible to hit an index, thus performance will be relatively limited.
On larger tables you can also use TABLESAMPLE for this to avoid scanning the whole table.
SELECT TOP 1 *
FROM YourTable
TABLESAMPLE (1000 ROWS)
ORDER BY NEWID()
The ORDER BY NEWID is still required to avoid just returning rows that appear first on the data page.
The number to use needs to be chosen carefully for the size and definition of table and you might consider retry logic if no row is returned. The maths behind this and why the technique is not suited to small tables is discussed here
Also try your method to get a random Id between MIN(Id) and MAX(Id) and then
SELECT TOP 1 * FROM table WHERE Id >= #yourrandomid
It will always get you one row.
If you want to select large data the best way that I know is:
SELECT * FROM Table1
WHERE (ABS(CAST(
(BINARY_CHECKSUM
(keycol1, NEWID())) as int))
% 100) < 10
Source: MSDN
I was looking to improve on the methods I had tried and came across this post. I realize it's old but this method is not listed. I am creating and applying test data; this shows the method for "address" in a SP called with #st (two char state)
Create Table ##TmpAddress (id Int Identity(1,1), street VarChar(50), city VarChar(50), st VarChar(2), zip VarChar(5))
Insert Into ##TmpAddress(street, city, st, zip)
Select street, city, st, zip
From tbl_Address (NOLOCK)
Where st = #st
-- unseeded RAND() will return the same number when called in rapid succession so
-- here, I seed it with a guaranteed different number each time. ##ROWCOUNT is the count from the most recent table operation.
Set #csr = Ceiling(RAND(convert(varbinary, newid())) * ##ROWCOUNT)
Select street, city, st, Right(('00000' + ltrim(zip)),5) As zip
From ##tmpAddress (NOLOCK)
Where id = #csr
If you really want a random sample of individual rows, modify your query to filter out rows randomly, instead of using TABLESAMPLE. For example, the following query uses the NEWID function to return approximately one percent of the rows of the Sales.SalesOrderDetail table:
SELECT * FROM Sales.SalesOrderDetail
WHERE 0.01 >= CAST(CHECKSUM(NEWID(), SalesOrderID) & 0x7fffffff AS float)
/ CAST (0x7fffffff AS int)
The SalesOrderID column is included in the CHECKSUM expression so that
NEWID() evaluates once per row to achieve sampling on a per-row basis.
The expression CAST(CHECKSUM(NEWID(), SalesOrderID) & 0x7fffffff AS
float / CAST (0x7fffffff AS int) evaluates to a random float value
between 0 and 1."
Source: http://technet.microsoft.com/en-us/library/ms189108(v=sql.105).aspx
This is further explained below:
How does this work? Let's split out the WHERE clause and explain it.
The CHECKSUM function is calculating a checksum over the items in the
list. It is arguable over whether SalesOrderID is even required, since
NEWID() is a function that returns a new random GUID, so multiplying a
random figure by a constant should result in a random in any case.
Indeed, excluding SalesOrderID seems to make no difference. If you are
a keen statistician and can justify the inclusion of this, please use
the comments section below and let me know why I'm wrong!
The CHECKSUM function returns a VARBINARY. Performing a bitwise AND
operation with 0x7fffffff, which is the equivalent of (111111111...)
in binary, yields a decimal value that is effectively a representation
of a random string of 0s and 1s. Dividing by the co-efficient
0x7fffffff effectively normalizes this decimal figure to a figure
between 0 and 1. Then to decide whether each row merits inclusion in
the final result set, a threshold of 1/x is used (in this case, 0.01)
where x is the percentage of the data to retrieve as a sample.
Source: https://www.mssqltips.com/sqlservertip/3157/different-ways-to-get-random-data-for-sql-server-data-sampling

Resources