Checking existance of dynamic value before update/insert - sql-server

I am trying to mass update a table column with values but I need to get the query to check whether this value already exists. If it does then to make the relevant changes before checking again and updating the table.
The database primarily holds staff information and I need to create a unique username, the script to create the username is :
select upper(LEFT(first_name,1))+LEFT(surname,3)+'1'
from staff_test
If this was used for an example user it would generate a username of ABit1 for user Andrew Bithell. What I need it to do is check to see if there already is a ABit1 username in the STAFF_TEST table and if so change Andrews username to ABit2 as the usernames have to be unique before it moves onto the next user.
I have created another table which lists all the current usernames splitting the existing usernames into 2 columns, so they display in this table as
column1 | column2
------------------
ABit |1
I have experimented with a function and I am now thinking a Merge statement might be the way to go.
Any suggestions are welcomed.

Use row_number can generate all the unique names at once:
select
upper(LEFT(first_name,1))+LEFT(surname,3)+
rtrim(row_number() over (partition by upper(LEFT(first_name,1))+LEFT(surname,3) ))
,first_name
,surname
from staff_test

Perform an up front check to see if there are any clashes:
SELECT UPPER(LEFT(first_name, 1)) + LEFT(surname, 3) + '1' AS username ,
COUNT(1) counter
FROM staff_test
GROUP BY UPPER(LEFT(first_name, 1)) + LEFT(surname, 3) + '1'
HAVING COUNT(1) > 1
ORDER BY COUNT(1) DESC
This will return each username on your staff table, grouped by the username, along with a count of how many occurrences there are of each.
You can either sanitize the data if that's what you're looking to do, otherwise I would suggest, appending an Id column value or some other unique value per record instead of 1 on the end.

Related

How to get rid of duplicates with T-SQL

Hi I have a login table that has some duplicated username.
Yes I know I should have put a constraint on it, but it's a bit too late for that now!
So essentially what I want to do is to first identify the duplicates. I can't just delete them since I can't be too sure which account is the correct one. The accounts have the same username and both of them have roughly the same information with a few small variances.
Is there any way to efficiently script it so that I can add "_duplicate" to only one of the accounts per duplicate?
You can use ROW_NUMBER with a PARTITION BY in the OVER() clause to find the duplicates and an updateable CTE to change the values accordingly:
DECLARE #dummyTable TABLE(ID INT IDENTITY, UserName VARCHAR(100));
INSERT INTO #dummyTable VALUES('Peter'),('Tom'),('Jane'),('Victoria')
,('Peter') ,('Jane')
,('Peter');
WITH UpdateableCTE AS
(
SELECT t.UserName AS OldValue
,t.UserName + CASE WHEN ROW_NUMBER() OVER(PARTITION BY UserName ORDER BY ID)=1 THEN '' ELSE '_duplicate' END AS NewValue
FROM #dummyTable AS t
)
UPDATE UpdateableCTE SET OldValue = NewValue;
SELECT * FROM #dummyTable;
The result
ID UserName
1 Peter
2 Tom
3 Jane
4 Victoria
5 Peter_duplicate
6 Jane_duplicate
7 Peter_duplicate
You might include ROW_NUMBER() as another column to find the duplicates ordinal. If you've got a sort clause to get the earliest (or must current) numbered with 1 it should be easy to find and correct the duplicates.
Once you've cleaned this mess, you should ensure not to get new dups. But you know this already :-D
There is no easy way to get rid of this nightmare. Some manual actions required.
First identify duplicates.
select * from dbo.users
where userId in
(select userId from dbo.users
group by username
having count(userId) > 1)
Next identify "useless" users (for example those who registered but never place any order).
Rerun the query above. Out of this list find duplicates which are the same (by email for example) and combine them in a single record. If they did something useful previously (for example placed orders) then first assign these orders to a user which survive. Remove others.
Continue with other criteria until you you get rid of duplicates.
Then set unique constrain on username field. Also it is good idea to set unique constraint on email field.
Again, it is not easy and not automatic.
In this case where you duplicates and the original names have some variance it is highly impossible to select non duplicate rows since you are not aware which is real and which is duplicate.
I think the best thing to is to correct you data and then fix from where you are getting this slight variant duplicates.

SQL SERVER - Retrieve Last Entered Data

I've searched for long time for getting last entered data in a table. But I got same answer.
SELECT TOP 1 CustomerName FROM Customers
ORDER BY CustomerID DESC;
My scenario is, how to get last data if that Customers table is having CustomerName column only? No other columns such as ID or createdDate I entered four names in following order.
James
Arun
Suresh
Bryen
Now I want to select last entered CustomerName, i.e., Bryen. How can I get it..?
If the table is not properly designed (IDENTITY, TIMESTAMP, identifier generated using SEQUENCE etc.), INSERT order is not kept by SQL Server. So, "last" record is meaningless without some criteria to use for ordering.
One possible workaround is if, by chance, records in this table are linked to some other table records (FKs, 1:1 or 1:n connection) and that table has a timestamp or something similar and you can deduct insertion order.
More details about "ordering without criteria" can be found here and here.
; with cte_new as (
select *,row_number() over(order by(select 1000)) as new from tablename
)
select * from cte_new where new=4

SQL Server : autoincrement fields separately

I have a table with two columns, where none of the columns is unique. I need to auto increment the column number separately for each user.
user | number
1 | 1
2 | 1
1 | 2
3 | 1
The only idea I could come up with is to search for the last number used and manually increment by one. Is there a more efficient way?
Instead of the number field, You can create an auto increment field in the table (I call it id), and get the desired number via a query:
first adding id:
alter table table_name add id int not null IDENTITY(1,1)
you do not need the number field anymore:
alter table table_name drop column number
The query to get number (you can use it to create a view):
select user,
row_number() over(partition by user order by id) as number
from table_name
Search for a user maximum and increment it.
INSERT INTO YOUR_TABLE (user, number)
SELECT (MAX(number) + 1)
FROM YOUR_TABLE
WHERE user = USER_ID
The term of auto-increment covers only the primary key auto-increment. See this page, for more information.
You can do it like in two query:
userno - id of user in integer you want to insert
first write(query 1) i.e if userno already there in table:
insert into table_name
select userno,count(*) from table_name where user = userno
group by user;
if it returns empty row then simply write(query 2) i.e it is a new user to be inserted:
insert into table_name values(userno,1);

Show records where most recent 'x' records meet criteria

Here's a simplified SQLFiddle example of data
Basically, I'm looking to identify records in a login audit table where the most recent records for each user has 'x' (let's say 3, for this example) number of failed logins
I am able to get this data for individual users by doing a SELECT TOP 3 and ordering by the log date in descending order and evaluating those records, but I know there's got to be a better way to do this.
I have tried a few queries using ROW_NUMBER(), partitioning by UserName and Success and ordering by LogDate, but I can't quite get it to do what I want. Essentially, every time a successful login occurs, I want the failed login counter to be reset.
try this code:
select * from (
select distinct a.UserName,
(select sum(cast(Success as int)) from (
SELECT TOP 3 Success --- here 3, change it to your number
FROM tbl as b
WHERE b.UserName=a.UserName
ORDER BY LogDate DESC
) as q
having count(*) >= 3 --- this string need to remove users who made less then 3 attempts
) as cnts
from tbl as a
) as q2
where q2.cnts=0
it shows users with all last 3 attempts failed, with different modifications, you can use this approach to identify how many success/fail attempts where done during last N rows
NOTE: this query works, but it is not the optimal way, from tbl as a should be changed to table where only users are stored, so you will be able to get rid of distinct, also - store users ID instead of username in tbl

How can I assign a number to each row in a table representing the record number?

How can I show the number of rows in a table in a way that when a new record is added the number representing the row goes higher and when a record is deleted the number gets updated accordingly?
To be more clear,suppose I have a simple table like this :
ID int (primary key) Name varchar(5)
The ID is set to get incremented by itself (using identity specification) so it can't represent the number of row(record) since if I have for example 3 records as:
ID NAME
1 Alex
2 Scott
3 Sara
and I delete Alex and Scott and add a new record it will be:
3 Sara
4 Mina
So basically I'm looking for a sql-side solution for doing this so that I don't change anything else in the source code in multiple places.
I tried to write something to get the job done but it failes. Here it is :
SELECT COUNT(*) AS [row number],Name
FROM dbo.Test
GROUP BY ID, Name
HAVING (ID = ID)
This shows as:
row number Name
1 Alex
1 Scott
1 Sara
while I want it to get shown as:
row number Name
1 Alex
2 Scott
3 Sara
If you just want the number against the rows while selecting the data and not in the database then you can use this
select row_number() over(order by id) from dbo.Test
This will give the row number n for nth row.
Try
SELECT id, name, ROW_NUMBER() OVER (ORDER BY id) AS RowNumber
FROM MyTable
What you want is called an auto increment.
For SQL-Server this is achieved by adding the IDENTITY(1,1) attribute to the table definition.
Other RDBMS use a different syntax. Firebird for example has generators, which do the counting. In a BEFORE-INSERT trigger you would assign the ID-field to the current value of the generator (which will be increased automatically).
I had this exact problem a while ago, but I was using SQL Server 2000, so although row number() is the best solution, in SQL Server 2000, this isn't available. A workaround for this is to create a temporary table, insert all the values with auto increment, and replace the current table with the new table in T-SQL.

Resources