Handling duplicate values in update query with UQ column

Handling duplicate values in update query with UQ column - sql-server

I am migrating user information from few source databases and generating Usernames in destination to those users that are compiled in two ways: 1. If user does not belong to 'usergroup' username is lastname+runningnumber(for duplicates) 2. if they belong to usergroup their username is usergroup+runningnumber. RunningNumber should not be 'universal' across all users, but instead limited to duplicates f.ex
User1,User2,User3,UserGroup1,UserGroup2,UserGroup3
It was suggested to me that this will be simpler to achieve outside of SSIS.
When running the query duplicate values might exists already in dest.db or they might be generated via query.
This is my current query (not working properly) and I need some help in approaching this issue:
SET XACT_ABORT OFF ;
BEGIN
DECLARE #Any_error int;
WITH Kep as (SELECT Id, CASE WHEN UserGroup IS NULL THEN CAST(LEFT(LOWER(REPLACE(LastName, ' ','')), 10) +'-'+
RIGHT(CAST(ROW_NUMBER() OVER(PARTITION BY LEFT(LOWER(REPLACE(LastName, ' ','')), 10) ORDER BY LastName) as nvarchar(50)),4) AS nvarchar(50)) ELSE CAST(LEFT(LOWER(REPLACE(UserGroup, ' ','')), 10) +'-'+
RIGHT(+CAST(ROW_NUMBER() OVER(PARTITION BY LEFT(LOWER(REPLACE(UserGroup, ' ','')), 10) ORDER BY UserGroup) as nvarchar(50)),4) AS nvarchar(50)) END AS rn
FROM Users)
UPDATE TOP(1000) Users
SET UserName = rn
SELECT #Any_error = ##ERROR
IF #Any_error = 2627 GOTO ErrorHandler
ErrorHandler:
DECLARE #RunningNumber INT
SET #RunningNumber = 1
Loop:
BEGIN
WHILE (#Any_error = 2627)
SET #RunningNumber = #RunningNumber+1;
WITH Kep as (SELECT Id, CASE WHEN UserGroup IS NULL THEN CAST(LEFT(LOWER(REPLACE(LastName, ' ','')), 10) +'-'+
RIGHT(+CAST(ROW_NUMBER() OVER(PARTITION BY LEFT(LOWER(REPLACE(LastName, ' ','')), 10) ORDER BY LastName)+#RunningNumber as nvarchar(50)),4) AS nvarchar(50)) ELSE CAST(LEFT(LOWER(REPLACE(UserGroup, ' ','')), 10) +'-'+
RIGHT(+CAST(ROW_NUMBER() OVER(PARTITION BY LEFT(LOWER(REPLACE(UserGroup, ' ','')), 10) ORDER BY UserGroup)+#RunningNumber as nvarchar(50)),4) AS nvarchar(50)) END AS rn
FROM Users)
UPDATE Users
SET UserName = rn
SELECT #Any_error = ##ERROR
IF (#Any_error = 2627) GOTO Loop;
END
END
Any advice / help is highly appreciated!
EDIT: Data..
Source
LastName UserGroup
Smith Sales
Smith Sales
Smith NULL
Smith NULL
Johnson Development
Johnson NULL
Destination
LastName UserGroup Username
Smith Sales sales-1
Smith Sales sales-2
Smith NULL smith-1
Smith NULL smith-2
Johnson Development development-1
Johnson NULL johnson-1

Yikes, your query looks scary. Try something like this:
My Versions of Your Tables
DECLARE #yourTable TABLE (LastName VARCHAR(15),UserGroup VARCHAR(15));
INSERT INTO #yourTable
VALUES ('Smith','Sales'),
('Smith','Sales'),
('Brown','Sales'), --added this row
('Smith',NULL),
('Smith',NULL),
('Johnson','Development'),
('Johnson',NULL);
DECLARE #destinationTable TABLE (LastName VARCHAR(15),UserGroup VARCHAR(15),UserName VARCHAR(15))
INSERT INTO #destinationTable
VALUES ('Smith',NULL,'Smith-1'),
('Stevens','Sales','Sales-1'),
('Stevens','Sales','Sales-2'),
('Lopez','Development','Development-1');
Actual Query
--INSERT INTO #destinationTable
SELECT LastName,
UserGroup,
COALESCE(UserGroup,LastName)
+ '-'
+ CAST(ROW_NUMBER() OVER (PARTITION BY COALESCE(UserGroup,LastName) ORDER BY (SELECT NULL)) + COALESCE(max_num,0) AS VARCHAR(10))
AS UserName
FROM #yourTable AS A
CROSS APPLY (
SELECT MAX(CAST(SUBSTRING(UserName,CHARINDEX('-',UserName) + 1,1000) AS INT)) --finds maximum number already used in destination table
FROM #destinationTable AS B
WHERE COALESCE(A.UserGroup,A.LastName) = COALESCE(B.UserGroup,B.LastName)
) CA(max_num)
Results:
LastName UserGroup UserName
--------------- --------------- --------------------------
Johnson Development Development-2
Johnson NULL Johnson-1
Smith Sales Sales-3
Smith Sales Sales-4
Brown Sales Sales-5
Smith NULL Smith-2
Smith NULL Smith-3

If you have a LastName same as a UserGroup this could produce duplicates
SELECT LastName,
UserGroup,
LEFT(LOWER(REPLACE(UserGroup, ' ','')), 10)
+ '-'
+ CAST(ROW_NUMBER() OVER (PARTITION BY LEFT(LOWER(REPLACE(UserGroup, ' ','')), 10) ORDER BY (SELECT NULL)) AS VARCHAR(10))
AS UserName
FROM Users
WHERE UserGroup is not null
SELECT LastName,
UserGroup,
LEFT(LOWER(REPLACE(LastName, ' ','')), 10)
+ '-'
+ CAST(ROW_NUMBER() OVER (PARTITION BY LEFT(LOWER(REPLACE(LastName, ' ','')), 10) ORDER BY (SELECT NULL)) AS VARCHAR(10))
AS UserName
FROM Users
WHERE UserGroup is null

Related

T-SQL : can this data be displayed using PIVOT without an aggregate

I have two tables that I've been asked to create a PIVOT table with however there is no aggregate and after messing around with this I'm unsure this is possible.
Table A looks like:
ID CustomerID ItemCode CustomerItemCode CustomerItemDescription
-------------------------------------------------------------------
1 1 123 321 product x
2 2 123 456 product x
3 1 987 789 product y
4 2 987 567 product y
Table B:
CustomerID CustomerName
------------------------
1 Customer ABC
2 Customer XYZ
What the result should look like is:
ItemCode CustomerItemDescription Customer ABC Customer XYZ
-------------------------------------------------------------
123 product x 321 456
987 product y 789 567
Because customers could always be added I'm trying to make this as dynamic as possible so I've gotten as far as setting up the customer columns and creating a temp table with the data but without an aggregate I'm unsure how to make this display properly.

That is an interesting one and yes it is possible.
Here two possibilities to achieve this.
Option 1 (the simpler to read and understand):
You can use this if you know the names of the Customers and can generate parts of the required query outside of SQL and execute the full in the end.
SELECT
ItemCode,
CustomerItemDescription,
max(case when (CustomerName='Customer ABC') then CustomerItemCode else NULL end) as 'Customer ABC',
max(case when (CustomerName='Customer XYZ') then CustomerItemCode else NULL end) as 'Customer XYZ'
FROM Table_A
JOIN Table_B ON Table_A.CustomerID = Table_B.CustomerID
GROUP BY ItemCode, CustomerItemDescription
ORDER BY ItemCode;
SQL-Fiddle for Option 1: https://www.db-fiddle.com/f/eEEDSao6Qy9v6um8N4zjqn/0
Option 2 (dynamic but hard to understand):
Here you generates dynamic the query inside of the SQL using variabel and execute this in the end.
SET #sql = NULL;
SELECT
GROUP_CONCAT(DISTINCT
CONCAT(
'max(case when (CustomerName = ''',
CustomerName,
''') then CustomerItemCode else NULL end) as ''',
CustomerName,''''
)
) INTO #sql
FROM Table_A
JOIN Table_B ON Table_A.CustomerID = Table_B.CustomerID;
SET #sql = CONCAT('SELECT ItemCode, CustomerItemDescription, ', #sql, '
FROM Table_A
JOIN Table_B ON Table_A.CustomerID = Table_B.CustomerID
GROUP BY ItemCode, CustomerItemDescription
ORDER BY ItemCode');
PREPARE stmt FROM #sql;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
SQL-Fiddle for Option 2: https://www.db-fiddle.com/f/xkAaZM9Z7Wszh89PMYNYWP/0
Check also this article for additional infos: https://ubiq.co/database-blog/display-row-values-columns-mysql/
SQL-Fiddle for the Article: https://www.db-fiddle.com/f/qZedfre2FtowmsxRB5TWt8/0

Concept is right - use dynamic sql. If you are specifically working with T-SQL and SQL Server 2012+, try the following:
declare #fldlist varchar(max) = stuff((
select ', '+ concat('max(case when customername = ''', CustomerName , ''' then CustomerItemCode else NULL end) [', CustomerName , ']')
from Table_B
JOIN Table_B ON Table_A.CustomerID = Table_B.CustomerID for xml path('')),1,1,'')
/*
need to use a global temp table because dynamic sql
*/
declare #tbname varchar(255) = '##UnlikelyToBeUsedGlobalTempTblName';
/*
there is an 8000 char limit for dynamic sql in sql server. If there are too many customers with long names, you will need to split tables.
*/
declare #sql varchar(8000) = CONCAT('
if object_id(''tempdb..'''+#tbname+''') is not null drop table ',#tbname,'
SELECT ItemCode, CustomerItemDescription, ', #fldlist, ' into ' , #tbname , '
FROM Table_A
JOIN Table_B ON Table_A.CustomerID = Table_B.CustomerID
GROUP BY ItemCode, CustomerItemDescription
ORDER BY ItemCode');
exec(#sql)
select * from ##UnlikelyToBeUsedGlobalTempTblName

How to get one column of one record in SQL query?

I try to get full name of #MechanicExpertTable table by select query but get error that
Incorrect syntax near the keyword 'SELECT'.
My code:
DECLARE #MechanicExpertTable AS TABLE
(
Id INT,
FirstName NVARCHAR(128),
LastName NVARCHAR(128)
);
INSERT INTO #MechanicExpertTable
SELECT
PROFILE.Id,
PROFILE.FirstName,
PROFILE.LastName
FROM
EstimatedRialMechanicExpert
INNER JOIN
PROFILE ON EstimatedRialMechanicExpert.ProfileId = PROFILE.Id
WHERE
EstimatedRialId = #id
DECLARE #MechanicExpert1 NVARCHAR(128) =
SELECT TOP(1)
ROW_NUMBER() OVER(ORDER BY Id ASC) AS rownumber,
#MechanicExpertTable.FirstName + ' ' + #MechanicExpertTable.LastName
FROM
#MechanicExpertTable
WHERE
rownumber = 3
How to fix this?

If you want to get the full name for а specific position, try with the following example. In your case, ROW_NUMBER() is used without PARTITION BY, so TOP(1) is not necessary.
Input:
DECLARE #MechanicExpertTable AS TABLE (
Id INT,
FirstName NVARCHAR(128),
LastName NVARCHAR(128)
);
INSERT INTO #MechanicExpertTable
(Id, FirstName, LastName)
VALUES
(1, 'FirstName1', 'LastName1'),
(2, 'FirstName2', 'LastName2'),
(6, 'FirstName6', 'LastName6'),
(7, 'FirstName7', 'LastName7'),
(9, 'FirstName9', 'LastName9')
T-SQL:
DECLARE #MechanicExpert1 NVARCHAR(128)
SELECT TOP(1) #MechanicExpert1 = FullName
FROM (
SELECT
ROW_NUMBER() OVER(ORDER BY Id ASC) AS rownumber,
FirstName + ' ' + LastName AS FullName
FROM #MechanicExpertTable
) t
WHERE rownumber = 3
PRINT #MechanicExpert1
Output:
FirstName6 LastName6

Missing parentheses. Add () for your sub query as and the code should be as below-
DECLARE #MechanicExpert1 NVARCHAR(128) =
(
SELECT TOP 1 T
FROM
(
SELECT ROW_NUMBER() OVER(ORDER BY Id ASC) AS rownumber,
#MechanicExpertTable.FirstName + ' ' + #MechanicExpertTable.LastName AS T
FROM #MechanicExpertTable
)A WHERE rownumber = 3
)
Note: I have added TOP 1 in the selection as it was available in your script. Basically when you are filtering with RowNumber = 3 there are no chance of coming multiple rows. You can remove "TOP 1" from the script.

Counting number of Populated Columns by Row

I wanted to know if there is a way of counting the number of populated columns per row of a table.
For example if I have the simple table below Called Customer:
**Name** **Customer** **DOB** **Order number** **Populated Columns**
ABC Ltd Jo Blogg 2/1/78 123 3
Umbrella Co A Sherman 232 2
Nike 14/5/98 1
What I want is a query which will give me an extra column with a number saying how many columns have a value in them.
Any ideas?

Can be done via trivial check on NULL (and empty strings for such columns):
SELECT
[Name]
, [Customer]
, [DOB]
, [Order number]
, CASE WHEN ISNULL([Name], '') != '' THEN 1 ELSE 0 END
+ CASE WHEN ISNULL([Customer], '') != '' THEN 1 ELSE 0 END
+ CASE WHEN [DOB] IS NOT NULL THEN 1 ELSE 0 END
+ CASE WHEN [Order number] IS NOT NULL THEN 1 ELSE 0 END AS [Populated Columns]
This will work nicely for a fixed and known number of columns.
Such an approach can be perhaps more universal if columns list fetched from the metadata. As a downside - this requires a dynamic SQL.
Below is an example for SQL Server 2017 and higher:
DECLARE #_SQL NVARCHAR(max)
DECLARE #_TableName sysname = 'Table1'
SELECT #_SQL =
'SELECT '
+ STRING_AGG(QUOTENAME(COLUMN_NAME), ',
')
+ ', '
+ STRING_AGG('
CASE WHEN ['+COLUMN_NAME+'] IS NOT NULL THEN 1 ELSE 0 END', ' +')
+ ' AS [Populated Columns]
FROM ' + QUOTENAME(MIN(TABLE_SCHEMA)) + '.' + QUOTENAME(MIN(TABLE_NAME))
FROM INFORMATION_SCHEMA.COLUMNs
WHERE TABLE_NAME = #_TableName
EXEC sys.sp_executesql #_SQL
It will generate and execute a code:
SELECT
[Col1],
[Col2],
[Col3],
CASE WHEN [Col1] IS NOT NULL THEN 1 ELSE 0 END +
CASE WHEN [Col2] IS NOT NULL THEN 1 ELSE 0 END +
CASE WHEN [Col3] IS NOT NULL THEN 1 ELSE 0 END AS [Populated Columns]
FROM [dbo].[Table1]
In older versions, such result is achievable but with other string aggregation workarounds, like XML STUFF or SQLCLR functions...

Just thought of sharing another approach using UNPIVOT to calculate the same, assuming that you will have a primary key/identity in your table.
declare #tmp table (id int, [Name] varchar(100), Customer varchar(100), dob datetime, orderno int)
insert into #tmp select 1, 'name1','c1',getdate(),123
insert into #tmp select 2,'name2',null,getdate(),123
insert into #tmp select 3,'name3',null,null,null
SELECT t.*,
t1.notpopulated
FROM #tmp t
INNER JOIN (SELECT 4 - Count(*) AS NotPopulated,
id
FROM
(SELECT id,
u.x,
u.y
FROM (SELECT id,
Cast([name]AS VARCHAR(100)) [name],
Cast(customer AS VARCHAR(100)) AS customer,
Cast(dob AS VARCHAR(100)) AS dob1,
Cast(orderno AS VARCHAR(100)) orderno
FROM #tmp) AS s
UNPIVOT ( [y]
FOR [x] IN ([name],
[Customer],
dob1,
[orderno]) ) u) t
GROUP BY id) t1
ON t1.id = t.id
Online Demo

Transform row in column in a table

I am trying to rotate the visualization of a table showing the lines as columns without any kind of aggregation.
Suppose I have this table
create table user
id int,
name nvarchar(100),
company nvarchar(100),
division nvarchar(100),
city nvarchar(100)
that can be retrieved with this select
select name,company division, city from user order by id
wich gives me this result
john Company1 division1 City1
Peter Company2 division2 City2
Mary Company3 division3 City3
.
.
but what I need is to show each line as a column and the first column with the name of the field like this
Name john Peter Mary ....
Company Company1 Company2 Company3 ....
Division division1 division2 division3 ....
City City1 City2 City3 ....
How can I accomplish this? I Tried using this unpivot
select col,value
from
(select cast(name as varchar) as name,
cast(Company as varchar) as company,
cast(Division as varchar) as division
cast(City as varchar) as city
from user) p
unpivot
(value for col in (name,company,division,city)) as unpvt
but this is what I got (Note: I want all the names in the same row)
name john
Company Company1
Division division1
City City1
name peter // this should be in the first row as a second column
Company Company2
Division division2
City City2
...

This is super ugly, but it's the only way I could figure out how to do what you want solely in SQL Server. If you copy and paste the code it should run and give you results and leave your database clean. I use a couple permanent tables to work around some dynamic sql scoping limitations, but I drop them both before it's done.
If Object_ID('tempdb..#userInfo') Is Not Null Drop Table #userInfo
Create Table #userInfo (id Int, name Varchar(100), company Varchar(100), division Varchar(100), city Varchar(100))
Insert #userInfo (id, name, company, division, city)
Values (1, 'john','company1', 'division1', 'city1'),
(2, 'peter','company2', 'division2', 'city2'),
(3, 'mary','company3', 'division3', 'city3'),
(4, 'timmy','company4', 'division4', 'city4'),
(5, 'nancy','company5', 'division5', 'city5'),
(6, 'james','company6', 'division6', 'city6'),
(7, 'brandon','company7', 'division7', 'city7'),
(8, 'jay','company8', 'division8', 'city8')
If Object_ID('tempdb..#unPivoted') Is Not Null Drop Table #unPivoted
Create Table #unPivoted (id Int, rid Int, col Varchar(100), value Varchar(100))
Insert #unPivoted
Select id, Row_Number() Over (Partition By id Order By value) As rID, col, value
From #userInfo p
Unpivot (value For col In (name, company, division, city)) As u
If Object_ID('dbo.TempQueryOutput') Is Not Null Drop Table dbo.TempQueryOutput
Select 1 As OrderCol,'City' As ColName Into dbo.TempQueryOutput
Union
Select 2,'Company'
Union
Select 3,'Division'
Union
Select 4,'Name'
Declare #sql Nvarchar(Max),
#maxID Int,
#loopIter Int = 1
Select #maxID = Max(id)
From #userInfo
While #loopIter <= #maxID
Begin
Set #sql = 'Select o.*, u.value As Col' + Convert(Nvarchar(100),#loopIter) + ' Into dbo.TempQueryTable
From dbo.TempQueryOutput o
Join #unPivoted u
On o.OrderCol = u.rid
And u.id = ' + Convert(Nvarchar(100),#loopIter)
Exec sp_executeSQL #sql
If Object_ID('dbo.TempQueryOutput') Is Not Null Drop Table dbo.TempQueryOutput
Select * Into dbo.TempQueryOutput
From dbo.TempQueryTable
If Object_ID('dbo.TempQueryTable') Is Not Null Drop Table dbo.TempQueryTable
Set #loopIter = #loopIter + 1
End
Update dbo.TempQueryOutput
Set OrderCol = Case
When ColName = 'Name' Then 1
When ColName = 'Company' Then 2
When ColName = 'Division' Then 3
When ColName = 'City' Then 4
End
Select *
From dbo.TempQueryOutput
Order By OrderCol
If Object_ID('dbo.TempQueryOutput') Is Not Null Drop Table dbo.TempQueryOutput

How to query data when primary key exists twice as foreign key in another table?

I am am trying to query data but it shows the same record twice how to distinct?
Table 1: UserBasics
user_Id , user_Fullname , user_Zip , user_Need
--------------------------------------------------------------------
10 Alberto Cesaro 98001 Sales, Marketing & Public Relations
Table 2: UserProfession
Prof_ID , Company , Designation
----------------------------------
10 Young's Marketing Manager
10 Young's Regional Manager
My procedure:
CREATE PROC P #Zip VARCHAR(20)=NULL,
#Company VARCHAR(200)=NULL,
#Designation VARCHAR(100)=NULL,
#Interest VARCHAR(200)=NULL,
#CurrentID VARCHAR(200)=NULL
--#JobFunc varchar(200)=NULL
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
SELECT ub.user_Id,
ub.user_Fullname,
up.Designation,
up.Company
FROM UserBasics UB
INNER JOIN UserProfession up
ON ub.user_Id = up.Prof_ID
WHERE ( #Zip IS NULL
OR ub.user_Zip LIKE '%' + #Zip + '%' )
AND ( #Interest IS NULL
OR ub.user_Need LIKE '%' + #Interest + '%' )
AND ( #Company IS NULL
OR up.Company LIKE '%' + #Company + '%' )
AND ( ub.user_Id != #CurrentID )
AND ( #Designation IS NULL
OR up.Designation LIKE '%' + #Designation + '%' )
END
as above is using a stored procedure just to make condition and variable clear
How to show every user data distinct hopes for your suggestion ?
Thanks !
EDIT:
Out put should be look alike,
10 Alberto Cesaro Marketing Manager,Regional Manager Young's
EDIT second time :
i have done with company name but there is some problem if i wanna use filter on user profession table column then how would i relate it ?? my query ,
SELECT
user_Id, user_Fullname,
STUFF(
(SELECT ', ' + Designation
FROM UserProfession as up
WHERE Prof_ID = a.user_Id
FOR XML PATH (''))
, 1, 1, '') Designation,
STUFF(
(SELECT ', ' + Company
FROM UserProfession
WHERE Prof_ID = a.user_Id
FOR XML PATH (''))
, 1, 1, '') AS Company
FROM UserBasics AS a
where (#Zip is null or a.user_Zip like '%'+#Zip+'%') and
(#Interest is null or a.user_Need like '%'+#Interest+'%') and
-- (#JobFunc is null or m.mentor_jobFunction= #JobFunc) and
(#Company is null or up.Company like '%'+#Company+'%') and
(a.user_Id != #CurrentID) and
(#Designation is null or up.Designation like '%'+#Designation+'%')
--where a.user_Zip like '%90005%'
-- WHERE clause here
GROUP BY user_Id, user_Fullname
-- As i am getting error on Company and Designation in where clause Hopes for last suggestion ???

SELECT
user_Id, user_Fullname,
STUFF(
(SELECT ', ' + Designation
FROM UserProfession
WHERE Prof_ID = a.user_Id
FOR XML PATH (''))
, 1, 1, '') AS DesignationList
FROM UserBasics AS a
-- WHERE clause here
GROUP BY user_Id, user_Fullname
SQLFiddle Demo

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Handling duplicate values in update query with UQ column - sql-server

Related

T-SQL : can this data be displayed using PIVOT without an aggregate

How to get one column of one record in SQL query?

Counting number of Populated Columns by Row

Transform row in column in a table

How to query data when primary key exists twice as foreign key in another table?

Categories

Resources