Identify "Character space Character" - sql-server

I'm needing to identify all records within FirstName that contain values such as A & B or B C
Is there a way I could go about identifying these records? I have this currently, but it doesn't quite get me there:
select * from #temp
WHERE FirstName LIKE '%[a-z] [a-z]%' or
FirstName LIKE '%[a-z] & [a-z]%'
Sample Code:
Create table #temp
(
FirstName varchar(50)
)
insert into #temp
(
FirstName
)
select
'Mary Smith'
union
select
'John'
union
select
'Bob'
union
select
'Bruce'
union
select
'Sally'
union
select
'A & B'
union
select
'B C'
select * from #temp
drop table #temp

If you remove the wildcard character from your like expression (%) it will match your current test data e.g.
select *
from #temp
where FirstName like '[a-z] [a-z]'
or FirstName like '[a-z] & [a-z]';

Related

Query column headings data in SQL

I need to display column headings that occur in a different table as a list.
Example:
Column headings - Adam, Cory, Jack, Jane, John, Josef, Mary, Timothy, Charlotte, Jessica, Kristal, Clive
Required column headings (contained within another table) - Jack, Jane, John, Mary, Maria, Josef
How would I check if the column headings are equal to any in the "required" list and then display only those?
I recommend you make use of Pivot and Dynamic SQL. Join from your complete names table to the required names table to end up with just the names you need as columns. Format those values to be later used in pivot with dynamic SQL. With pivot you are forced to aggregate something. Look at the Pivot rows to columns without aggregate post for more information. I hard coded 1 as the aggregate value we need to be able to pivot. You can ignore that value and join on name assuming that is your key to other tables to get additional data for each name.
https://learn.microsoft.com/en-us/sql/t-sql/queries/from-using-pivot-and-unpivot?view=sql-server-ver15
https://learn.microsoft.com/en-us/sql/odbc/reference/dynamic-sql?view=sql-server-ver15
Pivot rows to columns without aggregate
declare #PivotColumns nvarchar(max) = ''
create table #required ( name nvarchar(500) )
insert into #required ([name])
select 'Jack' union select 'Jane' union select 'John' union select 'Mary' union select 'Maria' union select 'Josef'
create table #allnames ( name nvarchar(500) )
insert into #allnames ([name])
select 'Adam' union select 'Cory' union select 'Jack' union select 'Jane' union select 'John' union select 'Josef' union select 'Mary' union select 'Timothy' union select 'Charlotte' union select 'Jessica' union select 'Kristal' union select 'Clive' union select 'Maria'
select #PivotColumns = #PivotColumns + QUOTENAME([name]) + ',' from #required
select #PivotColumns = substring(#PivotColumns,1,len(#PivotColumns)-1)
declare #SQL nvarchar(max) = ''
select #SQL = N'
select
*
from (
select
req.*, 1 [Val]
from #allnames as [All]
join #required as Req
on [All].[name] = Req.[name]
) src
pivot
(
max([Val])
FOR [name] in ('+#PivotColumns+')
) piv';
exec sp_executesql #SQL
drop table #required
drop table #allnames

Query Large (in the millions) Data Faster

I have two tables:
Tbl1 has 2 columns: name and state
Tbl2 has name and state and additional columns about the fields
I am trying to match tbl1 name and state with tbl2 name and state. I have remove all exact matches, but I see that I could match more if I could account for misspelling and name variations by using a scalar function that compares the 2 names and returns an integer showing how close of a match they are (the lower the number returned the better the match).
The issue is that Tbl1 has over 2M records and Tbl2 has over 4M records – it takes about 30sec to just to search one record from Tbl1 in Tbl2.
Is there some way I could arrange the data or query so the search could be completed faster?
Here’s the table structure:
CREATE TABLE Tbl1
(
Id INT NOT NULL IDENTITY( 1, 1 ) PRIMARY KEY,
Name NVARCHAR(255),
[State] VARCHAR(50),
Phone VARCHAR(50),
DoB SMALLDATETIME
)
GO
CREATE INDEX tbl1_Name_indx ON dbo.Tbl1( Name )
GO
CREATE INDEX tbl1_State_indx ON dbo.Tbl1( [State] )
GO
CREATE TABLE Tbl2
(
Id INT NOT NULL IDENTITY( 1, 1 ) PRIMARY KEY,
Name NVARCHAR(255),
[State] VARCHAR(50)
)
GO
CREATE INDEX tbl2_Name_indx ON dbo.Tbl1( Name )
GO
CREATE INDEX tbl2_State_indx ON dbo.Tbl1( [State] )
GO
Here's a sample function that I tested with to try to rule out function complexity:
CREATE FUNCTION [dbo].ScoreHowCloseOfMatch
(
#SearchString VARCHAR(200) ,
#MatchString VARCHAR(200)
)
RETURNS INT
AS
BEGIN
DECLARE #Result INT;
SET #Result = 1;
RETURN #Result;
END;
Here's some sample data:
INSERT INTO Tbl1
SELECT 'Bob Jones', 'WA', '555-333-2222', 'June 10, 1971' UNION
SELECT 'Melcome T Homes', 'CA', '927-333-2222', 'June 10, 1971' UNION
SELECT 'Janet Rengal', 'WA', '555-333-2222', 'June 10, 1971' UNION
SELECT 'Matt Francis', 'TN', '234-333-2222', 'June 10, 1971' UNION
SELECT 'Same Bojen', 'WA', '555-333-2222', 'June 10, 1971' UNION
SELECT 'Frank Tonga', 'NY', '903-333-2222', 'June 10, 1971' UNION
SELECT 'Jill Rogers', 'WA', '555-333-2222', 'June 10, 1971' UNION
SELECT 'Tim Jackson', 'OR', '757-333-2222', 'June 10, 1971'
GO
INSERT INTO Tbl2
SELECT 'BobJonez', 'WA' UNION
SELECT 'Malcome X', 'CA' UNION
SELECT 'Jan Regal', 'WA'
GO
Here's the query:
WITH cte as (
SELECT t1Id = t1.Id ,
t1Name = t1.Name ,
t1State = t1.State,
t2Name = t2.Name ,
t2State = t2.State ,
t2.Phone ,
t2.DoB,
Score = dbo.ScoreHowCloseOfMatch(t1.Name, t2.Name)
FROM dbo.Tbl1 t2
JOIN dbo.Tbl2 t1
ON t1.State = t2.State
)
SELECT *
INTO CompareResult
FROM cte
ORDER BY cte.Score ASC
GO
One possibility would be to add a column with a normalized name used only for matching purposes. You would remove all the white spaces, remove accents, replace first names by abbreviated first names, replace known nicknames by real names etc.
You could even sort the first name and the last name of one person alphabetically in order to allow swapping both.
Then you can simply join the two tables by this normalized name column.
JOIN dbo.Tbl2 t1
ON t1.State = t2.State
You are joining 2Mx4M rows on a max 50 distinct values join criteria. No wonder this is slow. You need to go back to the drawing board and redefine your problem. If you really want to figure out the 'close match' of every body with everybody else in the same state, then be prepared to pay the price...

Unpivot dynamic table columns into key value rows

The problem that I need to resolve is data transfer from one table with many dynamic fields into other structured key value table.
The first table comes from a data export from another system, and has the following structure ( it can have any column name and data):
[UserID],[FirstName],[LastName],[Email],[How was your day],[Would you like to receive weekly newsletter],[Confirm that you are 18+] ...
The second table is where I want to put the data, and it has the following structure:
[UserID uniqueidentifier],[QuestionText nvarchar(500)],[Question Answer nvarchar(max)]
I saw many examples showing how to unpivot table, but my problem is that I dont know what columns the Table 1 will have. Can I somehow dynamically unpivot the first table,so no matter what columns it has, it is converted into a key-value structure and import the data into the second table.
I will really appreciate your help with this.
You can't pivot or unpivot in one query without knowing the columns.
What you can do, assuming you have privileges, is query sys.columns to get the field names of your source table then build an unpivot query dynamically.
--Source table
create table MyTable (
id int,
Field1 nvarchar(10),
Field2 nvarchar(10),
Field3 nvarchar(10)
);
insert into MyTable (id, Field1, Field2, Field3) values ( 1, 'aaa', 'bbb', 'ccc' );
insert into MyTable (id, Field1, Field2, Field3) values ( 2, 'eee', 'fff', 'ggg' );
insert into MyTable (id, Field1, Field2, Field3) values ( 3, 'hhh', 'iii', 'jjj' );
--key/value table
create table MyValuesTable (
id int,
[field] sysname,
[value] nvarchar(10)
);
declare #columnString nvarchar(max)
--This recursive CTE examines the source table's columns excluding
--the 'id' column explicitly and builds a string of column names
--like so: '[Field1], [Field2], [Field3]'.
;with columnNames as (
select column_id, name
from sys.columns
where object_id = object_id('MyTable','U')
and name <> 'id'
),
columnString (id, string) as (
select
2, cast('' as nvarchar(max))
union all
select
b.id + 1, b.string + case when b.string = '' then '' else ', ' end + '[' + a.name + ']'
from
columnNames a
join columnString b on b.id = a.column_id
)
select top 1 #columnString = string from columnString order by id desc
--Now I build a query around the column names which unpivots the source and inserts into the key/value table.
declare #sql nvarchar(max)
set #sql = '
insert MyValuestable
select id, field, value
from
(select * from MyTable) b
unpivot
(value for field in (' + #columnString + ')) as unpvt'
--Query's ready to run.
exec (#sql)
select * from MyValuesTable
In case you're getting your source data from a stored procedure, you can use OPENROWSET to get the data into a table, then examine that table's column names. This link shows how to do that part.
https://stackoverflow.com/a/1228165/300242
Final note: If you use a temporary table, remember that you get the column names from tempdb.sys.columns like so:
select column_id, name
from tempdb.sys.columns
where object_id = object_id('tempdb..#MyTable','U')

Entering 40000 rows in sql server through loop

I have to make a database system that is purely on SQL Server. It's about a diagnostic lab. It should contain at least 40,000 distinct patient records. I have a table named "Patient" which contains an auto-generated ID, Name, DOB, Age and Phone number. Our teacher provided us with a dummy stored procedure which contained 2 temporary tables that has 200 names each and in the end he makes a Cartesian product which is supposed to give 40,000 distinct rows. I have used the same dummy stored procedure and modified it according to our table. But the rows inserted are only 1260 every time. Each time we run the query it does not give us more than 1260 records. I have added a part of temporary name tables and the stored procedure.
Declare #tFirstNames Table( FirstName Varchar(50) Not Null )
Declare #tLastNames Table ( LastName Varchar(50) Not Null )
Declare #tNames Table ( Id Int Identity Not Null, Name Varchar(50) Not Null)
Insert Into #tFirstNames (FirstName)
Select 'Julianne' Union All Select 'Sharyl' Union All Select 'Yoshie'
Union All Select 'Germaine' Union All Select 'Ja' Union All
Select 'Kandis' Select 'Hannelore' Union All Select 'Laquanda' Union All
Select 'Clayton' Union All Select 'Ollie' Union All
Select 'Rosa' Union All Select 'Deloras' Union All
Select 'April' Union All Select 'Garrett' Union All
Select 'Mariette' Union All Select 'Carline' Union All
Insert Into #tLastNames (LastName)
Select 'Brown' Union All Select 'Chrichton' Union All Select 'Bush'
Union All Select 'Clinton' Union All Select 'Blair'
Union All Select 'Wayne' Union All Select 'Hanks'
Union All Select 'Cruise' Union All Select 'Campbell'
Union All Select 'Turow' Union All Select 'Tracey'
Union All Select 'Arnold' Union All Select 'Derick'
Union All Select 'Nathanael' Union All Select 'Buddy'
Insert Into #tNames
Select FirstName + ' ' + LastName
From #tFirstNames, #tLastNames
Declare #iIndex Integer
Declare #iPatientTotalRecords Integer
Declare #vcName Varchar(50)
Declare #iAge Integer
--Set #iIndex = 1
Select #iPatientTotalRecords = Max(Id), #iIndex = Min(Id) From #tNames
While #iIndex <= #iPatientTotalRecords
Begin
Select #vcName = Name From #tNames Where Id = #iIndex
Set #iAge = Cast( Rand() * 70 As Integer ) + 10
Insert into Patient values
(#vcName, #iAge,
Case Cast( Rand() * 3 As Integer)
When 0 Then 'Male'
When 1 Then 'Female'
Else 'Female'
End,
Cast( Rand() * 8888889 As Integer ) + 1111111, DateAdd ( year, -#iAge, GetDate()))
Set #iIndex = #iIndex + 1
End
Possible you miss type UNION ALL -
Select 'Julianne' Union All
Select 'Sharyl' Union All
Select 'Yoshie' Union All
Select 'Germaine' Union All
Select 'Ja' Union All
Select 'Kandis' --<-- missing union all
Select 'Hannelore' Union All
Select 'Laquanda' Union All
Select 'Clayton' Union All
Select 'Ollie' Union All
Select 'Rosa' Union All
Select 'Deloras' Union All
Select 'April' Union All
Select 'Garrett' Union All
Select 'Mariette' Union All
Select 'Carline'
Try this one (without WHILE and additional variables):
DECLARE #tFirstNames TABLE (FirstName VARCHAR(50) NOT NULL)
INSERT INTO #tFirstNames (FirstName)
VALUES
('Julianne'), ('Sharyl'), ('Yoshie'), ('Germaine'),
('Ja'), ('Kandis'), ('Hannelore'), ('Laquanda'), ('Clayton'),
('Ollie'), ('Rosa'), ('Deloras'), ('April'), ('Garrett'),
('Mariette'), ('Carline')
DECLARE #tLastNames TABLE (LastName VARCHAR(50) NOT NULL)
INSERT INTO #tLastNames (LastName)
VALUES
('Brown'), ('Chrichton'), ('Bush'), ('Clinton'),
('Blair'), ('Wayne'), ('Hanks'), ('Cruise'), ('Campbell'),
('Turow'), ('Tracey'), ('Arnold'), ('Derick'),
('Nathanael'), ('Buddy')
INSERT INTO dbo.Patient (...)
SELECT
-- Possible problem: String or binary data would be truncated
d.FLName -- <-- FirstName + LastName i.e. 50 + 1 + 50 = 101 chars
, d.Age
, Gender = CASE ABS(CAST((BINARY_CHECKSUM(NEWID(), NEWID())) AS INT)) % 3
WHEN 0 THEN 'Male'
ELSE 'Female'
END
, (ABS(CAST((BINARY_CHECKSUM(NEWID(), NEWID())) AS INT)) % 8888889) + 1111111
, BirthDay = CONVERT(VARCHAR(10), DATEADD( year, -d.Age, GETDATE()), 112)
FROM (
SELECT
FLName = f.FirstName + ' ' + l.LastName
, Age = (ABS(CAST((BINARY_CHECKSUM(f.FirstName, NEWID())) AS INT)) % 70) + 10
FROM #tFirstNames f
CROSS JOIN #tLastNames l
) d

Select row data as ColumnName and Value

I have a history table and I need to select the values from this table in ColumnName, ColumnValue form. I am using SQL Server 2008 and I wasn’t sure if I could use the PIVOT function to accomplish this. Below is a simplified example of what I need to accomplish:
This is what I have:
The table’s schema is
CREATE TABLE TABLE1 (ID INT PRIMARY KEY, NAME VARCHAR(50))
The “history” table’s schema is
CREATE TABLE TABLE1_HISTORY(
ID INT,
NAME VARCHAR(50),
TYPE VARCHAR(50),
TRANSACTION_ID VARCHAR(50))
Here is the data from TABLE1_HISTORY
ID NAME TYPE TRANSACTION_ID
1 Joe INSERT a
1 Bill UPDATE b
1 Bill DELETE c
I need to extract the data from TABLE1_HISTORY into this format:
TransactionId Type ColumnName ColumnValue
a INSERT ID 1
a INSERT NAME Joe
b UPDATE ID 1
b UPDATE NAME Bill
c DELETE ID 1
c DELETE NAME Bill
Other than upgrading to Enterprise Edition and leveraging the built in change tracking functionality, what is your suggestion for accomplishing this task?
You could try using a UNION
Test Data
DECLARE #TABLE1_HISTORY TABLE (
ID INT,
NAME VARCHAR(50),
TYPE VARCHAR(50),
TRANSACTION_ID VARCHAR(50))
INSERT INTO #TABLE1_HISTORY
SELECT 1, 'Joe', 'INSERT', 'a'
UNION ALL SELECT 1, 'Bill', 'UPDATE', 'b'
UNION ALL SELECT 1, 'Bill', 'DELETE', 'c'
SQL Statement
SELECT [TransactionID] = Transaction_ID
, [Type] = [Type]
, [ColumnName] = 'ID'
, [ColumnValue] = CAST(ID AS VARCHAR(50))
FROM #Table1_History
UNION ALL
SELECT [TransactionID] = Transaction_ID
, [Type] = [Type]
, [ColumnName] = 'NAME'
, [ColumnValue] = [Name]
FROM #Table1_History
ORDER BY TransactionID
, ColumnName
This can be done with the UNPIVOT function in SQL Server:
select transaction_id,
type,
ColumnName,
ColumnValue
from
(
select transaction_id,
type,
cast(id as varchar(50)) id,
name
from TABLE1_HISTORY
) src
unpivot
(
ColumnValue
for ColumnName in (ID, Name)
) un

Resources