Separate data with substring and findstring - sql-server

I have a problem I think it's something simple but I'm just getting started on this, I have a .txt file that contains
Kayle;Osvorn;35;4399900433
What would be these my columns: First name;Last name;Age;Phone
I need to separate them through the process of transformation of the derived column into ETL but for now only the first and last name I have been able to extract and the rest I do not know how to continue.
I have this for the first two columns
Name = SUBSTRING(CustomerData,1,FINDSTRING(CustomerData,";",1) - 1)
Last Name = SUBSTRING(CustomerData,FINDSTRING(CustomerData,";",1) + 1,LEN(CustomerData))
Age = ?
Phone = ?
Does anyone have any idea how the expression would go?

There's no need to use a Derived Column transformation in an SSIS package. Instead, in your Flat File Connection Manager, define your field separator as the semicolon ; instead of the default comma ','. Indicate that it should ... identify columns and now your single column of CustomerData goes away and you have nice delimited columns.
If you have column headers, it should pull that out. Otherwise, you will need to specify no header and then go into the advanced tab and give them friendly names.

Please use this below logic to achieve your requirement-
Demo Here
DECLARE #T VARCHAR(200) = 'Kayle;Osvorn;35;4399900433'
DECLARE #index_1 INT
DECLARE #index_2 INT
DECLARE #index_3 INT
DECLARE #name VARCHAR(100)
DECLARE #last_name VARCHAR(100)
DECLARE #age VARCHAR(100)
DECLARE #phone VARCHAR(100)
SELECT #index_1 = CHARINDEX(';',#T,0) + 1
SELECT #index_2 = CHARINDEX(';',#T,#index_1 + 1) + 1
SELECT #index_3 = CHARINDEX(';',#T,#index_2 + 1) + 1
SELECT
#name = SUBSTRING(#T,0,#index_1 - 1),
#last_name = SUBSTRING(#T, #index_1 ,#index_2 - #index_1 - 1),
#age = SUBSTRING(#T,#index_2, #index_3 - #index_2 - 1),
#phone = SUBSTRING(#T,#index_3,LEN(#T))
SELECT #name,#last_name, #age,#phone

There is one simple way by doing the same operation on the REVERSEd string:
[Name] = SUBSTRING(#CustomerData,1,FINDSTRING(#CustomerData,";",1) - 1)
[Last Name] = SUBSTRING(#CustomerData, FINDSTRING(#CustomerData, ";",1) + 1,
FINDSTRING(SUBSTRING(#CustomerData, FINDSTRING(#CustomerData, ";",1)+1, LEN(#CustomerData)),";",1)-1)
Age = REVERSE(SUBSTRING(REVERSE(#CustomerData), FINDSTRING(REVERSE(#CustomerData),";",1)+1,
FINDSTRING(SUBSTRING(REVERSE(#CustomerData), FINDSTRING(";",REVERSE(#CustomerData),1) + 1, LEN(#CustomerData)),";",1)-1))
Phone = REVERSE(SUBSTRING(REVERSE(#CustomerData),1,FINDSTRING(REVERSE(#CustomerData),";",1) - 1))

If you need to do that using a transformation, why not using the TOKEN() function?
Name = TOKEN(CustomerData,";",1)
Last Name = TOKEN(CustomerData,";",2)
Age = TOKEN(CustomerData,";",3)
Phone = TOKEN(CustomerData,";",4)

Related

How to check if a list of comma separated string values match an integer?

Below I've added the SQL query.
I wanted retrieve the list of records that match a condition. I pass integer values into #ClassID and #SectionID parameters, The problem is ce.Class_ID and ce.Section_ID are lists of comma-separated string values.
SELECT ce.ID AS CircularEntryCount
FROM dbo.CircularEntry ce
WHERE ce.AcademicYearID = 1
AND (ce.Circular_Date = #CurrentDate OR CAST(ce.Created_Date AS date) = #CurrentDate)
AND (ce.CircularApplicableForID = 1 OR ce.CircularApplicableForID = 3)
AND (ce.Class_ID = #ClassID OR ce.Class_ID = '0')
AND (ce.Section_ID = #SectionID OR ce.Section_ID = '0')
PS: I used split string function to split the values into individual columns and compared the same with the parameters, but it shows.
Error converting data type nvarchar to bigint
(
#List nvarchar(2000),
#SplitOn nvarchar(1)
)
RETURNS #RtnValue table (
Id int identity(1,1),
Value nvarchar(100)
)
AS
BEGIN
While (Charindex(#SplitOn,#List)>0)
Begin
Insert Into #RtnValue (value)
Select
Value = ltrim(rtrim(Substring(#List,1,Charindex(#SplitOn,#List)-1)))
Set #List = Substring(#List,Charindex(#SplitOn,#List)+len(#SplitOn),len(#List))
End
Insert Into #RtnValue (Value)
Select Value = ltrim(rtrim(#List))
Return
EN
The correct solution is to fix the problem - which means changing the structure of the database to not store delimited strings at all, but instead normalize the data and use foreign keys.
For more information, read Is storing a delimited list in a database column really that bad?, and not only the accepted answer by Bill Karwin, but other answers as well.
In case you can't change the database structure, you can use a workaround using like:
SELECT ce.ID AS CircularEntryCount
FROM dbo.CircularEntry ce
WHERE ce.AcademicYearID = 1
AND (ce.Circular_Date = #CurrentDate OR CAST(ce.Created_Date AS date) = #CurrentDate)
AND (ce.CircularApplicableForID = 1 OR ce.CircularApplicableForID = 3)
AND (','+ ce.Class_ID +',' LIKE '%,'+ CAST(#ClassID as varchar(20)) +'%,' OR ce.Class_ID = '0')
AND (','+ ce.Section_ID +',' LIKE '%,'+ CAST(#SectionID as varchar(20)) +'%,' OR ce.Section_ID = '0')
Note the cast to varchar(20) - bigint's min value contains a minus sign and 19 digits. If the data type of #ClassID or #SectionID is int, you can cast to varchar(11) instead.

SQL Server Full Text Search to match contact name to prevent duplicates

Using SQL Server Azure or 2017 with Full Text Search, I need to return possible matches on names.
Here's the simple scenario: an administrator is entering contact information for a new employee, first name, last name, address, etc. I want to be able to search the Employee table for a possible match on the name(s) to see if this employee has already been entered in the database.
This might happen as an autosuggest type of feature, or simply display some similar results, like here in Stackoverflow, while the admin is entering the data.
I need to prevent duplicates!
If the admin enters "Bob", "Johnson", I want to be able to match on:
Bob Johnson
Rob Johnson
Robert Johnson
This will give the administrator the option of seeing if this person has already been entered into the database and choose one of those choices.
Is it possible to do this type of match on words like "Bob" and include "Robert" in the results? If so, what is necessary to accomplish this?
Try this.
You need to change the #per parameter value to your requirement. It indicates how many letters out of the length of the first name should match for the result to return. I just set it to 50% for testing purposes.
The dynamic SQL piece inside the loop adds all the CHARINDEX result per letter of the first name in question, to all existing first names.
Caveats:
Repeating letters will of course be misleading, like Bob will count 3 matches in Rob because there's 2 Bs in Bob.
I didn't consider 2 first names, like Bob Robert Johnson, etc so this will fail. You can improve on that however, but you get the idea.
The final SQL query gets the LetterMatch that is greater than or equal to the set value in #per.
DECLARE #name varchar(MAX) = 'Bobby Johnson' --sample name
DECLARE #first varchar(50) = SUBSTRING(#name, 0, CHARINDEX(' ', #name)) --get the first part of the name before space
DECLARE #last varchar(50) = SUBSTRING(#name, CHARINDEX(' ', #name) + 1, LEN(#name) - LEN(#first) - 1) --get the last part of the name after space
DECLARE #walker int = 1 --for looping
DECLARE #per float = LEN(#first) * 0.50 --declare percentage of how many letters out of the length of the first name should match. I just used 50% for testing
DECLARE #char char --for looping
DECLARE #sql varchar(MAX) --for dynamic SQL use
DECLARE #matcher varchar(MAX) = '' --for dynamic SQL use
WHILE #walker <> LEN(#first) + 1 BEGIN --loop through all the letters of the first name saved in #first variable
SET #char = SUBSTRING(#first, #walker, 1) --save the current letter in the iteration
SET #matcher = #matcher + IIF(#matcher = '', '', ' + ') + 'IIF(CHARINDEX(''' + #char + ''', FirstName) > 0, 1, 0)' --build the additional column to be added to the dynamic SQL
SET #walker = #walker + 1 --move the loop
END
SET #sql = 'SELECT * FROM (SELECT FirstName, LastName, ' + #matcher + ' AS LetterMatch
FROM TestName
WHERE LastName LIKE ' + '''%' + #last + '%''' + ') AS src
WHERE CAST(LetterMatch AS int) >= ROUND(' + CAST(#per AS varchar(50)) + ', 0)'
SELECT #sql
EXEC(#sql)
SELECT * FROM tbl_Names
WHERE Name LIKE '% user defined text %';
using a text in between % % will search those text on any position in the data.

Need help making my stored procedure more efficient

I would like some help making my SQL Server 2016 stored procedure more efficient. I got it to work and that is 50% of my battle but I know that many (if not most) of you folks have much more experience with SQL Server stored procedures than I do.
My code so far:
DECLARE #U1A nvarchar(50), #U2A nvarchar(50),
#U3A nvarchar(50), #U4A nvarchar(50),
#U5A nvarchar(50), #U6A nvarchar(50),
#U7A nvarchar(50), #U8A nvarchar(50),
#U9A nvarchar(50)
DECLARE #Jsonstring nvarchar(max)
DECLARE #recCount int
SELECT
#recCount = COUNT(*)
FROM
[dbo].[Staging_PersonalInformation]
WHERE
jsondata IS NULL
WHILE #recCount > 0
BEGIN
SELECT TOP 1
#U1A = [FirstName], #U2A = [MiddleName],
#U3A = [LastName], #U4A = [EmailAddress],
#U5A = eraCommons, #U6A = [PositionTitle],
#U7A = [MyNCBILink], #U8A = [UniqueID],
#U9A = [ReferenceID]
FROM
[dbo].[Staging_PersonalInformation]
WHERE
jsondata IS NULL
SET #Jsonstring = '[{"name":"FirstName","value":"'+isnull(#U1A, '')+'"},{"name":"Middlename","value":"'+ISNULL(#U2A, '')+'"},{"name":"LastName","value":"'+isnull(#U3A, '')+'"},{"name":"emailaddress","value":"'+isnull(#U4A, '')+'"},{"name":"eRACommons","value":"'+ISNULL(#U5A, '')+'"},{"name":"positionTitle","value":"'+ISNULL(#U6A, '')+'"},{"name":"MyNCBILink","value":"'+ISNULL(#U7A, '')+'"},{"name":" uniqueid","value":"'+ISNULL(#U8A, '')+'"},{"name":"ReferenceID","value":"'+ISNULL(#U9A, '')+'"}]'
UPDATE Staging_PersonalInformation
SET JsonData = #Jsonstring
WHERE (EmailAddress = #U4A);
SET #recCount = #recCount - 1
END
The purpose of this is to take the individual column values and make a string that my sterilized JavaScript form can repopulate. I would rather store the string than to make it on the fly each time.
Thanks for your help
Well the biggest issue is that looping is horribly inefficient. And since you are always going to update this column based on values already in the table you could use a computed column and avoid all this work entirely.
I would suggest that in the future you give your variable names something meaningful instead of just numbering them.
Here is how you could make this a computed column. You can read more about computed columns here. https://technet.microsoft.com/en-us/library/ms191250.aspx
alter table [dbo].[Staging_PersonalInformation]
add jsondata as '[{"name":"FirstName","value":"' + isnull(FirstName, '')
+ '"},{"name":"Middlename","value":"' + ISNULL(MiddleName, '')
+ '"},{"name":"LastName","value":"' + isnull(LastName, '')
+ '"},{"name":"emailaddress","value":"'+isnull(EmailAddress, '')
+ '"},{"name":"eRACommons","value":"'+ISNULL(eraCommons, '')
+ '"},{"name":"positionTitle","value":"'+ISNULL(PositionTitle, '')
+ '"},{"name":"MyNCBILink","value":"'+ISNULL(MyNCBILink, '')
+ '"},{"name":" uniqueid","value":"'+ISNULL(UniqueID, '')
+ '"},{"name":"ReferenceID","value":"'+ISNULL(ReferenceID, '')
+ '"}]'
The answer by Sean Lange is a great answer, but I am curious as to why you are not taking advantage of SQL Server 2016 support of for json.
I realize that the format is not the same as you specified, so I suppose that could be the reason. Perhaps this format would also work?:
select *
from Staging_PersonalInformation
for json auto, include_null_values
dbfiddle.uk demo: http://dbfiddle.uk/?rdbms=sqlserver_2016&fiddle=d533c6d3b82fdd7865b3817fba94037d
returns:
[{"Id":1,"FirstName":"Sean","MiddleName":null,"LastName":"Lange","EmailAddress":null,"PositionTitle":null,"MyNCBILink":null,"UniqueID":"6E6732A9-9FC9-4B6E-8695-AF6BB2DA2152","ReferenceID":0}
,{"Id":2,"FirstName":"Sql","MiddleName":null,"LastName":"Zim","EmailAddress":null,"PositionTitle":null,"MyNCBILink":null,"UniqueID":"FA33808B-E8BE-41B5-AA89-DA8A37503F8F","ReferenceID":0}]
Reference:
JSON support in SQL Server 2016 - Robert Sheldon
JSON Data
Include Null Values in JSON - include_null_values Option

Generate column name dynamically in sql server

Please look at the below query..
select name as [Employee Name] from table name.
I want to generate [Employee Name] dynamically based on other column value.
Here is the sample table
s_dt dt01 dt02 dt03
2015-10-26
I want dt01 value to display as column name 26 and dt02 column value will be 26+1=27
I'm not sure if I understood you correctly. If I'am going into the wrong direction, please add comments to your question to make it more precise.
If you really want to create columns per sql you could try a variation of this script:
DECLARE #name NVARCHAR(MAX) = 'somename'
DECLARE #sql NVARCHAR(MAX) = 'ALTER TABLE aps.tbl_Fabrikkalender ADD '+#name+' nvarchar(10) NULL'
EXEC sys.sp_executesql #sql;
To retrieve the column name from another query insert the following between the above declares and fill the placeholders as needed:
SELECT #name = <some colum> FROM <some table> WHERE <some condition>
You would need to dynamically build the SQL as a string then execute it. Something like this...
DECLARE #s_dt INT
DECLARE #query NVARCHAR(MAX)
SET #s_dt = (SELECT DATEPART(dd, s_dt) FROM TableName WHERE 1 = 1)
SET #query = 'SELECT s_dt'
+ ', NULL as dt' + RIGHT('0' + CAST(#s_dt as VARCHAR), 2)
+ ', NULL as dt' + RIGHT('0' + CAST((#s_dt + 1) as VARCHAR), 2)
+ ', NULL as dt' + RIGHT('0' + CAST((#s_dt + 2) as VARCHAR), 2)
+ ', NULL as dt' + RIGHT('0' + CAST((#s_dt + 3) as VARCHAR), 2)
+ ' FROM TableName WHERE 1 = 1)
EXECUTE(#query)
You will need to replace WHERE 1 = 1 in two places above to select your data, also change TableName to the name of your table and it currently puts NULL as the dynamic column data, you probably want something else there.
To explain what it is doing:
SET #s_dt is selecting the date value from your table and returning only the day part as an INT.
SET #query is dynamically building your SELECT statement based on the day part (#s_dt).
Each line is taking #s_dt, adding 0, 1, 2, 3 etc, casting as VARCHAR, adding '0' to the left (so that it is at least 2 chars in length) then taking the right two chars (the '0' and RIGHT operation just ensure anything under 10 have a leading '0').
It is possible to do this using dynamic SQL, however I would also consider looking at the pivot operators to see if they can achieve what you are after a lot more efficiently.
https://technet.microsoft.com/en-us/library/ms177410(v=sql.105).aspx

vb table adapter does not allow more than one parameter in the IN clause

What I need to achieve is to send a list of unknown QTY of values to a Sql server NOT IN clause but can only achieve this with singular values. below is my Sql statement:
SELECT SorMaster.LastInvoice
, SorMaster.SalesOrder
, SorMaster.OrderStatus
, ArCustomer.RouteCode
, SorMaster.Customer
, SorMaster.CustomerName
, SorMaster.CustomerPoNumber
, SorMaster.OrderDate
, SorMaster.DateLastInvPrt
, ArInvoice.InvoiceBal1
, ArInvoice.TermsCode
FROM SorMaster AS SorMaster
INNER JOIN ArCustomer AS ArCustomer ON SorMaster.Customer = ArCustomer.Customer
INNER JOIN ArInvoice AS ArInvoice ON SorMaster.LastInvoice = ArInvoice.Invoice
WHERE (SorMaster.OrderStatus = '9')
AND (SorMaster.Branch LIKE 'J%')
AND (SorMaster.DocumentType = 'O')
AND (SorMaster.LastInvoice > #Last_Invoice)
AND (SorMaster.OrderDate > DATEADD(Month, - 4, GETDATE()))
AND (SorMaster.LastInvoice NOT IN (#ExclusionList))
ORDER BY SorMaster.LastInvoice
The #ExclusionList value is generated by this code as a string from a listbox:
Dim exclusion As String = ""
If MenuForm.ExclusionCB.Checked = True Then
For i = 0 To MenuForm.ExclusionLB.Items.Count - 2
exclusion = exclusion & MenuForm.ExclusionLB.Items(i) & ","
Next
exclusion = exclusion & MenuForm.ExclusionLB.Items(MenuForm.ExclusionLB.Items.Count - 1)
Else
exclusion = ""
End If
I have also tried sending the entire listbox as a collection.
Does anyone know how I can send more than one value (something like 1,2,3,4,5,6) and have sql understand that these is more than one? I won't have an issue with the SELECT statement changing, just as long as it returns the same information.
The reason I need this with the exception list, is our remote DB PK is on the Salesorder column and the local DB is on the LastInvoice column
Hope this makes sense. if you need more info, please let me know
You can send it as a string and use dynamic sql. Here's a simple example how to do that.
DECLARE #vals VARCHAR(50) = '1,2,3,4,5,6'
DECLARE #sql VARCHAR(MAX) = 'SELECT * FROM TABLE WHERE FIELD1 IN'
SET #sql = #sql + ' (' + #vals + ')'
-- #sql = 'SELECT * FROM TABLE WHERE FIELD1 IN (1,2,3,4,5,6)'
EXEC (#sql)

Resources