get lowercase data in SQL server - sql-server

I have data like below in one of the column in table.
john;144;ny;
Nelson;154;NY;
john;144;NC;
john;144;kw;
I want to retrieve the rows which has lowercase in 3rd part of the data
so i need to get
john;144;kw;
john;144;ny;
is possible to get the data like this?

Force a case-sensitive matching, and then compare forced-lowercase to original:
SELECT ...
FROM ..
WHERE LOWER(name) = name COLLATE Latin1_General_CS_AS
^^---case sensitive
If the name is all-lower to start with, then LOWER() won't change it, and you'll get a match. If it's something like John, then you'd be doing john = John and the case-sensitivity will fail the match.

This does not really answer your question, it certainly adds nothing to Marc's existing answer in terms of resolving your actual problem, it is merely meant as a demonstration of how simple it is to correct your design (this whole script runs in about a second on my local express instance of SQL Server 2012).
CREATE TABLE dbo.T
(
ThreePartData VARCHAR(60)
);
-- INSERT 20,000 ROWS
INSERT dbo.T (ThreePartData)
SELECT t.ThreePartName
FROM (VALUES ('john;144;ny;'), ('Nelson;154;NY;'), ('john;144;NC;'), ('john;144;kw;')) t (ThreePartName)
CROSS JOIN
( SELECT TOP (5000) Number = 1
FROM sys.all_objects a
CROSS APPLY sys.all_objects b
) n;
GO
-- HERE IS WHERE THE CHANGES START
/**********************************************************************/
-- ADD A COLUMN FOR EACH COMPONENT
ALTER TABLE dbo.T ADD PartOne VARCHAR(20),
PartTwo VARCHAR(20),
PartThree VARCHAR(20);
GO
-- UPDATE THE PARTS WITH THEIR CORRESPONDING COMPONENT
UPDATE dbo.T
SET PartOne = PARSENAME(REPLACE(ThreePartData, ';', '.') + 't', 4),
PartTwo = PARSENAME(REPLACE(ThreePartData, ';', '.') + 't', 3),
PartThree = PARSENAME(REPLACE(ThreePartData, ';', '.') + 't', 2);
GO
-- GET RID OF CURRENT COLUMN
ALTER TABLE dbo.T DROP COLUMN ThreePartData;
GO
-- CREATE A NEW COMPUTED COLUMN THAT REBUILDS THE CONCATENATED STRING
ALTER TABLE dbo.T ADD ThreePartData AS CONCAT(PartOne, ';', PartTwo, ';', PartThree, ';');
GO
-- OR FOR VERSIONS BEFORE 2012
--ALTER TABLE dbo.T ADD ThreePartData AS PartOne + ';' + PartTwo + ';' + PartThree + ';';
Then your query is as simple as:
SELECT *
FROM T
WHERE LOWER(PartThree) = PartThree COLLATE Latin1_General_CS_AS;
And since you have recreated a computed column with the same name, any select statements in use will not be affected, although updates and inserts will need addressing.

Using BINARY_CHECKSUM we can retrieve the rows which has lowercase or upper case
CREATE TABLE #test
(
NAME VARCHAR(50)
)
INSERT INTO #test
VALUES ('john;144;ny;'),
('Nelson;154;NY;'),
('john;144;NC;'),
('john;144;kw;')
SELECT *
FROM #test
WHERE Binary_checksum(NAME) = Binary_checksum(Lower(NAME))
OUTPUT
name
-----------
john;144;ny;
john;144;kw;

Related

How to make SQL server recognize unique Thai characters?

Here's my table:
id name
1 ទឹក កាបូន
2 លីអូ បៀរ
3 ស្របៀរ ២៤
4 ស្រាបៀរ ឌ្រាប់
When I query using this statement: SELECT * FROM t1 WHERE name = N'លីអូ បៀរ', it returns all rows. This is weird since those Thai characters are different.
Also, this will cause an issue if I make the name column as unique. Does anyone encounter the same issue and what is the possible fix? I tried changing the collation but still no avail.
How to make SQL server recognize unique Thai characters?
Bing translate identifies those characters as Khmer, not Thai. So you need to pick a collation that has language-specific rules for those characters, eg
drop table if exists t1
create table t1(id int, name nvarchar(200) collate Khmer_100_CI_AI )
insert into t1(id,name)
values (1, N'ទឹក កាបូន'),(2, N'លីអូ បៀរ'),(3, N'ស្របៀរ ២៤'),(4, N'ស្រាបៀរ ឌ្រាប់')
SELECT * FROM t1 WHERE name = N'លីអូ បៀរ'
Or use binary collation, that simply compares the characters by their code point values. eg
drop table if exists t1
create table t1(id int, name nvarchar(200) collate Latin1_General_100_BIN2 )
insert into t1(id,name)
values (1, N'ទឹក កាបូន'),(2, N'លីអូ បៀរ'),(3, N'ស្របៀរ ២៤'),(4, N'ស្រាបៀរ ឌ្រាប់')
SELECT * FROM t1 WHERE name = N'លីអូ បៀរ'
Even some of the newer Latin collations will work, eg
drop table if exists t1
create table t1(id int, name nvarchar(200) collate Latin1_General_100_CI_AI )
insert into t1(id,name)
values (1, N'ទឹក កាបូន'),(2, N'លីអូ បៀរ'),(3, N'ស្របៀរ ២៤'),(4, N'ស្រាបៀរ ឌ្រាប់')
SELECT * FROM t1 WHERE name = N'លីអូ បៀរ'

SQL Server - parameter with unknown number of multiple values

I'm creating a grid that has two columns: Name and HotelId. The problem is that data for this grid should be sent with a single parameter of VARCHAR type and should look like this:
#Parameter = 'Name1:5;Name2:10;Name3:6'
As you can see, the parameter contains Name and a number that represents ID value and you can have multiple such entries, separated by ";" symbol.
My first idea was to write a query that creates a temp table that will have two columns and populate it with data from the parameter.
How could I achieve this? It seems like I need to split the parameter two times: by the ";" symbol for each row and then by ":" symbol for each column.
How should I approach this?
Also, if there is any other more appropriate solution, I'm open to suggestions.
First Drop the #temp table if Exists...
IF OBJECT_ID('tempdb..#temp', 'U') IS NOT NULL
/*Then it exists*/
DROP TABLE #temp
Then create #temp table
CREATE TABLE #temp (v1 VARCHAR(100))
Declare all the #Paramter....
DECLARE #Parameter VARCHAR(50)
SET #Parameter= 'Name1:5;Name2:10;Name3:6'
DECLARE #delimiter nvarchar(1)
SET #delimiter= N';';
Here, Inserting all #parameter value into #temp table using ';' separated..
INSERT INTO #temp(v1)
SELECT * FROM(
SELECT v1 = LTRIM(RTRIM(vals.node.value('(./text())[1]', 'nvarchar(4000)')))
FROM (
SELECT x = CAST('<root><data>' + REPLACE(#Parameter, #delimiter, '</data><data>') + '</data></root>' AS XML).query('.')
) v
CROSS APPLY x.nodes('/root/data') vals(node)
)abc
After inserting the value into #temp table..get all the value into ':' seprated...
select Left(v1, CHARINDEX(':', v1)-1) as Name , STUFF(v1, 1, CHARINDEX(':', v1), '') as HotelId FROM #temp
Then you will get this type of Output

Execution time for a SQL Query seems excessive

I have a data set of about 33 million rows and 20 columns. One of the columns is a raw data tab I'm using to extract relevant data from, inlcuding ID's and account numbers.
I extracted a column for User ID's into a temporary table to trim the User ID's of spaces. I'm now trying to add the trimmed User ID column back into the original data set using this code:
SELECT *
FROM [dbo].[DATA] AS A
INNER JOIN #TempTable AS B ON A. [RawColumn] = B. [RawColumn]
Extracting the User ID's and trimming the spaces took about a minute for each query. However, running this last query I'm at the 2 hour mark and I'm only 2% of the way through the dataset.
Is there a better way to run the query?
I'm running the query in SQL Server 2014 Management Studio
Thanks
Update:
I continued to let it run through the night. When I got back into work, only 6 million rows had been completed of the 33 million rows. I cancelled the execution and I'm trying to add a smaller primary key (The only other key I could see on the table was the [RawColumn], which was a very long string of text) using:
ALTER TABLE [dbo].[DATA]
ADD ID INT IDENTITY(1,1)
Right now I'm an hour into the execution.
Next, I'm planning to make it the primary key using
ALTER TABLE dbo.[DATA]
ADD CONSTRAINT PK_[DATA] PRIMARY KEY(ID)
I'm not familiar with using Indexes.. I've tried looking up on Stack Overflow how to create one, but from what I'm reading it sounds like it would take just as long to create an index as it would to run this query. Am I wrong about that?
For context on the RawColumn data, it looks something like this:
FirstName: John LastName: Smith UserID: JohnS Account#: 000-000-0000
Update #2:
I'm now learning that using "ALTER TABLE" is a bad idea. I should have done a little bit more research into how to add a primary key to a table.
Update #3
Here's the code I used to extract the "UserID" code out of the "RawColumn" data.
DROP #TEMPTABLE1
GO
SELECT [RAWColumn],
SUBSTRING([RAWColumn], CHARINDEX('USERID:', [RAWColumn])+LEN('USERID:'), CHARINDEX('Account#:', [RAWColumn])-Charindex('Username:', [RAWColumn]) - LEN('Account#:') - LEN('USERID:')) AS 'USERID_NEW'
INTO #TempTable1
FROM [dbo].[DATA]
Next I trimmed the data from the temporary tables
DROP #TEMPTABLE2
GO
SELECT [RawColumn],
LTRIM([USERID_NEW]) AS 'USERID_NEW'
INTO #TempTable2
FROM #TempTable1
So now I'm trying to get the data from #TEMPTABLE2 back into my original [DATA] table. Hopefully this is more clear now.
So I think your parsing code is a little bit wrong. Here's an approach that doesn't assume that the values appear in any particular order. It does assume that the header/tag name has a space after the colon character and it assumes that the value end at the subsequent space character. Here's a snippet that manipulates a single value.
declare #dat varchar(128) = 'FirstName: John LastName: Smith UserID: JohnS Account#: 000-000-0000';
declare #tag varchar(16) = 'UserID: ';
/* datalength() counts the trailing space character unlike len() */
declare #idx int = charindex(#tag, #dat) + datalength(#tag);
select substring(#dat, #idx, charindex(' ', #dat + ' ', #idx + 1) - #idx) as UserID
To use it in a single query without the temporary variable, the most straightforward approach is to just replace each instance of "#idx" with the original expression:
declare #tag varchar(16) = 'UserID: ';
select RawColumn,
substring(
RawColumn,
charindex(#tag, RawColumn) + datalength(#tag),
charindex(
' ', RawColumn + ' ',
charindex(#tag, RawColumn) + datalength(#tag) + 1
) - charindex(#tag, RawColumn) + datalength(#tag)
) as UserID
from dbo.DATA;
As an update it looks something like this:
declare #tag varchar(16) = 'UserID: ';
update dbo.DATA
set UserID =
substring(
RawColumn,
charindex(#tag, RawColumn) + datalength(#tag),
charindex(
' ', RawColumn + ' ',
charindex(#tag, RawColumn) + datalength(#tag) + 1
) - charindex(#tag, RawColumn) + datalength(#tag)
) as UserID;
You also appear to be ignoring upper/lower case in your string matches. It's not clear to me whether you need to consider that more carefully.

How to find specific characters in a string and replace them with values fetched from a table in SQL Server

I have text stored in the table "StructureStrings"
Create Table StructureStrings(Id INT Primary Key,String nvarchar(4000))
Sample Data:
Id String
1 Select * from Employee where Id BETWEEN ### and ### and Customer Id> ###
2 Select * from Customer where Id BETWEEN ### and ###
3 Select * from Department where Id=###
and I want to replace the "###" word with a values fetched from another table
named "StructureValues"
Create Table StructureValues (Id INT Primary Key,Value nvarcrhar(255))
Id Value
1 33
2 20
3 44
I want to replace the "###" token present in the strings like
Select * from Employee where Id BETWEEN 33 and 20 and Customer Id> 44
Select * from Customer where Id BETWEEN 33 and 20
Select * from Department where Id=33
PS: 1. Here an assumption is that the values will be replaced with the tokens in the same order i.e first occurence of "###" will be replaced by first value of
"StructureValues.Value" column and so on.
Posting this as a new answer, rather than editting my previous.
This uses Jeff Moden's DelimitedSplit8K; it does not use the built in splitter available in SQL Server 2016 onwards, as it does not provide an item number (thus no join criteria).
You'll need to firstly put the function on your server, then you'll be able to use this. DO NOT expect it to perform well. There's a lot of REPLACE in this, which will hinder performance.
SELECT (SELECT REPLACE(DS.Item, '###', CONVERT(nvarchar(100), SV.[Value]))
FROM StructureStrings sq
CROSS APPLY DelimitedSplit8K (REPLACE(sq.String,'###','###|'), '|') DS --NOTE this uses a varchar, not an nvarchar, you may need to change this if you really have Unicode characters
JOIN StructureValues SV ON DS.ItemNumber = SV.Id
WHERE SS.Id = sq.id
FOR XML PATH ('')) AS NewString
FROM StructureStrings SS;
If you have any question, please place the comments on this answer; do not put them under the question which has already become quite a long discussion.
Maybe this is what you are looking for.
DECLARE #Employee TABLE (Id int)
DECLARE #StructureValues TABLE (Id int, Value int)
INSERT INTO #Employee
VALUES (1), (2), (3), (10), (15), (20), (21)
INSERT INTO #StructureValues
VALUES (1, 10), (2, 20)
SELECT *
FROM #Employee
WHERE Id BETWEEN (SELECT MIN(Value) FROM #StructureValues) AND (SELECT MAX(Value) FROM #StructureValues)
Very different take here:
CREATE TABLE StructureStrings(Id int PRIMARY KEY,String nvarchar(4000));
INSERT INTO StructureStrings
VALUES (1,'SELECT * FROM Employee WHERE Id BETWEEN ### AND ###'),
(2,'SELECT * FROM Customer WHERE Id BETWEEN ### AND ###');
CREATE TABLE StructureValues (Id int, [Value] int);
INSERT INTO StructureValues
VALUES (1,10),
(2,20);
GO
DECLARE #SQL nvarchar(4000);
--I'm asuming that as you gave one output you are supplying an ID or something?
DECLARE #Id int = 1;
WITH CTE AS(
SELECT SS.Id,
SS.String,
SV.[Value],
LEAD([Value]) OVER (ORDER BY SV.Id) AS NextValue,
STUFF(SS.String,PATINDEX('%###%',SS.String),3,CONVERT(varchar(10),[Value])) AS ReplacedString
FROM StructureStrings SS
JOIN StructureValues SV ON SS.Id = SV.Id)
SELECT #SQL = STUFF(ReplacedString,PATINDEX('%###%',ReplacedString),3,CONVERT(varchar(10),NextValue))
FROM CTE
WHERE Id = #Id;
PRINT #SQL;
--EXEC (#SQL); --yes, I should really be using sp_executesql
GO
DROP TABLE StructureValues;
DROP TABLE StructureStrings;
Edit: Note that Id 2 will return NULL, as there isn't a value to LEAD to. If this needs to change, we'll need more logic on what the value should be if there is not value to LEAD to.
Edit 2: This was based on the OP's original post, not what he puts it as later. As it currently stands, it's impossible.

Most effective way to check sub-string exists in comma-separated string in SQL Server

I have a comma-separated list column available which has values like
Product1, Product2, Product3
I need to search whether the given product name exists in this column.
I used this SQL and it is working fine.
Select *
from ProductsList
where productname like '%Product1%'
This query is working very slowly. Is there a more efficient way I can search for a product name in the comma-separated list to improve the performance of the query?
Please note I have to search comma separated list before performing any other select statements.
user defined functions for comma separation of the string
Create FUNCTION [dbo].[BreakStringIntoRows] (#CommadelimitedString varchar(max))
RETURNS #Result TABLE (Column1 VARCHAR(max))
AS
BEGIN
DECLARE #IntLocation INT
WHILE (CHARINDEX(',', #CommadelimitedString, 0) > 0)
BEGIN
SET #IntLocation = CHARINDEX(',', #CommadelimitedString, 0)
INSERT INTO #Result (Column1)
--LTRIM and RTRIM to ensure blank spaces are removed
SELECT RTRIM(LTRIM(SUBSTRING(#CommadelimitedString, 0, #IntLocation)))
SET #CommadelimitedString = STUFF(#CommadelimitedString, 1, #IntLocation, '')
END
INSERT INTO #Result (Column1)
SELECT RTRIM(LTRIM(#CommadelimitedString))--LTRIM and RTRIM to ensure blank spaces are removed
RETURN
END
Declare #productname Nvarchar(max)
set #productname='Product1,Product2,Product3'
select * from product where [productname] in(select * from [dbo].[![enter image description here][1]][1][BreakStringIntoRows](#productname))
Felix is right and the 'right answer' is to normalize your table. Although, maybe you have 500k lines of code that expect this column to exist as it is. So your next best (non-destructive) answer is:
Create a table to hold normalize data:
CREATE TABLE ProductsList2 (ProductId INT, ProductName VARCHAR)
Create a TRIGGER that on UPDATE/INSERT/DELETE maintains ProductList2 by splitting the string 'Product1,Product2,Product3' into three records.
Index your new table.
Query against your new table:
SELECT *
FROM ProductsList
WHERE ProductId IN (SELECT x.ProductId
FROM ProductsList2 x
WHERE x.ProductName = 'Product1')

Resources