How to split single column into 2 columns by delimiter

How to split single column into 2 columns by delimiter - sql-server

I have a table with the following schema
a | b | c
qqq | www | ddd/ff
fff | ggg | xx/zz
jjj | gwq | as/we
How would I write a query so my data comes as
a | b | c_1 | c_2
qqq | www | ddd | ff

declare #t table(a varchar(20),b varchar(20),c varchar(20))
insert into #t values('qqq','www','ddd/ff')
SELECT a, b,
left(c,charindex('/',c)-1) As c_1,
right(c,charindex('/',reverse(c))-1) As c_2
FROM #t
or, if column c does not always have the format xxx/yyy, you need to validate charindex position:
declare #t table(a varchar(20),b varchar(20),c varchar(20))
insert into #t values('qqq','www','ddd/ff'), ('qqq','www','dddff')
SELECT a, b,
case when charindex('/',c) > 0 then left(c,charindex('/',c)-1) else c end As c_1,
case when charindex('/',c) > 0 then right(c,charindex('/',reverse(c))-1) else null end As c_2
FROM #t

You can use as follows :
select LEFT(name, CHARINDEX('/', name)-1) from test_table;
where it returns the left part of the string name, before slash, and the following command returns the right part, after slash.
select RIGHT(name, CHARINDEX('/', name)-1) from test_table;
I did a whole example as you can see:
create table test_table ( name varchar(50), substr1 varchar(50), substr2 varchar(50));
insert into test_table(name) values ('sub1/sub2');
update test_table set substr1 =
(select LEFT(name, CHARINDEX('/', name)-1) from test_table);
update test_table set substr2 =
(select RIGHT(name, CHARINDEX('/', name)-1) from test_table);
select * from test_table;
The result is :
name | substr1 | substr2
sub1/sub2 | sub1 | sub2

Patindex can also be used instead of Charindex
SELECT a,b,LEFT(c,PATINDEX('%/%',c)-1), RIGHT(c,PATINDEX('%/%',REVERSE(c))-1) FROM #t

Related

SQL Dynamic Charindex

I have a field in a sql table but I need to parse it via charindex, but the lttle caveat is, I don't know how many pieces there are.
The field data would look like the following:
(Image: "filename=a.jpg"), (Image: "filename=b.jpg")
But the question I'm not sure how many filenames there will be in this string, so i need to dynamically build this out this could be 1 or this could be 100.
Any suggestions?
Thanks

Since you cannot know in advance how many values you will extract from each value, I would suggest to represent the results as records, not columns.
If you are using SQL Server 2016 or higher, you can use function STRING_SPLIT() to turn CSV parts to records. Then, SUBSTRING() and CHARINDEX() can be used to extract the relevant information:
declare #t table ([txt] varchar(200))
insert into #t VALUES ('(Image: "filename=a.jpg"),(Image: "filename=b.jpg")')
SELECT value, SUBSTRING(
value,
CHARINDEX('=', value) + 1,
LEN(value) - CHARINDEX('=', value) - 2
)
FROM #t t
CROSS APPLY STRING_SPLIT(t.txt , ',')
Demo on DB Fiddle:
DECLARE #t table ([txt] varchar(200))
INSERT INTO #t VALUES ('(Image: "filename=a.jpg"),(Image: "filename=b.jpg")')
SELECT value, SUBSTRING(
value,
CHARINDEX('=', value) + 1,
LEN(value) - CHARINDEX('=', value) - 2
)
FROM #t t
CROSS APPLY STRING_SPLIT(t.txt , ',')
GO
value | (No column name)
:------------------------ | :---------------
(Image: "filename=a.jpg") | a.jpg
(Image: "filename=b.jpg") | b.jpg
NB : this assumes that the value to extract is always located after the first equal sign and until 2 characters before the end of string. If the pattern is different, you may need to adapt the SUBSTRING()/CHARINDEX() calls.

The real issue is: This is breaking 1.NF. You should never ever store more than one piece of data in one cell. Such CSV-formats are a pain in the neck and you really should use a related side table to store your image hints one by one.
Nevertheless, this can be handled:
--A mockup table
DECLARE #mockup TABLE(ID INT IDENTITY,YourString VARCHAR(1000));
INSERT INTO #mockup VALUES
('(Image: "filename=a.jpg"), (Image: "filename=b.jpg") ')
,('(Image: "filename=aa.jpg"), (Image: "filename=bb.jpg"), (Image: "filename=cc.jpg"), (Image: "filename=dd.jpg"), (Image: "filename=ee.jpg")');
--Pick one element by its position:
DECLARE #position INT=2;
SELECT CAST('<x>' + REPLACE(t.YourString,',','</x><x>') + '</x>' AS XML)
.value('/x[position()=sql:variable("#position")][1]','nvarchar(max)')
FROM #mockup t;
The trick is, to transform the string to XML and use XQuery to fetch the needed element by its position. The intermediate XML looks like this:
<x>(Image: "filename=a.jpg")</x>
<x> (Image: "filename=b.jpg") </x>
You can use some more replacements and L/RTRIM() to get it cleaner.
Read table data
And if you want to create a clean side table and you need all data neatly separated, you can use a bit more of the same:
SELECT CAST('<x><y><z>'
+ REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(
t.YourString,'(','') --no opening paranthesis
,')','') --no closing paranthesis
,'"','') --no quotes
,' ','') --no blanks
,'=','</z><z>') --Split at "="
,':','</z></y><y><z>') --Split at ":"
,',','</z></y></x><x><y><z>') --Split at ","
+ '</z></y></x>' AS XML)
FROM #mockup t;
This returns
<x>
<y>
<z>Image</z>
</y>
<y>
<z>filename</z>
<z>a.jpg</z>
</y>
</x>
<x>
<y>
<z>Image</z>
</y>
<y>
<z>filename</z>
<z>b.jpg</z>
</y>
</x>
And with this you would get a clean EAV-table (
WITH Casted AS
(
SELECT ID
,CAST('<x><y><z>'
+ REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(
t.YourString,'(','')
,')','')
,'"','')
,' ','')
,'=','</z><z>')
,':','</z></y><y><z>')
,',','</z></y></x><x><y><z>')
+ '</z></y></x>' AS XML) AS CastedToXml
FROM #mockup t
)
SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS ID
,ID AS oldId
,eachElement.value('y[1]/z[1]','varchar(max)') AS DataType
,eachElement.value('y[2]/z[1]','varchar(max)') AS ContentType
,eachElement.value('y[2]/z[2]','varchar(max)') AS Content
FROM Casted
CROSS APPLY CastedToXml.nodes('/x') A(eachElement)
The result
+----+-------+----------+-------------+---------+
| ID | oldId | DataType | ContentType | Content |
+----+-------+----------+-------------+---------+
| 1 | 1 | Image | filename | a.jpg |
+----+-------+----------+-------------+---------+
| 2 | 1 | Image | filename | b.jpg |
+----+-------+----------+-------------+---------+
| 3 | 2 | Image | filename | aa.jpg |
+----+-------+----------+-------------+---------+
| 4 | 2 | Image | filename | bb.jpg |
+----+-------+----------+-------------+---------+
| 5 | 2 | Image | filename | cc.jpg |
+----+-------+----------+-------------+---------+
| 6 | 2 | Image | filename | dd.jpg |
+----+-------+----------+-------------+---------+
| 7 | 2 | Image | filename | ee.jpg |
+----+-------+----------+-------------+---------+

I used a table value function
ALTER FUNCTION [dbo].[Fn_sqllist_to_table](#list AS VARCHAR(8000),
#delim AS VARCHAR(10))
RETURNS #listTable TABLE(
Position INT,
Value VARCHAR(8000))
AS
BEGIN
DECLARE #myPos INT
SET #myPos = 1
WHILE Charindex(#delim, #list) > 0
BEGIN
INSERT INTO #listTable
(Position,Value)
VALUES (#myPos,LEFT(#list, Charindex(#delim, #list) - 1))
SET #myPos = #myPos + 1
IF Charindex(#delim, #list) = Len(#list)
INSERT INTO #listTable
(Position,Value)
VALUES (#myPos,'')
SET #list = RIGHT(#list, Len(#list) - Charindex(#delim, #list))
END
IF Len(#list) > 0
INSERT INTO #listTable
(Position,Value)
VALUES (#myPos,#list)
RETURN
END
By calling it via
select * into #test from tableX as T
cross apply [Fn_sqllist_to_table](fieldname,'(')
and then just substringed the value into the final table

How to remove extension dates in SQL Server

How to remove extension dates in SQL server?
FileName | id
-------------------------+---
c:\abc_20181008.txt | 1
c:\xyz_20181007.dat | 2
c:\abc_xyz_20181007.dat | 3
c:\ab.xyz_20181007.txt | 4
Based on above data I want output like below :
Table: emp
FileName | id
-------------------+---
c:\abc.txt | 1
c:\xyz.dat | 2
c:\abc_xyz.dat | 3
c:\ab.xyz.txt | 4
I have tried like this:
select
substring (Filename, replace(filename, '.', ''), len(filename)), id
from
emp
But this query is not returning the expected result in SQL Server.
Please tell me how to write a query to achieve this task in SQL Server.

You can use the following query:
SELECT id, filename,
LEFT(filename, LEN(filename) - i1) + RIGHT(filename, i2 - 1)
FROM emp
CROSS APPLY
(
SELECT CHARINDEX('_', REVERSE(filename)) AS i1,
PATINDEX('%[0-9]%', REVERSE(filename)) AS i2
) AS x
Demo here

You can try this as well:
declare #t table (a varchar(50))
insert into #t values ('c:\abc_20181008.txt')
insert into #t values ('c:\abc_xyz_20181007.dat')
insert into #t values ('c:\ab.xyz_20181007.txt')
insert into #t values ('c:\ab.xyz_20182007.txt')
select replace(SUBSTRING(a,1,CHARINDEX('2',a) - 1) + SUBSTRING(a,len(a)-3,LEN(a)),'_.','.') from #t

How to check what column in INSERT do not have the correct data type?

Imagine I have 200 columns in one INSERT statement, and I occasionally get an "Cannot convert" error for one of columns. Things is, I do not know which column causes this error.
Is there any way in T-SQL or mybatis to check WHICH column has the incorrect format? (I have just date, char, numeric). I can use ISNUMERIC, ISDATE for every column, but this is not so elegant.
I'm using mybatis in Java, so I cannot use any PreparedStatement or so.

You could build a query that tries to convert each of the suspected columns.
And limit the query to where one of the attempts to convert fails.
Mostly the bad data will be in CHAR's or VARCHAR's when trying to cast or convert them to a datetime or number type.
So you can limit your research to those.
Also, from the error you should see which value failed to convert to which type. Which can also help to limit which fields you research.
A simplified example using table variables:
declare #T1 table (id int identity(1,1) primary key, field1 varchar(30), field2 varchar(30), field3 varchar(30));
declare #T2 table (id int identity(1,1) primary key, field1_int int, field2_date date, field3_dec decimal(10,2));
insert into #T1 (field1, field2, field3) values
('1','2018-01-01','1.23'),
('not an int','2018-01-01','1.23'),
('1','not a date','1.23'),
('1','2018-01-01','not a decimal'),
(null,'2018-01-01','1.23'),
('1',null,'1.23'),
('1','2018-01-01',null)
;
select top 1000
id,
case when try_convert(int, field1) is null then field1 end as field1,
case when try_convert(date, field2) is null then field2 end as field2,
case when try_convert(decimal(10,4), field3) is null then field3 end as field3
from #T1
where
try_convert(int, coalesce(field1, '0')) is null
or try_convert(date, coalesce(field2, '1900-01-01')) is null
or try_convert(decimal(10,4), coalesce(field3, '0.0')) is null;
Returns:
id field1 field2 field3
-- ---------- ----------- -------------
2 not an int NULL NULL
3 NULL not a date NULL
4 NULL NULL not a decimal
If the origin data doesn't have to much bad data you could try to fix the origin data first.
Or use the try_convert for the problematic columns with bad data.
For example:
insert into #T2 (field1_int, field2_date, field3_dec)
select
try_convert(int, field1),
try_convert(date, field2),
try_convert(decimal(10,4), field3)
from #T1;

With larger imports - especially when you expect issues - a two-stepped approach is highly recommended.
import the data to a very tolerant staging table (all NVARCHAR(MAX))
check, evaluate, manipulate, correct whatever is needed and do the real insert from here
Here is a generic approach you might adapt to your needs. It will check all tables values against a type-map-table and output all values, which fail in TRY_CAST (needs SQL-Server 2012+)
A table to mockup the staging table (partly borrowed from LukStorms' answer - thx!)
CREATE TABLE #T1 (id INT IDENTITY(1,1) PRIMARY KEY
,fldInt VARCHAR(30)
,fldDate VARCHAR(30)
,fldDecimal VARCHAR(30));
GO
INSERT INTO #T1 (fldInt, fldDate, fldDecimal) values
('1','2018-01-01','1.23'),
('blah','2018-01-01','1.23'),
('1','blah','1.23'),
('1','2018-01-01','blah'),
(null,'2018-01-01','1.23'),
('1',null,'1.23'),
('1','2018-01-01',null);
--a type map (might be taken from INFORMATION_SCHEMA of an existing target table automatically)
DECLARE #type_map TABLE(ColumnName VARCHAR(100),ColumnType VARCHAR(100));
INSERT INTO #type_map VALUES('fldInt','int')
,('fldDate','date')
,('fldDecimal','decimal(10,2)');
--The staging table's name
DECLARE #TableName NVARCHAR(100)='#T1';
--dynamically created statements for each column
DECLARE #columnSelect NVARCHAR(MAX)=
(SELECT
' UNION ALL SELECT id ,''' + tm.ColumnName + ''',''' + tm.ColumnType + ''',' + QUOTENAME(tm.ColumnName)
+ ',CASE WHEN TRY_CAST(' + QUOTENAME(tm.ColumnName) + ' AS ' + tm.ColumnType + ') IS NULL THEN 0 ELSE 1 END ' +
'FROM ' + QUOTENAME(#TableName)
FROM #type_map AS tm
FOR XML PATH('')
);
-The final dynamically created statement
DECLARE #cmd NVARCHAR(MAX)=
'SELECT tbl.*
FROM
(
SELECT 0 AS id,'''' AS ColumnName,'''' AS ColumnType,'''' AS ColumnValue,0 AS IsValid WHERE 1=0 '
+ #columnSelect +
') AS tbl
WHERE tbl.IsValid = 0;'
--Execution with EXEC()
EXEC(#cmd);
The result:
+----+------------+---------------+-------------+---------+
| id | ColumnName | ColumnType | ColumnValue | IsValid |
+----+------------+---------------+-------------+---------+
| 2 | fldInt | int | blah | 0 |
+----+------------+---------------+-------------+---------+
| 5 | fldInt | int | NULL | 0 |
+----+------------+---------------+-------------+---------+
| 3 | fldDate | date | blah | 0 |
+----+------------+---------------+-------------+---------+
| 6 | fldDate | date | NULL | 0 |
+----+------------+---------------+-------------+---------+
| 4 | fldDecimal | decimal(10,2) | blah | 0 |
+----+------------+---------------+-------------+---------+
| 7 | fldDecimal | decimal(10,2) | NULL | 0 |
+----+------------+---------------+-------------+---------+
The statement created is like here:
SELECT tbl.*
FROM
(
SELECT 0 AS id,'' AS ColumnName,'' AS ColumnType,'' AS ColumnValue,0 AS IsValid WHERE 1=0
UNION ALL SELECT id
,'fldInt'
,'int'
,[fldInt]
,CASE WHEN TRY_CAST([fldInt] AS int) IS NULL THEN 0 ELSE 1 END
FROM [#T1]
UNION ALL SELECT id
,'fldDate'
,'date',[fldDate]
,CASE WHEN TRY_CAST([fldDate] AS date) IS NULL THEN 0 ELSE 1 END
FROM [#T1]
UNION ALL SELECT id
,'fldDecimal'
,'decimal(10,2)'
,[fldDecimal]
,CASE WHEN TRY_CAST([fldDecimal] AS decimal(10,2)) IS NULL THEN 0 ELSE 1 END
FROM [#T1]
) AS tbl
WHERE tbl.IsValid = 0;

Reverse order of a XML Column in SQL Server

In a SQL Server table, I have a XML column where status are happened (first is oldest, last current status).
I have to write a stored procedure that returns the statuses: newest first, oldest last.
This is what I wrote:
ALTER PROCEDURE [dbo].[GetDeliveryStatus]
#invoiceID nvarchar(255)
AS
BEGIN
SET NOCOUNT ON;
DECLARE #xml xml
SET #xml = (SELECT statusXML
FROM Purchase
WHERE invoiceID = #invoiceID )
SELECT
t.n.value('text()[1]', 'nvarchar(50)') as DeliveryStatus
FROM
#xml.nodes('/statuses/status') as t(n)
ORDER BY
DeliveryStatus DESC
END
Example of value in the statusXML column:
<statuses>
<status>A</status>
<status>B</status>
<status>A</status>
<status>B</status>
<status>C</status>
</statuses>
I want the procedure to return:
C
B
A
B
A
with ORDER BY .... DESC it return ALPHABETIC reversed (C B B A A)
How should I correct my procedure ?

Create a sequence for the nodes based on the existing order then reverse it.
WITH [x] AS (
SELECT
t.n.value('text()[1]', 'nvarchar(50)') as DeliveryStatus
,ROW_NUMBER() OVER (ORDER BY t.n.value('..', 'NVARCHAR(100)')) AS [Order]
FROM
#xml.nodes('/statuses/status') as t(n)
)
SELECT
DeliveryStatus
FROM [x]
ORDER BY [x].[Order] DESC
... results ...
DeliveryStatus
C
B
A
B
A

There is no need to declare a variable first. You can (and you should!) read the needed values from your table column directly. Best was an inline table valued function (rather than a SP just to read something...)
Better performance
inlineable
You can query many InvoiceIDs at once
set-based
Try this (I drop the mock-table at the end - carefull with real data!):
CREATE TABLE Purchase(ID INT IDENTITY,statusXML XML, InvocieID INT, OtherValues VARCHAR(100));
INSERT INTO Purchase VALUES('<statuses>
<status>A</status>
<status>B</status>
<status>A</status>
<status>B</status>
<status>C</status>
</statuses>',100,'Other values of your row');
GO
WITH NumberedStatus AS
(
SELECT ID
,InvocieID
, ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS Nr
,stat.value('.','nvarchar(max)') AS [Status]
,OtherValues
FROM Purchase
CROSS APPLY statusXML.nodes('/statuses/status') AS A(stat)
WHERE InvocieID=100
)
SELECT *
FROM NumberedStatus
ORDER BY Nr DESC
GO
--Clean-Up
--DROP TABLE Purchase;
The result
+---+-----+---+---+--------------------------+
| 1 | 100 | 5 | C | Other values of your row |
+---+-----+---+---+--------------------------+
| 1 | 100 | 4 | B | Other values of your row |
+---+-----+---+---+--------------------------+
| 1 | 100 | 3 | A | Other values of your row |
+---+-----+---+---+--------------------------+
| 1 | 100 | 2 | B | Other values of your row |
+---+-----+---+---+--------------------------+
| 1 | 100 | 1 | A | Other values of your row |
+---+-----+---+---+--------------------------+

How to retrieve unique records having unique values in two columns from a table in SQL Server

I want to query a table where I need the result that contains unique values from two columns together. For e.g.
Table
EnquiryId | EquipmentId | Price
-----------+--------------+-------
1 | E20 | 10
1 | E50 | 40
1 | E60 | 20
2 | E30 | 90
2 | E20 | 10
2 | E90 | 10
3 | E90 | 10
3 | E60 | 10
For each EnquiryId, EquipmentId will be unique in the table. Now I want a result where I can get something like this
EnquiryId | EquipmentId | Price
-----------+--------------+-------
1 | E20 | 10
2 | E30 | 90
3 | E90 | 10
In the result each enquiryId present in the table should be displayed uniquely.
If suppose I have 3 EquipmentIds "E20,E50,E60" for EnquiryId "1".. Any random EquipmentId should be displayed from these three values only.
Any help would be appreciated. Thank you in advance.

QUERY
;WITH cte AS
(
SELECT *,
ROW_NUMBER() OVER
(PARTITION BY enquiryID
ORDER BY enquiryID ) AS RN
FROM tbl
)
SELECT enquiryID,equipmentID,Price
FROM cte
WHERE RN=1
FIND FIDDLE HERE

The following code must help you..
Sorry that I ended up in a lengthy solution only. Run it in your SSMS and see the result.
Declare #tab table (EnquiryId int, EquipmentId varchar(10),Price int)
Insert into #tab values
(1,'E20',10),
(1,'E50',40),
(1,'E60',20),
(2,'E30',90),
(2,'E20',10),
(2,'E90',10),
(3,'E90',10),
(3,'E60',10)
----------------------------------------------
Declare #s int = 1
Declare #e int,#z varchar(10)
Declare #Equipment table (EquipmentId varchar(10),ind int)
Insert into #Equipment (EquipmentId) Select Distinct EquipmentId From #tab
Declare #Enquiry table (id int identity(1,1),EnquiryId int,EquipmentId varchar(10))
Insert into #Enquiry (EnquiryId) Select Distinct EnquiryId From #tab
Set #e = ##ROWCOUNT
While #s <= #e
begin
Select Top 1 #z = T.EquipmentId
From #tab T
Join #Enquiry E On T.EnquiryId = E.EnquiryId
Join #Equipment Eq On Eq.EquipmentId = T.EquipmentId
Where E.id = #s
And Eq.ind is Null
Order by NEWID()
update #Enquiry
Set EquipmentId = #z
Where id = #s
update #Equipment
Set ind = 1
Where EquipmentId = #z
Set #s = #s + 1
End
Select T.EnquiryId,T.EquipmentId,T.Price
From #tab T
left join #Enquiry E on T.EnquiryId = E.EnquiryId
Where T.EquipmentId = E.EquipmentId

You can use GROUP BY (Typical way) to remove duplicate value.
Basic steps are:
Alter table & Add Identity Column.
Group by columns which can be dupicate.
Delete those record.
Check here Remove Duplicate Rows from a Table in SQL Server

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

How to split single column into 2 columns by delimiter - sql-server

I have a table with the following schema a | b | c qqq | www | ddd/ff fff | ggg | xx/zz jjj | gwq | as/we How would I write a query so my data comes as a | b | c_1 | c_2 qqq | www | ddd | ff

Patindex can also be used instead of Charindex SELECT a,b,LEFT(c,PATINDEX('%/%',c)-1), RIGHT(c,PATINDEX('%/%',REVERSE(c))-1) FROM #t

Related

SQL Dynamic Charindex

How to remove extension dates in SQL Server

How to check what column in INSERT do not have the correct data type?

Reverse order of a XML Column in SQL Server

How to retrieve unique records having unique values in two columns from a table in SQL Server

Categories

Resources